eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-01-06 14:14:46 +08:00

Author	SHA1	Message	Date
Benoit Steiner	abc815798b	Added a new operation to enable more powerful tensorindexing.	2016-05-27 12:22:25 -07:00
Benoit Steiner	5707537592	Fixed option '--relaxed-constexpr' has been deprecated and replaced by option '--expt-relaxed-constexpr' warning generated by nvcc 7.5	2016-05-27 10:47:53 -07:00
Gael Guennebaud	22a035db95	Fix compilation when defaulting to row-major	2016-05-27 10:31:11 +02:00
Benoit Steiner	1ae2567861	Fixed some compilation warnings	2016-05-26 15:57:19 -07:00
Benoit Steiner	1a47844529	Preserve the ability to vectorize the evaluation of an expression even when it involves a cast that isn't vectorized (e.g fp16 to float)	2016-05-26 14:37:09 -07:00
Benoit Steiner	36369ab63c	Resolved merge conflicts	2016-05-26 13:39:39 -07:00
Benoit Steiner	28fcb5ca2a	Merged latest reduction improvements	2016-05-26 12:19:33 -07:00
Benoit Steiner	c1c7f06c35	Improved the performance of inner reductions.	2016-05-26 11:53:59 -07:00
Benoit Steiner	22d02c9855	Improved the coverage of the fp16 reduction tests	2016-05-26 11:12:16 -07:00
Benoit Steiner	8288b0aec2	Code cleanup.	2016-05-26 09:00:04 -07:00
Benoit Steiner	2d7ed54ba2	Made the static storage class qualifier come first.	2016-05-25 22:16:15 -07:00
Benoit Steiner	e1fca8866e	Deleted unnecessary explicit qualifiers.	2016-05-25 22:15:26 -07:00
Benoit Steiner	9b0aaf5113	Don't mark inline functions as static since it confuses the ICC compiler	2016-05-25 22:10:11 -07:00
Benoit Steiner	037a463fd5	Marked unused variables as such	2016-05-25 22:07:48 -07:00
Benoit Steiner	3ac4045272	Made the IndexPair code compile in non cxx11 mode	2016-05-25 15:15:12 -07:00
Benoit Steiner	66556d0e05	Made the index pair list code more portable accross various compilers	2016-05-25 14:34:27 -07:00
Benoit Steiner	034aa3b2c0	Improved the performance of tensor padding	2016-05-25 11:43:08 -07:00
Benoit Steiner	58026905ae	Added support for statically known lists of pairs of indices	2016-05-25 11:04:14 -07:00
Benoit Steiner	0835667329	There is no need to make the fp16 full reduction kernel a static function.	2016-05-24 23:11:56 -07:00
Benoit Steiner	b5d6b52a4d	Fixed compilation warning	2016-05-24 23:10:57 -07:00
Benoit Steiner	a09cbf9905	Merged in rmlarsen/eigen (pull request PR-188) Minor cleanups: 1. Get rid of a few unused variables. 2. Get rid of last uses of EIGEN_USE_COST_MODEL.	2016-05-23 12:55:12 -07:00
Christoph Hertzberg	718521d5cf	Silenced several double-promotion warnings	2016-05-22 18:17:04 +02:00
Christoph Hertzberg	b5a7603822	fixed macro name	2016-05-22 16:49:29 +02:00
Christoph Hertzberg	25a03c02d6	Fix some sign-compare warnings	2016-05-22 16:42:27 +02:00
Gael Guennebaud	ccaace03c9	Make EIGEN_HAS_CONSTEXPR user configurable	2016-05-20 15:10:08 +02:00
Gael Guennebaud	c3410804cd	Make EIGEN_HAS_VARIADIC_TEMPLATES user configurable	2016-05-20 15:05:38 +02:00
Gael Guennebaud	48bf5ec216	Make EIGEN_HAS_RVALUE_REFERENCES user configurable	2016-05-20 14:54:20 +02:00
Gael Guennebaud	f43ae88892	Rename EIGEN_HAVE_RVALUE_REFERENCES to EIGEN_HAS_RVALUE_REFERENCES	2016-05-20 14:48:51 +02:00
Gael Guennebaud	2f656ce447	Remove std:: to enable custom scalar types.	2016-05-19 23:13:47 +02:00
Rasmus Larsen	b1e080c752	Merged eigen/eigen into default	2016-05-18 15:21:50 -07:00
Rasmus Munk Larsen	5624219b6b	Merge.	2016-05-18 15:16:06 -07:00
Rasmus Munk Larsen	7df811cfe5	Minor cleanups: 1. Get rid of unused variables. 2. Get rid of last uses of EIGEN_USE_COST_MODEL.	2016-05-18 15:09:48 -07:00
Benoit Steiner	bb3ff8e9d9	Advertize the packet api of the tensor reducers iff the corresponding packet primitives are available.	2016-05-18 14:52:49 -07:00
Gael Guennebaud	548a487800	bug #1229 : bypass usage of Derived::Options which is available for plain matrix types only. Better use column-major storage anyway.	2016-05-18 16:44:05 +02:00
Gael Guennebaud	43790e009b	Pass argument by const ref instead of by value in pow(AutoDiffScalar...)	2016-05-18 16:28:02 +02:00
Gael Guennebaud	1fbfab27a9	bug #1223 : fix compilation of AutoDiffScalar's min/max operators, and add regression unit test.	2016-05-18 16:26:26 +02:00
Gael Guennebaud	448d9d943c	bug #1222 : fix compilation in AutoDiffScalar and add respective unit test	2016-05-18 16:00:11 +02:00
Rasmus Munk Larsen	f519fca72b	Reduce overhead for small tensors and cheap ops by short-circuiting the const computation and block size calculation in parallelFor.	2016-05-17 16:06:00 -07:00
Benoit Steiner	86ae94462e	#if defined(EIGEN_USE_NONBLOCKING_THREAD_POOL) is now #if !defined(EIGEN_USE_SIMPLE_THREAD_POOL): the non blocking thread pool is the default since it's more scalable, and one needs to request the old thread pool explicitly.	2016-05-17 14:06:15 -07:00
Benoit Steiner	997c335970	Fixed compilation error	2016-05-17 12:54:18 -07:00
Benoit Steiner	ebf6ada5ee	Fixed compilation error in the tensor thread pool	2016-05-17 12:33:46 -07:00
Rasmus Munk Larsen	0bb61b04ca	Merge upstream.	2016-05-17 10:26:10 -07:00
Rasmus Munk Larsen	0dbd68145f	Roll back changes to core. Move include of TensorFunctors.h up to satisfy dependence in TensorCostModel.h.	2016-05-17 10:25:19 -07:00
Rasmus Larsen	00228f2506	Merged eigen/eigen into default	2016-05-17 09:49:31 -07:00
Benoit Steiner	e7e64c3277	Enable the use of the packet api to evaluate tensor broadcasts. This speed things up quite a bit: Before" M_broadcasting/10 500000 3690 27.10 MFlops/s BM_broadcasting/80 500000 4014 1594.24 MFlops/s BM_broadcasting/640 100000 14770 27731.35 MFlops/s BM_broadcasting/4K 5000 632711 39512.48 MFlops/s After: BM_broadcasting/10 500000 4287 23.33 MFlops/s BM_broadcasting/80 500000 4455 1436.41 MFlops/s BM_broadcasting/640 200000 10195 40173.01 MFlops/s BM_broadcasting/4K 5000 423746 58997.57 MFlops/s	2016-05-17 09:24:35 -07:00
Benoit Steiner	5fa27574dd	Allow vectorized padding on GPU. This helps speed things up a little Before: BM_padding/10 5000000 460 217.03 MFlops/s BM_padding/80 5000000 460 13899.40 MFlops/s BM_padding/640 5000000 461 888421.17 MFlops/s BM_padding/4K 5000000 460 54316322.55 MFlops/s After: BM_padding/10 5000000 454 220.20 MFlops/s BM_padding/80 5000000 455 14039.86 MFlops/s BM_padding/640 5000000 452 904968.83 MFlops/s BM_padding/4K 5000000 411 60750049.21 MFlops/s	2016-05-17 09:17:26 -07:00
Benoit Steiner	a910bcee43	Merged latest updates from trunk	2016-05-17 09:14:22 -07:00
Benoit Steiner	8d06c02ffd	Allow vectorized padding on GPU. This helps speed things up a little. Before: BM_padding/10 5000000 460 217.03 MFlops/s BM_padding/80 5000000 460 13899.40 MFlops/s BM_padding/640 5000000 461 888421.17 MFlops/s BM_padding/4K 5000000 460 54316322.55 MFlops/s After: BM_padding/10 5000000 454 220.20 MFlops/s BM_padding/80 5000000 455 14039.86 MFlops/s BM_padding/640 5000000 452 904968.83 MFlops/s BM_padding/4K 5000000 411 60750049.21 MFlops/s	2016-05-17 09:13:27 -07:00
Benoit Steiner	86da77cb9b	Pulled latest updates from trunk.	2016-05-17 07:21:48 -07:00
Benoit Steiner	92fc6add43	Don't rely on c++11 extension when we don't have to.	2016-05-17 07:21:22 -07:00

1 2 3 4 5 ...

1843 Commits