eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-21 07:19:46 +08:00

Author	SHA1	Message	Date
Benoit Steiner	a20b58845f	CUDA_ARCH isn't always defined, so avoid relying on it too much when figuring out which implementation to use for reductions. Instead rely on the device to tell us on which hardware version we're running.	2016-08-03 10:00:43 -07:00
Benoit Steiner	fd220dd8b0	Use numext::conj instead of std::conj	2016-08-01 18:16:16 -07:00
Benoit Steiner	e256acec7c	Avoid unecessary object copies	2016-08-01 17:03:39 -07:00
Benoit Steiner	2693fd54bf	bug #1266 : half implementation has been moved to half_impl namespace	2016-07-29 13:45:56 -07:00
Benoit Steiner	3d3d34e442	Deleted dead code.	2016-07-25 08:53:37 -07:00
Gael Guennebaud	6d5daf32f5	bug #1255 : comment out broken and unsused line.	2016-07-25 14:48:30 +02:00
Gael Guennebaud	9908020d36	Add minimal support for Array<string>, and fix Tensor<string>	2016-07-25 14:25:56 +02:00
Benoit Steiner	c6b0de2c21	Improved partial reductions in more cases	2016-07-22 17:18:20 -07:00
Gael Guennebaud	0f350a8b7e	Fix CUDA compilation	2016-07-21 18:47:07 +02:00
Benoit Steiner	20f7ef2f89	An evalTo expression is only aligned iff both the lhs and the rhs are aligned.	2016-07-12 10:56:42 -07:00
Benoit Steiner	3a2dd352ae	Improved the contraction mapper to properly support tensor products	2016-07-11 13:43:41 -07:00
Benoit Steiner	0bc020be9d	Improved the detection of packet size in the tensor scan evaluator.	2016-07-11 12:14:56 -07:00
Gael Guennebaud	194daa3048	Fix assertion (it did not make sense for static_val types)	2016-07-11 11:39:27 +02:00
Gael Guennebaud	18c35747ce	Emulate _BitScanReverse64 for 32 bits builds	2016-07-11 11:38:04 +02:00
Gael Guennebaud	599f8ba617	Change runtime to compile-time conditional.	2016-07-08 11:39:43 +02:00
Gael Guennebaud	544935101a	Fix warnings	2016-07-08 11:38:52 +02:00
Gael Guennebaud	179ebb88f9	Fix warning	2016-07-07 09:16:40 +02:00
Gael Guennebaud	ce9fc0ce14	fix clang compilation	2016-07-04 12:59:02 +02:00
Gael Guennebaud	440020474c	Workaround compilation issue with msvc	2016-07-04 12:49:19 +02:00
Benoit Steiner	cb2d8b8fa6	Made it possible to compile reductions for an old cuda architecture and run them on a recent gpu.	2016-06-29 15:42:01 -07:00
Benoit Steiner	b2a47641ce	Made the code compile when using CUDA architecture < 300	2016-06-29 15:32:47 -07:00
Igor Babuschkin	85699850d9	Add missing CUDA kernel to tensor scan op The TensorScanOp implementation was missing a CUDA kernel launch. This adds a simple placeholder implementation.	2016-06-29 11:54:35 +01:00
Benoit Steiner	75c333f94c	Don't store the scan axis in the evaluator of the tensor scan operation since it's only used in the constructor. Also avoid taking references to values that may becomes stale after a copy construction.	2016-06-27 10:32:38 -07:00
Rasmus Munk Larsen	a9c1e4d7b7	Return -1 from CurrentThreadId when called by thread outside the pool.	2016-06-23 16:40:07 -07:00
Rasmus Munk Larsen	d39df320d2	Resolve merge.	2016-06-23 15:08:03 -07:00
Gael Guennebaud	360a743a10	bug #1241 : does not emmit anything for empty tensors	2016-06-23 18:47:31 +02:00
Gael Guennebaud	7c6561485a	merge PR 194	2016-06-23 15:29:57 +02:00
Benoit Steiner	a29a2cb4ff	Silenced a couple of compilation warnings generated by xcode	2016-06-22 16:43:02 -07:00
Benoit Steiner	f8fcd6b32d	Turned the constructor of the PerThread struct into what is effectively a constant expression to make the code compatible with a wider range of compilers	2016-06-22 16:03:11 -07:00
Benoit Steiner	c58df31747	Handle empty tensors in the print functions	2016-06-21 09:22:43 -07:00
Benoit Steiner	de32f8d656	Fixed the printing of rank-0 tensors	2016-06-20 10:46:45 -07:00
Benoit Steiner	7d495d890a	Merged in ibab/eigen (pull request PR-197) Implement exclusive scan option for Tensor library	2016-06-14 17:54:59 -07:00
Benoit Steiner	aedc5be1d6	Avoid generating pseudo random numbers that are multiple of 5: this helps spread the load over multiple cpus without havind to rely on work stealing.	2016-06-14 17:51:47 -07:00
Igor Babuschkin	c4d10e921f	Implement exclusive scan option	2016-06-14 19:44:07 +01:00
Gael Guennebaud	76236cdea4	merge	2016-06-14 15:33:47 +02:00
Gael Guennebaud	5d38203735	Update Tensor module to use bind1st_op and bind2nd_op	2016-06-14 15:06:03 +02:00
Benoit Steiner	65d33e5898	Merged in ibab/eigen (pull request PR-195) Add small fixes to TensorScanOp	2016-06-10 19:31:17 -07:00
Benoit Steiner	a05607875a	Don't refer to the half2 type unless it's been defined	2016-06-10 11:53:56 -07:00
Igor Babuschkin	86aedc9282	Add small fixes to TensorScanOp	2016-06-07 20:06:38 +01:00
Benoit Steiner	84b2060a9e	Fixed compilation error with gcc 4.4	2016-06-06 17:16:19 -07:00
Benoit Steiner	7ef9f47b58	Misc small improvements to the reduction code.	2016-06-06 14:09:46 -07:00
Benoit Steiner	9137f560f0	Moved assertions to the constructor to make the code more portable	2016-06-06 07:26:48 -07:00
Rasmus Munk Larsen	f1f2ff8208	size_t -> int	2016-06-03 18:06:37 -07:00
Rasmus Munk Larsen	76308e7fd2	Add CurrentThreadId and NumThreads methods to Eigen threadpools and TensorDeviceThreadPool.	2016-06-03 16:28:58 -07:00
Benoit Steiner	37638dafd7	Simplified the code that dispatches vectorized reductions on GPU	2016-06-09 10:29:52 -07:00
Benoit Steiner	66796e843d	Fixed definition of some of the reducer_traits	2016-06-09 08:50:01 -07:00
Benoit Steiner	14a112ee15	Use signed integers more consistently to encode the number of threads to use to evaluate a tensor expression.	2016-06-09 08:25:22 -07:00
Benoit Steiner	8f92c26319	Improved code formatting	2016-06-09 08:23:42 -07:00
Benoit Steiner	aa33446dac	Improved support for vectorization of 16-bit floats	2016-06-09 08:22:27 -07:00
Benoit Steiner	d6d39c7ddb	Added missing EIGEN_DEVICE_FUNC	2016-06-07 14:35:08 -07:00

1 2 3 4 5 ...

707 Commits