eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-01-12 14:25:16 +08:00

Author	SHA1	Message	Date
Benoit Steiner	e4d4d15588	Register the cxx11_tensor_device only for recent cuda architectures (i.e. >= 3.0) since the test instantiate contractions that require a modern gpu.	2016-09-12 19:01:52 -07:00
Benoit Steiner	4dfd888c92	CUDA contractions require arch >= 3.0: don't compile the cuda contraction tests on older architectures.	2016-09-12 18:49:01 -07:00
Benoit Steiner	028e299577	Fixed a bug impacting some outer reductions on GPU	2016-09-12 18:36:52 -07:00
Benoit Steiner	5f50f12d2c	Added the ability to compute the absolute value of a complex number on GPU, as well as a test to catch the problem.	2016-09-12 13:46:13 -07:00
Benoit Steiner	8321dcce76	Merged latest updates from trunk	2016-09-12 10:33:05 -07:00
Benoit Steiner	eb6ba00cc8	Properly size the list of waiters	2016-09-12 10:31:55 -07:00
Benoit Steiner	a618094b62	Added a resize method to MaxSizeVector	2016-09-12 10:30:53 -07:00
Gael Guennebaud	471eac5399	bug #1195 : move NumTraits::Div<>::Cost to internal::scalar_div_cost (with some specializations in arch/SSE and arch/AVX)	2016-09-08 08:36:27 +02:00
Gael Guennebaud	e1642f485c	bug #1288 : fix memory leak in arpack wrapper.	2016-09-05 18:01:30 +02:00
Gael Guennebaud	dabc81751f	Fix compilation when cuda_fp16.h does not exist.	2016-09-05 17:14:20 +02:00
Benoit Steiner	87a8a1975e	Fixed a regression test	2016-09-02 19:29:33 -07:00
Benoit Steiner	13df3441ae	Use MaxSizeVector instead of std::vector: xcode sometimes assumes that std::vector allocates aligned memory and therefore issues aligned instruction to initialize it. This can result in random crashes when compiling with AVX instructions enabled.	2016-09-02 19:25:47 -07:00
Benoit Steiner	cadd124d73	Pulled latest update from trunk	2016-09-02 15:30:02 -07:00
Benoit Steiner	05b0518077	Made the index type an explicit template parameter to help some compilers compile the code.	2016-09-02 15:29:34 -07:00
Benoit Steiner	adf864fec0	Merged in rmlarsen/eigen (pull request PR-222) Fix CUDA build broken by changes to min and max reduction.	2016-09-02 14:11:20 -07:00
Rasmus Munk Larsen	13e93ca8b7	Fix CUDA build broken by changes to min and max reduction.	2016-09-02 13:41:36 -07:00
Benoit Steiner	6c05c3dd49	Fix the cxx11_tensor_cuda.cu test on 32bit platforms.	2016-09-02 11:12:16 -07:00
Benoit Steiner	039e225f7f	Added a test for nullary expressions on CUDA Also check that we can mix 64 and 32 bit indices in the same compilation unit	2016-09-01 13:28:12 -07:00
Benoit Steiner	c53f783705	Updated the contraction code to support constant inputs.	2016-09-01 11:41:27 -07:00
Gael Guennebaud	46475eff9a	Adjust Tensor module wrt recent change in nullary functor	2016-09-01 13:40:45 +02:00
Gael Guennebaud	72a4d49315	Fix compilation with CUDA 8	2016-09-01 13:39:33 +02:00
Rasmus Munk Larsen	a1e092d1e8	Fix bugs to make min- and max reducers with correctly with IEEE infinities.	2016-08-31 15:04:16 -07:00
Gael Guennebaud	1f84f0d33a	merge EulerAngles module	2016-08-30 10:01:53 +02:00
Gael Guennebaud	e074f720c7	Include missing forward declaration of SparseMatrix	2016-08-29 18:56:46 +02:00
Gael Guennebaud	6cd7b9ea6b	Fix compilation with cuda 8	2016-08-29 11:06:08 +02:00
Gael Guennebaud	35a8e94577	bug #1167 : simplify installation of header files using cmake's install(DIRECTORY ...) command.	2016-08-29 10:59:37 +02:00
Gael Guennebaud	0f56b5a6de	enable vectorization path when testing half on cuda, and add test for log1p	2016-08-26 14:55:51 +02:00
Gael Guennebaud	965e595f02	Add missing log1p method	2016-08-26 14:55:00 +02:00
Benoit Steiner	7944d4431f	Made the cost model cwiseMax and cwiseMin methods consts to help the PowerPC cuda compiler compile this code.	2016-08-18 13:46:36 -07:00
Benoit Steiner	647a51b426	Force the inlining of a simple accessor.	2016-08-18 12:31:02 -07:00
Benoit Steiner	a452dedb4f	Merged in ibab/eigen/double-tensor-reduction (pull request PR-216) Enable efficient Tensor reduction for doubles on the GPU (continued)	2016-08-18 12:29:54 -07:00
Igor Babuschkin	18c67df31c	Fix remaining CUDA >= 300 checks	2016-08-18 17:18:30 +01:00
Igor Babuschkin	1569a7d7ab	Add the necessary CUDA >= 300 checks back	2016-08-18 17:15:12 +01:00
Benoit Steiner	2b17f34574	Properly detect the type of the result of a contraction.	2016-08-16 16:00:30 -07:00
Benoit Steiner	34ae80179a	Use array_prod instead of calling TotalSize since TotalSize is only available on DSize.	2016-08-15 10:29:14 -07:00
Benoit Steiner	fe73648c98	Fixed a bug in the documentation.	2016-08-12 10:00:43 -07:00
Benoit Steiner	e3a8dfb02f	std::erfcf doesn't exist: use numext::erfc instead	2016-08-11 15:24:06 -07:00
Benoit Steiner	64e68cbe87	Don't attempt to optimize partial reductions when the optimized implementation doesn't buy anything.	2016-08-08 19:29:59 -07:00
Igor Babuschkin	841e075154	Remove CUDA >= 300 checks and enable outer reductin for doubles	2016-08-06 18:07:50 +01:00
Igor Babuschkin	0425118e2a	Merge upstream changes	2016-08-05 14:34:57 +01:00
Igor Babuschkin	9537e8b118	Make use of atomicExch for atomicExchCustom	2016-08-05 14:29:58 +01:00
Benoit Steiner	5eea1c7f97	Fixed cut and paste bug in debud message	2016-08-04 17:34:13 -07:00
Benoit Steiner	b50d8f8c4a	Extended a regression test to validate that we basic fp16 support works with cuda 7.0	2016-08-03 16:50:13 -07:00
Benoit Steiner	fad9828769	Deleted redundant regression test.	2016-08-03 16:08:37 -07:00
Benoit Steiner	ca2cee2739	Merged in ibab/eigen (pull request PR-206) Expose real and imag methods on Tensors	2016-08-03 11:53:04 -07:00
Benoit Steiner	d92df04ce8	Cleaned up the new float16 test a bit	2016-08-03 11:50:07 -07:00
Benoit Steiner	81099ef482	Added a test for fp16	2016-08-03 11:41:17 -07:00
Benoit Steiner	a20b58845f	CUDA_ARCH isn't always defined, so avoid relying on it too much when figuring out which implementation to use for reductions. Instead rely on the device to tell us on which hardware version we're running.	2016-08-03 10:00:43 -07:00
Benoit Steiner	fd220dd8b0	Use numext::conj instead of std::conj	2016-08-01 18:16:16 -07:00
Benoit Steiner	e256acec7c	Avoid unecessary object copies	2016-08-01 17:03:39 -07:00

1 2 3 4 5 ...

2011 Commits