eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-03-07 18:27:40 +08:00

Author	SHA1	Message	Date
Benoit Steiner	780623261e	Re-enabled the optimized reduction CUDA code.	2016-01-11 09:07:14 -08:00
Benoit Steiner	e76904af1b	Simplified the dispatch code.	2016-01-08 16:50:57 -08:00
Benoit Steiner	d726e864ac	Made it possible to use array of size 0 on CUDA devices	2016-01-08 16:38:14 -08:00
Benoit Steiner	3358dfd5dd	Reworked the dispatch of optimized cuda reduction kernels to workaround a nvcc bug that prevented the code from compiling in optimized mode in some cases	2016-01-08 16:28:53 -08:00
Benoit Steiner	53749ff415	Prevent nvcc from miscompiling the cuda metakernel. Unfortunately this reintroduces some compulation warnings but it's much better than having to deal with random assertion failures.	2016-01-08 13:53:40 -08:00
Benoit Steiner	6639b7d6e8	Removed a couple of partial specialization that confuse nvcc and result in errors such as this: error: more than one partial specialization matches the template argument list of class "Eigen::internal::get<3, Eigen::internal::numeric_list<std::size_t, 1UL, 1UL, 1UL, 1UL>>" "Eigen::internal::get<n, Eigen::internal::numeric_list<T, a, as...>>" "Eigen::internal::get<n, Eigen::internal::numeric_list<T, as...>>"	2016-01-07 18:45:19 -08:00
Benoit Steiner	0cb2ca5de2	Fixed a typo.	2016-01-06 18:50:28 -08:00
Benoit Steiner	213459d818	Optimized the performance of broadcasting of scalars.	2016-01-06 18:47:45 -08:00
Benoit Steiner	cfff40b1d4	Improved the performance of reductions on CUDA devices	2016-01-04 17:25:00 -08:00
Benoit Steiner	515dee0baf	Added a 'divup' util to compute the floor of the quotient of two integers	2016-01-04 16:29:26 -08:00
Gael Guennebaud	8b0d1eb0f7	Fix numerous doxygen shortcomings, and workaround some clang -Wdocumentation warnings	2016-01-01 21:45:06 +01:00
Gael Guennebaud	978c379ed7	Add missing ctor from uint	2015-12-30 12:52:38 +01:00
Benoit Steiner	bdcbc66a5c	Don't attempt to vectorize mean reductions of integers since we can't use SSE or AVX instructions to divide 2 integers.	2015-12-22 17:51:55 -08:00
Benoit Steiner	a1e08fb2a5	Optimized the configuration of the outer reduction cuda kernel	2015-12-22 16:30:10 -08:00
Benoit Steiner	9c7d96697b	Added missing define	2015-12-22 16:11:07 -08:00
Benoit Steiner	e7e6d01810	Made sure the optimized gpu reduction code is actually compiled.	2015-12-22 15:07:33 -08:00
Benoit Steiner	b5d2078c4a	Optimized outer reduction on GPUs.	2015-12-22 15:06:17 -08:00
Benoit Steiner	1c3e78319d	Added missing const	2015-12-21 15:05:01 -08:00
Benoit Steiner	1b82969559	Add alignment requirement for local buffer used by the slicing op.	2015-12-18 14:36:35 -08:00
Benoit Steiner	75a7fa1919	Doubled the speed of full reductions on GPUs.	2015-12-18 14:07:31 -08:00
Benoit Steiner	8dd17cbe80	Fixed a clang compilation warning triggered by the use of arrays of size 0.	2015-12-17 14:00:33 -08:00
Benoit Steiner	4aac55f684	Silenced some compilation warnings triggered by nvcc	2015-12-17 13:39:01 -08:00
Benoit Steiner	40e6250fc3	Made it possible to run tensor chipping operations on CUDA devices	2015-12-17 13:29:08 -08:00
Benoit Steiner	2ca55a3ae4	Fixed some compilation error triggered by the tensor code with msvc 2008	2015-12-16 20:45:58 -08:00
Gael Guennebaud	35d8725c73	Disable AutoDiffScalar generic copy ctor for non compatible scalar types (fix ambiguous template instantiation)	2015-12-16 10:14:24 +01:00
Christoph Hertzberg	92655e7215	bug #1136 : Protect isinf for Intel compilers. Also don't distinguish GCC from ICC and don't rely on EIGEN_NOT_A_MACRO, which might not be defined when including this.	2015-12-15 11:34:52 +01:00
Benoit Steiner	17352e2792	Made the entire TensorFixedSize api callable from a CUDA kernel.	2015-12-14 15:20:31 -08:00
Benoit Steiner	75e19fc7ca	Marked the tensor constructors as EIGEN_DEVICE_FUNC: This makes it possible to call them from a CUDA kernel.	2015-12-14 15:12:55 -08:00
Gael Guennebaud	ca39b1546e	Merged in ebrevdo/eigen (pull request PR-148) Add special functions to eigen: lgamma, erf, erfc.	2015-12-11 11:52:09 +01:00
Benoit Steiner	6af52a1227	Fixed a typo in the constructor of tensors of rank 5.	2015-12-10 23:31:12 -08:00
Benoit Steiner	2d8f2e4042	Made 2 tests compile without cxx11. HdG: --	2015-12-10 23:20:04 -08:00
Benoit Steiner	8d28a161b2	Use the proper accessor to refer to the value of a scalar tensor	2015-12-10 22:53:56 -08:00
Benoit Steiner	8e00ea9a92	Fixed the coefficient accessors use for the 2d and 3d case when compiling without cxx11 support.	2015-12-10 22:45:10 -08:00
Benoit Steiner	9db8316c93	Updated the cxx11_tensor_custom_op to not require cxx11.	2015-12-10 20:53:44 -08:00
Benoit Steiner	4e324ca6ae	Updated the cxx11_tensor_assign test to make it compile without support for cxx11	2015-12-10 20:47:25 -08:00
Eugene Brevdo	fa4f933c0f	Add special functions to Eigen: lgamma, erf, erfc. Includes CUDA support and unit tests.	2015-12-07 15:24:49 -08:00
Benoit Steiner	7dfe75f445	Fixed compilation warnings	2015-12-07 08:12:30 -08:00
Gael Guennebaud	ad3d68400e	Add matrix-free solver example	2015-12-07 12:33:38 +01:00
Gael Guennebaud	b37036afce	Implement wrapper for matrix-free iterative solvers	2015-12-07 12:23:22 +01:00
Benoit Steiner	f4ca8ad917	Use signed integers instead of unsigned ones more consistently in the codebase.	2015-12-04 18:14:16 -08:00
Benoit Steiner	490d26e4c1	Use integers instead of std::size_t to encode the number of dimensions in the Tensor class since most of the code currently already use integers.	2015-12-04 10:15:11 -08:00
Benoit Steiner	d20efc974d	Made it possible to use the sigmoid functor within a CUDA kernel.	2015-12-04 09:38:15 -08:00
Benoit Steiner	029052d276	Deleted redundant code	2015-12-03 17:08:47 -08:00
Gael Guennebaud	fd727249ad	Update ADOL-C support.	2015-11-30 16:00:22 +01:00
Gael Guennebaud	da46b1ed54	bug #1112 : fix compilation on exotic architectures	2015-11-27 15:57:18 +01:00
Mark Borgerding	7ddcf97da7	added scalar_sign_op (both real,complex)	2015-11-24 17:15:07 -05:00
Benoit Steiner	44848ac39b	Fixed a bug in TensorArgMax.h	2015-11-23 15:58:47 -08:00
Benoit Steiner	547a8608e5	Fixed the implementation of Eigen::internal::count_leading_zeros for MSVC. Also updated the code to silence bogux warnings generated by nvcc when compilining this function.	2015-11-23 12:17:45 -08:00
Benoit Steiner	562078780a	Don't create more cuda blocks than necessary	2015-11-23 11:00:10 -08:00
Benoit Steiner	df31ca3b9e	Made it possible to refer t oa GPUDevice from code compile with a regular C++ compiler	2015-11-23 10:03:53 -08:00

1 2 3 4 5 ...

1424 Commits