eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-21 07:19:46 +08:00

Author	SHA1	Message	Date
Benoit Steiner	e3a15a03a4	Don't explicitely evaluate the subexpression from TensorForcedEval::evalSubExprIfNeeded, as it will be done when executing the EvalTo subexpression	2016-01-24 23:04:50 -08:00
Benoit Steiner	bd207ce11e	Added missing EIGEN_DEVICE_FUNC qualifier	2016-01-24 20:36:05 -08:00
Benoit Steiner	cb4e53ff7f	Merged in ville-k/eigen/tensorflow_fix (pull request PR-153) Add ctor for long	2016-01-22 19:11:31 -08:00
Ville Kallioniemi	9f94e030c1	Re-add executable flags to minimize changeset.	2016-01-22 20:08:45 -07:00
Benoit Steiner	3aeeca32af	Leverage the new blocking code in the tensor contraction code.	2016-01-22 16:36:30 -08:00
Benoit Steiner	4beb447e27	Created a mechanism to enable contraction mappers to determine the best blocking strategy.	2016-01-22 14:37:26 -08:00
Gael Guennebaud	6a44ccb58b	Backout changeset `690bc950f7`	2016-01-22 15:03:53 +01:00
Ville Kallioniemi	9b6c72958a	Update to latest default branch	2016-01-21 23:08:54 -07:00
Benoit Steiner	c33479324c	Fixed a constness bug	2016-01-21 17:08:11 -08:00
Jan Prach	690bc950f7	fix clang warnings "braces around scalar initializer"	2016-01-20 19:35:59 -08:00
Benoit Steiner	7ce932edd3	Small cleanup and small fix to the contraction of row major tensors	2016-01-20 18:12:08 -08:00
Benoit Steiner	47076bf00e	Reduce the register pressure exerted by the tensor mappers whenever possible. This improves the performance of the contraction of a matrix with a vector by about 35%.	2016-01-20 14:51:48 -08:00
Ville Kallioniemi	915e7667cd	Remove executable bit from header files	2016-01-19 21:17:29 -07:00
Ville Kallioniemi	2832175a68	Use explicitly 32 bit integer types in constructors.	2016-01-19 20:12:17 -07:00
Benoit Steiner	df79c00901	Improved the formatting of the code	2016-01-19 17:24:08 -08:00
Benoit Steiner	6d472d8375	Moved the contraction mapping code to its own file to make the code more manageable.	2016-01-19 17:22:05 -08:00
Benoit Steiner	b3b722905f	Improved code indentation	2016-01-19 17:09:47 -08:00
Benoit Steiner	5b7713dd33	Record whether the underlying tensor storage can be accessed directly during the evaluation of an expression.	2016-01-19 17:05:10 -08:00
Ville Kallioniemi	63fb66f53a	Add ctor for long	2016-01-17 21:25:36 -07:00
Benoit Steiner	34057cff23	Fixed a race condition that could affect some reductions on CUDA devices.	2016-01-15 15:11:56 -08:00
Benoit Steiner	0461f0153e	Made it possible to compare tensor dimensions inside a CUDA kernel.	2016-01-15 11:22:16 -08:00
Benoit Steiner	aed4cb1269	Use warp shuffles instead of shared memory access to speedup the inner reduction kernel.	2016-01-14 21:45:14 -08:00
Benoit Steiner	8fe2532e70	Fixed a boundary condition bug in the outer reduction kernel	2016-01-14 09:29:48 -08:00
Benoit Steiner	9f013a9d86	Properly record the rank of reduced tensors in the tensor traits.	2016-01-13 14:24:37 -08:00
Benoit Steiner	79b69b7444	Trigger the optimized matrix vector path more conservatively.	2016-01-12 15:21:09 -08:00
Benoit Steiner	d920d57f38	Improved the performance of the contraction of a 2d tensor with a 1d tensor by a factor of 3 or more. This helps speedup LSTM neural networks.	2016-01-12 11:32:27 -08:00
Benoit Steiner	bd7d901da9	Reverted a previous change that tripped nvcc when compiling in debug mode.	2016-01-11 17:49:44 -08:00
Benoit Steiner	c5e6900400	Silenced a few compilation warnings.	2016-01-11 17:06:39 -08:00
Benoit Steiner	f894736d61	Updated the tensor traits: the alignment is not part of the Flags enum anymore	2016-01-11 16:42:18 -08:00
Benoit Steiner	4f7714d72c	Enabled the use of fixed dimensions from within a cuda kernel.	2016-01-11 16:01:00 -08:00
Benoit Steiner	01c55d37e6	Deleted unused variable.	2016-01-11 15:53:19 -08:00
Benoit Steiner	0504c56ea7	Silenced a nvcc compilation warning	2016-01-11 15:49:21 -08:00
Benoit Steiner	b523771a24	Silenced several compilation warnings triggered by nvcc.	2016-01-11 14:25:43 -08:00
Benoit Steiner	2c3b13eded	Merged in jeremy_barnes/eigen/shader-model-3.0 (pull request PR-152) Alternative way of forcing instantiation of device kernels without causing warnings or requiring device to device kernel invocations.	2016-01-11 11:43:37 -08:00
Benoit Steiner	2ccb1c8634	Fixed a bug in the dispatch of optimized reduction kernels.	2016-01-11 10:36:37 -08:00
Benoit Steiner	780623261e	Re-enabled the optimized reduction CUDA code.	2016-01-11 09:07:14 -08:00
Jeremy Barnes	91678f489a	Cleaned up double-defined macro from last commit	2016-01-10 22:44:45 -05:00
Jeremy Barnes	403a7cb6c3	Alternative way of forcing instantiation of device kernels without causing warnings or requiring device to device kernel invocations. This allows Tensorflow to work on SM 3.0 (ie, Amazon EC2) machines.	2016-01-10 22:39:13 -05:00
Benoit Steiner	e76904af1b	Simplified the dispatch code.	2016-01-08 16:50:57 -08:00
Benoit Steiner	d726e864ac	Made it possible to use array of size 0 on CUDA devices	2016-01-08 16:38:14 -08:00
Benoit Steiner	3358dfd5dd	Reworked the dispatch of optimized cuda reduction kernels to workaround a nvcc bug that prevented the code from compiling in optimized mode in some cases	2016-01-08 16:28:53 -08:00
Benoit Steiner	53749ff415	Prevent nvcc from miscompiling the cuda metakernel. Unfortunately this reintroduces some compulation warnings but it's much better than having to deal with random assertion failures.	2016-01-08 13:53:40 -08:00
Benoit Steiner	6639b7d6e8	Removed a couple of partial specialization that confuse nvcc and result in errors such as this: error: more than one partial specialization matches the template argument list of class "Eigen::internal::get<3, Eigen::internal::numeric_list<std::size_t, 1UL, 1UL, 1UL, 1UL>>" "Eigen::internal::get<n, Eigen::internal::numeric_list<T, a, as...>>" "Eigen::internal::get<n, Eigen::internal::numeric_list<T, as...>>"	2016-01-07 18:45:19 -08:00
Benoit Steiner	0cb2ca5de2	Fixed a typo.	2016-01-06 18:50:28 -08:00
Benoit Steiner	213459d818	Optimized the performance of broadcasting of scalars.	2016-01-06 18:47:45 -08:00
Benoit Steiner	cfff40b1d4	Improved the performance of reductions on CUDA devices	2016-01-04 17:25:00 -08:00
Benoit Steiner	515dee0baf	Added a 'divup' util to compute the floor of the quotient of two integers	2016-01-04 16:29:26 -08:00
Gael Guennebaud	8b0d1eb0f7	Fix numerous doxygen shortcomings, and workaround some clang -Wdocumentation warnings	2016-01-01 21:45:06 +01:00
Gael Guennebaud	978c379ed7	Add missing ctor from uint	2015-12-30 12:52:38 +01:00
Eugene Brevdo	f7362772e3	Add digamma for CPU + CUDA. Includes tests.	2015-12-24 21:15:38 -08:00
Benoit Steiner	bdcbc66a5c	Don't attempt to vectorize mean reductions of integers since we can't use SSE or AVX instructions to divide 2 integers.	2015-12-22 17:51:55 -08:00
Benoit Steiner	a1e08fb2a5	Optimized the configuration of the outer reduction cuda kernel	2015-12-22 16:30:10 -08:00
Benoit Steiner	9c7d96697b	Added missing define	2015-12-22 16:11:07 -08:00
Benoit Steiner	e7e6d01810	Made sure the optimized gpu reduction code is actually compiled.	2015-12-22 15:07:33 -08:00
Benoit Steiner	b5d2078c4a	Optimized outer reduction on GPUs.	2015-12-22 15:06:17 -08:00
Benoit Steiner	1c3e78319d	Added missing const	2015-12-21 15:05:01 -08:00
Benoit Steiner	1b82969559	Add alignment requirement for local buffer used by the slicing op.	2015-12-18 14:36:35 -08:00
Benoit Steiner	75a7fa1919	Doubled the speed of full reductions on GPUs.	2015-12-18 14:07:31 -08:00
Benoit Steiner	8dd17cbe80	Fixed a clang compilation warning triggered by the use of arrays of size 0.	2015-12-17 14:00:33 -08:00
Benoit Steiner	4aac55f684	Silenced some compilation warnings triggered by nvcc	2015-12-17 13:39:01 -08:00
Benoit Steiner	40e6250fc3	Made it possible to run tensor chipping operations on CUDA devices	2015-12-17 13:29:08 -08:00
Benoit Steiner	2ca55a3ae4	Fixed some compilation error triggered by the tensor code with msvc 2008	2015-12-16 20:45:58 -08:00
Gael Guennebaud	35d8725c73	Disable AutoDiffScalar generic copy ctor for non compatible scalar types (fix ambiguous template instantiation)	2015-12-16 10:14:24 +01:00
Benoit Steiner	17352e2792	Made the entire TensorFixedSize api callable from a CUDA kernel.	2015-12-14 15:20:31 -08:00
Benoit Steiner	75e19fc7ca	Marked the tensor constructors as EIGEN_DEVICE_FUNC: This makes it possible to call them from a CUDA kernel.	2015-12-14 15:12:55 -08:00
Gael Guennebaud	ca39b1546e	Merged in ebrevdo/eigen (pull request PR-148) Add special functions to eigen: lgamma, erf, erfc.	2015-12-11 11:52:09 +01:00
Benoit Steiner	6af52a1227	Fixed a typo in the constructor of tensors of rank 5.	2015-12-10 23:31:12 -08:00
Benoit Steiner	8e00ea9a92	Fixed the coefficient accessors use for the 2d and 3d case when compiling without cxx11 support.	2015-12-10 22:45:10 -08:00
Eugene Brevdo	fa4f933c0f	Add special functions to Eigen: lgamma, erf, erfc. Includes CUDA support and unit tests.	2015-12-07 15:24:49 -08:00
Benoit Steiner	7dfe75f445	Fixed compilation warnings	2015-12-07 08:12:30 -08:00
Gael Guennebaud	ad3d68400e	Add matrix-free solver example	2015-12-07 12:33:38 +01:00
Gael Guennebaud	b37036afce	Implement wrapper for matrix-free iterative solvers	2015-12-07 12:23:22 +01:00
Benoit Steiner	f4ca8ad917	Use signed integers instead of unsigned ones more consistently in the codebase.	2015-12-04 18:14:16 -08:00
Benoit Steiner	490d26e4c1	Use integers instead of std::size_t to encode the number of dimensions in the Tensor class since most of the code currently already use integers.	2015-12-04 10:15:11 -08:00
Benoit Steiner	d20efc974d	Made it possible to use the sigmoid functor within a CUDA kernel.	2015-12-04 09:38:15 -08:00
Benoit Steiner	029052d276	Deleted redundant code	2015-12-03 17:08:47 -08:00
Gael Guennebaud	fd727249ad	Update ADOL-C support.	2015-11-30 16:00:22 +01:00
Gael Guennebaud	da46b1ed54	bug #1112 : fix compilation on exotic architectures	2015-11-27 15:57:18 +01:00
Mark Borgerding	7ddcf97da7	added scalar_sign_op (both real,complex)	2015-11-24 17:15:07 -05:00
Benoit Steiner	44848ac39b	Fixed a bug in TensorArgMax.h	2015-11-23 15:58:47 -08:00
Benoit Steiner	547a8608e5	Fixed the implementation of Eigen::internal::count_leading_zeros for MSVC. Also updated the code to silence bogux warnings generated by nvcc when compilining this function.	2015-11-23 12:17:45 -08:00
Benoit Steiner	562078780a	Don't create more cuda blocks than necessary	2015-11-23 11:00:10 -08:00
Benoit Steiner	df31ca3b9e	Made it possible to refer t oa GPUDevice from code compile with a regular C++ compiler	2015-11-23 10:03:53 -08:00
Benoit Steiner	1e04059012	Deleted unused variable.	2015-11-23 08:36:54 -08:00
Benoit Steiner	9fa65d3838	Split TensorDeviceType.h in 3 files to make it more manageable	2015-11-20 17:42:50 -08:00
Benoit Steiner	a367804856	Added option to force the usage of the Eigen array class instead of the std::array class.	2015-11-20 12:41:40 -08:00
Benoit Steiner	383d1cc2ed	Added proper support for fast 64bit integer division on CUDA	2015-11-20 11:09:46 -08:00
Benoit Steiner	f37a5f1c53	Fixed compilation error triggered by nvcc	2015-11-19 14:34:26 -08:00
Benoit Steiner	f8df393165	Added support for 128bit integers on CUDA devices.	2015-11-19 13:57:27 -08:00
Benoit Steiner	1dd444ea71	Avoid using the version of TensorIntDiv optimized for 32-bit integers when the divisor can be equal to one since it isn't supported.	2015-11-18 11:37:58 -08:00
Benoit Steiner	f1fbd74db9	Added sanity check	2015-11-13 09:07:27 -08:00
Benoit Steiner	7815b84be4	Fixed a compilation warning	2015-11-12 20:16:59 -08:00
Benoit Steiner	10a91930cc	Fixed a compilation warning triggered by nvcc	2015-11-12 20:10:52 -08:00
Benoit Steiner	ed4b37de02	Fixed a few compilation warnings	2015-11-12 20:08:01 -08:00
Benoit Steiner	b69248fa2a	Added a couple of missing EIGEN_DEVICE_FUNC	2015-11-12 20:01:50 -08:00
Benoit Steiner	0aaa5941df	Silenced some compilation warnings triggered by nvcc	2015-11-12 19:11:43 -08:00
Benoit Steiner	2c73633b28	Fixed a few more typos	2015-11-12 18:39:19 -08:00
Benoit Steiner	be08e82953	Fixed typos	2015-11-12 18:37:40 -08:00
Benoit Steiner	150c12e138	Completed the IndexList rewrite	2015-11-12 18:11:56 -08:00
Benoit Steiner	8037826367	Simplified more of the IndexList code.	2015-11-12 17:19:45 -08:00
Benoit Steiner	e9ecfad796	Started to make the IndexList code compile by more compilers	2015-11-12 16:41:14 -08:00
Benoit Steiner	7a1316fcc5	Fixed compilation error with xcode.	2015-11-12 11:05:54 -08:00
Benoit Steiner	737d237722	Made it possible to run some of the CXXMeta functions on a CUDA device.	2015-11-12 09:02:59 -08:00
Benoit Steiner	1e072424e8	Moved the array code into it's own file.	2015-11-12 08:57:04 -08:00
Benoit Steiner	aa5f1ca714	gen_numeric_list takes a size_t, not a int	2015-11-12 08:30:10 -08:00
Benoit Steiner	9fa10fe52d	Don't use std::array when compiling with nvcc since nvidia doesn't support the use of STL containers on GPU.	2015-11-11 15:38:30 -08:00
Benoit Steiner	c587293e48	Fixed a compilation warning	2015-11-11 15:35:12 -08:00
Benoit Steiner	7f1c29fb0c	Make it possible for a vectorized tensor expression to be executed in a CUDA kernel.	2015-11-11 15:22:50 -08:00
Benoit Steiner	99f4778506	Disable SFINAE when compiling with nvcc	2015-11-11 15:04:58 -08:00
Benoit Steiner	5cb18e5b5e	Fixed CUDA compilation errors	2015-11-11 14:36:33 -08:00
Benoit Steiner	228edfe616	Use Eigen::NumTraits instead of std::numeric_limits	2015-11-11 09:26:23 -08:00
Benoit Steiner	20e2ab1121	Fixed another compilation warning	2015-12-07 16:17:57 -08:00
Benoit Steiner	d573efe303	Code cleanup	2015-11-06 14:54:28 -08:00
Benoit Steiner	9fa283339f	Silenced a compilation warning	2015-11-06 11:44:22 -08:00
Benoit Steiner	53432a17b2	Added static assertions to avoid misuses of padding, broadcasting and concatenation ops.	2015-11-06 10:26:19 -08:00
Benoit Steiner	6857a35a11	Fixed typos	2015-11-06 09:42:05 -08:00
Benoit Steiner	33cbdc2d15	Added more missing EIGEN_DEVICE_FUNC	2015-11-06 09:29:59 -08:00
Benoit Steiner	ed1962b464	Reimplement the tensor comparison operators by using the scalar_cmp_op functors. This makes them more cuda friendly.	2015-11-06 09:18:43 -08:00
Benoit Steiner	29038b982d	Added support for modulo operation	2015-11-05 19:39:48 -08:00
Benoit Steiner	fbcf8cc8c1	Pulled latest updates from trunk	2015-11-05 14:30:02 -08:00
Benoit Steiner	c75a19f815	Misc fixes to full reductions	2015-11-05 14:21:20 -08:00
Benoit Steiner	ec5a81b45a	Fixed a bug in the extraction of sizes of fixed sized tensors of rank 0	2015-11-05 13:39:48 -08:00
Gael Guennebaud	9ceaa8e445	bug #1063 : nest AutoDiffScalar by value to avoid dead references	2015-11-05 13:54:26 +01:00
Benoit Steiner	beedd9630d	Updated the reduction code so that full reductions now return a tensor of rank 0.	2015-11-04 13:57:36 -08:00
Benoit Steiner	6a02c2a85d	Fixed a compilation warning	2015-10-29 20:21:29 -07:00
Benoit Steiner	ca12d4c3b3	Pulled latest updates from trunk	2015-10-29 17:57:48 -07:00
Benoit Steiner	ce19e38c1f	Added support for tensor maps of rank 0.	2015-10-29 17:49:04 -07:00
Benoit Steiner	3785c69294	Added support for fixed sized tensors of rank 0	2015-10-29 17:31:03 -07:00
Benoit Steiner	0d7a23d34e	Extended the reduction code so that reducing an empty set returns the neural element for the operation	2015-10-29 17:29:49 -07:00
Benoit Steiner	1b0685d09a	Added support for rank-0 tensors	2015-10-29 17:27:38 -07:00
Benoit Steiner	c444a0a8c3	Consistently use the same index type in the fft codebase.	2015-10-29 16:39:47 -07:00
Benoit Steiner	09ea3a7acd	Silenced a few more compilation warnings	2015-10-29 16:22:52 -07:00
Benoit Steiner	0974a57910	Silenced compiler warning	2015-10-29 15:00:06 -07:00
Gael Guennebaud	77ff3386b7	Refactoring of the cost model: - Dynamic is now an invalid value - introduce a HugeCost constant to be used for runtime-cost values or arbitrarily huge cost - add sanity checks for cost values: must be >=0 and not too large This change provides several benefits: - it fixes shortcoming is some cost computation where the Dynamic case was not properly handled. - it simplifies cost computation logic, and should avoid future similar shortcomings. - it allows to distinguish between different level of dynamic/huge/infinite cost - it should enable further simplifications in the computation of costs (save compilation time)	2015-10-28 11:42:14 +01:00
Benoit Steiner	1c8312c811	Started to add support for tensors of rank 0	2015-10-26 14:29:26 -07:00
Benoit Steiner	1f4c98abb1	Fixed compilation warning	2015-10-26 12:42:55 -07:00
Benoit Steiner	9dc236bc83	Fixed compilation warning	2015-10-26 12:41:48 -07:00
Benoit Steiner	9f721384e0	Added support for empty dimensions	2015-10-26 11:21:27 -07:00
Benoit Steiner	a3e144727c	Fixed compilation warning	2015-10-26 10:48:11 -07:00
Gael Guennebaud	a5324a131f	bug #1092 : fix iterative solver ctors for expressions as input	2015-10-26 16:16:24 +01:00
Gael Guennebaud	af2e25d482	Merged in infinitei/eigen (pull request PR-140) bug #1097 Added ArpackSupport to cmake install target	2015-10-26 15:31:39 +01:00
Abhijit Kundu	0ed41bdefa	ArpackSupport was missing here also.	2015-10-16 18:21:02 -07:00
Abhijit Kundu	1127ca8586	Added ArpackSupport to cmake install target	2015-10-16 16:41:33 -07:00
Benoit Steiner	de1e9f29f4	Updated the custom indexing code: we can now use any container that provides the [] operator to index a tensor. Added unit tests to validate the use of std::map and a few more types as valid custom index containers	2015-10-15 14:58:49 -07:00
Benoit Steiner	6585efc553	Tightened the definition of isOfNormalIndex to take into account integer types in addition to arrays of indices Only compile the custom index code when EIGEN_HAS_SFINAE is defined. For the time beeing, EIGEN_HAS_SFINAE is a synonym for EIGEN_HAS_VARIADIC_TEMPLATES, but this might evolve in the future. Moved some code around.	2015-10-14 09:31:37 -07:00
Gabriel Nützi	fc7478c04d	name changes 2 user: Gabriel Nützi <gnuetzi@gmx.ch> branch 'default' changed unsupported/Eigen/CXX11/src/Tensor/Tensor.h changed unsupported/Eigen/CXX11/src/Tensor/TensorMeta.h	2015-10-09 19:10:08 +02:00
Gabriel Nützi	7b34834f64	name changes user: Gabriel Nützi <gnuetzi@gmx.ch> branch 'default' changed unsupported/Eigen/CXX11/src/Tensor/Tensor.h	2015-10-09 19:08:14 +02:00
Gabriel Nützi	6edae2d30d	added CustomIndex capability only to Tensor and not yet to TensorBase. using Sfinae and is_base_of to select correct template which converts to array<Index,NumIndices> user: Gabriel Nützi <gnuetzi@gmx.ch> branch 'default' added unsupported/Eigen/CXX11/src/Tensor/TensorMetaMacros.h added unsupported/test/cxx11_tensor_customIndex.cpp changed unsupported/Eigen/CXX11/Tensor changed unsupported/Eigen/CXX11/src/Tensor/Tensor.h changed unsupported/Eigen/CXX11/src/Tensor/TensorMeta.h changed unsupported/test/CMakeLists.txt	2015-10-09 18:52:48 +02:00
Gael Guennebaud	186ec1437c	Cleanup EIGEN_SPARSE_PUBLIC_INTERFACE, it is now a simple alias to EIGEN_GENERIC_PUBLIC_INTERFACE	2015-10-08 22:06:49 +02:00
Gael Guennebaud	1b148d9e2e	Move IncompleteCholesky to official modules	2015-10-08 11:32:46 +02:00

1 2 3 4 5 ...

1296 Commits