eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-01-06 14:14:46 +08:00

Author	SHA1	Message	Date
Benoit Steiner	dc413dbe8a	Merged in ville-k/eigen/explicit_long_constructors (pull request PR-158) Add constructor for long types.	2016-02-02 20:58:06 -08:00
Ville Kallioniemi	783018d8f6	Use EIGEN_STATIC_ASSERT for backward compatibility.	2016-02-02 16:45:12 -07:00
Benoit Steiner	99cde88341	Don't try to use direct offsets when computing a tensor product, since the required stride isn't available.	2016-02-02 11:06:53 -08:00
Ville Kallioniemi	aedea349aa	Replace separate low word constructors with a single templated constructor.	2016-02-01 20:25:02 -07:00
Ville Kallioniemi	f0fdefa96f	Rebase to latest.	2016-02-01 19:32:31 -07:00
Benoit Steiner	64ce78c2ec	Cleaned up a tensor contraction test	2016-02-01 13:57:41 -08:00
Benoit Steiner	0ce5d32be5	Sharded the cxx11_tensor_contract_cuda test	2016-02-01 13:33:23 -08:00
Benoit Steiner	922b5f527b	Silenced a few compilation warnings	2016-02-01 13:30:49 -08:00
Benoit Steiner	6b5dff875e	Made it possible to limit the number of blocks that will be used to evaluate a tensor expression on a CUDA device. This makesit possible to set aside streaming multiprocessors for other computations.	2016-02-01 12:46:32 -08:00
Benoit Steiner	264f8141f8	Shared the tensor reduction test	2016-02-01 07:44:31 -08:00
Benoit Steiner	11bb71c8fc	Sharded the tensor device test	2016-02-01 07:34:59 -08:00
Benoit Steiner	e80ed948e1	Fixed a number of compilation warnings generated by the cuda tests	2016-01-31 20:09:41 -08:00
Benoit Steiner	6720b38fbf	Fixed a few compilation warnings	2016-01-31 16:48:50 -08:00
Benoit Steiner	4a2ddfb81d	Sharded the CUDA argmax tensor test	2016-01-31 10:44:15 -08:00
Benoit Steiner	483082ef6e	Fixed a few memory leaks in the cuda tests	2016-01-30 11:59:22 -08:00
Benoit Steiner	bd21aba181	Sharded the cxx11_tensor_cuda test and fixed a memory leak	2016-01-30 11:47:09 -08:00
Benoit Steiner	9de155d153	Added a test to cover threaded tensor shuffling	2016-01-30 10:56:47 -08:00
Benoit Steiner	32088c06a1	Made the comparison between single and multithreaded contraction results more resistant to numerical noise to prevent spurious test failures.	2016-01-30 10:51:14 -08:00
Benoit Steiner	2053478c56	Made sure to use a tensor of rank 0 to store the result of a full reduction in the tensor thread pool test	2016-01-30 10:46:36 -08:00
Benoit Steiner	d0db95f730	Sharded the tensor thread pool test	2016-01-30 10:43:57 -08:00
Benoit Steiner	ba27c8a7de	Made the CUDA contract test more robust to numerical noise.	2016-01-30 10:28:43 -08:00
Benoit Steiner	963f2d2a8f	Marked several methods EIGEN_DEVICE_FUNC	2016-01-28 23:37:48 -08:00
Benoit Steiner	c5d25bf1d0	Fixed a couple of compilation warnings.	2016-01-28 23:15:45 -08:00
Benoit Steiner	7b3044d086	Made sure to call nvcc with the relaxed-constexpr flag.	2016-01-28 15:36:34 -08:00
Gael Guennebaud	ddf64babde	merge	2016-01-28 13:21:48 +01:00
Gael Guennebaud	7802a6bb1c	Fix unit test filename.	2016-01-28 09:35:37 +01:00
Benoit Steiner	4bf9eaf77a	Deleted an invalid assertion that prevented the assignment of empty tensors.	2016-01-27 17:09:30 -08:00
Benoit Steiner	291069e885	Fixed some compilation problems with nvcc + clang	2016-01-27 15:37:03 -08:00
Benoit Steiner	47ca9dc809	Fixed the tensor_cuda test	2016-01-27 14:58:48 -08:00
Benoit Steiner	55a5204319	Fixed the flags passed to nvcc to compile the tensor code.	2016-01-27 14:46:34 -08:00
Benoit Steiner	9dfbd4fe8d	Made the cuda tests compile using make check	2016-01-27 12:22:17 -08:00
Benoit Steiner	5973bcf939	Properly specify the namespace when calling cout/endl	2016-01-27 12:04:42 -08:00
Gael Guennebaud	9c8f7dfe94	bug #1156 : fix several function declarations whose arguments were passed by value instead of being passed by reference	2016-01-27 18:34:42 +01:00
Ville Kallioniemi	02db1228ed	Add constructor for long types.	2016-01-26 23:41:01 -07:00
Hauke Heibel	5eb2790be0	Fixed minor typo in SplineFitting.	2016-01-25 22:17:52 +01:00
Benoit Steiner	e3a15a03a4	Don't explicitely evaluate the subexpression from TensorForcedEval::evalSubExprIfNeeded, as it will be done when executing the EvalTo subexpression	2016-01-24 23:04:50 -08:00
Benoit Steiner	bd207ce11e	Added missing EIGEN_DEVICE_FUNC qualifier	2016-01-24 20:36:05 -08:00
Benoit Steiner	cb4e53ff7f	Merged in ville-k/eigen/tensorflow_fix (pull request PR-153) Add ctor for long	2016-01-22 19:11:31 -08:00
Ville Kallioniemi	9f94e030c1	Re-add executable flags to minimize changeset.	2016-01-22 20:08:45 -07:00
Benoit Steiner	3aeeca32af	Leverage the new blocking code in the tensor contraction code.	2016-01-22 16:36:30 -08:00
Benoit Steiner	4beb447e27	Created a mechanism to enable contraction mappers to determine the best blocking strategy.	2016-01-22 14:37:26 -08:00
Gael Guennebaud	6a44ccb58b	Backout changeset `690bc950f7`	2016-01-22 15:03:53 +01:00
Ville Kallioniemi	9b6c72958a	Update to latest default branch	2016-01-21 23:08:54 -07:00
Benoit Steiner	c33479324c	Fixed a constness bug	2016-01-21 17:08:11 -08:00
Jan Prach	690bc950f7	fix clang warnings "braces around scalar initializer"	2016-01-20 19:35:59 -08:00
Benoit Steiner	7ce932edd3	Small cleanup and small fix to the contraction of row major tensors	2016-01-20 18:12:08 -08:00
Benoit Steiner	47076bf00e	Reduce the register pressure exerted by the tensor mappers whenever possible. This improves the performance of the contraction of a matrix with a vector by about 35%.	2016-01-20 14:51:48 -08:00
Ville Kallioniemi	915e7667cd	Remove executable bit from header files	2016-01-19 21:17:29 -07:00
Ville Kallioniemi	2832175a68	Use explicitly 32 bit integer types in constructors.	2016-01-19 20:12:17 -07:00
Benoit Steiner	df79c00901	Improved the formatting of the code	2016-01-19 17:24:08 -08:00
Benoit Steiner	6d472d8375	Moved the contraction mapping code to its own file to make the code more manageable.	2016-01-19 17:22:05 -08:00
Benoit Steiner	b3b722905f	Improved code indentation	2016-01-19 17:09:47 -08:00
Benoit Steiner	5b7713dd33	Record whether the underlying tensor storage can be accessed directly during the evaluation of an expression.	2016-01-19 17:05:10 -08:00
Ville Kallioniemi	63fb66f53a	Add ctor for long	2016-01-17 21:25:36 -07:00
Benoit Steiner	34057cff23	Fixed a race condition that could affect some reductions on CUDA devices.	2016-01-15 15:11:56 -08:00
Benoit Steiner	0461f0153e	Made it possible to compare tensor dimensions inside a CUDA kernel.	2016-01-15 11:22:16 -08:00
Benoit Steiner	aed4cb1269	Use warp shuffles instead of shared memory access to speedup the inner reduction kernel.	2016-01-14 21:45:14 -08:00
Benoit Steiner	8fe2532e70	Fixed a boundary condition bug in the outer reduction kernel	2016-01-14 09:29:48 -08:00
Benoit Steiner	9f013a9d86	Properly record the rank of reduced tensors in the tensor traits.	2016-01-13 14:24:37 -08:00
Benoit Steiner	79b69b7444	Trigger the optimized matrix vector path more conservatively.	2016-01-12 15:21:09 -08:00
Benoit Steiner	d920d57f38	Improved the performance of the contraction of a 2d tensor with a 1d tensor by a factor of 3 or more. This helps speedup LSTM neural networks.	2016-01-12 11:32:27 -08:00
Benoit Steiner	bd7d901da9	Reverted a previous change that tripped nvcc when compiling in debug mode.	2016-01-11 17:49:44 -08:00
Benoit Steiner	c5e6900400	Silenced a few compilation warnings.	2016-01-11 17:06:39 -08:00
Benoit Steiner	f894736d61	Updated the tensor traits: the alignment is not part of the Flags enum anymore	2016-01-11 16:42:18 -08:00
Benoit Steiner	4f7714d72c	Enabled the use of fixed dimensions from within a cuda kernel.	2016-01-11 16:01:00 -08:00
Benoit Steiner	01c55d37e6	Deleted unused variable.	2016-01-11 15:53:19 -08:00
Benoit Steiner	0504c56ea7	Silenced a nvcc compilation warning	2016-01-11 15:49:21 -08:00
Benoit Steiner	b523771a24	Silenced several compilation warnings triggered by nvcc.	2016-01-11 14:25:43 -08:00
Benoit Steiner	2c3b13eded	Merged in jeremy_barnes/eigen/shader-model-3.0 (pull request PR-152) Alternative way of forcing instantiation of device kernels without causing warnings or requiring device to device kernel invocations.	2016-01-11 11:43:37 -08:00
Benoit Steiner	2ccb1c8634	Fixed a bug in the dispatch of optimized reduction kernels.	2016-01-11 10:36:37 -08:00
Benoit Steiner	780623261e	Re-enabled the optimized reduction CUDA code.	2016-01-11 09:07:14 -08:00
Jeremy Barnes	91678f489a	Cleaned up double-defined macro from last commit	2016-01-10 22:44:45 -05:00
Jeremy Barnes	403a7cb6c3	Alternative way of forcing instantiation of device kernels without causing warnings or requiring device to device kernel invocations. This allows Tensorflow to work on SM 3.0 (ie, Amazon EC2) machines.	2016-01-10 22:39:13 -05:00
Benoit Steiner	e76904af1b	Simplified the dispatch code.	2016-01-08 16:50:57 -08:00
Benoit Steiner	d726e864ac	Made it possible to use array of size 0 on CUDA devices	2016-01-08 16:38:14 -08:00
Benoit Steiner	3358dfd5dd	Reworked the dispatch of optimized cuda reduction kernels to workaround a nvcc bug that prevented the code from compiling in optimized mode in some cases	2016-01-08 16:28:53 -08:00
Benoit Steiner	53749ff415	Prevent nvcc from miscompiling the cuda metakernel. Unfortunately this reintroduces some compulation warnings but it's much better than having to deal with random assertion failures.	2016-01-08 13:53:40 -08:00
Benoit Steiner	6639b7d6e8	Removed a couple of partial specialization that confuse nvcc and result in errors such as this: error: more than one partial specialization matches the template argument list of class "Eigen::internal::get<3, Eigen::internal::numeric_list<std::size_t, 1UL, 1UL, 1UL, 1UL>>" "Eigen::internal::get<n, Eigen::internal::numeric_list<T, a, as...>>" "Eigen::internal::get<n, Eigen::internal::numeric_list<T, as...>>"	2016-01-07 18:45:19 -08:00
Benoit Steiner	0cb2ca5de2	Fixed a typo.	2016-01-06 18:50:28 -08:00
Benoit Steiner	213459d818	Optimized the performance of broadcasting of scalars.	2016-01-06 18:47:45 -08:00
Benoit Steiner	cfff40b1d4	Improved the performance of reductions on CUDA devices	2016-01-04 17:25:00 -08:00
Benoit Steiner	515dee0baf	Added a 'divup' util to compute the floor of the quotient of two integers	2016-01-04 16:29:26 -08:00
Gael Guennebaud	8b0d1eb0f7	Fix numerous doxygen shortcomings, and workaround some clang -Wdocumentation warnings	2016-01-01 21:45:06 +01:00
Gael Guennebaud	978c379ed7	Add missing ctor from uint	2015-12-30 12:52:38 +01:00
Eugene Brevdo	f7362772e3	Add digamma for CPU + CUDA. Includes tests.	2015-12-24 21:15:38 -08:00
Benoit Steiner	bdcbc66a5c	Don't attempt to vectorize mean reductions of integers since we can't use SSE or AVX instructions to divide 2 integers.	2015-12-22 17:51:55 -08:00
Benoit Steiner	a1e08fb2a5	Optimized the configuration of the outer reduction cuda kernel	2015-12-22 16:30:10 -08:00
Benoit Steiner	9c7d96697b	Added missing define	2015-12-22 16:11:07 -08:00
Benoit Steiner	e7e6d01810	Made sure the optimized gpu reduction code is actually compiled.	2015-12-22 15:07:33 -08:00
Benoit Steiner	b5d2078c4a	Optimized outer reduction on GPUs.	2015-12-22 15:06:17 -08:00
Benoit Steiner	1c3e78319d	Added missing const	2015-12-21 15:05:01 -08:00
Benoit Steiner	1b82969559	Add alignment requirement for local buffer used by the slicing op.	2015-12-18 14:36:35 -08:00
Benoit Steiner	75a7fa1919	Doubled the speed of full reductions on GPUs.	2015-12-18 14:07:31 -08:00
Benoit Steiner	8dd17cbe80	Fixed a clang compilation warning triggered by the use of arrays of size 0.	2015-12-17 14:00:33 -08:00
Benoit Steiner	4aac55f684	Silenced some compilation warnings triggered by nvcc	2015-12-17 13:39:01 -08:00
Benoit Steiner	40e6250fc3	Made it possible to run tensor chipping operations on CUDA devices	2015-12-17 13:29:08 -08:00
Benoit Steiner	2ca55a3ae4	Fixed some compilation error triggered by the tensor code with msvc 2008	2015-12-16 20:45:58 -08:00
Gael Guennebaud	35d8725c73	Disable AutoDiffScalar generic copy ctor for non compatible scalar types (fix ambiguous template instantiation)	2015-12-16 10:14:24 +01:00
Christoph Hertzberg	92655e7215	bug #1136 : Protect isinf for Intel compilers. Also don't distinguish GCC from ICC and don't rely on EIGEN_NOT_A_MACRO, which might not be defined when including this.	2015-12-15 11:34:52 +01:00
Benoit Steiner	17352e2792	Made the entire TensorFixedSize api callable from a CUDA kernel.	2015-12-14 15:20:31 -08:00

1 2 3 4 5 ...

1547 Commits