eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-21 07:19:46 +08:00

Author	SHA1	Message	Date
Benoit Steiner	ba32ded021	Fixed include path	2016-04-29 15:11:09 -07:00
Gael Guennebaud	0f3c4c8ff4	Fix compilation of sparse.cast<>().transpose().	2016-04-29 18:26:08 +02:00
Benoit Steiner	a524a26fdc	Fixed a few memory leaks	2016-04-28 18:55:53 -07:00
Benoit Steiner	dacb23277e	Fixed the igamma and igammac implementations to make them callable from a gpu kernel.	2016-04-28 18:54:54 -07:00
Benoit Steiner	a5d4545083	Deleted unused variable	2016-04-28 14:14:48 -07:00
Justin Lebar	40d1e2f8c7	Eliminate mutual recursion in igamma{,c}_impl::Run. Presently, igammac_impl::Run calls igamma_impl::Run, which in turn calls igammac_impl::Run. This isn't actually mutual recursion; the calls are guarded such that we never get into a loop. Nonetheless, it's a stretch for clang to prove this. As a result, clang emits a recursive call in both igammac_impl::Run and igamma_impl::Run. That this is suboptimal code is bad enough, but it's particularly bad when compiling for CUDA/nvptx. nvptx allows recursion, but only begrudgingly: If you have recursive calls in a kernel, it's on you to manually specify the kernel's stack size. Otherwise, ptxas will dump a warning, make a guess, and who knows if it's right. This change explicitly eliminates the mutual recursion in igammac_impl::Run and igamma_impl::Run.	2016-04-28 13:57:08 -07:00
Benoit Steiner	3ec81fc00f	Fixed compilation error with clang.	2016-04-27 19:32:12 -07:00
Benoit Steiner	2b917291d9	Merged in rmlarsen/eigen2 (pull request PR-183) Detect cxx_constexpr support when compiling with clang.	2016-04-27 15:19:54 -07:00
Rasmus Munk Larsen	09b9e951e3	Depend on the more extensive support for constexpr in clang: http://clang.llvm.org/docs/LanguageExtensions.html#c-1y-relaxed-constexpr	2016-04-27 14:59:11 -07:00
Rasmus Munk Larsen	1a325ef71c	Detect cxx_constexpr support when compiling with clang.	2016-04-27 14:33:51 -07:00
Benoit Steiner	1a97fd8b4e	Merged latest update from trunk	2016-04-27 14:22:45 -07:00
Benoit Steiner	c61170e87d	fpclassify isn't portable enough. In particular, the return values of the function are not available on all the platforms Eigen supportes: remove it from Eigen.	2016-04-27 14:22:20 -07:00
Gael Guennebaud	318e65e0ae	Fix missing inclusion of Eigen/Core	2016-04-27 23:05:40 +02:00
Benoit Steiner	f629fe95c8	Made the index type a template parameter to evaluateProductBlockingSizes Use numext::mini and numext::maxi instead of std::min/std::max to compute blocking sizes.	2016-04-27 13:11:19 -07:00
Benoit Steiner	66b215b742	Merged latest updates from trunk	2016-04-27 12:57:48 -07:00
Benoit Steiner	25141b69d4	Improved support for min and max on 16 bit floats when running on recent cuda gpus	2016-04-27 12:57:21 -07:00
Rasmus Larsen	ff33798acd	Merged eigen/eigen into default	2016-04-27 12:27:00 -07:00
Rasmus Munk Larsen	463738ccbe	Use computeProductBlockingSizes to compute blocking for both ShardByCol and ShardByRow cases.	2016-04-27 12:26:18 -07:00
Benoit Steiner	6744d776ba	Added support for fpclassify in Eigen::Numext	2016-04-27 12:10:25 -07:00
Rasmus Munk Larsen	1f48f47ab7	Implement stricter argument checking for SYRK and SY2K and real matrices. To implement the BLAS API they should return info=2 if op='C' is passed for a complex matrix. Without this change, the Eigen BLAS fails the strict zblat3 and cblat3 tests in LAPACK 3.5.	2016-04-27 19:59:44 +02:00
Gael Guennebaud	3dddd34133	Refactor the unsupported CXX11/Core module to internal headers only.	2016-04-26 11:20:25 +02:00
Benoit Steiner	4a164d2c46	Fixed the partial evaluation of non vectorizable tensor subexpressions	2016-04-25 10:43:03 -07:00
Benoit Steiner	fd9401f260	Refined the cost of the striding operation.	2016-04-25 09:16:08 -07:00
Benoit Steiner	5c372d19e3	Merged in rmlarsen/eigen (pull request PR-179) Prevent crash in CompleteOrthogonalDecomposition if object was default constructed.	2016-04-21 18:06:36 -07:00
Benoit Steiner	4bbc97be5e	Provide access to the base threadpool classes	2016-04-21 17:59:33 -07:00
Rasmus Munk Larsen	a3256d78d8	Prevent crash in CompleteOrthogonalDecomposition if object was default constructed.	2016-04-21 16:49:28 -07:00
Benoit Steiner	33adce5c3a	Added the ability to switch to the new thread pool with a #define	2016-04-21 11:59:58 -07:00
Benoit Steiner	79b900375f	Use index list for the striding benchmarks	2016-04-21 11:58:27 -07:00
Benoit Steiner	f670613e4b	Fixed several compilation warnings	2016-04-21 11:03:02 -07:00
Benoit Steiner	6015422ee6	Added an option to enable the use of the F16C instruction set	2016-04-21 10:30:29 -07:00
Benoit Steiner	32ffce04fc	Use EIGEN_THREAD_YIELD instead of std::this_thread::yield to make the code more portable.	2016-04-21 08:47:28 -07:00
Benoit Steiner	2dde1b1028	Don't crash when attempting to reduce empty tensors.	2016-04-20 18:08:20 -07:00
Benoit Steiner	a792cd357d	Added more tests	2016-04-20 17:33:58 -07:00
Benoit Steiner	80200a1828	Don't attempt to leverage the _cvtss_sh and _cvtsh_ss instructions when compiling with clang since it's unclear which versions of clang actually support these instruction.	2016-04-20 12:10:27 -07:00
Benoit Steiner	c7c2054bb5	Started to implement a portable way to yield.	2016-04-19 17:59:58 -07:00
Benoit Steiner	1d0238375d	Made sure all the required header files are included when trying to use fp16	2016-04-19 17:44:12 -07:00
Benoit Steiner	2b72163028	Implemented a more portable version of thread local variables	2016-04-19 15:56:02 -07:00
Benoit Steiner	04f954956d	Fixed a few typos	2016-04-19 15:27:09 -07:00
Benoit Steiner	5b1106c56b	Fixed a compilation error with nvcc 7.	2016-04-19 14:57:57 -07:00
Benoit Steiner	7129d998db	Simplified the code that launches cuda kernels.	2016-04-19 14:55:21 -07:00
Benoit Steiner	b9ea40c30d	Don't take the address of a kernel on CUDA devices that don't support this feature.	2016-04-19 14:35:11 -07:00
Benoit Steiner	884c075058	Use numext::ceil instead of std::ceil	2016-04-19 14:33:30 -07:00
Benoit Steiner	a278414d1b	Avoid an unnecessary copy of the evaluator.	2016-04-19 13:54:28 -07:00
Benoit Steiner	f953c60705	Fixed 2 recent regression tests	2016-04-19 12:57:39 -07:00
Benoit Steiner	50968a0a3e	Use DenseIndex in the MeanReducer to avoid overflows when processing very large tensors.	2016-04-19 11:53:58 -07:00
Benoit Steiner	84543c8be2	Worked around the lack of a rand_r function on windows systems	2016-04-17 19:29:27 -07:00
Benoit Steiner	5fbcfe5eb4	Worked around the lack of a rand_r function on windows systems	2016-04-17 18:42:31 -07:00
Gael Guennebaud	e4fe611e2c	Enable lazy-coeff-based-product for vector*(1x1) products	2016-04-16 15:17:39 +02:00
Benoit Steiner	c8e8f93d6c	Move the evalGemm method into the TensorContractionEvaluatorBase class to make it accessible from both the single and multithreaded contraction evaluators.	2016-04-15 16:48:10 -07:00
Benoit Steiner	1a16fb1532	Deleted extraneous comma.	2016-04-15 15:50:13 -07:00

1 2 3 4 5 ...

7867 Commits