eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-01-24 14:45:14 +08:00

Author	SHA1	Message	Date
Gael Guennebaud	21cf4a1a8b	Make is_convertible more robust and conformant to std::is_convertible	2018-07-12 09:57:19 +02:00
Gael Guennebaud	8a5955a052	Optimize the product of a householder-sequence with the identity, and optimize the evaluation of a HouseholderSequence to a dense matrix using faster blocked product.	2018-07-11 17:16:50 +02:00
Gael Guennebaud	d193cc87f4	Fix regression in `9357838f94`	2018-07-11 17:09:23 +02:00
Gael Guennebaud	fb33687736	Fix double ;;	2018-07-11 17:08:30 +02:00
Gael Guennebaud	f00d08cc0a	Optimize extraction of Q in SparseQR by exploiting the structure of the identity matrix.	2018-07-11 14:01:47 +02:00
Gael Guennebaud	1625476091	Add internall::is_identity compile-time helper	2018-07-11 14:00:24 +02:00
Gael Guennebaud	fe723d6129	Fix conversion warning	2018-07-10 09:10:32 +02:00
Gael Guennebaud	9357838f94	bug #1543 : improve linear indexing for general block expressions	2018-07-10 09:10:15 +02:00
Gael Guennebaud	de9e31a06d	Introduce the macro ei_declare_local_nested_eval to help allocating on the stack local temporaries via alloca, and let outer-products makes a good use of it. If successful, we should use it everywhere nested_eval is used to declare local dense temporaries.	2018-07-09 15:41:14 +02:00
Gael Guennebaud	ec323b7e66	Skip null numerators in triangular-vector-solve (as in BLAS TRSV).	2018-07-09 11:13:19 +02:00
Gael Guennebaud	359dd77ec3	Fix legitimate "declaration shadows a typedef" warning	2018-07-09 11:03:39 +02:00
Mark D Ryan	90a53ca6fd	Fix the Packet16h version of ptranspose The AVX512 version of ptranpose for PacketBlock<Packet16h,16> was reordering the PacketBlock argument incorrectly. This lead to errors in the multiplication of matrices composed of 16 bit floats on AVX512 machines, if at least of the matrices was using RowMajor order. This error is responsible for one tensorflow unit test failure on AVX512 machines: //tensorflow/python/kernel_tests:batch_matmul_op_test	2018-06-16 15:13:06 -07:00
Gael Guennebaud	1f54164eca	Fix a few issues with Packet16h	2018-07-07 00:15:07 +02:00
Gael Guennebaud	f2dc048df9	complete implementation of Packet16h (AVX512)	2018-07-06 17:43:11 +02:00
Gael Guennebaud	f4d623ffa7	Complete Packet8h implementation and test it in packetmath unit test	2018-07-06 17:13:36 +02:00
Andrea Bocci	f7124b3e46	Extend CUDA support to matrix inversion and selfadjointeigensolver	2018-06-11 18:33:24 +02:00
Gael Guennebaud	0537123953	bug #1565 : help MSVC to generatenot too bad ASM in reductions.	2018-07-05 09:21:26 +02:00
Gael Guennebaud	6a241bd8ee	Implement custom inplace triangular product to avoid a temporary	2018-07-03 14:02:46 +02:00
Gael Guennebaud	3ae2083e23	Make is_same_dense compatible with different scalar types.	2018-07-03 13:21:43 +02:00
Gael Guennebaud	047677a08d	Fix regression in changeset `f05dea6b23` : computeFromHessenberg can take any expression for matrixQ, not only an HouseholderSequence.	2018-07-02 12:18:25 +02:00
Gael Guennebaud	d625564936	Simplify redux_evaluator using inheritance, and properly rename parameters in reducers.	2018-07-02 11:50:41 +02:00
Gael Guennebaud	d428a199ab	bug #1562 : optimize evaluation of small products of the form sAB by rewriting them as: s*(A.lazyProduct(B)) to save a costly temporary. Measured speedup from 2x to 5x...	2018-07-02 11:41:09 +02:00
Gael Guennebaud	0cdacf3fa4	update comment	2018-06-29 11:28:36 +02:00
Gael Guennebaud	9a81de1d35	Fix order of EIGEN_DEVICE_FUNC and returned type	2018-06-28 00:20:59 +02:00
Gael Guennebaud	f9d337780d	First step towards a generic vectorised quaternion product	2018-06-25 14:26:51 +02:00
Gael Guennebaud	ee5864f72e	bug #1560 fix product with a 1x1 diagonal matrix	2018-06-25 10:30:12 +02:00
Rasmus Munk Larsen	bda71ad394	Fix typo in pbend for AltiVec.	2018-06-22 15:04:35 -07:00
Benoit Steiner	d3a380af4d	Merged in mfigurnov/eigen/gamma-der-a (pull request PR-403) Derivative of the incomplete Gamma function and the sample of a Gamma random variable Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com>	2018-06-11 17:57:47 +00:00
Gael Guennebaud	d6813fb1c5	bug #1531 : expose NumDimensions for solve and sparse expressions.	2018-06-08 16:55:10 +02:00
Gael Guennebaud	89d65bb9d6	bug #1531 : expose NumDimensions for compatibility with Tensor	2018-06-08 16:50:17 +02:00
Gael Guennebaud	f05dea6b23	bug #1550 : prevent avoidable memory allocation in RealSchur	2018-06-08 10:14:57 +02:00
Benoit Steiner	522d3ca54d	Don't use std::equal_to inside cuda kernels since it's not supported.	2018-06-07 13:02:07 -07:00
Christoph Hertzberg	7d7bb91537	Missing line during manual rebase of PR-374	2018-06-07 20:30:09 +02:00
Michael Figurnov	30fa3d0454	Merge from eigen/eigen	2018-06-07 17:57:56 +01:00
Gael Guennebaud	af7c83b9a2	Fix warning	2018-06-07 15:45:24 +02:00
Gael Guennebaud	7fe29aceeb	Fix MSVC warning C4290: C++ exception specification ignored except to indicate a function is not __declspec(nothrow)	2018-06-07 15:36:20 +02:00
Christoph Hertzberg	e5f9f4768f	Avoid unnecessary C++11 dependency	2018-06-07 15:03:50 +02:00
Gael Guennebaud	b3fd93207b	Fix typos found using codespell	2018-06-07 14:43:02 +02:00
Michael Figurnov	4bd158fa37	Derivative of the incomplete Gamma function and the sample of a Gamma random variable. In addition to igamma(a, x), this code implements: * igamma_der_a(a, x) = d igamma(a, x) / da -- derivative of igamma with respect to the parameter * gamma_sample_der_alpha(alpha, sample) -- reparameterization derivative of a Gamma(alpha, 1) random variable sample with respect to the alpha parameter The derivatives are computed by forward mode differentiation of the igamma(a, x) code. Although gamma_sample_der_alpha can be implemented via igamma_der_a, a separate function is more accurate and efficient due to analytical cancellation of some terms. All three functions are implemented by a method parameterized with "mode" that always computes the derivatives, but does not return them unless required by the mode. The compiler is expected to (and, based on benchmarks, does) skip the unnecessary computations depending on the mode.	2018-06-06 18:49:26 +01:00
Michael Figurnov	f216854453	Exponentially scaled modified Bessel functions of order zero and one. The functions are conventionally called i0e and i1e. The exponentially scaled version is more numerically stable. The standard Bessel functions can be obtained as i0(x) = exp(\|x\|) i0e(x) The code is ported from Cephes and tested against SciPy.	2018-05-31 15:34:53 +01:00
Gael Guennebaud	647b724a36	Define pcast<> for SSE types even when AVX is enabled. (otherwise float are silently reinterpreted as int instead of being converted)	2018-05-29 20:46:46 +02:00
Gael Guennebaud	49262dfee6	Fix compilation and SSE support with PGI compiler	2018-05-29 15:09:31 +02:00
Gael Guennebaud	f0862b062f	Fix internal::is_integral<size_t/ptrdiff_t> with MSVC 2013 and older.	2018-05-22 19:29:51 +02:00
Gael Guennebaud	36e413a534	Workaround a MSVC 2013 compilation issue with MatrixBase(Index,int)	2018-05-22 18:51:35 +02:00
Gael Guennebaud	725bd92903	fix stupid typo	2018-05-18 17:46:43 +02:00
Gael Guennebaud	a382bc9364	is_convertible<T,Index> does not seems to work well with MSVC 2013, so let's rather use __is_enum(T) for old MSVC versions	2018-05-18 17:02:27 +02:00
Gael Guennebaud	4dd767f455	add some internal checks	2018-05-18 13:59:55 +02:00
Mark D Ryan	405859f18d	Set EIGEN_IDEAL_MAX_ALIGN_BYTES correctly for AVX512 builds bug #1548 The macro EIGEN_IDEAL_MAX_ALIGN_BYTES is being incorrectly set to 32 on AVX512 builds. It should be set to 64. In the current code it is only set to 64 if the macro EIGEN_VECTORIZE_AVX512 is defined. This macro does get defined in AVX512 builds in Core, but only after Macros.h, the file that defines EIGEN_IDEAL_MAX_ALIGN_BYTES, has been included. This commit fixes the issue by setting EIGEN_IDEAL_MAX_ALIGN_BYTES to 64 if __AVX512F__ is defined.	2018-05-17 17:04:00 +01:00
Gael Guennebaud	7134fa7a2e	Fix compilation with MSVC by reverting to char* for _mm_prefetch except for PGI (the later being the one that has the wrong prototype).	2018-06-07 09:33:10 +02:00
Robert Lukierski	b2053990d0	Adding EIGEN_DEVICE_FUNC to Products, especially Dense2Dense Assignment specializations. Otherwise causes problems with small fixed size matrix multiplication (call to 0x00 in call_assignment_no_alias in debug mode or trap in release with CUDA 9.1).	2018-03-14 16:19:43 +00:00

1 2 3 4 5 ...

5561 Commits