eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-21 07:19:46 +08:00

Author	SHA1	Message	Date
Rasmus Munk Larsen	f2767112c8	Simplify a bit.	2019-01-09 16:29:18 -08:00
Rasmus Munk Larsen	cb955df9a6	Add packet up "pones". Write pnot(a) as pxor(pones(a), a).	2019-01-09 16:17:08 -08:00
Rasmus Larsen	cb3c059fa4	Merged eigen/eigen into default	2019-01-09 15:04:17 -08:00
Gael Guennebaud	d812f411c3	bug #1654 : fix compilation with cuda and no c++11	2019-01-09 18:00:05 +01:00
Gael Guennebaud	3492a1ca74	fix plog(+inf) with AVX512	2019-01-09 16:53:37 +01:00
Gael Guennebaud	47810cf5b7	Add dedicated implementations of predux_any for AVX512, NEON, and Altivec/VSE	2019-01-09 16:40:42 +01:00
Gael Guennebaud	3f14e0d19e	fix warning	2019-01-09 15:45:21 +01:00
Gael Guennebaud	aeec68f77b	Add missing pcmp_lt and others for AVX512	2019-01-09 15:36:41 +01:00
Gael Guennebaud	e6b217b8dd	bug #1652 : implements a much more accurate version of vectorized sin/cos. This new version achieve same speed for SSE/AVX, and is slightly faster with FMA. Guarantees are as follows: - no FMA: 1ULP up to 3pi, 2ULP up to sin(25966) and cos(18838), fallback to std::sin/cos for larger inputs - FMA: 1ULP up to sin(117435.992) and cos(71476.0625), fallback to std::sin/cos for larger inputs	2019-01-09 15:25:17 +01:00
Eugene Zhulenev	e70ffef967	Optimize evalShardedByInnerDim	2019-01-08 16:26:31 -08:00
Rasmus Munk Larsen	055f0b73db	Add support for pcmp_eq and pnot, including for complex types.	2019-01-07 16:53:36 -08:00
Eugene Zhulenev	190d053e41	Explicitly set fill character when printing aligned data to ostream	2019-01-03 14:55:28 -08:00
Mark D Ryan	bc5dd4cafd	PR560: Fix the AVX512f only builds Commit `c53eececb0` introduced AVX512 support for complex numbers but required avx512dq to build. Commit `1d683ae2f5` fixed some but not, it would seem all, of the hard avx512dq dependencies. Build failures are still evident on Eigen and TensorFlow when compiling with just avx512f and no avx512dq using gcc 7.3. Looking at the code there does indeed seem to be a problem. Commit `c53eececb0` calls avx512dq intrinsics directly, e.g, _mm512_extractf32x8_ps and _mm512_and_ps. This commit fixes the issue by replacing the direct intrinsic calls with the various wrapper functions that are safe to use on avx512f only builds.	2019-01-03 14:33:04 +01:00
Gael Guennebaud	697fba3bb0	Fix unit test	2018-12-27 11:20:47 +01:00
Gael Guennebaud	60d3fe9a89	One more stupid AVX 512 fix (I don't have direct access to AVX512 machines)	2018-12-24 13:05:03 +01:00
Gael Guennebaud	4aa667b510	Add EIGEN_STRONG_INLINE where required	2018-12-24 10:45:01 +01:00
Gael Guennebaud	961ff567e8	Add missing pcmp_lt_or_nan for AVX512	2018-12-23 22:13:29 +01:00
Gael Guennebaud	0f6f75bd8a	Implement a faster fix for sin/cos of large entries that also correctly handle INF input.	2018-12-23 17:26:21 +01:00
Gael Guennebaud	38d704def8	Make sure that psin/pcos return number in [-1,1] for large inputs (though sin/cos on large entries is quite useless because it's inaccurate)	2018-12-23 16:13:24 +01:00
Gael Guennebaud	5713fb7feb	Fix plog(+INF): it returned ~87 instead of +INF	2018-12-23 15:40:52 +01:00
Christoph Hertzberg	6dd93f7e3b	Make code compile again for older compilers. See https://stackoverflow.com/questions/7411515/	2018-12-22 13:09:07 +01:00
Gael Guennebaud	efa4c9c40f	bug #1615 : slightly increase the default unrolling limit to compensate for changeset `101ea26f5e` . This solves a performance regression with clang and 3x3 matrix products.	2018-12-13 10:42:39 +01:00
Gael Guennebaud	f20c991679	add changesets related to matrix product perf.	2018-12-13 10:33:29 +01:00
Rasmus Munk Larsen	dd6d65898a	Fix shorten-64-to-32 warning. Use regular memcpy if num_threads==0.	2018-12-12 14:45:31 -08:00
Gael Guennebaud	f582ea3579	Fix compilation with expression template scalar type.	2018-12-12 22:47:00 +01:00
Gael Guennebaud	cfc70dc13f	Add regression test for bug #1174	2018-12-12 18:03:31 +01:00
Gael Guennebaud	2de8da70fd	bug #1557 : fix RealSchur and EigenSolver for matrices with only zeros on the diagonal.	2018-12-12 17:30:08 +01:00
Gael Guennebaud	72c0bbe2bd	Simplify handling of tests that must fail to compile. Each test is now a normal ctest target, and build properties (compiler+flags) are preserved (instead of starting a new build-dir from scratch).	2018-12-12 15:48:36 +01:00
Gael Guennebaud	37c91e1836	bug #1644 : fix warning	2018-12-11 22:07:20 +01:00
Gael Guennebaud	f159cf3d75	Artificially increase l1-blocking size for AVX512. +10% speedup with current kernels. With a 6pX4 kernel (not committed yet), this provides a +20% speedup.	2018-12-11 15:36:27 +01:00
Gael Guennebaud	0a7e7af6fd	Properly set the number of registers for AVX512	2018-12-11 15:33:17 +01:00
Gael Guennebaud	7166496f70	bug #1643 : fix compilation issue with gcc and no optimizaion	2018-12-11 13:24:42 +01:00
Gael Guennebaud	0d90637838	enable spilling workaround on architectures with SSE/AVX	2018-12-10 23:22:44 +01:00
Gael Guennebaud	cf697272e1	Remove debug code.	2018-12-09 23:05:46 +01:00
Gael Guennebaud	450dc97c6b	Various fixes in polynomial solver and its unit tests: - cleanup noise in imaginary part of real roots - take into account the magnitude of the derivative to check roots. - use <= instead of < at appropriate places	2018-12-09 22:54:39 +01:00
Gael Guennebaud	348bb386d1	Enable "old" CMP0026 policy (not perfect, but better than dozens of warning)	2018-12-08 18:59:51 +01:00
Gael Guennebaud	bff90bf270	workaround "may be used uninitialized" warning	2018-12-08 18:58:28 +01:00
Gael Guennebaud	81c27325ae	bug #1641 : fix testing of pandnot and fix pandnot for complex on SSE/AVX/AVX512	2018-12-08 14:27:48 +01:00
Gael Guennebaud	426bce7529	fix EIGEN_GEBP_2PX4_SPILLING_WORKAROUND for non vectorized type, and non x86/64 target	2018-12-08 09:44:21 +01:00
Gael Guennebaud	cd25b538ab	Fix noise in sparse_basic_3 (numerical cancellation)	2018-12-08 00:13:37 +01:00
Gael Guennebaud	efaf03bf96	Fix noise in lu unit test	2018-12-08 00:05:03 +01:00
Gael Guennebaud	956678a4ef	bug #1515 : disable gebp's 3pX4 micro kernel for MSVC<=19.14 because of register spilling.	2018-12-07 18:03:36 +01:00
Gael Guennebaud	7b6d0ff1f6	Enable FMA with MSVC (through /arch:AVX2). To make this possible, I also has to turn the #warning regarding AVX512-FMA to a #error.	2018-12-07 15:14:50 +01:00
Gael Guennebaud	f233c6194d	bug #1637 : workaround register spilling in gebp with clang>=6.0+AVX+FMA	2018-12-07 10:01:09 +01:00
Gael Guennebaud	ae59a7652b	bug #1638 : add a warning if avx512 is enabled without SSE/AVX FMA	2018-12-07 09:23:28 +01:00
Gael Guennebaud	4e7746fe22	bug #1636 : fix gemm performance issue with gcc>=6 and no FMA	2018-12-07 09:15:46 +01:00
Gael Guennebaud	cbf2f4b7a0	AVX512f includes FMA but GCC does not define __FMA__ with -mavx512f only	2018-12-06 18:21:56 +01:00
Gael Guennebaud	1d683ae2f5	Fix compilation with avx512f only, i.e., no AVX512DQ	2018-12-06 18:11:07 +01:00
Gael Guennebaud	aab749b1c3	fix test regarding AVX512 vectorization of complexes.	2018-12-06 16:55:00 +01:00
Gael Guennebaud	c53eececb0	Implement AVX512 vectorization of std::complex<float/double>	2018-12-06 15:58:06 +01:00

1 2 3 4 5 ...

10305 Commits