eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-21 07:19:46 +08:00

Author	SHA1	Message	Date
Rasmus Munk Larsen	47d8b741b2	#elif -> #else to fix GPU build.	2018-12-05 13:19:31 -08:00
Rasmus Munk Larsen	8a02883d58	Merged in markdryan/eigen/avx512-contraction-2 (pull request PR-554) Fix tensor contraction on AVX512 builds Approved-by: Rasmus Munk Larsen <rmlarsen@google.com>	2018-12-05 18:19:32 +00:00
Gael Guennebaud	acc3459a49	Add help messages in the quick ref/ascii docs regarding slicing, indexing, and reshaping.	2018-12-05 17:17:23 +01:00
Gael Guennebaud	e2e897298a	Fix page nesting	2018-12-05 17:13:46 +01:00
Christoph Hertzberg	c1d356e8b4	bug #1635 : Use infinity from Numtraits instead of creating it manually.	2018-12-05 15:01:04 +01:00
Mark D Ryan	36f8f6d0be	Fix evalShardedByInnerDim for AVX512 builds evalShardedByInnerDim ensures that the values it passes for start_k and end_k to evalGemmPartialWithoutOutputKernel are multiples of 8 as the kernel does not work correctly when the values of k are not multiples of the packet_size. While this precaution works for AVX builds, it is insufficient for AVX512 builds where the maximum packet size is 16. The result is slightly incorrect float32 contractions on AVX512 builds. This commit fixes the problem by ensuring that k is always a multiple of the packet_size if the packet_size is > 8.	2018-12-05 12:29:03 +01:00
Rasmus Munk Larsen	b57b31cce9	Merged in ezhulenev/eigen-01 (pull request PR-553) Do not disable alignment with EIGEN_GPUCC Approved-by: Rasmus Munk Larsen <rmlarsen@google.com>	2018-12-04 23:47:19 +00:00
Eugene Zhulenev	0bb15bb6d6	Update checks in ConfigureVectorization.h	2018-12-03 17:10:40 -08:00
Eugene Zhulenev	fd0fbfa9b5	Do not disable alignment with EIGEN_GPUCC	2018-12-03 15:54:10 -08:00
Christoph Hertzberg	919414b9fe	bug #785 : Make Cholesky decomposition work for empty matrices	2018-12-03 16:18:15 +01:00
Gael Guennebaud	0ea7ae7213	Add missing padd for Packet8i (it was implicitly generated by clang and gcc)	2018-11-30 21:52:25 +01:00
Gael Guennebaud	ab4df3e6ff	bug #1634 : remove double copy in move-ctor of non movable Matrix/Array	2018-11-30 21:25:51 +01:00
Gael Guennebaud	c785464430	Add packet sin and cos to Altivec/VSX and NEON	2018-11-30 16:21:33 +01:00
Gael Guennebaud	69ace742be	Several improvements regarding packet-bitwise operations: - add unit tests - optimize their AVX512f implementation - add missing implementations (half, Packet4f, ...)	2018-11-30 15:56:08 +01:00
Gael Guennebaud	fa87f9d876	Add psin/pcos on AVX512 -> almost for free, at last!	2018-11-30 14:33:13 +01:00
Gael Guennebaud	c68bd2fa7a	Cleanup	2018-11-30 14:32:31 +01:00
Gael Guennebaud	f91500d303	Fix pandnot order in AVX512	2018-11-30 14:32:06 +01:00
Gael Guennebaud	b477d60bc6	Extend the generic psin_float code to handle cosine and make SSE and AVX use it (-> this adds pcos for AVX)	2018-11-30 11:26:30 +01:00
Gael Guennebaud	e19ece822d	Disable fma gcc's workaround for gcc >= 8 (based on GEMM benchmarks)	2018-11-28 17:56:24 +01:00
Gael Guennebaud	41052f63b7	same for pmax	2018-11-28 17:17:28 +01:00
Gael Guennebaud	3e95e398b6	pmin/pmax o SSE: make sure to use AVX instruction with AVX enabled, and disable gcc workaround for fixed gcc versions	2018-11-28 17:14:20 +01:00
Gael Guennebaud	aa6097395b	Add missing SSE/AVX type-casting in AVX512 mode	2018-11-28 16:09:08 +01:00
Gael Guennebaud	48fe78c375	bug #1630 : fix linspaced when requesting smaller packet size than default one.	2018-11-28 13:15:06 +01:00
Eugene Zhulenev	80f1651f35	Use explicit packet type in SSE/PacketMath pldexp	2018-11-27 17:25:49 -08:00
Benoit Jacob	a4159dba08	do not read buffers out of bounds -- load only the 4 bytes we know exist here. Could also have done a vld1_lane_f32 but doing so here, without the overhead of initializing the unused lane, would have triggered used-of-uninitialized-value errors in tools such as ASan. Note that this code is sub-optimal before or after this change: we should be reading either 2 or 4 float32 values per load-instruction (2 for ARM in-order cores with an affinity for 8-byte loads; 4 for ARM out-of-order cores able to dual-issue 16-byte load instructions with arithmetic instructions). Before or after this patch, we are only loading 4 bytes of useful data here (even if before this patch, we were technically loading 8, only to use only the 4 first).	2018-11-27 16:53:14 -05:00
Gael Guennebaud	b131a4db24	bug #1631 : fix compilation with ARM NEON and clang, and cleanup the weird pshiftright_and_cast and pcast_and_shiftleft functions.	2018-11-27 23:45:00 +01:00
Gael Guennebaud	a1a5fbbd21	Update pshiftleft to pass the shift as a true compile-time integer.	2018-11-27 22:57:30 +01:00
Gael Guennebaud	fa7fd61eda	Unify SSE/AVX psin functions. It is based on the SSE version which is much more accurate, though very slightly slower. This changeset also includes the following required changes: - add packet-float to packet-int type traits - add packet float<->int reinterpret casts - add faster pselect for AVX based on blendv	2018-11-27 22:41:51 +01:00
Rasmus Munk Larsen	08edbc8cfe	Merged in bjacob/eigen/fixbuild (pull request PR-549) fix the build on 64-bit ARM when NEON is disabled	2018-11-27 20:14:12 +00:00
Benoit Jacob	7b1cb8a440	fix the build on 64-bit ARM when NEON is disabled	2018-11-27 11:11:02 -05:00
Gael Guennebaud	b5695a6008	Unify Altivec/VSX pexp(double) with default implementation	2018-11-27 13:53:05 +01:00
Gael Guennebaud	7655a8af6e	cleanup	2018-11-26 23:21:29 +01:00
Gael Guennebaud	502f92fa10	Unify SSE and AVX pexp for double.	2018-11-26 23:12:44 +01:00
Gael Guennebaud	4a347a0054	Unify NEON's pexp with generic implementation	2018-11-26 22:15:44 +01:00
Gael Guennebaud	5c8406babc	Unify Altivec/VSX's pexp with generic implementation	2018-11-26 16:47:13 +01:00
Gael Guennebaud	cf8b85d5c5	Unify SSE and AVX implementation of pexp	2018-11-26 16:36:19 +01:00
Gael Guennebaud	c2f35b1b47	Unify Altivec/VSX's plog with generic implementation, and enable it!	2018-11-26 15:58:11 +01:00
Gael Guennebaud	c24e98e6a8	Unify NEON's plog with generic implementation	2018-11-26 15:02:16 +01:00
Gael Guennebaud	2c44c40114	First step toward a unification of packet log implementation, currently only SSE and AVX are unified. To this end, I added the following functions: pzero, pcmp_*, pfrexp, pset1frombits functions.	2018-11-26 14:21:24 +01:00
Gael Guennebaud	5f6045077c	Make SSE/AVX pandnot(A,B) consistent with generic version, i.e., "A and not B"	2018-11-26 14:14:07 +01:00
Gael Guennebaud	382279eb7f	Extend unit test to recursively check half-packet types and non packet types	2018-11-26 14:10:07 +01:00
Gael Guennebaud	0836a715d6	bug #1611 : fix plog(0) on NEON	2018-11-26 09:08:38 +01:00
Patrik Huber	95566eeed4	Fix typos	2018-11-23 22:22:14 +00:00
Gael Guennebaud	e3b22a6bd0	merge	2018-11-23 16:06:21 +01:00
Gael Guennebaud	ccabdd88c9	Fix reserved usage of double __ in macro names	2018-11-23 16:01:47 +01:00
Gael Guennebaud	572d62697d	check two ctors	2018-11-23 15:37:09 +01:00
Gael Guennebaud	354f14293b	Fix double = bool !	2018-11-23 15:12:06 +01:00
Gael Guennebaud	a7842daef2	Fix several uninitialized member from ctor	2018-11-23 15:10:28 +01:00
Christoph Hertzberg	ea60a172cf	Add default constructor to Bar to make test compile again with clang-3.8	2018-11-23 14:24:22 +01:00
Christoph Hertzberg	806352d844	Small typo found be Patrick Huber (pull request PR-547)	2018-11-23 12:34:27 +00:00

... 5 6 7 8 9 ...

10555 Commits