eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-21 07:19:46 +08:00

Author	SHA1	Message	Date
Eugene Zhulenev	3c02fefec5	Add async evaluation support to TensorSlicingOp. Device::memcpy is not async-safe and might lead to deadlocks. Always evaluate slice expression in async mode.	2020-04-22 19:55:01 +00:00
Pedro Caldeira	0c67b855d2	Add Packet8s and Packet8us to support signed/unsigned int16/short Altivec vector operations	2020-04-21 14:52:46 -03:00
Rasmus Munk Larsen	e8f40e4670	Fix bug in ptrue for Packet16b.	2020-04-20 21:45:10 +00:00
Rasmus Munk Larsen	2f6ddaa25c	Add partial vectorization for matrices and tensors of bool. This speeds up boolean operations on Tensors by up to 25x. Benchmark numbers for the logical and of two NxN tensors: name old time/op new time/op delta BM_booleanAnd_1T/3 [using 1 threads] 14.6ns ± 0% 14.4ns ± 0% -0.96% BM_booleanAnd_1T/4 [using 1 threads] 20.5ns ±12% 9.0ns ± 0% -56.07% BM_booleanAnd_1T/7 [using 1 threads] 41.7ns ± 0% 10.5ns ± 0% -74.87% BM_booleanAnd_1T/8 [using 1 threads] 52.1ns ± 0% 10.1ns ± 0% -80.59% BM_booleanAnd_1T/10 [using 1 threads] 76.3ns ± 0% 13.8ns ± 0% -81.87% BM_booleanAnd_1T/15 [using 1 threads] 167ns ± 0% 16ns ± 0% -90.45% BM_booleanAnd_1T/16 [using 1 threads] 188ns ± 0% 16ns ± 0% -91.57% BM_booleanAnd_1T/31 [using 1 threads] 667ns ± 0% 34ns ± 0% -94.83% BM_booleanAnd_1T/32 [using 1 threads] 710ns ± 0% 35ns ± 0% -95.01% BM_booleanAnd_1T/64 [using 1 threads] 2.80µs ± 0% 0.11µs ± 0% -95.93% BM_booleanAnd_1T/128 [using 1 threads] 11.2µs ± 0% 0.4µs ± 0% -96.11% BM_booleanAnd_1T/256 [using 1 threads] 44.6µs ± 0% 2.5µs ± 0% -94.31% BM_booleanAnd_1T/512 [using 1 threads] 178µs ± 0% 10µs ± 0% -94.35% BM_booleanAnd_1T/1k [using 1 threads] 717µs ± 0% 78µs ± 1% -89.07% BM_booleanAnd_1T/2k [using 1 threads] 2.87ms ± 0% 0.31ms ± 1% -89.08% BM_booleanAnd_1T/4k [using 1 threads] 11.7ms ± 0% 1.9ms ± 4% -83.55% BM_booleanAnd_1T/10k [using 1 threads] 70.3ms ± 0% 17.2ms ± 4% -75.48%	2020-04-20 20:16:28 +00:00
dlazenby	00f6340153	Update PreprocessorDirectives.dox - Added line for the new VectorwiseOp plugin directive (and re-alphabatized the plugin section)	2020-04-17 21:43:37 +00:00
Rasmus Munk Larsen	5ab87d8aba	Move eigen_packet_wrapper to GenericPacketMath.h and use it for SSE/AVX/AVX512 as it is already used for NEON. This will allow us to define multiple packet types backed by the same vector type, e.g., __m128i. Use this machanism to define packets for half and clean up the packet op implementations.	2020-04-15 18:17:19 +00:00
Rasmus Munk Larsen	4aae8ac693	Fix typo in TypeCasting.h	2020-04-14 02:55:51 +00:00
Rasmus Munk Larsen	1d674003b2	Fix big in vectorized casting of {uint8, int8} -> {int16, uint16, int32, uint32, float} {uint16, int16} -> {int32, uint32, int64, uint64, float} for NEON. These conversions were advertised as vectorized, but not actually implemented.	2020-04-14 02:11:06 +00:00
Changming Sun	b1aa07a8d3	Fix a bug in TensorIndexList.h	2020-04-13 18:22:03 +00:00
Christoph Hertzberg	d46d726e9d	CommaInitializer wrongfully asserted for 0-sized blocks commainitialier unit-test never actually called `test_block_recursion`, which also was not correctly implemented and would have caused too deep template recursion.	2020-04-13 16:41:20 +02:00
Antonio Sanchez	c854e189e6	Fixed commainitializer test. The removed `finished()` call was responsible for enforcing that the initializer was provided the correct number of values. Putting it back in to restore previous behavior.	2020-04-10 13:53:26 -07:00
jangsoopark	39142904cc	Resolve C4346 when building eigen on windows	2020-04-08 14:55:39 +09:00
Rasmus Munk Larsen	f0577a2bfd	Speed up matrix multiplication for small to medium size matrices by using half- or quarter-packet vectorized loads in gemm_pack_rhs if they have size 4, instead of dropping down the the scalar path. Benchmark measurements below are for computing ```c.noalias() = a.transpose() * b;``` for square RowMajor matrices of varying size. Measured improvement with AVX+FMA: name old time/op new time/op delta BM_MatMul_ATB/8 139ns ± 1% 129ns ± 1% -7.49% (p=0.008 n=5+5) BM_MatMul_ATB/32 1.46µs ± 1% 1.22µs ± 0% -16.72% (p=0.008 n=5+5) BM_MatMul_ATB/64 8.43µs ± 1% 7.41µs ± 0% -12.04% (p=0.008 n=5+5) BM_MatMul_ATB/128 56.8µs ± 1% 52.9µs ± 1% -6.83% (p=0.008 n=5+5) BM_MatMul_ATB/256 407µs ± 1% 395µs ± 3% -2.94% (p=0.032 n=5+5) BM_MatMul_ATB/512 3.27ms ± 3% 3.18ms ± 1% ~ (p=0.056 n=5+5) Measured improvement for AVX512: name old time/op new time/op delta BM_MatMul_ATB/8 167ns ± 1% 154ns ± 1% -7.63% (p=0.008 n=5+5) BM_MatMul_ATB/32 1.08µs ± 1% 0.83µs ± 3% -23.58% (p=0.008 n=5+5) BM_MatMul_ATB/64 6.21µs ± 1% 5.06µs ± 1% -18.47% (p=0.008 n=5+5) BM_MatMul_ATB/128 36.1µs ± 2% 31.3µs ± 1% -13.32% (p=0.008 n=5+5) BM_MatMul_ATB/256 263µs ± 2% 242µs ± 2% -7.92% (p=0.008 n=5+5) BM_MatMul_ATB/512 1.95ms ± 2% 1.91ms ± 2% ~ (p=0.095 n=5+5) BM_MatMul_ATB/1k 15.4ms ± 4% 14.8ms ± 2% ~ (p=0.095 n=5+5)	2020-04-07 22:09:51 +00:00
Antonio Sanchez	8e875719b3	Replace norm() with squaredNorm() to address integer overflows For random matrices with integer coefficients, many of the tests here lead to integer overflows. When taking the norm() of a row/column, the squaredNorm() often overflows to a negative value, leading to domain errors when taking the sqrt(). This leads to a crash on some systems. By replacing the norm() call by a squaredNorm(), the values still overflow, but at least there is no domain error. Addresses https://gitlab.com/libeigen/eigen/-/issues/1856	2020-04-07 19:48:28 +00:00
Antonio Sanchez	9dda5eb7d2	Missing struct definition in NumTraits	2020-04-07 09:01:11 -07:00
Akshay Naresh Modi	bcc0e9e15c	Add numeric_limits min and max for bool This will allow (among other things) computation of argmax and argmin of bool tensors	2020-04-06 23:38:57 +00:00
Bernardo Bahia Monteiro	54a0a9c9dd	Bugfix: conjugate_gradient did not compile with lazy-evaluated RealScalar The error generated by the compiler was: no matching function for call to 'maxi' RealScalar threshold = numext::maxi(toltolrhsNorm2,considerAsZero); The important part in the following notes was: candidate template ignored: deduced conflicting types for parameter 'T'" ('codi::Multiply11<...>' vs. 'codi::ActiveReal<...>') EIGEN_ALWAYS_INLINE T maxi(const T& x, const T& y) I am using CoDiPack to provide the RealScalar type. This bug was introduced in `bc000deaa` Fix conjugate-gradient for very small rhs	2020-03-29 19:44:12 -04:00
Rasmus Munk Larsen	4fd5d1477b	Fix packetmath test build for AVX.	2020-03-27 17:05:39 +00:00
Rasmus Munk Larsen	393dbd8ee9	Fix bug in `52d54278be`	2020-03-27 16:42:18 +00:00
Rasmus Munk Larsen	55c8fe8d0f	Fix bug in `52d54278be`	2020-03-27 16:41:15 +00:00
Joel Holdsworth	6d2dbfc453	NEON: Fixed MSVC types definitions	2020-03-26 20:19:58 +00:00
Joel Holdsworth	52d54278be	Additional NEON packet-math operations	2020-03-26 20:18:19 +00:00
Everton Constantino	deb93ed1bf	Adhere to recommended load/store intrinsics for pp64le	2020-03-23 15:18:15 -03:00
Aaron Franke	5c22c7a7de	Make file formatting comply with POSIX and Unix standards UTF-8, LF, no BOM, and newlines at the end of files	2020-03-23 18:09:02 +00:00
Everton Constantino	5afdaa473a	Fixing float32's pround halfway criteria to match STL's criteria.	2020-03-21 22:30:54 -05:00
Alessio M	96cd1ff718	Fixed: - access violation when initializing 0x0 matrices - exception can be thrown during stack unwind while comma-initializing a matrix if eigen_assert if configured to throw	2020-03-21 05:11:21 +00:00
dlazenby	cc954777f2	Update VectorwiseOp.h to allow Plugins similar to MatrixBase.h or ArrayBase.h	2020-03-20 19:30:01 +00:00
Masaki Murooka	55ecd58a3c	Bug https://gitlab.com/libeigen/eigen/-/issues/1415 : add missing EIGEN_DEVICE_FUNC to diagonal_product_evaluator_base.	2020-03-20 13:37:37 +09:00
Rasmus Munk Larsen	4da2c6b197	Remove reference to non-existent unary_op_base class.	2020-03-19 18:23:06 +00:00
Rasmus Munk Larsen	eda90baf35	Add missing arguments to numext::absdiff().	2020-03-19 18:16:55 +00:00
Joel Holdsworth	d5c665742b	Add absolute_difference coefficient-wise binary Array function	2020-03-19 17:45:20 +00:00
Everton Constantino	6ff5a14091	Reenabling packetmath unsigned tests, adding dummy pabs for relevant unsigned types.	2020-03-19 17:31:49 +00:00
Joel Holdsworth	232f904082	Add shift_left<N> and shift_right<N> coefficient-wise unary Array functions	2020-03-19 17:24:06 +00:00
Joel Holdsworth	54aa8fa186	Implement integer square-root for NEON	2020-03-19 17:05:13 +00:00
Allan Leal	37ccb86916	Update NullaryFunctors.h	2020-03-16 11:59:02 +00:00
Deven Desai	7158ed4e0e	Fixing HIP breakage caused by the recent commit that introduces Packet4h2 as the Eigen::Half packet type	2020-03-12 01:06:24 +00:00
Joel Holdsworth	d53ae40f7b	NEON: Added int64_t and uint64_t packet math	2020-03-10 22:46:19 +00:00
Joel Holdsworth	4b9ecf2924	NEON: Added int8_t and uint8_t packet math	2020-03-10 22:46:19 +00:00
Joel Holdsworth	ceaabd4e16	NEON: Added int16_t and uint16_t packet math	2020-03-10 22:46:19 +00:00
Joel Holdsworth	d5d3cf9339	NEON: Added uint32_t packet math	2020-03-10 22:46:19 +00:00
Joel Holdsworth	eacf97f727	NEON: Implemented half-size vectors	2020-03-10 22:46:19 +00:00
Joel Holdsworth	5f411b729e	NEON: Set packet_traits<double> flags	2020-03-10 22:46:19 +00:00
Joel Holdsworth	88337acae2	test/packetmath: Add tests for all integer types	2020-03-10 22:46:19 +00:00
Joel Holdsworth	9e68977578	test/packetmath: Made negate non-mandatory	2020-03-10 22:46:19 +00:00
Sami Kama	b733b8b680	remove duplicate pset1 for half and add some comments about why we need expose pmul/add/div/min/max on host	2020-03-10 20:28:43 +00:00
Ram-Z	a45d28256d	Don't restrict CMAKE_BUILD_TYPE This prevents projects that add Eigen using `add_subdirectory` from using their own custom CMAKE_BUILD_TYPE and have Eigen respect the same custom flags.	2020-02-28 20:46:53 +00:00
Cédric Hubert	98bfc5aaa8	Update MarketIO.h	2020-02-28 12:41:51 +00:00
Rasmus Munk Larsen	52a2fbbb00	Revert "avoid selecting half-packets when unnecessary" This reverts commit `5ca10480b0`	2020-02-25 01:07:43 +00:00
Rasmus Munk Larsen	235bcfe08d	Revert "Pick full packet unconditionally when EIGEN_UNALIGNED_VECTORIZE" This reverts commit `44df2109c8`	2020-02-25 01:07:28 +00:00
Rasmus Munk Larsen	d7a42eade6	Revert "do not pick full-packet if it'd result in more operations" This reverts commit `e9cc0cd353`	2020-02-25 01:07:15 +00:00

... 3 4 5 6 7 ...

11119 Commits