eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-01-06 14:14:46 +08:00

Author	SHA1	Message	Date
Eugene Zhulenev	4ac93f8edc	Allocate non-const scalar buffer for block evaluation with DefaultDevice	2019-07-01 10:55:19 -07:00
Mehdi Goli	9ea490c82c	[SYCL] : * Modifying TensorDeviceSYCL to use `EIGEN_THROW_X`. * Modifying TensorMacro to use `EIGEN_TRY/CATCH(X)` macro. * Modifying TensorReverse.h to use `EIGEN_DEVICE_REF` instead of `&`. * Fixing the SYCL device macro in SpecialFunctionsImpl.h.	2019-07-01 16:27:28 +01:00
Eugene Zhulenev	81a03bec75	Fix TensorReverse on GPU with m_stride[i]==0	2019-06-28 15:50:39 -07:00
Rasmus Munk Larsen	8053eeb51e	Fix CUDA compilation error for pselect<half>.	2019-06-28 12:07:29 -07:00
Rasmus Munk Larsen	74a9dd1102	Fix preprocessor condition to only generate a warning when calling eigen::GpuDevice::synchronize() from device code, but not when calling from a non-GPU compilation unit.	2019-06-28 11:56:21 -07:00
Rasmus Munk Larsen	70d4020ad9	Remove comma causing warning in c++03 mode.	2019-06-28 11:39:45 -07:00
Eugene Zhulenev	6e7c76481a	Merge with Eigen head	2019-06-28 11:22:46 -07:00
Eugene Zhulenev	878845cb25	Add block access to TensorReverseOp and make sure that TensorForcedEval uses block access when preferred	2019-06-28 11:13:44 -07:00
Rasmus Munk Larsen	1f61aee5ca	[SYCL] This PR adds the minimum modifications to the Eigen unsupported module required to run it on devices supporting SYCL. * Abstracting the pointer type so that both SYCL memory and pointer can be captured. * Converting SYCL virtual pointer to SYCL device memory in Eigen evaluator class. * Binding SYCL placeholder accessor to command group handler by using bind method in Eigen evaluator node. * Adding SYCL macro for controlling loop unrolling. * Modifying the TensorDeviceSycl.h and SYCL executor method to adopt the above changes.	2019-06-28 10:11:56 -07:00
Mehdi Goli	7d08fa805a	[SYCL] This PR adds the minimum modifications to the Eigen unsupported module required to run it on devices supporting SYCL. * Abstracting the pointer type so that both SYCL memory and pointer can be captured. * Converting SYCL virtual pointer to SYCL device memory in Eigen evaluator class. * Binding SYCL placeholder accessor to command group handler by using bind method in Eigen evaluator node. * Adding SYCL macro for controlling loop unrolling. * Modifying the TensorDeviceSycl.h and SYCL executor method to adopt the above changes.	2019-06-28 10:08:23 +01:00
Mehdi Goli	16a56b2ddd	[SYCL] This PR adds the minimum modifications to Eigen core required to run Eigen unsupported modules on devices supporting SYCL. * Adding SYCL memory model * Enabling/Disabling SYCL backend in Core * Supporting Vectorization	2019-06-27 12:25:09 +01:00
Christoph Hertzberg	adec097c61	Remove extra comma (causes warnings in C++03)	2019-06-26 16:14:28 +02:00
Eugene Zhulenev	229db81572	Optimize evaluation strategy for TensorSlicingOp and TensorChippingOp	2019-06-25 15:41:37 -07:00
Deven Desai	ba506d5bd2	fix for a ROCm/HIP specificcompile errror introduced by a recent commit.	2019-06-22 00:06:05 +00:00
Rasmus Munk Larsen	c9394d7a0e	Remove extra "one" in comment.	2019-06-20 16:23:19 -07:00
Rasmus Munk Larsen	b8f8dac4eb	Update comment as suggested by tra@google.com.	2019-06-20 16:18:37 -07:00
Rasmus Munk Larsen	e5e63c2cad	Fix grammar.	2019-06-20 16:03:59 -07:00
Rasmus Munk Larsen	302a404b7e	Added comment explaining the surprising EIGEN_COMP_CLANG && !EIGEN_COMP_NVCC clause.	2019-06-20 15:59:08 -07:00
Rasmus Munk Larsen	b5237f53b1	Fix CUDA build on Mac.	2019-06-20 15:44:14 -07:00
Rasmus Munk Larsen	988f24b730	Various fixes for packet ops. 1. Fix buggy pcmp_eq and unit test for half types. 2. Add unit test for pselect and add specializations for SSE 4.1, AVX512, and half types. 3. Get rid of FIXME: Implement faster pnegate for half by XOR'ing with a sign bit mask.	2019-06-20 11:47:49 -07:00
Christoph Hertzberg	e0be7f30e1	bug #1724 : Mask buggy warnings with g++-7 (grafted from `427f2f66d6` )	2019-06-14 14:57:46 +02:00
Anshul Jaiswal	fab51d133e	Updated Eigen_Colamd.h, namespacing macros ALIVE & DEAD as COLAMD_ALIVE & COLAMD_DEAD to prevent conflicts with other libraries / code.	2019-06-08 21:09:06 +00:00
Eugene Zhulenev	79c402e40e	Fix shadow warnings in TensorContractionThreadPool	2019-08-30 15:38:31 -07:00
Eugene Zhulenev	edf2ec28d8	Fix block mapper type name in TensorExecutor	2019-08-30 15:29:25 -07:00
Eugene Zhulenev	f0b36fb9a4	evalSubExprsIfNeededAsync + async TensorContractionThreadPool	2019-08-30 15:13:38 -07:00
Eugene Zhulenev	619cea9491	Revert accidentally removed <memory> header from ThreadPool	2019-08-30 14:51:17 -07:00
Eugene Zhulenev	66665e7e76	Asynchronous expression evaluation with TensorAsyncDevice	2019-08-30 14:49:40 -07:00
Rasmus Munk Larsen	f6c51d9209	Fix missing header inclusion and colliding definitions for half type casting, which broke build with -march=native on Haswell/Skylake.	2019-08-30 14:03:29 -07:00
Eugene Zhulenev	bc40d4522c	Const correctness in TensorMap<const Tensor<T, ...>> expressions	2019-08-28 17:46:05 -07:00
Rasmus Munk Larsen	1187bb65ad	Add more tests for corner cases of log1p and expm1. Add handling of infinite arguments to log1p such that log1p(inf) = inf.	2019-08-28 12:20:21 -07:00
Eugene Zhulenev	6e77f9bef3	Remove shadow warnings in TensorDeviceThreadPool	2019-08-28 10:32:19 -07:00
Rasmus Munk Larsen	9aba527405	Revert changes to std_falback::log1p that broke handling of arguments less than -1. Fix packet op accordingly.	2019-08-27 15:35:29 -07:00
Rasmus Munk Larsen	b021cdea6d	Clean up float16 a.k.a. Eigen::half support in Eigen. Move the definition of half to Core/arch/Default and move arch-specific packet ops to their respective sub-directories.	2019-08-27 11:30:31 -07:00
Rasmus Larsen	84fefdf321	Merged in ezhulenev/eigen-01 (pull request PR-683) Asynchronous parallelFor in Eigen ThreadPoolDevice	2019-08-26 21:49:17 +00:00
maratek	8b5ab0e4dd	Fix get_random_seed on Native Client Newlib in Native Client SDK does not provide ::random function. Implement get_random_seed for NaCl using ::rand, similarly to Windows version.	2019-08-23 15:25:56 -07:00
Eugene Zhulenev	6901788013	Asynchronous parallelFor in Eigen ThreadPoolDevice	2019-08-22 10:50:51 -07:00
Christoph Hertzberg	2fb24384c9	Merged in jaopaulolc/eigen (pull request PR-679) Fixes for Altivec/VSX and compilation with clang on PowerPC	2019-08-22 15:57:33 +00:00
Rasmus Larsen	57f6b62597	Merged in rmlarsen/eigen (pull request PR-680) Implement vectorized versions of log1p and expm1 in Eigen using Kahan's formulas, and change the scalar implementations to properly handle infinite arguments.	2019-08-22 00:25:29 +00:00
Eugene Zhulenev	071311821e	Remove XSMM support from Tensor module	2019-08-19 11:44:25 -07:00
João P. L. de Carvalho	5ac7984ffa	Fix debug macros in p{load,store}u	2019-08-14 11:59:12 -06:00
João P. L. de Carvalho	db9147ae40	Add missing pcmp_XX methods for double/Packet2d This actually fixes an issue in unit-test packetmath_2 with pcmp_eq when it is compiled with clang. When pcmp_eq(Packet4f,Packet4f) is used instead of pcmp_eq(Packet2d,Packet2d), the unit-test does not pass due to NaN on ref vector.	2019-08-14 10:37:39 -06:00
Rasmus Munk Larsen	a3298b22ec	Implement vectorized versions of log1p and expm1 in Eigen using Kahan's formulas, and change the scalar implementations to properly handle infinite arguments. Depending on instruction set, significant speedups are observed for the vectorized path: log1p wall time is reduced 60-93% (2.5x - 15x speedup) expm1 wall time is reduced 0-85% (1x - 7x speedup) The scalar path is slower by 20-30% due to the extra branch needed to handle +infinity correctly. Full benchmarks measured on Intel(R) Xeon(R) Gold 6154 here: https://bitbucket.org/snippets/rmlarsen/MXBkpM	2019-08-12 13:53:28 -07:00
João P. L. de Carvalho	787f6ef025	Fix packed load/store for PowerPC's VSX The vec_vsx_ld/vec_vsx_st builtins were wrongly used for aligned load/store. In fact, they perform unaligned memory access and, even when the address is 16-byte aligned, they are much slower (at least 2x) than their aligned counterparts. For double/Packet2d vec_xl/vec_xst should be prefered over vec_ld/vec_st, although the latter works when casted to float/Packet4f. Silencing some weird warning with throw but some GCC versions. Such warning are not thrown by Clang.	2019-08-09 16:02:55 -06:00
João P. L. de Carvalho	4d29aa0294	Fix offset argument of ploadu/pstoreu for Altivec If no offset is given, them it should be zero. Also passes full address to vec_vsx_ld/st builtins. Removes userless _EIGEN_ALIGNED_PTR & _EIGEN_MASK_ALIGNMENT. Removes unnecessary casts.	2019-08-09 15:59:26 -06:00
João P. L. de Carvalho	66d073c38e	bug #1718 : Add cast to successfully compile with clang on PowerPC Ignoring -Wc11-extensions warnings thrown by clang at Altivec/PacketMath.h	2019-08-09 15:56:26 -06:00
Rasmus Munk Larsen	6d432eae5d	Make is_valid_index_type return false for float and double when EIGEN_HAS_TYPE_TRAITS is off.	2019-06-05 16:42:27 -07:00
Rasmus Munk Larsen	f715f6e816	Add workaround for choosing the right include files with FP16C support with clang.	2019-06-05 13:36:37 -07:00
Justin Carpentier	ffaf658ecd	PR 655: Fix missing Eigen namespace in Macros	2019-06-05 09:51:59 +02:00
Mehdi Goli	0b24e1cb5c	[SYCL] Adding the SYCL memory model. The SYCL memory model provides : * an interface for SYCL buffers to behave as a non-dereferenceable pointer * an interface for placeholder accessor to behave like a pointer on both host and device	2019-07-01 16:02:30 +01:00
Rasmus Larsen	c1b0aea653	Merged in Artem-B/eigen (pull request PR-654) Minor build improvements Approved-by: Rasmus Larsen <rmlarsen@google.com>	2019-05-31 22:27:04 +00:00

... 3 4 5 6 7 ...

10836 Commits