eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-03-07 18:27:40 +08:00

Author	SHA1	Message	Date
Rasmus Munk Larsen	4e696901f8	Remove __host__ annotation for device-only function.	2019-12-03 14:33:19 -08:00
Rasmus Munk Larsen	ead81559c8	Use EIGEN_DEVICE_FUNC macro instead of __device__.	2019-12-03 12:08:22 -08:00
Mehdi Goli	00f32752f7	[SYCL] Rebasing the SYCL support branch on top of the Einge upstream master branch. * Unifying all loadLocalTile from lhs and rhs to an extract_block function. * Adding get_tensor operation which was missing in TensorContractionMapper. * Adding the -D method missing from cmake for Disable_Skinny Contraction operation. * Wrapping all the indices in TensorScanSycl into Scan parameter struct. * Fixing typo in Device SYCL * Unifying load to private register for tall/skinny no shared * Unifying load to vector tile for tensor-vector/vector-tensor operation * Removing all the LHS/RHS class for extracting data from global * Removing Outputfunction from TensorContractionSkinnyNoshared. * Combining the local memory version of tall/skinny and normal tensor contraction into one kernel. * Combining the no-local memory version of tall/skinny and normal tensor contraction into one kernel. * Combining General Tensor-Vector and VectorTensor contraction into one kernel. * Making double buffering optional for Tensor contraction when local memory is version is used. * Modifying benchmark to accept custom Reduction Sizes * Disabling AVX optimization for SYCL backend on the host to allow SSE optimization to the host * Adding Test for SYCL * Modifying SYCL CMake	2019-11-28 10:08:54 +00:00
Eugene Zhulenev	5496d0da0b	Add async evaluation support to TensorReverse	2019-11-26 15:02:24 -08:00
Eugene Zhulenev	bc66c88255	Add async evaluation support to TensorPadding/TensorImagePatch/TensorShuffling	2019-11-26 11:41:57 -08:00
Hans Johnson	8c8cab1afd	STYLE: Convert CMake-language commands to lower case Ancient CMake versions required upper-case commands. Later command names became case-insensitive. Now the preferred style is lower-case.	2019-10-31 11:36:37 -05:00
Hans Johnson	6fb3e5f176	STYLE: Remove CMake-language block-end command arguments Ancient versions of CMake required else(), endif(), and similar block termination commands to have arguments matching the command starting the block. This is no longer the preferred style.	2019-10-31 11:36:27 -05:00
Gael Guennebaud	c3f6fcf2c0	bug #1747 : one more fix for MSVC regarding the Bessel implementation.	2019-11-15 11:12:35 +01:00
Gael Guennebaud	b9837ca9ae	bug #1281 : fix AutoDiffScalar's make_coherent for nested expression of constant ADs.	2019-11-14 14:58:08 +01:00
Eugene Zhulenev	13c3327f5c	Remove legacy block evaluation support	2019-11-12 10:12:28 -08:00
Rasmus Munk Larsen	0ed0338593	Fix a race in async tensor evaluation: Don't run on_done() until after device.deallocate() / evaluator.cleanup() complete, since the device might be destroyed after on_done() runs.	2019-11-11 12:26:41 -08:00
Eugene Zhulenev	c952b8dfda	Break loop dependence in TensorGenerator block access	2019-11-11 10:32:57 -08:00
Rasmus Munk Larsen	ebf04fb3e8	Fix data race in css11_tensor_notification test.	2019-11-08 17:44:50 -08:00
Rasmus Munk Larsen	cc3d0e6a40	Add EIGEN_HAS_INTRINSIC_INT128 macro Add a new EIGEN_HAS_INTRINSIC_INT128 macro, and use this instead of __SIZEOF_INT128__. This fixes related issues with TensorIntDiv.h when building with Clang for Windows, where support for 128-bit integer arithmetic is advertised but broken in practice.	2019-11-06 14:24:33 -08:00
Rasmus Munk Larsen	ee404667e2	Rollback or PR-746 and partial rollback of `668ab3fc47` . std::array is still not supported in CUDA device code on Windows.	2019-11-05 17:17:58 -08:00
Rasmus Larsen	0c9745903a	Merged in ezhulenev/eigen-01 (pull request PR-746) Remove internal::smart_copy and replace with std::copy	2019-11-04 20:18:38 +00:00
Eugene Zhulenev	73ecb2c57d	Cleanup includes in Tensor module after switch to C++11 and above	2019-10-29 15:49:54 -07:00
Eugene Zhulenev	e7ed4bd388	Remove internal::smart_copy and replace with std::copy	2019-10-29 11:25:24 -07:00
Eugene Zhulenev	fbc0a9a3ec	Fix CXX11Meta compilation with MSVC	2019-10-28 18:30:10 -07:00
Eugene Zhulenev	bd864ab42b	Prevent potential ODR in TensorExecutor	2019-10-28 15:45:09 -07:00
Mehdi Goli	6332aff0b2	This PR fixes: * The specialization of array class in the different namespace for GCC<=6.4 * The implicit call to `std::array` constructor using the initializer list for GCC <=6.1	2019-10-23 15:56:56 +01:00
Rasmus Larsen	8e4e29ae99	Merged in deven-amd/eigen-hip-fix-191018 (pull request PR-738) Fix for the HIP build+test errors.	2019-10-22 22:18:38 +00:00
Rasmus Munk Larsen	97c0c5d485	Add block evaluation V2 to TensorAsyncExecutor. Add async evaluation to a number of ops.	2019-10-22 12:42:44 -07:00
Deven Desai	102cf2a72d	Fix for the HIP build+test errors. The errors were introduced by this commit : After the above mentioned commit, some of the tests started failing with the following error ``` Built target cxx11_tensor_reduction Building HIPCC object unsupported/test/CMakeFiles/cxx11_tensor_reduction_gpu_5.dir/cxx11_tensor_reduction_gpu_5_generated_cxx11_tensor_reduction_gpu.cu.o In file included from /home/rocm-user/eigen/unsupported/test/cxx11_tensor_reduction_gpu.cu:16: In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/Tensor:117: /home/rocm-user/eigen/unsupported/Eigen/CXX11/src/Tensor/TensorBlockV2.h:155:5: error: the field type is not amp-compatible DestinationBufferKind m_kind; ^ /home/rocm-user/eigen/unsupported/Eigen/CXX11/src/Tensor/TensorBlockV2.h:211:3: error: the field type is not amp-compatible DestinationBuffer m_destination; ^ ``` For some reason HIPCC does not like device code to contain enum types which do not have the base-type explicitly declared. The fix is trivial, explicitly state "int" as the basetype	2019-10-22 19:21:27 +00:00
Rasmus Munk Larsen	668ab3fc47	Drop support for c++03 in Eigen tensor. Get rid of some code used to emulate c++11 functionality with older compilers.	2019-10-18 16:42:00 -07:00
Eugene Zhulenev	df0e8b8137	Propagate block evaluation preference through rvalue tensor expressions	2019-10-17 11:17:33 -07:00
Eugene Zhulenev	0d2a14ce11	Cleanup Tensor block destination and materialized block storage allocation	2019-10-16 17:14:37 -07:00
Eugene Zhulenev	02431cbe71	TensorBroadcasting support for random/uniform blocks	2019-10-16 13:26:28 -07:00
Eugene Zhulenev	d380c23b2c	Block evaluation for TensorGenerator/TensorReverse/TensorShuffling	2019-10-14 14:31:59 -07:00
Gael Guennebaud	39fb9eeccf	bug #1747 : fix compilation with MSVC	2019-10-14 22:50:23 +02:00
Eugene Zhulenev	a411e9f344	Block evaluation for TensorGenerator + TensorReverse + fixed bug in tensor reverse op	2019-10-10 10:56:58 -07:00
Eugene Zhulenev	33e1746139	Block evaluation for TensorChipping + fixed bugs in TensorPadding and TensorSlicing	2019-10-09 12:45:31 -07:00
Gael Guennebaud	f0a4642bab	Implement c++03 compatible fix for changeset `7a43af1a33`	2019-10-09 16:00:57 +02:00
Gael Guennebaud	7a43af1a33	Fix compilation of FFTW unit test	2019-10-08 08:58:35 +02:00
Eugene Zhulenev	f74ab8cb8d	Add block evaluation to TensorEvalTo and fix few small bugs	2019-10-07 15:34:26 -07:00
Brian Zhao	3afb640b56	Fixing incorrect size in Tensor documentation.	2019-10-04 21:30:35 -07:00
Rasmus Munk Larsen	20c4a9118f	Use "pdiv" rather than operator/ to support packet types.	2019-10-04 16:54:03 -07:00
Eugene Zhulenev	98bdd7252e	Fix compilation warnings and errors with clang in TensorBlockV2 code and tests	2019-10-04 10:15:33 -07:00
Eugene Zhulenev	60ae24ee1a	Add block evaluation to TensorReshaping/TensorCasting/TensorPadding/TensorSelect	2019-10-02 12:44:06 -07:00
Eugene Zhulenev	6e40454a6e	Add beta to TensorContractionKernel and make memset optional	2019-10-02 11:06:02 -07:00
Rasmus Munk Larsen	13ef08e5ac	Move implementation of vectorized error function erf() to SpecialFunctionsImpl.h.	2019-09-27 13:56:04 -07:00
Eugene Zhulenev	7c8bc0d928	Fix cxx11_tensor_block_io test	2019-09-25 11:48:11 -07:00
Eugene Zhulenev	71d5bedf72	Fix compilation warnings and errors with clang in TensorBlockV2	2019-09-25 11:25:22 -07:00
Deven Desai	5e186b1987	Fix for the HIP build+test errors. The errors were introduced by this commit : `d38e6fbc27` After the above mentioned commit, some of the tests started failing with the following error ``` Building HIPCC object unsupported/test/CMakeFiles/cxx11_tensor_reduction_gpu_5.dir/cxx11_tensor_reduction_gpu_5_generated_cxx11_tensor_reduction_gpu.cu.o In file included from /home/rocm-user/eigen/unsupported/test/cxx11_tensor_reduction_gpu.cu:16: In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/Tensor:29: In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/../SpecialFunctions:70: /home/rocm-user/eigen/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsHalf.h:28:22: error: call to 'erf' is ambiguous return Eigen::half(Eigen::numext::erf(static_cast<float>(a))); ^~~~~~~~~~~~~~~~~~ /home/rocm-user/eigen/unsupported/test/../../Eigen/src/Core/MathFunctions.h:1600:7: note: candidate function [with T = float] float erf(const float &x) { return ::erff(x); } ^ /home/rocm-user/eigen/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h:1897:5: note: candidate function [with Scalar = float] erf(const Scalar& x) { ^ In file included from /home/rocm-user/eigen/unsupported/test/cxx11_tensor_reduction_gpu.cu:16: In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/Tensor:29: In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/../SpecialFunctions:75: /home/rocm-user/eigen/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/GPU/GpuSpecialFunctions.h:87:23: error: call to 'erf' is ambiguous return make_double2(erf(a.x), erf(a.y)); ^~~ /home/rocm-user/eigen/unsupported/test/../../Eigen/src/Core/MathFunctions.h:1603:8: note: candidate function [with T = double] double erf(const double &x) { return ::erf(x); } ^ /home/rocm-user/eigen/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h:1897:5: note: candidate function [with Scalar = double] erf(const Scalar& x) { ^ In file included from /home/rocm-user/eigen/unsupported/test/cxx11_tensor_reduction_gpu.cu:16: In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/Tensor:29: In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/../SpecialFunctions:75: /home/rocm-user/eigen/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/GPU/GpuSpecialFunctions.h:87:33: error: call to 'erf' is ambiguous return make_double2(erf(a.x), erf(a.y)); ^~~ /home/rocm-user/eigen/unsupported/test/../../Eigen/src/Core/MathFunctions.h:1603:8: note: candidate function [with T = double] double erf(const double &x) { return ::erf(x); } ^ /home/rocm-user/eigen/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h:1897:5: note: candidate function [with Scalar = double] erf(const Scalar& x) { ^ 3 errors generated. ``` This PR fixes the compile error by removing the "old" implementation for "erf" (assuming that the "new" implementation is what we want going forward. from a GPU point-of-view both implementations are the same). This PR also fixes what seems like a cut-n-paste error in the aforementioned commit	2019-09-25 15:39:13 +00:00
Eugene Zhulenev	f35b9ab510	Fix a bug in a packed block type in TensorContractionThreadPool	2019-09-24 16:54:36 -07:00
Rasmus Larsen	d38e6fbc27	Merged in rmlarsen/eigen (pull request PR-704) Add generic PacketMath implementation of the Error Function (erf).	2019-09-24 23:40:29 +00:00
Rasmus Munk Larsen	591a554c68	Add TODO to cleanup FMA cost modelling.	2019-09-24 16:39:25 -07:00
Eugene Zhulenev	c64396b4c6	Choose TensorBlock StridedLinearCopy type statically	2019-09-24 16:04:29 -07:00
Eugene Zhulenev	c97b208468	Add new TensorBlock api implementation + tests	2019-09-24 15:17:35 -07:00
Eugene Zhulenev	ef9dfee7bd	Tensor block evaluation V2 support for unary/binary/broadcsting	2019-09-24 12:52:45 -07:00

1 2 3 4 5 ...

2791 Commits