eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-21 07:19:46 +08:00

Author	SHA1	Message	Date
João P. L. de Carvalho	66d073c38e	bug #1718 : Add cast to successfully compile with clang on PowerPC Ignoring -Wc11-extensions warnings thrown by clang at Altivec/PacketMath.h	2019-08-09 15:56:26 -06:00
Rasmus Munk Larsen	d55d392e7b	Fix bugs in log1p and expm1 where repeated using statements would clobber each other. Add specializations for complex types since std::log1p and std::exp1m do not support complex.	2019-08-08 16:27:32 -07:00
Rasmus Munk Larsen	85928e5f47	Guard against repeated definition of EIGEN_MPL2_ONLY	2019-08-07 14:19:00 -07:00
Rasmus Munk Larsen	facc4e4536	Disable tests for contraction with output kernels when using libxsmm, which does not support this.	2019-08-07 14:11:15 -07:00
Rasmus Munk Larsen	eab7e52db2	[Eigen] Vectorize evaluation of coefficient-wise functions over tensor blocks if the strides are known to be 1. Provides up to 20-25% speedup of the TF cross entropy op with AVX. A few benchmark numbers: name old time/op new time/op delta BM_Xent_16_10000_cpu 448µs ± 3% 389µs ± 2% -13.21% (p=0.008 n=5+5) BM_Xent_32_10000_cpu 575µs ± 6% 454µs ± 3% -21.00% (p=0.008 n=5+5) BM_Xent_64_10000_cpu 933µs ± 4% 712µs ± 1% -23.71% (p=0.008 n=5+5)	2019-08-07 12:57:42 -07:00
Rasmus Munk Larsen	0987126165	Clean up unnecessary namespace specifiers in TensorBlock.h.	2019-08-07 12:12:52 -07:00
Gael Guennebaud	0050644b23	Fix doc regarding alignment and c++17	2019-08-04 01:09:41 +02:00
Rasmus Munk Larsen	e2999d4c38	Fix performance regressions due to https://bitbucket.org/eigen/eigen/pull-requests/662 . The change caused the device struct to be copied for each expression evaluation, and caused, e.g., a 10% regression in the TensorFlow multinomial op on GPU: Benchmark Time(ns) CPU(ns) Iterations ---------------------------------------------------------------------- BM_Multinomial_gpu_1_100000_4 128173 231326 2922 1.610G items/s VS Benchmark Time(ns) CPU(ns) Iterations ---------------------------------------------------------------------- BM_Multinomial_gpu_1_100000_4 146683 246914 2719 1.509G items/s	2019-08-02 11:18:13 -07:00
Kyle Vedder	f22b7283a3	Added leading asterisk for Doxygen to consume as it was removing asterisk intended to be part of the code.	2019-07-18 18:12:14 +00:00
Michael Grupp	6e17491f45	Fix typo in Umeyama method documentation	2019-07-17 11:20:41 +00:00
Christoph Hertzberg	e0f5a2a456	Remove {} accidentally added in previous commit	2019-07-18 20:22:17 +02:00
Christoph Hertzberg	ea6d7eb32f	Move variadic constructors outside `#ifndef EIGEN_PARSED_BY_DOXYGEN` block, to make it actually appear in the generated documentation.	2019-07-12 19:46:37 +02:00
Christoph Hertzberg	9237883ff1	Escape \# inside doxygen docu	2019-07-12 19:45:13 +02:00
Christoph Hertzberg	c2671e5315	Build deprecated snippets with -DEIGEN_NO_DEPRECATED_WARNING Also, document LinSpaced only where it is implemented	2019-07-12 19:43:32 +02:00
Eugene Zhulenev	3cd148f983	Fix expression evaluation heuristic for TensorSliceOp	2019-07-09 12:10:26 -07:00
Rasmus Munk Larsen	23b958818e	Fix compiler for unsigned integers.	2019-07-09 11:18:25 -07:00
Eugene Zhulenev	6083014594	Add outer/inner chipping optimization for chipping dimension specified at runtime	2019-07-03 11:35:25 -07:00
Deven Desai	7eb2e0a95b	adding the EIGEN_DEVICE_FUNC attribute to the constCast routine. Not having this attribute results in the following failures in the `--config=rocm` TF build. ``` In file included from tensorflow/core/kernels/cross_op_gpu.cu.cc:20: In file included from ./tensorflow/core/framework/register_types.h:20: In file included from ./tensorflow/core/framework/numeric_types.h:20: In file included from ./third_party/eigen3/unsupported/Eigen/CXX11/Tensor:1: In file included from external/eigen_archive/unsupported/Eigen/CXX11/Tensor:140: external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorChipping.h:356:37: error: 'Eigen::constCast': no overloaded function has restriction specifiers that are compatible with the ambient context 'data' typename Storage::Type result = constCast(m_impl.data()); ^ external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorChipping.h:356:37: error: 'Eigen::constCast': no overloaded function has restriction specifiers that are compatible with the ambient context 'data' external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorAssign.h:148:56: note: in instantiation of member function 'Eigen::TensorEvaluator<const Eigen::TensorChippingOp<1, Eigen::TensorMap<Eigen::Tensor<int, 2, 1, long>, 16, MakePointer> >, Eigen::Gpu\ Device>::data' requested here return m_rightImpl.evalSubExprsIfNeeded(m_leftImpl.data()); ``` Adding the EIGEN_DEVICE_FUNC attribute resolves those errors	2019-07-02 20:02:46 +00:00
Gael Guennebaud	ef8aca6a89	Merged in codeplaysoftware/eigen (pull request PR-667) [SYCL] : Approved-by: Gael Guennebaud <g.gael@free.fr> Approved-by: Rasmus Larsen <rmlarsen@google.com>	2019-07-02 12:45:23 +00:00
Eugene Zhulenev	4ac93f8edc	Allocate non-const scalar buffer for block evaluation with DefaultDevice	2019-07-01 10:55:19 -07:00
Mehdi Goli	9ea490c82c	[SYCL] : * Modifying TensorDeviceSYCL to use `EIGEN_THROW_X`. * Modifying TensorMacro to use `EIGEN_TRY/CATCH(X)` macro. * Modifying TensorReverse.h to use `EIGEN_DEVICE_REF` instead of `&`. * Fixing the SYCL device macro in SpecialFunctionsImpl.h.	2019-07-01 16:27:28 +01:00
Eugene Zhulenev	81a03bec75	Fix TensorReverse on GPU with m_stride[i]==0	2019-06-28 15:50:39 -07:00
Rasmus Munk Larsen	8053eeb51e	Fix CUDA compilation error for pselect<half>.	2019-06-28 12:07:29 -07:00
Rasmus Munk Larsen	74a9dd1102	Fix preprocessor condition to only generate a warning when calling eigen::GpuDevice::synchronize() from device code, but not when calling from a non-GPU compilation unit.	2019-06-28 11:56:21 -07:00
Rasmus Munk Larsen	70d4020ad9	Remove comma causing warning in c++03 mode.	2019-06-28 11:39:45 -07:00
Eugene Zhulenev	6e7c76481a	Merge with Eigen head	2019-06-28 11:22:46 -07:00
Eugene Zhulenev	878845cb25	Add block access to TensorReverseOp and make sure that TensorForcedEval uses block access when preferred	2019-06-28 11:13:44 -07:00
Rasmus Munk Larsen	1f61aee5ca	[SYCL] This PR adds the minimum modifications to the Eigen unsupported module required to run it on devices supporting SYCL. * Abstracting the pointer type so that both SYCL memory and pointer can be captured. * Converting SYCL virtual pointer to SYCL device memory in Eigen evaluator class. * Binding SYCL placeholder accessor to command group handler by using bind method in Eigen evaluator node. * Adding SYCL macro for controlling loop unrolling. * Modifying the TensorDeviceSycl.h and SYCL executor method to adopt the above changes.	2019-06-28 10:11:56 -07:00
Mehdi Goli	7d08fa805a	[SYCL] This PR adds the minimum modifications to the Eigen unsupported module required to run it on devices supporting SYCL. * Abstracting the pointer type so that both SYCL memory and pointer can be captured. * Converting SYCL virtual pointer to SYCL device memory in Eigen evaluator class. * Binding SYCL placeholder accessor to command group handler by using bind method in Eigen evaluator node. * Adding SYCL macro for controlling loop unrolling. * Modifying the TensorDeviceSycl.h and SYCL executor method to adopt the above changes.	2019-06-28 10:08:23 +01:00
Mehdi Goli	16a56b2ddd	[SYCL] This PR adds the minimum modifications to Eigen core required to run Eigen unsupported modules on devices supporting SYCL. * Adding SYCL memory model * Enabling/Disabling SYCL backend in Core * Supporting Vectorization	2019-06-27 12:25:09 +01:00
Christoph Hertzberg	adec097c61	Remove extra comma (causes warnings in C++03)	2019-06-26 16:14:28 +02:00
Eugene Zhulenev	229db81572	Optimize evaluation strategy for TensorSlicingOp and TensorChippingOp	2019-06-25 15:41:37 -07:00
Deven Desai	ba506d5bd2	fix for a ROCm/HIP specificcompile errror introduced by a recent commit.	2019-06-22 00:06:05 +00:00
Rasmus Munk Larsen	c9394d7a0e	Remove extra "one" in comment.	2019-06-20 16:23:19 -07:00
Rasmus Munk Larsen	b8f8dac4eb	Update comment as suggested by tra@google.com.	2019-06-20 16:18:37 -07:00
Rasmus Munk Larsen	e5e63c2cad	Fix grammar.	2019-06-20 16:03:59 -07:00
Rasmus Munk Larsen	302a404b7e	Added comment explaining the surprising EIGEN_COMP_CLANG && !EIGEN_COMP_NVCC clause.	2019-06-20 15:59:08 -07:00
Rasmus Munk Larsen	b5237f53b1	Fix CUDA build on Mac.	2019-06-20 15:44:14 -07:00
Rasmus Munk Larsen	988f24b730	Various fixes for packet ops. 1. Fix buggy pcmp_eq and unit test for half types. 2. Add unit test for pselect and add specializations for SSE 4.1, AVX512, and half types. 3. Get rid of FIXME: Implement faster pnegate for half by XOR'ing with a sign bit mask.	2019-06-20 11:47:49 -07:00
Christoph Hertzberg	e0be7f30e1	bug #1724 : Mask buggy warnings with g++-7 (grafted from `427f2f66d6` )	2019-06-14 14:57:46 +02:00
Rasmus Munk Larsen	6d432eae5d	Make is_valid_index_type return false for float and double when EIGEN_HAS_TYPE_TRAITS is off.	2019-06-05 16:42:27 -07:00
Rasmus Munk Larsen	f715f6e816	Add workaround for choosing the right include files with FP16C support with clang.	2019-06-05 13:36:37 -07:00
Justin Carpentier	ffaf658ecd	PR 655: Fix missing Eigen namespace in Macros	2019-06-05 09:51:59 +02:00
Mehdi Goli	0b24e1cb5c	[SYCL] Adding the SYCL memory model. The SYCL memory model provides : * an interface for SYCL buffers to behave as a non-dereferenceable pointer * an interface for placeholder accessor to behave like a pointer on both host and device	2019-07-01 16:02:30 +01:00
Rasmus Larsen	c1b0aea653	Merged in Artem-B/eigen (pull request PR-654) Minor build improvements Approved-by: Rasmus Larsen <rmlarsen@google.com>	2019-05-31 22:27:04 +00:00
Rasmus Munk Larsen	b08527b0c1	Clean up CUDA/NVCC version macros and their use in Eigen, and a few other CUDA build failures.	2019-05-31 15:26:06 -07:00
tra	b4c49bf00e	Minor build improvements * Allow specifying multiple GPU architectures. E.g.: cmake -DEIGEN_CUDA_COMPUTE_ARCH="60;70" * Pass CUDA SDK path to clang. Without it it will default to /usr/local/cuda which may not be the right location, if cmake was invoked with -DCUDA_TOOLKIT_ROOT_DIR=/some/other/CUDA/path	2019-05-31 14:08:34 -07:00
Christoph Hertzberg	5614400581	digits10() needs to return an integer Problem reported on https://stackoverflow.com/questions/56395899	2019-05-31 15:45:41 +02:00
Rasmus Larsen	36e0a2b93f	Merged in deven-amd/eigen-hip-fix-190524 (pull request PR-649) fix for HIP build errors that were introduced by a commit earlier this week	2019-05-24 16:05:31 +00:00
Deven Desai	2c38930161	fix for HIP build errors that were introduced by a commit earlier this week	2019-05-24 14:25:32 +00:00

1 2 3 4 5 ...

10630 Commits