Commit Graph

10639 Commits

Author SHA1 Message Date
Rasmus Munk Larsen
039ee52125 Tweak cost model for tensor contraction when parallelizing over the inner dimension.
https://bitbucket.org/snippets/rmlarsen/MexxLo
2019-04-12 13:35:10 -07:00
Jonathon Koyle
9a3f06d836 Update TheadPoolDevice example to include ThreadPool creation and passing pointer into constructor. 2019-04-10 10:02:33 -06:00
Deven Desai
66a885b61e adding EIGEN_DEVICE_FUNC to the recently added TensorContractionKernel constructor. Not having the EIGEN_DEVICE_FUNC attribute on it was leading to compiler errors when compiling Eigen in the ROCm/HIP path 2019-04-08 13:45:08 +00:00
Eugene Zhulenev
629ddebd15 Add missing semicolon 2019-04-02 15:04:26 -07:00
Eugene Zhulenev
4e2f6de1a8 Add support for custom packed Lhs/Rhs blocks in tensor contractions 2019-04-01 11:47:31 -07:00
Gael Guennebaud
45e65fbb77 bug #1695: fix a numerical robustness issue. Computing the secular equation at the middle range without a shift might give a wrong sign. 2019-03-27 20:16:58 +01:00
William D. Irons
8de66719f9 Collapsed revision from PR-619
* Add support for pcmp_eq in AltiVec/Complex.h
* Fixed implementation of pcmp_eq for double

The new logic is based on the logic from NEON for double.
2019-03-26 18:14:49 +00:00
Gael Guennebaud
f11364290e ICC does not support -fno-unsafe-math-optimizations 2019-03-22 09:26:24 +01:00
Deven Desai
51e399fc15 updates requested in the PR feedback. Also droping coded within #ifdef EIGEN_HAS_OLD_HIP_FP16 2019-03-19 21:45:25 +00:00
Deven Desai
2dbea5510f Merged eigen/eigen into default 2019-03-19 16:52:38 -04:00
Rasmus Larsen
5c93b38c5f Merged in rmlarsen/eigen (pull request PR-618)
Make clipping outside [-18:18] consistent for vectorized and non-vectorized paths of scalar_logistic_op<float>.

Approved-by: Gael Guennebaud <g.gael@free.fr>
2019-03-18 15:51:55 +00:00
Gael Guennebaud
48898a988a fix unit test in c++03: c++03 does not allow passing local or anonymous enum as template param 2019-03-18 11:38:36 +01:00
Gael Guennebaud
cf7e2e277f bug #1692: enable enum as sizes of Matrix and Array 2019-03-17 21:59:30 +01:00
Rasmus Munk Larsen
e42f9aa68a Make clipping outside [-18:18] consistent for vectorized and non-vectorized paths of scalar_logistic_<float>. 2019-03-15 17:15:14 -07:00
Rasmus Larsen
1936aac43f Merged in tellenbach/eigen/sykline_consistent_include_guards (pull request PR-617)
Fix include guard comments for Skyline module
2019-03-15 20:04:56 +00:00
David Tellenbach
bd9c2ae3fd Fix include guard comments 2019-03-15 15:29:17 +01:00
Rasmus Munk Larsen
8450a6d519 Clean up half packet traits and add a few more missing packet ops. 2019-03-14 15:18:06 -07:00
David Tellenbach
b013176e52 Remove undefined std::complex<int> 2019-03-14 11:40:28 +01:00
David Tellenbach
97f9a46cb9 PR 593: Add variadtic ctor for DiagonalMatrix with unit tests 2019-03-14 10:18:24 +01:00
Gael Guennebaud
45ab514fe2 revert debug stuff 2019-03-14 10:08:12 +01:00
Rasmus Munk Larsen
6a34003141 Remove EIGEN_MPL2_ONLY guard in IncompleteCholesky that is no longer needed after the AMD reordering code was relicensed to MPL2. 2019-03-13 11:52:41 -07:00
Gael Guennebaud
d7d2f0680e bug #1684: partially workaround clang's 6/7 bug #40815 2019-03-13 10:40:01 +01:00
Rasmus Larsen
690f0795d0 Merged in rmlarsen/eigen (pull request PR-615)
Clean up PacketMathHalf.h and add a few missing logical packet ops.
2019-03-12 16:09:48 +00:00
Thomas Capricelli
1901433674 erm.. use proper id 2019-03-12 13:53:38 +01:00
Thomas Capricelli
90302aa8c9 update tracking code 2019-03-12 13:47:01 +01:00
Rasmus Munk Larsen
77f7d4a894 Clean up PacketMathHalf.h and add a few missing logical packet ops. 2019-03-11 17:51:16 -07:00
Eugene Zhulenev
001f10e3c9 Fix segfaults with cuda compilation 2019-03-11 09:43:33 -07:00
Eugene Zhulenev
899c16fa2c Fix a bug in TensorGenerator for 1d tensors 2019-03-11 09:42:01 -07:00
Eugene Zhulenev
0f8bfff23d Fix a data race in NonBlockingThreadPool 2019-03-11 09:38:44 -07:00
Gael Guennebaud
656d9bc66b Apply SSE's pmin/pmax fix for GCC <= 5 to AVX's pmin/pmax 2019-03-10 21:19:18 +01:00
Gael Guennebaud
2df4f00246 Change license from LGPL to MPL2 with agreement from David Harmon. 2019-03-07 18:17:10 +01:00
Rasmus Munk Larsen
3c3f639fe2 Merge. 2019-03-06 11:54:30 -08:00
Rasmus Munk Larsen
f4ec8edea8 Add macro EIGEN_AVOID_THREAD_LOCAL to make it possible to manually disable the use of thread_local. 2019-03-06 11:52:04 -08:00
Rasmus Munk Larsen
41cdc370d0 Fix placement of "#if defined(EIGEN_GPUCC)" guard region.
Found with -Wundefined-func-template.

Author: tkoeppe@google.com
2019-03-06 11:42:22 -08:00
Rasmus Munk Larsen
cc407c9d4d Fix placement of "#if defined(EIGEN_GPUCC)" guard region.
Found with -Wundefined-func-template.

Author: tkoeppe@google.com
2019-03-06 11:40:06 -08:00
Eugene Zhulenev
1bc2a0a57c Add missing return to NonBlockingThreadPool::LocalSteal 2019-03-06 10:49:49 -08:00
Eugene Zhulenev
4e4dcd9026 Remove redundant steal loop 2019-03-06 10:39:07 -08:00
Rasmus Larsen
4d808e834a Merged in rmlarsen/eigen_threadpool (pull request PR-606)
Remove EIGEN_MPL2_ONLY guards around code re-licensed from LGPL to MPL2 in 2ca1e73239


Approved-by: Sameer Agarwal <sameeragarwal@google.com>
2019-03-06 17:59:03 +00:00
Rasmus Larsen
2ea18e505f Merged in ezhulenev/eigen-01 (pull request PR-610)
Block evaluation for TensorGeneratorOp
2019-03-06 16:49:38 +00:00
Eugene Zhulenev
25abaa2e41 Check that inner block dimension is continuous 2019-03-05 17:34:35 -08:00
Eugene Zhulenev
5d9a6686ed Block evaluation for TensorGeneratorOp 2019-03-05 16:35:21 -08:00
Rasmus Larsen
b4861f4778 Merged in ezhulenev/eigen-01 (pull request PR-609)
Tune tensor contraction threadpool heuristics
2019-03-05 23:54:40 +00:00
Gael Guennebaud
bfbf7da047 bug #1689 fix used-but-marked-unused warning 2019-03-05 23:46:24 +01:00
Eugene Zhulenev
a407e022e6 Tune tensor contraction threadpool heuristics 2019-03-05 14:19:59 -08:00
Eugene Zhulenev
56c6373f82 Add an extra check for the RunQueue size estimate 2019-03-05 11:51:26 -08:00
Eugene Zhulenev
b1a8627493 Do not create Tensor<const T> in cxx11_tensor_forced_eval test 2019-03-05 11:19:25 -08:00
Rasmus Munk Larsen
0318fc7f44 Remove EIGEN_MPL2_ONLY guards around code re-licensed from LGPL to MPL2 in 2ca1e73239 2019-03-05 10:24:54 -08:00
Eugene Zhulenev
efb5080d31 Do not initialize invalid fast_strides in TensorGeneratorOp 2019-03-04 16:58:49 -08:00
Eugene Zhulenev
b95941e5c2 Add tiled evaluation for TensorForcedEvalOp 2019-03-04 16:02:22 -08:00
Eugene Zhulenev
694084ecbd Use fast divisors in TensorGeneratorOp 2019-03-04 11:10:21 -08:00