Rasmus Munk Larsen
ee404667e2
Rollback or PR-746 and partial rollback of 668ab3fc47
...
.
std::array is still not supported in CUDA device code on Windows.
2019-11-05 17:17:58 -08:00
Joel Holdsworth
743c925286
test/packetmath: Silence alignment warnings
2019-11-05 19:06:12 +00:00
Rasmus Larsen
0c9745903a
Merged in ezhulenev/eigen-01 (pull request PR-746)
...
Remove internal::smart_copy and replace with std::copy
2019-11-04 20:18:38 +00:00
Hans Johnson
8c8cab1afd
STYLE: Convert CMake-language commands to lower case
...
Ancient CMake versions required upper-case commands. Later command names
became case-insensitive. Now the preferred style is lower-case.
2019-10-31 11:36:37 -05:00
Hans Johnson
6fb3e5f176
STYLE: Remove CMake-language block-end command arguments
...
Ancient versions of CMake required else(), endif(), and similar block
termination commands to have arguments matching the command starting the block.
This is no longer the preferred style.
2019-10-31 11:36:27 -05:00
Rasmus Munk Larsen
f1e8307308
1. Fix a bug in psqrt and make it return 0 for +inf arguments.
...
2. Simplify handling of special cases by taking advantage of the fact that the
builtin vrsqrt approximation handles negative, zero and +inf arguments correctly.
This speeds up the SSE and AVX implementations by ~20%.
3. Make the Newton-Raphson formula used for rsqrt more numerically robust:
Before: y = y * (1.5 - x/2 * y^2)
After: y = y * (1.5 - y * (x/2) * y)
Forming y^2 can overflow for very large or very small (denormalized) values of x, while x*y ~= 1. For AVX512, this makes it possible to compute accurate results for denormal inputs down to ~1e-42 in single precision.
4. Add a faster double precision implementation for Knights Landing using the vrsqrt28 instruction and a single Newton-Raphson iteration.
Benchmark results: https://bitbucket.org/snippets/rmlarsen/5LBq9o
2019-11-15 17:09:46 -08:00
Gael Guennebaud
2cb2915f90
bug #1744 : fix compilation with MSVC 2017 and AVX512, plog1p/pexpm1 require plog/pexp, but the later was disabled on some compilers
2019-11-15 13:39:51 +01:00
Gael Guennebaud
c3f6fcf2c0
bug #1747 : one more fix for MSVC regarding the Bessel implementation.
2019-11-15 11:12:35 +01:00
Gael Guennebaud
b9837ca9ae
bug #1281 : fix AutoDiffScalar's make_coherent for nested expression of constant ADs.
2019-11-14 14:58:08 +01:00
Gael Guennebaud
0fb6e24408
Fix case issue with Lapack unit tests
2019-11-14 14:16:05 +01:00
Gael Guennebaud
8af045a287
bug #1774 : fix VectorwiseOp::begin()/end() return types regarding constness.
2019-11-14 11:45:52 +01:00
Sakshi Goynar
75b4c0a3e0
PR 751: Fixed compilation issue when compiling using MSVC with /arch:AVX512 flag
2019-10-31 16:09:16 -07:00
Gael Guennebaud
8496f86f84
Enable CompleteOrthogonalDecomposition::pseudoInverse with non-square fixed-size matrices.
2019-11-13 21:16:53 +01:00
Gael Guennebaud
002e5b6db6
Move to my.cdash.org
2019-11-13 13:33:49 +01:00
Eugene Zhulenev
13c3327f5c
Remove legacy block evaluation support
2019-11-12 10:12:28 -08:00
Gael Guennebaud
71aa53dd6d
Disable AVX on broken xcode versions. See PR 748.
...
Patch adapted from Hans Johnson's PR 748.
2019-11-12 11:40:38 +01:00
Rasmus Munk Larsen
0ed0338593
Fix a race in async tensor evaluation: Don't run on_done() until after device.deallocate() / evaluator.cleanup() complete, since the device might be destroyed after on_done() runs.
2019-11-11 12:26:41 -08:00
Eugene Zhulenev
c952b8dfda
Break loop dependence in TensorGenerator block access
2019-11-11 10:32:57 -08:00
Rasmus Munk Larsen
ebf04fb3e8
Fix data race in css11_tensor_notification test.
2019-11-08 17:44:50 -08:00
Eugene Zhulenev
73ecb2c57d
Cleanup includes in Tensor module after switch to C++11 and above
2019-10-29 15:49:54 -07:00
Eugene Zhulenev
e7ed4bd388
Remove internal::smart_copy and replace with std::copy
2019-10-29 11:25:24 -07:00
Eugene Zhulenev
fbc0a9a3ec
Fix CXX11Meta compilation with MSVC
2019-10-28 18:30:10 -07:00
Eugene Zhulenev
bd864ab42b
Prevent potential ODR in TensorExecutor
2019-10-28 15:45:09 -07:00
Mehdi Goli
6332aff0b2
This PR fixes:
...
* The specialization of array class in the different namespace for GCC<=6.4
* The implicit call to `std::array` constructor using the initializer list for GCC <=6.1
2019-10-23 15:56:56 +01:00
Rasmus Larsen
8e4e29ae99
Merged in deven-amd/eigen-hip-fix-191018 (pull request PR-738)
...
Fix for the HIP build+test errors.
2019-10-22 22:18:38 +00:00
Rasmus Munk Larsen
97c0c5d485
Add block evaluation V2 to TensorAsyncExecutor.
...
Add async evaluation to a number of ops.
2019-10-22 12:42:44 -07:00
Deven Desai
102cf2a72d
Fix for the HIP build+test errors.
...
The errors were introduced by this commit :
After the above mentioned commit, some of the tests started failing with the following error
```
Built target cxx11_tensor_reduction
Building HIPCC object unsupported/test/CMakeFiles/cxx11_tensor_reduction_gpu_5.dir/cxx11_tensor_reduction_gpu_5_generated_cxx11_tensor_reduction_gpu.cu.o
In file included from /home/rocm-user/eigen/unsupported/test/cxx11_tensor_reduction_gpu.cu:16:
In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/Tensor:117:
/home/rocm-user/eigen/unsupported/Eigen/CXX11/src/Tensor/TensorBlockV2.h:155:5: error: the field type is not amp-compatible
DestinationBufferKind m_kind;
^
/home/rocm-user/eigen/unsupported/Eigen/CXX11/src/Tensor/TensorBlockV2.h:211:3: error: the field type is not amp-compatible
DestinationBuffer m_destination;
^
```
For some reason HIPCC does not like device code to contain enum types which do not have the base-type explicitly declared. The fix is trivial, explicitly state "int" as the basetype
2019-10-22 19:21:27 +00:00
Rasmus Munk Larsen
668ab3fc47
Drop support for c++03 in Eigen tensor. Get rid of some code used to emulate c++11 functionality with older compilers.
2019-10-18 16:42:00 -07:00
Eugene Zhulenev
df0e8b8137
Propagate block evaluation preference through rvalue tensor expressions
2019-10-17 11:17:33 -07:00
Eugene Zhulenev
0d2a14ce11
Cleanup Tensor block destination and materialized block storage allocation
2019-10-16 17:14:37 -07:00
Eugene Zhulenev
02431cbe71
TensorBroadcasting support for random/uniform blocks
2019-10-16 13:26:28 -07:00
Eugene Zhulenev
d380c23b2c
Block evaluation for TensorGenerator/TensorReverse/TensorShuffling
2019-10-14 14:31:59 -07:00
Gael Guennebaud
39fb9eeccf
bug #1747 : fix compilation with MSVC
2019-10-14 22:50:23 +02:00
Eugene Zhulenev
a411e9f344
Block evaluation for TensorGenerator + TensorReverse + fixed bug in tensor reverse op
2019-10-10 10:56:58 -07:00
Rasmus Larsen
b03eb63d7c
Merged in ezhulenev/eigen-01 (pull request PR-726)
...
Block evaluation for TensorChipping + fixed bugs in TensorPadding and TensorSlicing
2019-10-10 16:58:11 +00:00
Gael Guennebaud
e7d8ba747c
bug #1752 : make is_convertible equivalent to the std c++11 equivalent and fallback to std::is_convertible when c++11 is enabled.
2019-10-10 17:41:47 +02:00
Gael Guennebaud
fb557aec5c
bug #1752 : disable some is_convertible tests for recent compilers.
2019-10-10 11:40:21 +02:00
Eugene Zhulenev
33e1746139
Block evaluation for TensorChipping + fixed bugs in TensorPadding and TensorSlicing
2019-10-09 12:45:31 -07:00
Gael Guennebaud
f0a4642bab
Implement c++03 compatible fix for changeset 7a43af1a33
2019-10-09 16:00:57 +02:00
Gael Guennebaud
196de2efe3
Explicitly bypass resize and memmoves when there is already the exact right number of elements available.
2019-10-08 21:44:33 +02:00
Gael Guennebaud
36da231a41
Disable an expected warning in unit test
2019-10-08 16:28:14 +02:00
Gael Guennebaud
d1def335dc
fix one more possible conflicts with real/imag
2019-10-08 16:19:10 +02:00
Gael Guennebaud
87427d2eaa
PR 719: fix real/imag namespace conflict
2019-10-08 09:15:17 +02:00
Gael Guennebaud
7a43af1a33
Fix compilation of FFTW unit test
2019-10-08 08:58:35 +02:00
Eugene Zhulenev
f74ab8cb8d
Add block evaluation to TensorEvalTo and fix few small bugs
2019-10-07 15:34:26 -07:00
Brian Zhao
3afb640b56
Fixing incorrect size in Tensor documentation.
2019-10-04 21:30:35 -07:00
Rasmus Munk Larsen
20c4a9118f
Use "pdiv" rather than operator/ to support packet types.
2019-10-04 16:54:03 -07:00
Rasmus Larsen
d1dd51cb5f
Merged in ezhulenev/eigen-01 (pull request PR-723)
...
Add block evaluation to TensorReshaping/TensorCasting/TensorPadding/TensorSelect
Approved-by: Rasmus Larsen <rmlarsen@google.com>
2019-10-04 17:19:13 +00:00
Eugene Zhulenev
98bdd7252e
Fix compilation warnings and errors with clang in TensorBlockV2 code and tests
2019-10-04 10:15:33 -07:00
Rasmus Munk Larsen
fab4e3a753
Address comments on Chebyshev evaluation code:
...
1. Use pmadd when possible.
2. Add casts to avoid c++03 warnings.
2019-10-02 12:48:17 -07:00