Commit Graph

5577 Commits

Author SHA1 Message Date
Gael Guennebaud
98728312c8 Fix compilation regarding std::array 2018-07-12 17:00:37 +02:00
Gael Guennebaud
eb3d8f68bb fix unused warning 2018-07-12 16:59:47 +02:00
Gael Guennebaud
006e18e52b Cleanup the mess in Eigen/Core by moving CUDA/HIP stuff at more appropriate places (Macros.h),
and alignment/vectorization logic is now in util/ConfigureVectorization.h
2018-07-12 16:57:41 +02:00
Julian Kent
6d451cf2b6 Add missing consts for rows and cols functions in SparseLU 2018-02-10 13:44:05 +01:00
Gael Guennebaud
8bdb214fd0 remove double ;; 2018-07-12 11:17:53 +02:00
Gael Guennebaud
a9060378d3 bug #1570: fix warning 2018-07-12 11:07:09 +02:00
Gael Guennebaud
da0c604078 Merged in deven-amd/eigen (pull request PR-402)
Adding support for using Eigen in HIP kernels.
2018-07-12 08:07:16 +00:00
Gael Guennebaud
a4ea611ca7 Remove useless specialization thanks to is_convertible being more robust. 2018-07-12 09:59:44 +02:00
Gael Guennebaud
8ef267ccbd spellcheck 2018-07-12 09:58:29 +02:00
Gael Guennebaud
21cf4a1a8b Make is_convertible more robust and conformant to std::is_convertible 2018-07-12 09:57:19 +02:00
Gael Guennebaud
8a5955a052 Optimize the product of a householder-sequence with the identity, and optimize the evaluation of a HouseholderSequence to a dense matrix using faster blocked product. 2018-07-11 17:16:50 +02:00
Gael Guennebaud
d193cc87f4 Fix regression in 9357838f94 2018-07-11 17:09:23 +02:00
Gael Guennebaud
fb33687736 Fix double ;; 2018-07-11 17:08:30 +02:00
Deven Desai
876f392c39 Updates corresponding to the latest round of PR feedback
The major changes are

1. Moving CUDA/PacketMath.h to GPU/PacketMath.h
2. Moving CUDA/MathFunctions.h to GPU/MathFunction.h
3. Moving CUDA/CudaSpecialFunctions.h to GPU/GpuSpecialFunctions.h
    The above three changes effectively enable the Eigen "Packet" layer for the HIP platform

4. Merging the "hip_basic" and "cuda_basic" unit tests into one ("gpu_basic")
5. Updating the "EIGEN_DEVICE_FUNC" marking in some places

The change has been tested on the HIP and CUDA platforms.
2018-07-11 10:39:54 -04:00
Deven Desai
471cfe5ff7 renaming CUDA* to GPU* for some header files 2018-07-11 09:22:04 -04:00
Deven Desai
38807a2575 merging updates from upstream 2018-07-11 09:17:33 -04:00
Gael Guennebaud
f00d08cc0a Optimize extraction of Q in SparseQR by exploiting the structure of the identity matrix. 2018-07-11 14:01:47 +02:00
Gael Guennebaud
1625476091 Add internall::is_identity compile-time helper 2018-07-11 14:00:24 +02:00
Gael Guennebaud
fe723d6129 Fix conversion warning 2018-07-10 09:10:32 +02:00
Gael Guennebaud
9357838f94 bug #1543: improve linear indexing for general block expressions 2018-07-10 09:10:15 +02:00
Gael Guennebaud
de9e31a06d Introduce the macro ei_declare_local_nested_eval to help allocating on the stack local temporaries via alloca, and let outer-products makes a good use of it.
If successful, we should use it everywhere nested_eval is used to declare local dense temporaries.
2018-07-09 15:41:14 +02:00
Gael Guennebaud
ec323b7e66 Skip null numerators in triangular-vector-solve (as in BLAS TRSV). 2018-07-09 11:13:19 +02:00
Gael Guennebaud
359dd77ec3 Fix legitimate "declaration shadows a typedef" warning 2018-07-09 11:03:39 +02:00
Mark D Ryan
90a53ca6fd Fix the Packet16h version of ptranspose
The AVX512 version of ptranpose for PacketBlock<Packet16h,16> was
reordering the PacketBlock argument incorrectly.  This lead to errors in
the multiplication of matrices composed of 16 bit floats on AVX512
machines, if at least of the matrices was using RowMajor order.  This
error is responsible for one tensorflow unit test failure on AVX512
machines:

//tensorflow/python/kernel_tests:batch_matmul_op_test
2018-06-16 15:13:06 -07:00
Gael Guennebaud
1f54164eca Fix a few issues with Packet16h 2018-07-07 00:15:07 +02:00
Gael Guennebaud
f2dc048df9 complete implementation of Packet16h (AVX512) 2018-07-06 17:43:11 +02:00
Gael Guennebaud
f4d623ffa7 Complete Packet8h implementation and test it in packetmath unit test 2018-07-06 17:13:36 +02:00
Deven Desai
b6cc0961b1 updates based on PR feedback
There are two major changes (and a few minor ones which are not listed here...see PR discussion for details)

1. Eigen::half implementations for HIP and CUDA have been merged.
This means that
- `CUDA/Half.h` and `HIP/hcc/Half.h` got merged to a new file `GPU/Half.h`
- `CUDA/PacketMathHalf.h` and `HIP/hcc/PacketMathHalf.h` got merged to a new file `GPU/PacketMathHalf.h`
- `CUDA/TypeCasting.h` and `HIP/hcc/TypeCasting.h` got merged to a new file `GPU/TypeCasting.h`

After this change the `HIP/hcc` directory only contains one file `math_constants.h`. That will go away too once that file becomes a part of the HIP install.

2. new macros EIGEN_GPUCC, EIGEN_GPU_COMPILE_PHASE and EIGEN_HAS_GPU_FP16 have been added and the code has been updated to use them where appropriate.
- `EIGEN_GPUCC` is the same as `(EIGEN_CUDACC || EIGEN_HIPCC)`
- `EIGEN_GPU_DEVICE_COMPILE` is the same as `(EIGEN_CUDA_ARCH || EIGEN_HIP_DEVICE_COMPILE)`
- `EIGEN_HAS_GPU_FP16` is the same as `(EIGEN_HAS_CUDA_FP16 or EIGEN_HAS_HIP_FP16)`
2018-06-14 10:21:54 -04:00
Deven Desai
ba972fb6b4 moving Half headers from CUDA dir to GPU dir, removing the HIP versions 2018-06-13 12:26:18 -04:00
Deven Desai
d1d22ef0f4 syncing this fork with upstream 2018-06-13 12:09:52 -04:00
Benoit Steiner
d3a380af4d Merged in mfigurnov/eigen/gamma-der-a (pull request PR-403)
Derivative of the incomplete Gamma function and the sample of a Gamma random variable

Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com>
2018-06-11 17:57:47 +00:00
Andrea Bocci
f7124b3e46 Extend CUDA support to matrix inversion and selfadjointeigensolver 2018-06-11 18:33:24 +02:00
Gael Guennebaud
0537123953 bug #1565: help MSVC to generatenot too bad ASM in reductions. 2018-07-05 09:21:26 +02:00
Gael Guennebaud
6a241bd8ee Implement custom inplace triangular product to avoid a temporary 2018-07-03 14:02:46 +02:00
Gael Guennebaud
3ae2083e23 Make is_same_dense compatible with different scalar types. 2018-07-03 13:21:43 +02:00
Gael Guennebaud
047677a08d Fix regression in changeset f05dea6b23
: computeFromHessenberg can take any expression for matrixQ, not only an HouseholderSequence.
2018-07-02 12:18:25 +02:00
Gael Guennebaud
d625564936 Simplify redux_evaluator using inheritance, and properly rename parameters in reducers. 2018-07-02 11:50:41 +02:00
Gael Guennebaud
d428a199ab bug #1562: optimize evaluation of small products of the form s*A*B by rewriting them as: s*(A.lazyProduct(B)) to save a costly temporary. Measured speedup from 2x to 5x... 2018-07-02 11:41:09 +02:00
Gael Guennebaud
0cdacf3fa4 update comment 2018-06-29 11:28:36 +02:00
Gael Guennebaud
9a81de1d35 Fix order of EIGEN_DEVICE_FUNC and returned type 2018-06-28 00:20:59 +02:00
Gael Guennebaud
f9d337780d First step towards a generic vectorised quaternion product 2018-06-25 14:26:51 +02:00
Gael Guennebaud
ee5864f72e bug #1560 fix product with a 1x1 diagonal matrix 2018-06-25 10:30:12 +02:00
Rasmus Munk Larsen
bda71ad394 Fix typo in pbend for AltiVec. 2018-06-22 15:04:35 -07:00
Gael Guennebaud
d6813fb1c5 bug #1531: expose NumDimensions for solve and sparse expressions. 2018-06-08 16:55:10 +02:00
Gael Guennebaud
89d65bb9d6 bug #1531: expose NumDimensions for compatibility with Tensor 2018-06-08 16:50:17 +02:00
Gael Guennebaud
f05dea6b23 bug #1550: prevent avoidable memory allocation in RealSchur 2018-06-08 10:14:57 +02:00
Benoit Steiner
522d3ca54d Don't use std::equal_to inside cuda kernels since it's not supported. 2018-06-07 13:02:07 -07:00
Christoph Hertzberg
7d7bb91537 Missing line during manual rebase of PR-374 2018-06-07 20:30:09 +02:00
Michael Figurnov
30fa3d0454 Merge from eigen/eigen 2018-06-07 17:57:56 +01:00
Gael Guennebaud
af7c83b9a2 Fix warning 2018-06-07 15:45:24 +02:00