Gael Guennebaud
98728312c8
Fix compilation regarding std::array
2018-07-12 17:00:37 +02:00
Gael Guennebaud
eb3d8f68bb
fix unused warning
2018-07-12 16:59:47 +02:00
Gael Guennebaud
006e18e52b
Cleanup the mess in Eigen/Core by moving CUDA/HIP stuff at more appropriate places (Macros.h),
...
and alignment/vectorization logic is now in util/ConfigureVectorization.h
2018-07-12 16:57:41 +02:00
Julian Kent
6d451cf2b6
Add missing consts for rows and cols functions in SparseLU
2018-02-10 13:44:05 +01:00
Gael Guennebaud
8bdb214fd0
remove double ;;
2018-07-12 11:17:53 +02:00
Gael Guennebaud
a9060378d3
bug #1570 : fix warning
2018-07-12 11:07:09 +02:00
Gael Guennebaud
da0c604078
Merged in deven-amd/eigen (pull request PR-402)
...
Adding support for using Eigen in HIP kernels.
2018-07-12 08:07:16 +00:00
Gael Guennebaud
a4ea611ca7
Remove useless specialization thanks to is_convertible being more robust.
2018-07-12 09:59:44 +02:00
Gael Guennebaud
8ef267ccbd
spellcheck
2018-07-12 09:58:29 +02:00
Gael Guennebaud
21cf4a1a8b
Make is_convertible more robust and conformant to std::is_convertible
2018-07-12 09:57:19 +02:00
Gael Guennebaud
8a5955a052
Optimize the product of a householder-sequence with the identity, and optimize the evaluation of a HouseholderSequence to a dense matrix using faster blocked product.
2018-07-11 17:16:50 +02:00
Gael Guennebaud
d193cc87f4
Fix regression in 9357838f94
2018-07-11 17:09:23 +02:00
Gael Guennebaud
fb33687736
Fix double ;;
2018-07-11 17:08:30 +02:00
Deven Desai
876f392c39
Updates corresponding to the latest round of PR feedback
...
The major changes are
1. Moving CUDA/PacketMath.h to GPU/PacketMath.h
2. Moving CUDA/MathFunctions.h to GPU/MathFunction.h
3. Moving CUDA/CudaSpecialFunctions.h to GPU/GpuSpecialFunctions.h
The above three changes effectively enable the Eigen "Packet" layer for the HIP platform
4. Merging the "hip_basic" and "cuda_basic" unit tests into one ("gpu_basic")
5. Updating the "EIGEN_DEVICE_FUNC" marking in some places
The change has been tested on the HIP and CUDA platforms.
2018-07-11 10:39:54 -04:00
Deven Desai
471cfe5ff7
renaming CUDA* to GPU* for some header files
2018-07-11 09:22:04 -04:00
Deven Desai
38807a2575
merging updates from upstream
2018-07-11 09:17:33 -04:00
Gael Guennebaud
f00d08cc0a
Optimize extraction of Q in SparseQR by exploiting the structure of the identity matrix.
2018-07-11 14:01:47 +02:00
Gael Guennebaud
1625476091
Add internall::is_identity compile-time helper
2018-07-11 14:00:24 +02:00
Gael Guennebaud
fe723d6129
Fix conversion warning
2018-07-10 09:10:32 +02:00
Gael Guennebaud
9357838f94
bug #1543 : improve linear indexing for general block expressions
2018-07-10 09:10:15 +02:00
Gael Guennebaud
de9e31a06d
Introduce the macro ei_declare_local_nested_eval to help allocating on the stack local temporaries via alloca, and let outer-products makes a good use of it.
...
If successful, we should use it everywhere nested_eval is used to declare local dense temporaries.
2018-07-09 15:41:14 +02:00
Gael Guennebaud
ec323b7e66
Skip null numerators in triangular-vector-solve (as in BLAS TRSV).
2018-07-09 11:13:19 +02:00
Gael Guennebaud
359dd77ec3
Fix legitimate "declaration shadows a typedef" warning
2018-07-09 11:03:39 +02:00
Mark D Ryan
90a53ca6fd
Fix the Packet16h version of ptranspose
...
The AVX512 version of ptranpose for PacketBlock<Packet16h,16> was
reordering the PacketBlock argument incorrectly. This lead to errors in
the multiplication of matrices composed of 16 bit floats on AVX512
machines, if at least of the matrices was using RowMajor order. This
error is responsible for one tensorflow unit test failure on AVX512
machines:
//tensorflow/python/kernel_tests:batch_matmul_op_test
2018-06-16 15:13:06 -07:00
Gael Guennebaud
1f54164eca
Fix a few issues with Packet16h
2018-07-07 00:15:07 +02:00
Gael Guennebaud
f2dc048df9
complete implementation of Packet16h (AVX512)
2018-07-06 17:43:11 +02:00
Gael Guennebaud
f4d623ffa7
Complete Packet8h implementation and test it in packetmath unit test
2018-07-06 17:13:36 +02:00
Deven Desai
b6cc0961b1
updates based on PR feedback
...
There are two major changes (and a few minor ones which are not listed here...see PR discussion for details)
1. Eigen::half implementations for HIP and CUDA have been merged.
This means that
- `CUDA/Half.h` and `HIP/hcc/Half.h` got merged to a new file `GPU/Half.h`
- `CUDA/PacketMathHalf.h` and `HIP/hcc/PacketMathHalf.h` got merged to a new file `GPU/PacketMathHalf.h`
- `CUDA/TypeCasting.h` and `HIP/hcc/TypeCasting.h` got merged to a new file `GPU/TypeCasting.h`
After this change the `HIP/hcc` directory only contains one file `math_constants.h`. That will go away too once that file becomes a part of the HIP install.
2. new macros EIGEN_GPUCC, EIGEN_GPU_COMPILE_PHASE and EIGEN_HAS_GPU_FP16 have been added and the code has been updated to use them where appropriate.
- `EIGEN_GPUCC` is the same as `(EIGEN_CUDACC || EIGEN_HIPCC)`
- `EIGEN_GPU_DEVICE_COMPILE` is the same as `(EIGEN_CUDA_ARCH || EIGEN_HIP_DEVICE_COMPILE)`
- `EIGEN_HAS_GPU_FP16` is the same as `(EIGEN_HAS_CUDA_FP16 or EIGEN_HAS_HIP_FP16)`
2018-06-14 10:21:54 -04:00
Deven Desai
ba972fb6b4
moving Half headers from CUDA dir to GPU dir, removing the HIP versions
2018-06-13 12:26:18 -04:00
Deven Desai
d1d22ef0f4
syncing this fork with upstream
2018-06-13 12:09:52 -04:00
Benoit Steiner
d3a380af4d
Merged in mfigurnov/eigen/gamma-der-a (pull request PR-403)
...
Derivative of the incomplete Gamma function and the sample of a Gamma random variable
Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com>
2018-06-11 17:57:47 +00:00
Andrea Bocci
f7124b3e46
Extend CUDA support to matrix inversion and selfadjointeigensolver
2018-06-11 18:33:24 +02:00
Gael Guennebaud
0537123953
bug #1565 : help MSVC to generatenot too bad ASM in reductions.
2018-07-05 09:21:26 +02:00
Gael Guennebaud
6a241bd8ee
Implement custom inplace triangular product to avoid a temporary
2018-07-03 14:02:46 +02:00
Gael Guennebaud
3ae2083e23
Make is_same_dense compatible with different scalar types.
2018-07-03 13:21:43 +02:00
Gael Guennebaud
047677a08d
Fix regression in changeset f05dea6b23
...
: computeFromHessenberg can take any expression for matrixQ, not only an HouseholderSequence.
2018-07-02 12:18:25 +02:00
Gael Guennebaud
d625564936
Simplify redux_evaluator using inheritance, and properly rename parameters in reducers.
2018-07-02 11:50:41 +02:00
Gael Guennebaud
d428a199ab
bug #1562 : optimize evaluation of small products of the form s*A*B by rewriting them as: s*(A.lazyProduct(B)) to save a costly temporary. Measured speedup from 2x to 5x...
2018-07-02 11:41:09 +02:00
Gael Guennebaud
0cdacf3fa4
update comment
2018-06-29 11:28:36 +02:00
Gael Guennebaud
9a81de1d35
Fix order of EIGEN_DEVICE_FUNC and returned type
2018-06-28 00:20:59 +02:00
Gael Guennebaud
f9d337780d
First step towards a generic vectorised quaternion product
2018-06-25 14:26:51 +02:00
Gael Guennebaud
ee5864f72e
bug #1560 fix product with a 1x1 diagonal matrix
2018-06-25 10:30:12 +02:00
Rasmus Munk Larsen
bda71ad394
Fix typo in pbend for AltiVec.
2018-06-22 15:04:35 -07:00
Gael Guennebaud
d6813fb1c5
bug #1531 : expose NumDimensions for solve and sparse expressions.
2018-06-08 16:55:10 +02:00
Gael Guennebaud
89d65bb9d6
bug #1531 : expose NumDimensions for compatibility with Tensor
2018-06-08 16:50:17 +02:00
Gael Guennebaud
f05dea6b23
bug #1550 : prevent avoidable memory allocation in RealSchur
2018-06-08 10:14:57 +02:00
Benoit Steiner
522d3ca54d
Don't use std::equal_to inside cuda kernels since it's not supported.
2018-06-07 13:02:07 -07:00
Christoph Hertzberg
7d7bb91537
Missing line during manual rebase of PR-374
2018-06-07 20:30:09 +02:00
Michael Figurnov
30fa3d0454
Merge from eigen/eigen
2018-06-07 17:57:56 +01:00
Gael Guennebaud
af7c83b9a2
Fix warning
2018-06-07 15:45:24 +02:00