eigen/Eigen
Antonio Sanchez 45e67a6fda Use reinterpret_cast on GPU for bit_cast.
This seems to be the recommended approach for doing type punning in
CUDA. See for example
- https://stackoverflow.com/questions/47037104/cuda-type-punning-memcpy-vs-ub-union
- https://developer.nvidia.com/blog/faster-parallel-reductions-kepler/
(the latter puns a double to an `int2`).
The issue is that for CUDA, the `memcpy` is not elided, and ends up
being an expensive operation.  We already have similar `reintepret_cast`s across
the Eigen codebase for GPU (as does TensorFlow).
2021-10-20 21:34:40 +00:00
..
src Use reinterpret_cast on GPU for bit_cast. 2021-10-20 21:34:40 +00:00
Cholesky
CholmodSupport
Core Move CUDA/Complex.h to GPU/Complex.h, remove TensorReductionCuda.h 2021-10-20 12:00:19 -07:00
Dense
Eigen
Eigenvalues
Geometry
Householder
IterativeLinearSolvers
Jacobi
KLUSupport
LU
MetisSupport
OrderingMethods
PardisoSupport
PaStiXSupport
QR
QtAlignedMalloc
Sparse
SparseCholesky
SparseCore
SparseLU
SparseQR
SPQRSupport
StdDeque
StdList
StdVector
SuperLUSupport
SVD
UmfPackSupport