mirror of
https://gitlab.com/libeigen/eigen.git
synced 2024-12-21 07:19:46 +08:00
50df8d3d6d
- The current implementation computes `size + total_threads`, which can overflow and cause CUDA_ERROR_ILLEGAL_ADDRESS when size is close to the maximum representable value. - The num_blocks calculation can also overflow due to the implementation of divup(). - This patch prevents these overflows and allows the kernel to work correctly for the full representable range of tensor sizes. - Also adds relevant tests. |
||
---|---|---|
.. | ||
CXX11 | ||
src | ||
AdolcForward | ||
AlignedVector3 | ||
ArpackSupport | ||
AutoDiff | ||
BVH | ||
CMakeLists.txt | ||
EulerAngles | ||
FFT | ||
IterativeSolvers | ||
KroneckerProduct | ||
LevenbergMarquardt | ||
MatrixFunctions | ||
MoreVectorization | ||
MPRealSupport | ||
NonLinearOptimization | ||
NumericalDiff | ||
OpenGLSupport | ||
Polynomials | ||
Skyline | ||
SparseExtra | ||
SpecialFunctions | ||
Splines |