eigen/unsupported/Eigen
Deven Desai 46f8a18567 Adding an explicit launch_bounds(1024) attribute for GPU kernels.
Starting with ROCm 3.5, the HIP compiler will change from HCC to hip-clang.

This compiler change introduce a change in the default value of the `__launch_bounds__` attribute associated with a GPU kernel. (default value means the value assumed by the compiler as the `__launch_bounds attribute__` value, when it is not explicitly specified by the user)

Currently (i.e. for HIP with ROCm 3.3 and older), the default value is 1024. That changes to 256 with ROCm 3.5 (i.e. hip-clang compiler). As a consequence of this change, if a GPU kernel with a `__luanch_bounds__` attribute of 256 is launched at runtime with a threads_per_block value > 256, it leads to a runtime error. This is leading to a couple of Eigen unit test failures with ROCm 3.5.

This commit adds an explicit `__launch_bounds(1024)__` attribute to every GPU kernel that currently does not have it explicitly specified (and hence will end up getting the default value of 256 with the change to hip-clang)
2020-08-05 01:46:34 +00:00
..
CXX11 Adding an explicit launch_bounds(1024) attribute for GPU kernels. 2020-08-05 01:46:34 +00:00
src Support BFloat16 in Eigen 2020-06-20 19:16:24 +00:00
AdolcForward
AlignedVector3 fix AlignedVector3 inconsisent interface with other Vector classes, default constructor and operator- were missing. 2019-12-06 21:07:39 +01:00
ArpackSupport
AutoDiff
BVH
CMakeLists.txt
EulerAngles
FFT
IterativeSolvers
KroneckerProduct
LevenbergMarquardt
MatrixFunctions
MoreVectorization
MPRealSupport
NonLinearOptimization
NumericalDiff
OpenGLSupport
Polynomials
Skyline
SparseExtra
SpecialFunctions Support BFloat16 in Eigen 2020-06-20 19:16:24 +00:00
Splines