mirror of
https://gitlab.com/libeigen/eigen.git
synced 2024-12-21 07:19:46 +08:00
e2999d4c38
The change caused the device struct to be copied for each expression evaluation, and caused, e.g., a 10% regression in the TensorFlow multinomial op on GPU: Benchmark Time(ns) CPU(ns) Iterations ---------------------------------------------------------------------- BM_Multinomial_gpu_1_100000_4 128173 231326 2922 1.610G items/s VS Benchmark Time(ns) CPU(ns) Iterations ---------------------------------------------------------------------- BM_Multinomial_gpu_1_100000_4 146683 246914 2719 1.509G items/s |
||
---|---|---|
.. | ||
CXX11 | ||
src | ||
AdolcForward | ||
AlignedVector3 | ||
ArpackSupport | ||
AutoDiff | ||
BVH | ||
CMakeLists.txt | ||
EulerAngles | ||
FFT | ||
IterativeSolvers | ||
KroneckerProduct | ||
LevenbergMarquardt | ||
MatrixFunctions | ||
MoreVectorization | ||
MPRealSupport | ||
NonLinearOptimization | ||
NumericalDiff | ||
OpenGLSupport | ||
Polynomials | ||
Skyline | ||
SparseExtra | ||
SpecialFunctions | ||
Splines |