Gael Guennebaud
4263f23c28
Improve doc on multi-threading and warn about hyper-threading
2018-11-14 14:42:29 +01:00
Gael Guennebaud
db529ae4ec
doxygen does not like \addtogroup and \ingroup in the same line
2018-11-14 14:42:06 +01:00
Rasmus Munk Larsen
72928a2c8a
Merged in rmlarsen/eigen2 (pull request PR-543)
...
Add parallel memcpy to TensorThreadPoolDevice in Eigen, but limit the number of threads to 4, beyond which we just seem to be wasting CPU cycles as the threads contend for memory bandwidth.
Approved-by: Eugene Zhulenev <ezhulenev@google.com>
2018-11-13 17:10:30 +00:00
Rasmus Munk Larsen
cda479d626
Remove accidental changes.
2018-11-12 18:34:04 -08:00
Rasmus Munk Larsen
719d9aee65
Add parallel memcpy to TensorThreadPoolDevice in Eigen, but limit the number of threads to 4, beyond which we just seem to be wasting CPU cycles as the threads contend for memory bandwidth.
2018-11-12 17:46:02 -08:00
Rasmus Munk Larsen
77b447c24e
Add optimized version of logistic function for float. As an example, this is about 50% faster than the existing version on Haswell using AVX.
2018-11-12 13:42:24 -08:00
Gael Guennebaud
c81bdbdadc
Add manual doc on STL-compatible iterators
2018-11-12 22:06:33 +01:00
Gael Guennebaud
0105146915
Fix warning in c++03
2018-11-10 09:11:38 +01:00
Rasmus Munk Larsen
93f9988a7e
A few small fixes to a) prevent throwing in ctors and dtors of the threading code, and b) supporting matrix exponential on platforms with 113 bits of mantissa for long doubles.
2018-11-09 14:15:32 -08:00
Gael Guennebaud
784a3f13cf
bug #1619 : fix mixing of const and non-const generic iterators
2018-11-09 21:45:10 +01:00
Gael Guennebaud
db9a9a12ba
bug #1619 : make const and non-const iterators compatible
2018-11-09 16:49:19 +01:00
Gael Guennebaud
fbd6e7b025
add missing ref to a.zeta(b)
2018-11-09 13:53:42 +01:00
Gael Guennebaud
dffd1e11de
Limit the size of the toc
2018-11-09 13:52:34 +01:00
Gael Guennebaud
a88e0a0e95
Update doxy hacks wrt doxygen 1.8.13/14
2018-11-09 13:52:10 +01:00
Gael Guennebaud
bd9a00718f
Let doxygen sees lastN
2018-11-09 11:35:48 +01:00
Gael Guennebaud
d7c644213c
Add and update manual pages for slicing, indexing, and reshaping.
2018-11-09 11:35:27 +01:00
Gael Guennebaud
a368848473
Recent xcode versions does support EIGEN_HAS_STATIC_ARRAY_TEMPLATE
2018-11-09 10:33:17 +01:00
Gael Guennebaud
f62a0f69c6
Fix max-size in indexed-view
2018-11-08 18:40:22 +01:00
Gael Guennebaud
bf495859ff
Merged in glchaves/eigen (pull request PR-539)
...
Vectorize row-by-row gebp loop iterations on 16 packets as well
2018-11-07 07:21:15 +00:00
Gael Guennebaud
995730fc6c
Add option to disable plot generation
2018-11-07 00:41:16 +01:00
Gustavo Lima Chaves
4ad359237a
Vectorize row-by-row gebp loop iterations on 16 packets as well
...
Signed-off-by: Gustavo Lima Chaves <gustavo.lima.chaves@intel.com>
Signed-off-by: Mark D. Ryan <mark.d.ryan@intel.com>
2018-11-06 10:48:42 -08:00
Gael Guennebaud
9d318b92c6
add unit tests for bug #1619
2018-11-01 15:14:50 +01:00
Matthieu Vigne
8d7a73e48e
bug #1617 : Fix SolveTriangular.solveInPlace crashing for empty matrix.
...
This made FullPivLU.kernel() crash when used on the zero matrix.
Add unit test for FullPivLU.kernel() on the zero matrix.
2018-10-31 20:28:18 +01:00
Christoph Hertzberg
66b28e290d
bug #1618 : Use different power-of-2 check to avoid MSVC warning
2018-11-01 13:23:19 +01:00
Rasmus Munk Larsen
07fcdd1438
Merged in ezhulenev/eigen-02 (pull request PR-534)
...
Fix cxx11_tensor_{block_access, reduction} tests
2018-10-25 18:34:35 +00:00
Eugene Zhulenev
8a977c1f46
Fix cxx11_tensor_{block_access, reduction} tests
2018-10-25 11:31:29 -07:00
Halie Murray-Davis
fb62d6d96e
Fix typo in tutorial documentation.
2018-10-25 04:55:34 +00:00
Christoph Hertzberg
b5f077d22c
Document EIGEN_NO_IO preprocessor directive
2018-10-25 16:49:25 +02:00
Christian von Schultz
4a40b3785d
Collapsed revision (based on pull request PR-325)
...
* Support compiling without IO streams
Add the preprocessor definition EIGEN_NO_IO which, if defined,
disables all use of the IO streams part of the standard library.
2018-10-22 21:14:40 +02:00
Rasmus Munk Larsen
14054e217f
Do not rely on the compiler generating __device__ functions for constexpr in Cuda (via EIGEN_CONSTEXPR_ARE_DEVICE_FUNC. This breaks several target in the TensorFlow Cuda build, e.g.,
...
INFO: From Compiling tensorflow/core/kernels/maxpooling_op_gpu.cu.cc:
/b/f/w/run/external/eigen_archive/Eigen/src/Core/arch/GPU/Half.h(197): error: calling a __host__ function("std::equal_to<float> ::operator () const") from a __global__ function("tensorflow::_NV_ANON_NAMESPACE::MaxPoolGradBackwardNoMaskNHWC< ::Eigen::half> ") is not allowed
/b/f/w/run/external/eigen_archive/Eigen/src/Core/arch/GPU/Half.h(197): error: identifier "std::equal_to<float> ::operator () const" is undefined in device code"
/b/f/w/run/external/eigen_archive/Eigen/src/Core/arch/GPU/Half.h(197): error: calling a __host__ function("std::equal_to<float> ::operator () const") from a __global__ function("tensorflow::_NV_ANON_NAMESPACE::MaxPoolGradBackwardNoMaskNCHW< ::Eigen::half> ") is not allowed
/b/f/w/run/external/eigen_archive/Eigen/src/Core/arch/GPU/Half.h(197): error: identifier "std::equal_to<float> ::operator () const" is undefined in device code
4 errors detected in the compilation of "/tmp/tmpxft_00000011_00000000-6_maxpooling_op_gpu.cu.cpp1.ii".
ERROR: /tmpfs/tensor_flow/tensorflow/core/kernels/BUILD:3753:1: output 'tensorflow/core/kernels/_objs/pooling_ops_gpu/maxpooling_op_gpu.cu.pic.o' was not created
ERROR: /tmpfs/tensor_flow/tensorflow/core/kernels/BUILD:3753:1: Couldn't build file tensorflow/core/kernels/_objs/pooling_ops_gpu/maxpooling_op_gpu.cu.pic.o: not all outputs were created or valid
2018-10-22 16:18:24 -07:00
Rasmus Munk Larsen
954b4ca9d0
Suppress compiler warning about unused global variable.
2018-10-22 13:48:56 -07:00
Rasmus Munk Larsen
9caafca550
Merged in rmlarsen/eigen (pull request PR-532)
...
Only set EIGEN_CONSTEXPR_ARE_DEVICE_FUNC for clang++ if cxx_relaxed_constexpr is available.
2018-10-19 21:37:14 +00:00
Christoph Hertzberg
449ff74672
Fix most Doxygen warnings. Also add links to stable documentation from unsupported modules (by using the corresponding Doxytags file).
...
Manually grafted from d107a371c6
2018-10-19 21:10:28 +02:00
Rasmus Munk Larsen
39fec15d5c
Merged eigen/eigen into default
2018-10-19 09:48:19 -07:00
Christoph Hertzberg
40fa6f98bf
bug #1606 : Explicitly set the standard before find_package(StandardMathLibrary). Also replace EIGEN_COMPILER_SUPPORT_CXX11 in favor of EIGEN_COMPILER_SUPPORT_CPP11.
...
Grafted manually from a4afa90d16
2018-10-19 17:20:51 +02:00
Rasmus Munk Larsen
d8f285852b
Only set EIGEN_CONSTEXPR_ARE_DEVICE_FUNC for clang++ if cxx_relaxed_constexpr is available.
2018-10-18 16:55:02 -07:00
Rasmus Munk Larsen
dda68f56ec
Fix GPU build due to gpu_assert not always being defined.
2018-10-18 16:29:29 -07:00
Gael Guennebaud
1dcf5a6ed8
fix typo in doc
2018-10-17 09:29:36 +02:00
Eugene Zhulenev
9e96e91936
Move from rvalue arguments in ThreadPool enqueue* methods
2018-10-16 16:48:32 -07:00
Eugene Zhulenev
217d839816
Reduce thread scheduling overhead in parallelFor
2018-10-16 14:53:06 -07:00
Rasmus Munk Larsen
d52763bb4f
Merged in ezhulenev/eigen-02 (pull request PR-528)
...
[TensorBlockIO] Check if it's allowed to squeeze inner dimensions
Approved-by: Rasmus Munk Larsen <rmlarsen@google.com>
2018-10-16 15:39:40 +00:00
Gael Guennebaud
0f780bb0b4
Fix float-to-double warning
2018-10-16 09:19:45 +02:00
Eugene Zhulenev
900c7c61bb
Check if it's allowed to squueze inner dimensions in TensorBlockIO
2018-10-15 16:52:33 -07:00
Gael Guennebaud
a39e0f7438
bug #1612 : fix regression in "outer-vectorization" of partial reductions for PacketSize==1 (aka complex<double>)
2018-10-16 01:04:25 +02:00
Gael Guennebaud
e3b85771d7
Show call stack in case of failing sparse solving.
2018-10-16 00:43:44 +02:00
Gael Guennebaud
d2d570c116
Remove useless (and broken) resize
2018-10-16 00:42:48 +02:00
Gael Guennebaud
f0fb95135d
Iterative solvers: unify and fix handling of multiple rhs.
...
m_info was not properly computed and the logic was repeated in several places.
2018-10-15 23:47:46 +02:00
Gael Guennebaud
2747b98cfc
DGMRES: fix null rhs, fix restart, fix m_isDeflInitialized for multiple solve
2018-10-15 23:46:00 +02:00
Gael Guennebaud
d835a0bf53
relax number of iterations checks to avoid false negatives
2018-10-15 10:23:32 +02:00
Gael Guennebaud
3a33db4de5
merge
2018-10-15 09:22:27 +02:00