Christoph Hertzberg
c9825b967e
Renaming even more I
identifiers
2019-01-26 13:22:13 +01:00
Christoph Hertzberg
934b8a1304
Avoid I
as an identifier, since it may clash with the C-header complex.h
2019-01-25 14:54:39 +01:00
Rasmus Munk Larsen
ee550a2ac3
Fix flaky test for tensor fft.
2019-01-16 14:03:12 -08:00
Eugene Zhulenev
1e6d15b55b
Fix shorten-64-to-32 warning in TensorContractionThreadPool
2019-01-11 11:41:53 -08:00
Eugene Zhulenev
0abe03764c
Fix shorten-64-to-32 warning in TensorContractionThreadPool
2019-01-10 10:27:55 -08:00
Gael Guennebaud
d812f411c3
bug #1654 : fix compilation with cuda and no c++11
2019-01-09 18:00:05 +01:00
Eugene Zhulenev
e70ffef967
Optimize evalShardedByInnerDim
2019-01-08 16:26:31 -08:00
Rasmus Munk Larsen
dd6d65898a
Fix shorten-64-to-32 warning. Use regular memcpy if num_threads==0.
2018-12-12 14:45:31 -08:00
Gael Guennebaud
cf697272e1
Remove debug code.
2018-12-09 23:05:46 +01:00
Gael Guennebaud
450dc97c6b
Various fixes in polynomial solver and its unit tests:
...
- cleanup noise in imaginary part of real roots
- take into account the magnitude of the derivative to check roots.
- use <= instead of < at appropriate places
2018-12-09 22:54:39 +01:00
Rasmus Munk Larsen
8a02883d58
Merged in markdryan/eigen/avx512-contraction-2 (pull request PR-554)
...
Fix tensor contraction on AVX512 builds
Approved-by: Rasmus Munk Larsen <rmlarsen@google.com>
2018-12-05 18:19:32 +00:00
Mark D Ryan
36f8f6d0be
Fix evalShardedByInnerDim for AVX512 builds
...
evalShardedByInnerDim ensures that the values it passes for start_k and
end_k to evalGemmPartialWithoutOutputKernel are multiples of 8 as the kernel
does not work correctly when the values of k are not multiples of the
packet_size. While this precaution works for AVX builds, it is insufficient
for AVX512 builds where the maximum packet size is 16. The result is slightly
incorrect float32 contractions on AVX512 builds.
This commit fixes the problem by ensuring that k is always a multiple of
the packet_size if the packet_size is > 8.
2018-12-05 12:29:03 +01:00
Christoph Hertzberg
0ec8afde57
Fixed most conversion warnings in MatrixFunctions module
2018-11-20 16:23:28 +01:00
Deven Desai
e7e6809e6b
ROCm/HIP specfic fixes + updates
...
1. Eigen/src/Core/arch/GPU/Half.h
Updating the HIPCC implementation half so that it can declared as a __shared__ variable
2. Eigen/src/Core/util/Macros.h, Eigen/src/Core/util/Memory.h
introducing a EIGEN_USE_STD(func) macro that calls
- std::func be default
- ::func when eigen is being compiled with HIPCC
This change was requested in the previous HIP PR
(https://bitbucket.org/eigen/eigen/pull-requests/518/pr-with-hip-specific-fixes-for-the-eigen/diff )
3. unsupported/Eigen/CXX11/src/Tensor/TensorDeviceThreadPool.h
Removing EIGEN_DEVICE_FUNC attribute from pure virtual methods as it is not supported by HIPCC
4. unsupported/Eigen/CXX11/src/Tensor/TensorReduction.h
Disabling the template specializations of InnerMostDimReducer as they run into HIPCC link errors
2018-11-19 18:13:59 +00:00
Rasmus Munk Larsen
72928a2c8a
Merged in rmlarsen/eigen2 (pull request PR-543)
...
Add parallel memcpy to TensorThreadPoolDevice in Eigen, but limit the number of threads to 4, beyond which we just seem to be wasting CPU cycles as the threads contend for memory bandwidth.
Approved-by: Eugene Zhulenev <ezhulenev@google.com>
2018-11-13 17:10:30 +00:00
Rasmus Munk Larsen
cda479d626
Remove accidental changes.
2018-11-12 18:34:04 -08:00
Rasmus Munk Larsen
719d9aee65
Add parallel memcpy to TensorThreadPoolDevice in Eigen, but limit the number of threads to 4, beyond which we just seem to be wasting CPU cycles as the threads contend for memory bandwidth.
2018-11-12 17:46:02 -08:00
Rasmus Munk Larsen
93f9988a7e
A few small fixes to a) prevent throwing in ctors and dtors of the threading code, and b) supporting matrix exponential on platforms with 113 bits of mantissa for long doubles.
2018-11-09 14:15:32 -08:00
Rasmus Munk Larsen
07fcdd1438
Merged in ezhulenev/eigen-02 (pull request PR-534)
...
Fix cxx11_tensor_{block_access, reduction} tests
2018-10-25 18:34:35 +00:00
Eugene Zhulenev
8a977c1f46
Fix cxx11_tensor_{block_access, reduction} tests
2018-10-25 11:31:29 -07:00
Christoph Hertzberg
449ff74672
Fix most Doxygen warnings. Also add links to stable documentation from unsupported modules (by using the corresponding Doxytags file).
...
Manually grafted from d107a371c6
2018-10-19 21:10:28 +02:00
Christoph Hertzberg
40fa6f98bf
bug #1606 : Explicitly set the standard before find_package(StandardMathLibrary). Also replace EIGEN_COMPILER_SUPPORT_CXX11 in favor of EIGEN_COMPILER_SUPPORT_CPP11.
...
Grafted manually from a4afa90d16
2018-10-19 17:20:51 +02:00
Rasmus Munk Larsen
dda68f56ec
Fix GPU build due to gpu_assert not always being defined.
2018-10-18 16:29:29 -07:00
Eugene Zhulenev
9e96e91936
Move from rvalue arguments in ThreadPool enqueue* methods
2018-10-16 16:48:32 -07:00
Eugene Zhulenev
217d839816
Reduce thread scheduling overhead in parallelFor
2018-10-16 14:53:06 -07:00
Rasmus Munk Larsen
d52763bb4f
Merged in ezhulenev/eigen-02 (pull request PR-528)
...
[TensorBlockIO] Check if it's allowed to squeeze inner dimensions
Approved-by: Rasmus Munk Larsen <rmlarsen@google.com>
2018-10-16 15:39:40 +00:00
Eugene Zhulenev
900c7c61bb
Check if it's allowed to squueze inner dimensions in TensorBlockIO
2018-10-15 16:52:33 -07:00
Gael Guennebaud
f0fb95135d
Iterative solvers: unify and fix handling of multiple rhs.
...
m_info was not properly computed and the logic was repeated in several places.
2018-10-15 23:47:46 +02:00
Gael Guennebaud
2747b98cfc
DGMRES: fix null rhs, fix restart, fix m_isDeflInitialized for multiple solve
2018-10-15 23:46:00 +02:00
Gael Guennebaud
d835a0bf53
relax number of iterations checks to avoid false negatives
2018-10-15 10:23:32 +02:00
Gael Guennebaud
8214cf1896
Make sparse_basic includable from sparse_extra, but disable it since sparse_basic(DynamicSparseMatrix) does not compile at all anyways
2018-10-11 10:27:23 +02:00
Christoph Hertzberg
3f2c8b7ff0
Fix a lot of Doxygen warnings in Tensor module
2018-10-09 20:22:47 +02:00
Gael Guennebaud
93a6192e98
fix mpreal for mpfr<4.0.0
2018-10-09 09:15:22 +02:00
Rasmus Munk Larsen
d16634c4d4
Fix out-of bounds access in TensorArgMax.h.
2018-10-08 16:41:36 -07:00
Rasmus Munk Larsen
1a737e1d6a
Fix contraction test.
2018-10-08 16:37:07 -07:00
Gael Guennebaud
2eda9783de
typo
2018-10-08 21:37:46 +02:00
Gael Guennebaud
6cc9b2c831
fix warning in mpreal.h
2018-10-08 18:25:37 +02:00
Gael Guennebaud
e29bfe8479
Update included mpreal header to 3.6.5 and fix deprecated warnings.
2018-10-08 17:09:23 +02:00
Gael Guennebaud
64b1a15318
Workaround stupid warning
2018-10-08 12:01:18 +02:00
Christoph Hertzberg
c5f1d0a72a
Fix shadow warning
2018-10-02 19:01:08 +02:00
Christoph Hertzberg
b92c71235d
Move struct outside of method for C++03 compatibility.
2018-10-02 18:59:10 +02:00
Christoph Hertzberg
051f9c1aff
Make code compile in C++03 mode again
2018-10-02 18:36:30 +02:00
Christoph Hertzberg
b786ce8c72
Fix conversion warning ... again
2018-10-02 18:35:25 +02:00
Christoph Hertzberg
564ca71e39
Merged in deven-amd/eigen/HIP_fixes (pull request PR-518)
...
PR with HIP specific fixes (for the eigen nightly regression failures in HIP mode)
2018-10-01 16:51:04 +00:00
Deven Desai
94898488a6
This commit contains the following (HIP specific) updates:
...
- unsupported/Eigen/CXX11/src/Tensor/TensorReductionGpu.h
Changing "pass-by-reference" argument to be "pass-by-value" instead
(in a __global__ function decl).
"pass-by-reference" arguments to __global__ functions are unwise,
and will be explicitly flagged as errors by the newer versions of HIP.
- Eigen/src/Core/util/Memory.h
- unsupported/Eigen/CXX11/src/Tensor/TensorContraction.h
Changes introduced in recent commits breaks the HIP compile.
Adding EIGEN_DEVICE_FUNC attribute to some functions and
calling ::malloc/free instead of the corresponding std:: versions
to get the HIP compile working again
- unsupported/Eigen/CXX11/src/Tensor/TensorReduction.h
Change introduced a recent commit breaks the HIP compile
(link stage errors out due to failure to inline a function).
Disabling the recently introduced code (only for HIP compile), to get
the eigen nightly testing going again.
Will submit another PR once we have te proper fix.
- Eigen/src/Core/util/ConfigureVectorization.h
Enabling GPU VECTOR support when HIP compiler is in use
(for both the host and device compile phases)
2018-10-01 14:28:37 +00:00
Rasmus Munk Larsen
2088c0897f
Merged eigen/eigen into default
2018-09-28 16:00:46 -07:00
Rasmus Munk Larsen
31629bb964
Get rid of unused variable warning.
2018-09-28 16:00:09 -07:00
Eugene Zhulenev
bb13d5d917
Fix bug in copy optimization in Tensor slicing.
2018-09-28 14:34:42 -07:00
Rasmus Munk Larsen
104e8fa074
Fix a few warnings and rename a variable to not shadow "last".
2018-09-28 12:00:08 -07:00
Rasmus Munk Larsen
7c1b47840a
Merged in ezhulenev/eigen-01 (pull request PR-514)
...
Add tests for evalShardedByInnerDim contraction + fix bugs
2018-09-28 18:37:54 +00:00
Eugene Zhulenev
524c81f3fa
Add tests for evalShardedByInnerDim contraction + fix bugs
2018-09-28 11:24:08 -07:00
Christoph Hertzberg
86ba50be39
Fix integer conversion warnings
2018-09-28 19:33:39 +02:00
Eugene Zhulenev
e95696acb3
Optimize TensorBlockCopyOp
2018-09-27 14:49:26 -07:00
Eugene Zhulenev
9f33e71e9d
Revert code lost in merge
2018-09-27 12:08:17 -07:00
Eugene Zhulenev
a7a3e9f2b6
Merge with eigen/eigen default
2018-09-27 12:05:06 -07:00
Eugene Zhulenev
9f4988959f
Remove explicit mkldnn support and redundant TensorContractionKernelBlocking
2018-09-27 11:49:19 -07:00
Eugene Zhulenev
b314376f9c
Test mkldnn pack for doubles
2018-09-26 18:22:24 -07:00
Eugene Zhulenev
22ed98a331
Conditionally add mkldnn test
2018-09-26 17:57:37 -07:00
Rasmus Munk Larsen
d956204ab2
Remove "false &&" left over from test.
2018-09-26 17:03:30 -07:00
Rasmus Munk Larsen
3815aeed7a
Parallelize tensor contraction over the inner dimension in cases where where one or both of the outer dimensions (m and n) are small but k is large. This speeds up individual matmul microbenchmarks by up to 85%.
...
Naming below is BM_Matmul_M_K_N_THREADS, measured on a 2-socket Intel Broadwell-based server.
Benchmark Base (ns) New (ns) Improvement
------------------------------------------------------------------
BM_Matmul_1_80_13522_1 387457 396013 -2.2%
BM_Matmul_1_80_13522_2 406487 230789 +43.2%
BM_Matmul_1_80_13522_4 395821 123211 +68.9%
BM_Matmul_1_80_13522_6 391625 97002 +75.2%
BM_Matmul_1_80_13522_8 408986 113828 +72.2%
BM_Matmul_1_80_13522_16 399988 67600 +83.1%
BM_Matmul_1_80_13522_22 411546 60044 +85.4%
BM_Matmul_1_80_13522_32 393528 57312 +85.4%
BM_Matmul_1_80_13522_44 390047 63525 +83.7%
BM_Matmul_1_80_13522_88 387876 63592 +83.6%
BM_Matmul_1_1500_500_1 245359 248119 -1.1%
BM_Matmul_1_1500_500_2 401833 143271 +64.3%
BM_Matmul_1_1500_500_4 210519 100231 +52.4%
BM_Matmul_1_1500_500_6 251582 86575 +65.6%
BM_Matmul_1_1500_500_8 211499 80444 +62.0%
BM_Matmul_3_250_512_1 70297 68551 +2.5%
BM_Matmul_3_250_512_2 70141 52450 +25.2%
BM_Matmul_3_250_512_4 67872 58204 +14.2%
BM_Matmul_3_250_512_6 71378 63340 +11.3%
BM_Matmul_3_250_512_8 69595 41652 +40.2%
BM_Matmul_3_250_512_16 72055 42549 +40.9%
BM_Matmul_3_250_512_22 70158 54023 +23.0%
BM_Matmul_3_250_512_32 71541 56042 +21.7%
BM_Matmul_3_250_512_44 71843 57019 +20.6%
BM_Matmul_3_250_512_88 69951 54045 +22.7%
BM_Matmul_3_1500_512_1 369328 374284 -1.4%
BM_Matmul_3_1500_512_2 428656 223603 +47.8%
BM_Matmul_3_1500_512_4 205599 139508 +32.1%
BM_Matmul_3_1500_512_6 214278 139071 +35.1%
BM_Matmul_3_1500_512_8 184149 142338 +22.7%
BM_Matmul_3_1500_512_16 156462 156983 -0.3%
BM_Matmul_3_1500_512_22 163905 158259 +3.4%
BM_Matmul_3_1500_512_32 155314 157662 -1.5%
BM_Matmul_3_1500_512_44 235434 158657 +32.6%
BM_Matmul_3_1500_512_88 156779 160275 -2.2%
BM_Matmul_1500_4_512_1 363358 349528 +3.8%
BM_Matmul_1500_4_512_2 303134 263319 +13.1%
BM_Matmul_1500_4_512_4 176208 130086 +26.2%
BM_Matmul_1500_4_512_6 148026 115449 +22.0%
BM_Matmul_1500_4_512_8 131656 98421 +25.2%
BM_Matmul_1500_4_512_16 134011 82861 +38.2%
BM_Matmul_1500_4_512_22 134950 85685 +36.5%
BM_Matmul_1500_4_512_32 133165 90081 +32.4%
BM_Matmul_1500_4_512_44 133203 90644 +32.0%
BM_Matmul_1500_4_512_88 134106 100566 +25.0%
BM_Matmul_4_1500_512_1 439243 435058 +1.0%
BM_Matmul_4_1500_512_2 451830 257032 +43.1%
BM_Matmul_4_1500_512_4 276434 164513 +40.5%
BM_Matmul_4_1500_512_6 182542 144827 +20.7%
BM_Matmul_4_1500_512_8 179411 166256 +7.3%
BM_Matmul_4_1500_512_16 158101 155560 +1.6%
BM_Matmul_4_1500_512_22 152435 155448 -1.9%
BM_Matmul_4_1500_512_32 155150 149538 +3.6%
BM_Matmul_4_1500_512_44 193842 149777 +22.7%
BM_Matmul_4_1500_512_88 149544 154468 -3.3%
2018-09-26 16:47:13 -07:00
Eugene Zhulenev
71cd3fbd6a
Support multiple contraction kernel types in TensorContractionThreadPool
2018-09-26 11:08:47 -07:00
Christoph Hertzberg
0a3356f4ec
Don't deactivate BVH test for clang (probably, this was failing for very old versions of clang)
2018-09-25 20:26:16 +02:00
Christoph Hertzberg
2c083ace3e
Provide EIGEN_OVERRIDE and EIGEN_FINAL macros to mark virtual function overrides
2018-09-24 18:01:17 +02:00
Gael Guennebaud
c696dbcaa6
Fiw shadowing of last and all
2018-09-21 23:02:33 +02:00
Gael Guennebaud
4291f167ee
Add missing plugins to DynamicSparseMatrix -- fix sparse_extra_3
2018-09-21 14:53:43 +02:00
Eugene Zhulenev
719e438a20
Collapsed revision
...
* Split cxx11_tensor_executor test
* Register test parts with EIGEN_SUFFIXES
* Fix EIGEN_SUFFIXES in cxx11_tensor_executor test
2018-09-20 15:19:12 -07:00
Rasmus Munk Larsen
8e2be7777e
Merged eigen/eigen into default
2018-09-20 11:41:15 -07:00
Rasmus Munk Larsen
5d2e759329
Initialize BlockIteratorState in a C++03 compatible way.
2018-09-20 11:40:43 -07:00
Gael Guennebaud
e04faca930
merge
2018-09-20 18:33:54 +02:00
Gael Guennebaud
d37188b9c1
Fix MPrealSupport
2018-09-20 18:30:10 +02:00
Gael Guennebaud
3c6dc93f99
Fix GPU support.
2018-09-20 18:29:21 +02:00
Gael Guennebaud
9419f506d0
Fix regression introduced by the previous fix for AVX512.
...
It brokes the complex-complex case on SSE.
2018-09-20 17:32:34 +02:00
Christoph Hertzberg
a0166ab651
Workaround for spurious "array subscript is above array bounds" warnings with g++4.x
2018-09-20 17:08:43 +02:00
Christoph Hertzberg
c50250cb24
Avoid warning "suggest braces around initialization of subobject".
...
This test is not run in C++03 mode, so no compatibility is lost.
2018-09-20 17:03:42 +02:00
Gael Guennebaud
71496b0e25
Fix gebp kernel for real+complex in case only reals are vectorized (e.g., AVX512).
...
This commit also removes "half-packet" from data-mappers: it was not used and conceptually broken anyways.
2018-09-20 17:01:24 +02:00
Rasmus Munk Larsen
44d8274383
Cast to longer type.
2018-09-19 13:31:42 -07:00
Rasmus Munk Larsen
d638b62dda
Silence compiler warning.
2018-09-19 13:27:55 -07:00
Rasmus Munk Larsen
db9c9df59a
Silence more compiler warnings.
2018-09-19 11:50:27 -07:00
Rasmus Munk Larsen
febd09dcc0
Silence compiler warnings in ThreadPoolInterface.h.
2018-09-19 11:11:04 -07:00
luz.paz"
f67b19a884
[PATCH 1/2] Misc. typos
...
From 68d431b4c14ad60a778ee93c1f59ecc4b931950e Mon Sep 17 00:00:00 2001
Found via `codespell -q 3 -I ../eigen-word-whitelist.txt` where the whitelists consists of:
```
als
ans
cas
dum
lastr
lowd
nd
overfl
pres
preverse
substraction
te
uint
whch
```
---
CMakeLists.txt | 26 +++++++++----------
Eigen/src/Core/GenericPacketMath.h | 2 +-
Eigen/src/SparseLU/SparseLU.h | 2 +-
bench/bench_norm.cpp | 2 +-
doc/HiPerformance.dox | 2 +-
doc/QuickStartGuide.dox | 2 +-
.../Eigen/CXX11/src/Tensor/TensorChipping.h | 6 ++---
.../Eigen/CXX11/src/Tensor/TensorDeviceGpu.h | 2 +-
.../src/Tensor/TensorForwardDeclarations.h | 4 +--
.../src/Tensor/TensorGpuHipCudaDefines.h | 2 +-
.../Eigen/CXX11/src/Tensor/TensorReduction.h | 2 +-
.../CXX11/src/Tensor/TensorReductionGpu.h | 2 +-
.../test/cxx11_tensor_concatenation.cpp | 2 +-
unsupported/test/cxx11_tensor_executor.cpp | 2 +-
14 files changed, 29 insertions(+), 29 deletions(-)
2018-09-18 04:15:01 -04:00
Eugene Zhulenev
c4627039ac
Support static dimensions (aka IndexList) in Tensor::resize(...)
2018-09-18 14:25:21 -07:00
Eugene Zhulenev
218a7b9840
Enable DSizes type promotion with c++03 compilers
2018-09-18 10:57:00 -07:00
Ravi Kiran
1f0c941c3d
Collapsed revision
...
* Merged eigen/eigen into default
2018-09-17 18:29:12 -07:00
Rasmus Munk Larsen
03a88c57e1
Merged in ezhulenev/eigen-02 (pull request PR-498)
...
Add DSizes index type promotion
2018-09-17 21:58:38 +00:00
Rasmus Munk Larsen
5ca0e4a245
Merged in ezhulenev/eigen-01 (pull request PR-497)
...
Fix warnings in IndexList array_prod
2018-09-17 20:15:06 +00:00
Eugene Zhulenev
a5cd4e9ad1
Replace deprecated Eigen::DenseIndex with Eigen::Index in TensorIndexList
2018-09-17 10:58:07 -07:00
Gael Guennebaud
b311bfb752
bug #1596 : fix inclusion of Eigen's header within unsupported modules.
2018-09-17 09:54:29 +02:00
Gael Guennebaud
72f19c827a
typo
2018-09-16 22:10:34 +02:00
Eugene Zhulenev
66f056776f
Add DSizes index type promotion
2018-09-15 15:17:38 -07:00
Eugene Zhulenev
f313126dab
Fix warnings in IndexList array_prod
2018-09-15 13:47:54 -07:00
Christoph Hertzberg
42705ba574
Fix weird error for building with g++-4.7 in C++03 mode.
2018-09-15 12:43:41 +02:00
Rasmus Munk Larsen
c2383f95af
Merged in ezhulenev/eigen/fix_dsizes (pull request PR-494)
...
Fix DSizes IndexList constructor
2018-09-15 02:36:19 +00:00
Rasmus Munk Larsen
30290cdd56
Merged in ezhulenev/eigen/moar_eigen_fixes_3 (pull request PR-493)
...
Const cast scalar pointer in TensorSlicingOp evaluator
Approved-by: Sameer Agarwal <sameeragarwal@google.com>
2018-09-15 02:35:07 +00:00
Eugene Zhulenev
f7d0053cf0
Fix DSizes IndexList constructor
2018-09-14 19:19:13 -07:00
Rasmus Munk Larsen
601e289d27
Merged in ezhulenev/eigen/moar_eigen_fixes_1 (pull request PR-492)
...
Explicitly construct tensor block dimensions from evaluator dimensions
2018-09-15 01:36:21 +00:00
Eugene Zhulenev
71070a1e84
Const cast scalar pointer in TensorSlicingOp evaluator
2018-09-14 17:17:50 -07:00
Eugene Zhulenev
4863375723
Explicitly construct tensor block dimensions from evaluator dimensions
2018-09-14 16:55:05 -07:00
Rasmus Munk Larsen
14e35855e1
Merged in chtz/eigen-maxsizevector (pull request PR-490)
...
Let MaxSizeVector respect alignment of objects
Approved-by: Rasmus Munk Larsen <rmlarsen@google.com>
2018-09-14 23:29:24 +00:00
Eugene Zhulenev
1b8d70a22b
Support reshaping with static shapes and dimensions conversion in tensor broadcasting
2018-09-14 15:25:27 -07:00
Christoph Hertzberg
007f165c69
bug #1598 : Let MaxSizeVector respect alignment of objects and add a unit test
...
Also revert 8b3d9ed081
2018-09-14 20:21:56 +02:00