Steven Peters
953ca5ba2f
Spline.h: fix spelling "spang" -> "span"
2019-02-08 06:23:24 +00:00
Eugene Zhulenev
59998117bb
Don't do parallel_pack if we can use thread_local memory in tensor contractions
2019-02-07 09:21:25 -08:00
Eugene Zhulenev
8491127082
Do not reduce parallelism too much in contractions with small number of threads
2019-02-04 12:59:33 -08:00
Eugene Zhulenev
eb21bab769
Parallelize tensor contraction only by sharding dimension and use 'thread-local' memory for packing
2019-02-04 10:43:16 -08:00
Gael Guennebaud
d586686924
Workaround lack of support for arbitrary packet-type in Tensor by manually loading half/quarter packets in tensor contraction mapper.
2019-01-30 16:48:01 +01:00
Christoph Hertzberg
a7779a9b42
Hide some annoying unused variable warnings in g++8.1
2019-01-29 16:48:21 +01:00
Christoph Hertzberg
c9825b967e
Renaming even more I
identifiers
2019-01-26 13:22:13 +01:00
Christoph Hertzberg
934b8a1304
Avoid I
as an identifier, since it may clash with the C-header complex.h
2019-01-25 14:54:39 +01:00
Rasmus Munk Larsen
ee550a2ac3
Fix flaky test for tensor fft.
2019-01-16 14:03:12 -08:00
Eugene Zhulenev
1e6d15b55b
Fix shorten-64-to-32 warning in TensorContractionThreadPool
2019-01-11 11:41:53 -08:00
Eugene Zhulenev
0abe03764c
Fix shorten-64-to-32 warning in TensorContractionThreadPool
2019-01-10 10:27:55 -08:00
Gael Guennebaud
d812f411c3
bug #1654 : fix compilation with cuda and no c++11
2019-01-09 18:00:05 +01:00
Eugene Zhulenev
e70ffef967
Optimize evalShardedByInnerDim
2019-01-08 16:26:31 -08:00
Rasmus Munk Larsen
dd6d65898a
Fix shorten-64-to-32 warning. Use regular memcpy if num_threads==0.
2018-12-12 14:45:31 -08:00
Gael Guennebaud
cf697272e1
Remove debug code.
2018-12-09 23:05:46 +01:00
Gael Guennebaud
450dc97c6b
Various fixes in polynomial solver and its unit tests:
...
- cleanup noise in imaginary part of real roots
- take into account the magnitude of the derivative to check roots.
- use <= instead of < at appropriate places
2018-12-09 22:54:39 +01:00
Rasmus Munk Larsen
8a02883d58
Merged in markdryan/eigen/avx512-contraction-2 (pull request PR-554)
...
Fix tensor contraction on AVX512 builds
Approved-by: Rasmus Munk Larsen <rmlarsen@google.com>
2018-12-05 18:19:32 +00:00
Mark D Ryan
36f8f6d0be
Fix evalShardedByInnerDim for AVX512 builds
...
evalShardedByInnerDim ensures that the values it passes for start_k and
end_k to evalGemmPartialWithoutOutputKernel are multiples of 8 as the kernel
does not work correctly when the values of k are not multiples of the
packet_size. While this precaution works for AVX builds, it is insufficient
for AVX512 builds where the maximum packet size is 16. The result is slightly
incorrect float32 contractions on AVX512 builds.
This commit fixes the problem by ensuring that k is always a multiple of
the packet_size if the packet_size is > 8.
2018-12-05 12:29:03 +01:00
Christoph Hertzberg
0ec8afde57
Fixed most conversion warnings in MatrixFunctions module
2018-11-20 16:23:28 +01:00
Rasmus Munk Larsen
72928a2c8a
Merged in rmlarsen/eigen2 (pull request PR-543)
...
Add parallel memcpy to TensorThreadPoolDevice in Eigen, but limit the number of threads to 4, beyond which we just seem to be wasting CPU cycles as the threads contend for memory bandwidth.
Approved-by: Eugene Zhulenev <ezhulenev@google.com>
2018-11-13 17:10:30 +00:00
Rasmus Munk Larsen
cda479d626
Remove accidental changes.
2018-11-12 18:34:04 -08:00
Rasmus Munk Larsen
719d9aee65
Add parallel memcpy to TensorThreadPoolDevice in Eigen, but limit the number of threads to 4, beyond which we just seem to be wasting CPU cycles as the threads contend for memory bandwidth.
2018-11-12 17:46:02 -08:00
Rasmus Munk Larsen
93f9988a7e
A few small fixes to a) prevent throwing in ctors and dtors of the threading code, and b) supporting matrix exponential on platforms with 113 bits of mantissa for long doubles.
2018-11-09 14:15:32 -08:00
Rasmus Munk Larsen
07fcdd1438
Merged in ezhulenev/eigen-02 (pull request PR-534)
...
Fix cxx11_tensor_{block_access, reduction} tests
2018-10-25 18:34:35 +00:00
Eugene Zhulenev
8a977c1f46
Fix cxx11_tensor_{block_access, reduction} tests
2018-10-25 11:31:29 -07:00
Christoph Hertzberg
449ff74672
Fix most Doxygen warnings. Also add links to stable documentation from unsupported modules (by using the corresponding Doxytags file).
...
Manually grafted from d107a371c6
2018-10-19 21:10:28 +02:00
Christoph Hertzberg
40fa6f98bf
bug #1606 : Explicitly set the standard before find_package(StandardMathLibrary). Also replace EIGEN_COMPILER_SUPPORT_CXX11 in favor of EIGEN_COMPILER_SUPPORT_CPP11.
...
Grafted manually from a4afa90d16
2018-10-19 17:20:51 +02:00
Rasmus Munk Larsen
dda68f56ec
Fix GPU build due to gpu_assert not always being defined.
2018-10-18 16:29:29 -07:00
Eugene Zhulenev
9e96e91936
Move from rvalue arguments in ThreadPool enqueue* methods
2018-10-16 16:48:32 -07:00
Eugene Zhulenev
217d839816
Reduce thread scheduling overhead in parallelFor
2018-10-16 14:53:06 -07:00
Rasmus Munk Larsen
d52763bb4f
Merged in ezhulenev/eigen-02 (pull request PR-528)
...
[TensorBlockIO] Check if it's allowed to squeeze inner dimensions
Approved-by: Rasmus Munk Larsen <rmlarsen@google.com>
2018-10-16 15:39:40 +00:00
Eugene Zhulenev
900c7c61bb
Check if it's allowed to squueze inner dimensions in TensorBlockIO
2018-10-15 16:52:33 -07:00
Gael Guennebaud
f0fb95135d
Iterative solvers: unify and fix handling of multiple rhs.
...
m_info was not properly computed and the logic was repeated in several places.
2018-10-15 23:47:46 +02:00
Gael Guennebaud
2747b98cfc
DGMRES: fix null rhs, fix restart, fix m_isDeflInitialized for multiple solve
2018-10-15 23:46:00 +02:00
Gael Guennebaud
d835a0bf53
relax number of iterations checks to avoid false negatives
2018-10-15 10:23:32 +02:00
Gael Guennebaud
8214cf1896
Make sparse_basic includable from sparse_extra, but disable it since sparse_basic(DynamicSparseMatrix) does not compile at all anyways
2018-10-11 10:27:23 +02:00
Christoph Hertzberg
3f2c8b7ff0
Fix a lot of Doxygen warnings in Tensor module
2018-10-09 20:22:47 +02:00
Gael Guennebaud
93a6192e98
fix mpreal for mpfr<4.0.0
2018-10-09 09:15:22 +02:00
Rasmus Munk Larsen
d16634c4d4
Fix out-of bounds access in TensorArgMax.h.
2018-10-08 16:41:36 -07:00
Rasmus Munk Larsen
1a737e1d6a
Fix contraction test.
2018-10-08 16:37:07 -07:00
Gael Guennebaud
2eda9783de
typo
2018-10-08 21:37:46 +02:00
Gael Guennebaud
6cc9b2c831
fix warning in mpreal.h
2018-10-08 18:25:37 +02:00
Gael Guennebaud
e29bfe8479
Update included mpreal header to 3.6.5 and fix deprecated warnings.
2018-10-08 17:09:23 +02:00
Gael Guennebaud
64b1a15318
Workaround stupid warning
2018-10-08 12:01:18 +02:00
Christoph Hertzberg
c5f1d0a72a
Fix shadow warning
2018-10-02 19:01:08 +02:00
Christoph Hertzberg
b92c71235d
Move struct outside of method for C++03 compatibility.
2018-10-02 18:59:10 +02:00
Christoph Hertzberg
051f9c1aff
Make code compile in C++03 mode again
2018-10-02 18:36:30 +02:00
Christoph Hertzberg
b786ce8c72
Fix conversion warning ... again
2018-10-02 18:35:25 +02:00
Christoph Hertzberg
564ca71e39
Merged in deven-amd/eigen/HIP_fixes (pull request PR-518)
...
PR with HIP specific fixes (for the eigen nightly regression failures in HIP mode)
2018-10-01 16:51:04 +00:00
Deven Desai
94898488a6
This commit contains the following (HIP specific) updates:
...
- unsupported/Eigen/CXX11/src/Tensor/TensorReductionGpu.h
Changing "pass-by-reference" argument to be "pass-by-value" instead
(in a __global__ function decl).
"pass-by-reference" arguments to __global__ functions are unwise,
and will be explicitly flagged as errors by the newer versions of HIP.
- Eigen/src/Core/util/Memory.h
- unsupported/Eigen/CXX11/src/Tensor/TensorContraction.h
Changes introduced in recent commits breaks the HIP compile.
Adding EIGEN_DEVICE_FUNC attribute to some functions and
calling ::malloc/free instead of the corresponding std:: versions
to get the HIP compile working again
- unsupported/Eigen/CXX11/src/Tensor/TensorReduction.h
Change introduced a recent commit breaks the HIP compile
(link stage errors out due to failure to inline a function).
Disabling the recently introduced code (only for HIP compile), to get
the eigen nightly testing going again.
Will submit another PR once we have te proper fix.
- Eigen/src/Core/util/ConfigureVectorization.h
Enabling GPU VECTOR support when HIP compiler is in use
(for both the host and device compile phases)
2018-10-01 14:28:37 +00:00