Deven Desai
8fbd47052b
Adding support for using Eigen in HIP kernels.
...
This commit enables the use of Eigen on HIP kernels / AMD GPUs. Support has been added along the same lines as what already exists for using Eigen in CUDA kernels / NVidia GPUs.
Application code needs to explicitly define EIGEN_USE_HIP when using Eigen in HIP kernels. This is because some of the CUDA headers get picked up by default during Eigen compile (irrespective of whether or not the underlying compiler is CUDACC/NVCC, for e.g. Eigen/src/Core/arch/CUDA/Half.h). In order to maintain this behavior, the EIGEN_USE_HIP macro is used to switch to using the HIP version of those header files (see Eigen/Core and unsupported/Eigen/CXX11/Tensor)
Use the "-DEIGEN_TEST_HIP" cmake option to enable the HIP specific unit tests.
2018-06-06 10:12:58 -04:00
Benoit Steiner
e206f8d4a4
Merged in mfigurnov/eigen (pull request PR-400)
...
Exponentially scaled modified Bessel functions of order zero and one.
Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com>
2018-06-05 17:05:21 +00:00
Penporn Koanantakool
e2ed0cf8ab
Add a ThreadPoolInterface* getter for ThreadPoolDevice.
2018-06-02 12:07:49 -07:00
Gael Guennebaud
84868da904
Don't run hg on non mercurial clone
2018-05-31 21:21:57 +02:00
Michael Figurnov
f216854453
Exponentially scaled modified Bessel functions of order zero and one.
...
The functions are conventionally called i0e and i1e. The exponentially scaled version is more numerically stable. The standard Bessel functions can be obtained as i0(x) = exp(|x|) i0e(x)
The code is ported from Cephes and tested against SciPy.
2018-05-31 15:34:53 +01:00
Gael Guennebaud
6af1433cb5
Doc: add aliasing in common pitfaffs.
2018-05-29 22:37:47 +02:00
Katrin Leinweber
ea94543190
Hyperlink DOIs against preferred resolver
2018-05-24 18:55:40 +02:00
Gael Guennebaud
999b552c16
Search for sequential Pastix.
2018-05-29 20:49:25 +02:00
Gael Guennebaud
eef4b7bd87
Fix handling of path names containing spaces and the likes.
2018-05-29 20:49:06 +02:00
Gael Guennebaud
647b724a36
Define pcast<> for SSE types even when AVX is enabled. (otherwise float are silently reinterpreted as int instead of being converted)
2018-05-29 20:46:46 +02:00
Gael Guennebaud
49262dfee6
Fix compilation and SSE support with PGI compiler
2018-05-29 15:09:31 +02:00
Christoph Hertzberg
750af06362
Add an option to test with external BLAS library
2018-05-22 21:04:32 +02:00
Christoph Hertzberg
d06a753d10
Make qr_fullpivoting unit test run for fixed-sized matrices
2018-05-22 20:29:17 +02:00
Gael Guennebaud
f0862b062f
Fix internal::is_integral<size_t/ptrdiff_t> with MSVC 2013 and older.
2018-05-22 19:29:51 +02:00
Gael Guennebaud
36e413a534
Workaround a MSVC 2013 compilation issue with MatrixBase(Index,int)
2018-05-22 18:51:35 +02:00
Gael Guennebaud
725bd92903
fix stupid typo
2018-05-18 17:46:43 +02:00
Gael Guennebaud
a382bc9364
is_convertible<T,Index> does not seems to work well with MSVC 2013, so let's rather use __is_enum(T) for old MSVC versions
2018-05-18 17:02:27 +02:00
Gael Guennebaud
4dd767f455
add some internal checks
2018-05-18 13:59:55 +02:00
Gael Guennebaud
345c0ab450
check that all integer types are properly handled by mat(i,j)
2018-05-18 13:46:46 +02:00
Jeff Trull
e7147f69ae
Add tests for sparseQR results (value and size) covering bugs #1522 and #1544
2018-04-21 10:26:30 -07:00
Jeff Trull
9f0c5c3669
Make sparse QR result sizes consistent with dense QR, with the following rules:
...
1) Q is always square
2) Q*R*P' is valid and recovers the original matrix
This implies that the size of Q is the number of rows in the original matrix, square,
and that the size of R is the size of the original matrix.
2018-02-15 15:00:31 -08:00
Christoph Hertzberg
d655900953
bug #1544 : Generate correct Q matrix in complex case. Original patch was by Jeff Trull in PR-386.
2018-05-17 19:17:01 +02:00
Benoit Steiner
0371380d5b
Merged in rmlarsen/eigen2 (pull request PR-393)
...
Rename scalar_clip_op to scalar_clamp_op to prevent collision with existing functor in TensorFlow.
2018-05-16 21:45:42 +00:00
Rasmus Munk Larsen
b8d36774fa
Rename clip2 to clamp.
2018-05-16 14:04:48 -07:00
Rasmus Munk Larsen
812480baa3
Rename scalar_clip_op to scalar_clip2_op to prevent collision with existing functor in TensorFlow.
2018-05-16 09:49:24 -07:00
Benoit Steiner
1403c2c15b
Merged in didierjansen/eigen (pull request PR-360)
...
Fix bugs and typos in the contraction example of the tensor README
2018-05-16 01:16:36 +00:00
Benoit Steiner
ad355b3f05
Merged in rmlarsen/eigen2 (pull request PR-392)
...
Add vectorized clip functor for Eigen Tensors
2018-05-16 01:15:56 +00:00
Christoph Hertzberg
0272f2451a
Fix "suggest parentheses around comparison" warning
2018-05-15 19:35:53 +02:00
Rasmus Munk Larsen
afec3021f7
Use numext::maxi & numext::mini.
2018-05-14 16:35:39 -07:00
Rasmus Munk Larsen
b8c8e5f436
Add vectorized clip functor for Eigen Tensors.
2018-05-14 16:07:13 -07:00
Benoit Steiner
6118c6ff4f
Enable RawAccess to tensor slices whenever possinle.
...
Avoid 32-bit integer overflow in TensorSlicingOp
2018-04-30 11:28:12 -07:00
Gael Guennebaud
6e7118265d
Fix compilation with NEON+MSVC
2018-04-26 10:50:41 +02:00
Gael Guennebaud
097dd4616d
Fix unit test for SIMD engine not supporting sqrt
2018-04-26 10:47:39 +02:00
Gael Guennebaud
8810baaed4
Add multi-threading for sparse-row-major * dense-row-major
2018-04-25 10:14:48 +02:00
Gael Guennebaud
2f3287da7d
Fix "used uninitialized" warnings
2018-04-24 17:17:25 +02:00
Gael Guennebaud
3ffd449ef5
Workaround warning
2018-04-24 17:11:51 +02:00
Gael Guennebaud
e8ca5166a9
bug #1428 : atempt to make NEON vectorization compilable by MSVC.
...
The workaround is to wrap NEON packet types to make them different c++ types.
2018-04-24 11:19:49 +02:00
Benoit Steiner
6f5935421a
fix AVX512 plog
2018-04-23 15:49:26 +00:00
Gael Guennebaud
e9da464e20
Add specializations of is_arithmetic for long long in c++11
2018-04-23 16:26:29 +02:00
Gael Guennebaud
a57e6e5f0f
workaround MSVC 2013 compilation issue (ambiguous call)
2018-04-23 15:31:51 +02:00
Gael Guennebaud
11123175db
typo in doc
2018-04-23 15:30:35 +02:00
Gael Guennebaud
5679e439e0
bug #1543 : fix linear indexing in generic block evaluation (this completes the fix in commit 12efc7d41b
...
)
2018-04-23 14:40:16 +02:00
Gael Guennebaud
35b31353ab
Fix unit test
2018-04-22 22:49:08 +02:00
Christoph Hertzberg
34e499ad36
Disable -Wshadow when compiling with g++
2018-04-21 22:08:26 +02:00
Jayaram Bobba
b7b868d1c4
fix AVX512 plog
2018-04-20 13:39:18 -07:00
Gael Guennebaud
686fb57233
fix const cast in NEON
2018-04-18 18:46:34 +02:00
Dmitriy Korchemkin
02d2f1cb4a
Cast zeros to Scalar in RealSchur
2018-04-18 13:52:46 +03:00
Christoph Hertzberg
50633d1a83
Renamed .trans() et al. to .reverseFlag() et at. Adapted documentation of .setReverseFlag()
2018-04-17 11:30:27 +02:00
nicolov
39c2cba810
Add a specialization of Eigen::numext::conj for std::complex<T> to be used when compiling a cuda kernel. This fixes the compilation of TensorFlow 1.4 with clang 6.0 used as CUDA compiler with libc++.
...
This follows the previous change in 2a69290ddb
, which mentions OSX (I guess because it uses libc++ too).
2018-04-13 22:29:10 +00:00
Christoph Hertzberg
775766d175
Add parenthesis to fix compiler warnings
2018-04-15 18:43:56 +02:00