Michael Figurnov
4bd158fa37
Derivative of the incomplete Gamma function and the sample of a Gamma random variable.
...
In addition to igamma(a, x), this code implements:
* igamma_der_a(a, x) = d igamma(a, x) / da -- derivative of igamma with respect to the parameter
* gamma_sample_der_alpha(alpha, sample) -- reparameterization derivative of a Gamma(alpha, 1) random variable sample with respect to the alpha parameter
The derivatives are computed by forward mode differentiation of the igamma(a, x) code. Although gamma_sample_der_alpha can be implemented via igamma_der_a, a separate function is more accurate and efficient due to analytical cancellation of some terms. All three functions are implemented by a method parameterized with "mode" that always computes the derivatives, but does not return them unless required by the mode. The compiler is expected to (and, based on benchmarks, does) skip the unnecessary computations depending on the mode.
2018-06-06 18:49:26 +01:00
Deven Desai
8fbd47052b
Adding support for using Eigen in HIP kernels.
...
This commit enables the use of Eigen on HIP kernels / AMD GPUs. Support has been added along the same lines as what already exists for using Eigen in CUDA kernels / NVidia GPUs.
Application code needs to explicitly define EIGEN_USE_HIP when using Eigen in HIP kernels. This is because some of the CUDA headers get picked up by default during Eigen compile (irrespective of whether or not the underlying compiler is CUDACC/NVCC, for e.g. Eigen/src/Core/arch/CUDA/Half.h). In order to maintain this behavior, the EIGEN_USE_HIP macro is used to switch to using the HIP version of those header files (see Eigen/Core and unsupported/Eigen/CXX11/Tensor)
Use the "-DEIGEN_TEST_HIP" cmake option to enable the HIP specific unit tests.
2018-06-06 10:12:58 -04:00
Michael Figurnov
f216854453
Exponentially scaled modified Bessel functions of order zero and one.
...
The functions are conventionally called i0e and i1e. The exponentially scaled version is more numerically stable. The standard Bessel functions can be obtained as i0(x) = exp(|x|) i0e(x)
The code is ported from Cephes and tested against SciPy.
2018-05-31 15:34:53 +01:00
Gael Guennebaud
647b724a36
Define pcast<> for SSE types even when AVX is enabled. (otherwise float are silently reinterpreted as int instead of being converted)
2018-05-29 20:46:46 +02:00
Gael Guennebaud
49262dfee6
Fix compilation and SSE support with PGI compiler
2018-05-29 15:09:31 +02:00
Gael Guennebaud
f0862b062f
Fix internal::is_integral<size_t/ptrdiff_t> with MSVC 2013 and older.
2018-05-22 19:29:51 +02:00
Gael Guennebaud
36e413a534
Workaround a MSVC 2013 compilation issue with MatrixBase(Index,int)
2018-05-22 18:51:35 +02:00
Gael Guennebaud
725bd92903
fix stupid typo
2018-05-18 17:46:43 +02:00
Gael Guennebaud
a382bc9364
is_convertible<T,Index> does not seems to work well with MSVC 2013, so let's rather use __is_enum(T) for old MSVC versions
2018-05-18 17:02:27 +02:00
Gael Guennebaud
4dd767f455
add some internal checks
2018-05-18 13:59:55 +02:00
Mark D Ryan
405859f18d
Set EIGEN_IDEAL_MAX_ALIGN_BYTES correctly for AVX512 builds
...
bug #1548
The macro EIGEN_IDEAL_MAX_ALIGN_BYTES is being incorrectly set to 32
on AVX512 builds. It should be set to 64. In the current code it is
only set to 64 if the macro EIGEN_VECTORIZE_AVX512 is defined. This
macro does get defined in AVX512 builds in Core, but only after Macros.h,
the file that defines EIGEN_IDEAL_MAX_ALIGN_BYTES, has been included.
This commit fixes the issue by setting EIGEN_IDEAL_MAX_ALIGN_BYTES to
64 if __AVX512F__ is defined.
2018-05-17 17:04:00 +01:00
Gael Guennebaud
7134fa7a2e
Fix compilation with MSVC by reverting to char* for _mm_prefetch except for PGI (the later being the one that has the wrong prototype).
2018-06-07 09:33:10 +02:00
Robert Lukierski
b2053990d0
Adding EIGEN_DEVICE_FUNC to Products, especially Dense2Dense Assignment
...
specializations. Otherwise causes problems with small fixed size matrix multiplication (call to
0x00 in call_assignment_no_alias in debug mode or trap in release with CUDA 9.1).
2018-03-14 16:19:43 +00:00
Jeff Trull
9f0c5c3669
Make sparse QR result sizes consistent with dense QR, with the following rules:
...
1) Q is always square
2) Q*R*P' is valid and recovers the original matrix
This implies that the size of Q is the number of rows in the original matrix, square,
and that the size of R is the size of the original matrix.
2018-02-15 15:00:31 -08:00
Christoph Hertzberg
d655900953
bug #1544 : Generate correct Q matrix in complex case. Original patch was by Jeff Trull in PR-386.
2018-05-17 19:17:01 +02:00
Christoph Hertzberg
0272f2451a
Fix "suggest parentheses around comparison" warning
2018-05-15 19:35:53 +02:00
Gael Guennebaud
6e7118265d
Fix compilation with NEON+MSVC
2018-04-26 10:50:41 +02:00
Gael Guennebaud
8810baaed4
Add multi-threading for sparse-row-major * dense-row-major
2018-04-25 10:14:48 +02:00
Gael Guennebaud
e8ca5166a9
bug #1428 : atempt to make NEON vectorization compilable by MSVC.
...
The workaround is to wrap NEON packet types to make them different c++ types.
2018-04-24 11:19:49 +02:00
Benoit Steiner
6f5935421a
fix AVX512 plog
2018-04-23 15:49:26 +00:00
Gael Guennebaud
e9da464e20
Add specializations of is_arithmetic for long long in c++11
2018-04-23 16:26:29 +02:00
Gael Guennebaud
a57e6e5f0f
workaround MSVC 2013 compilation issue (ambiguous call)
2018-04-23 15:31:51 +02:00
Gael Guennebaud
11123175db
typo in doc
2018-04-23 15:30:35 +02:00
Gael Guennebaud
5679e439e0
bug #1543 : fix linear indexing in generic block evaluation (this completes the fix in commit 12efc7d41b
...
)
2018-04-23 14:40:16 +02:00
Christoph Hertzberg
34e499ad36
Disable -Wshadow when compiling with g++
2018-04-21 22:08:26 +02:00
Jayaram Bobba
b7b868d1c4
fix AVX512 plog
2018-04-20 13:39:18 -07:00
Gael Guennebaud
686fb57233
fix const cast in NEON
2018-04-18 18:46:34 +02:00
Dmitriy Korchemkin
02d2f1cb4a
Cast zeros to Scalar in RealSchur
2018-04-18 13:52:46 +03:00
Christoph Hertzberg
50633d1a83
Renamed .trans() et al. to .reverseFlag() et at. Adapted documentation of .setReverseFlag()
2018-04-17 11:30:27 +02:00
nicolov
39c2cba810
Add a specialization of Eigen::numext::conj for std::complex<T> to be used when compiling a cuda kernel. This fixes the compilation of TensorFlow 1.4 with clang 6.0 used as CUDA compiler with libc++.
...
This follows the previous change in 2a69290ddb
, which mentions OSX (I guess because it uses libc++ too).
2018-04-13 22:29:10 +00:00
Christoph Hertzberg
42715533f1
bug #1493 : Make representation of HouseholderSequence consistent and working for complex numbers. Made corresponding unit test actually test that. Also simplify implementation of QR decompositions
2018-04-15 10:15:28 +02:00
Christoph Hertzberg
4d392d93aa
Make hypot_impl compile again for types with expression-templates (e.g., boost::multiprecision)
2018-04-13 19:01:37 +02:00
Christoph Hertzberg
072e111ec0
SelfAdjointView<...,Mode> causes a static assert since commit d820ab9edc
2018-04-13 19:00:34 +02:00
Gael Guennebaud
7a9089c33c
fix linking issue
2018-04-13 08:51:47 +02:00
Gael Guennebaud
e43ca0320d
bug #1520 : workaround some -Wfloat-equal warnings by calling std::equal_to
2018-04-11 15:24:13 +02:00
Gael Guennebaud
c91906b065
Umfpack: UF_long has been removed in recent versions of suitesparse, and fix a few long-to-int conversions issues.
2018-04-11 09:59:59 +02:00
Gael Guennebaud
0050709ea7
Merged in v_huber/eigen (pull request PR-378)
...
Add interface to umfpack_*l_* functions
2018-04-11 07:43:04 +00:00
Guillaume Jacob
8c1652055a
Fix code sample output in block(int, int, int, int) doxygen
2018-04-09 17:23:59 +02:00
Gael Guennebaud
add15924ac
Fix MKL backend for symmetric eigenvalues on row-major matrices.
2018-04-09 13:29:26 +02:00
Gael Guennebaud
04b1628e55
Add missing empty line.
2018-04-09 13:28:31 +02:00
Gael Guennebaud
2f833b1c64
bug #1509 : fix computeInverseWithCheck for complexes
2018-04-04 15:47:46 +02:00
Gael Guennebaud
b903fa74fd
Extend list of MSVC versions
2018-04-04 15:14:09 +02:00
Gael Guennebaud
403f09ccef
Make stableNorm and blueNorm compatible with 2D matrices.
2018-04-04 15:13:31 +02:00
Gael Guennebaud
4213b63f5c
Factories code between numext::hypot and scalar_hyot_op functor.
2018-04-04 15:12:43 +02:00
Gael Guennebaud
368dd4cd9d
Make innerVector() and innerVectors() methods available to all expressions supported by Block.
...
Before, only SparseBase exposed such methods.
2018-04-04 15:09:21 +02:00
Gael Guennebaud
e116f6847e
bug #1521 : avoid signalling NaN in hypot and make it std::complex<> friendly.
2018-04-04 13:47:23 +02:00
Gael Guennebaud
13f5df9f67
Add a note on vec_min vs asm
2018-04-04 13:10:38 +02:00
Gael Guennebaud
e91e314347
bug #1494 : makes pmin/pmax behave on Altivec/VSX as on x86 regading NaNs
2018-04-04 11:39:19 +02:00
Gael Guennebaud
112c899304
comment unreachable code
2018-04-03 23:16:43 +02:00
Gael Guennebaud
a1292395d6
Fix compilation of product with inverse transpositions (e.g., mat * Transpositions().inverse())
2018-04-03 23:06:44 +02:00
Gael Guennebaud
8c7b5158a1
commit 45e9c9996da790b55ed9c4b0dfeae49492ac5c46 (HEAD -> memory_fix)
...
Author: George Burgess IV <gbiv@google.com>
Date: Thu Mar 1 11:20:24 2018 -0800
Prefer `::operator new` to `new`
The C++ standard allows compilers much flexibility with `new`
expressions, including eliding them entirely
(https://godbolt.org/g/yS6i91 ). However, calls to `operator new` are
required to be treated like opaque function calls.
Since we're calling `new` for side-effects other than allocating heap
memory, we should prefer the less flexible version.
Signed-off-by: George Burgess IV <gbiv@google.com>
2018-04-03 17:15:38 +02:00
Gael Guennebaud
dd4cc6bd9e
bug #1527 : fix support for MKL's VML (destination was not properly resized)
2018-04-03 17:11:15 +02:00
Gael Guennebaud
c5b56f1fb2
bug #1528 : better use numeric_limits::min() instead of 1/highest() that with underflow.
2018-04-03 16:49:35 +02:00
Gael Guennebaud
8d0ffe3655
bug #1516 : add assertion for out-of-range diagonal index in MatrixBase::diagonal(i)
2018-04-03 16:15:43 +02:00
Gael Guennebaud
407e3e2621
bug #1532 : disable stl::*_negate in C++17 (they are deprecated)
2018-04-03 15:59:30 +02:00
Gael Guennebaud
40b4bf3d32
AVX512: _mm512_rsqrt28_ps is available for AVX512ER only
2018-04-03 14:36:27 +02:00
Gael Guennebaud
584951ca4d
Rename predux_downto4 to be more accurate on its semantic.
2018-04-03 14:28:38 +02:00
Gael Guennebaud
7b0630315f
AVX512: fix psqrt and prsqrt
2018-04-03 14:12:50 +02:00
Gael Guennebaud
6719409cd9
AVX512: add missing pinsertfirst and pinsertlast, implement pblend for Packet8d, fix compilation without AVX512DQ
2018-04-03 14:11:56 +02:00
vhuber
267a144da5
Remove unnecessary define
2018-03-30 23:04:53 +02:00
vhuber
baf9a5a776
Add interface to umfpack_*l_* functions
2018-03-30 18:53:34 +02:00
luz.paz
e3912f5e63
MIsc. source and comment typos
...
Found using `codespell` and `grep` from downstream FreeCAD
2018-03-11 10:01:44 -04:00
Basil Fierz
624df50945
Adds missing EIGEN_STRONG_INLINE to support MSVC properly inlining small vector calculations
...
When working with MSVC often small vector operations are not properly inlined. This behaviour is observed even on the most recent compiler versions.
2017-10-26 22:44:28 +02:00
Benoit Steiner
d2631ef61d
Merged in facaiy/eigen/ENH/exp_support_complex_for_gpu (pull request PR-359)
...
ENH: exp supports complex type for cuda
2018-03-23 00:59:15 +00:00
Benoit Steiner
8fcbd6d4c9
Merged in dtrebbien/eigen (pull request PR-369)
...
Move up the specialization of std::numeric_limits
2018-03-23 00:54:58 +00:00
Gael Guennebaud
f7d17689a5
Add static assertion for fixed sizes Ref<>
2018-03-09 10:11:13 +01:00
Gael Guennebaud
f6be7289d7
Implement better static assertion checking to make sure that the first assertion is a static one and not a runtime one.
2018-03-09 10:00:51 +01:00
Gael Guennebaud
d820ab9edc
Add static assertion on selfadjoint-view's UpLo parameter.
2018-03-09 09:33:43 +01:00
Daniel Trebbien
0c57be407d
Move up the specialization of std::numeric_limits
...
This fixes a compilation error seen when building TensorFlow on macOS:
https://github.com/tensorflow/tensorflow/issues/17067
2018-02-18 15:35:45 -08:00
Gael Guennebaud
adb134d47e
Fix implicit conversion from 0.0 to scalar
2018-02-16 22:26:01 +04:00
Gael Guennebaud
5deeb19e7b
bug #1517 : fix triangular product with unit diagonal and nested scaling factor: (s*A).triangularView<UpperUnit>()*B
2018-02-09 16:52:35 +01:00
Gael Guennebaud
12efc7d41b
Fix linear indexing in generic block evaluation.
2018-02-09 16:45:49 +01:00
Gael Guennebaud
f4a6863c75
Fix typo
2018-02-09 16:43:49 +01:00
Gael Guennebaud
09a16ba42f
bug #1412 : fix compilation with nvcc+MSVC
2018-01-17 23:13:16 +01:00
Yan Facai (颜发才)
42a8334668
ENH: exp supports complex type for cuda
2018-01-04 16:01:01 +08:00
Eugene Chereshnev
f558ad2955
Fix incorrect ldvt in LAPACKE call from JacobiSVD
2018-01-03 12:55:52 -08:00
Gael Guennebaud
73629f8b68
Fix gcc7 warning
2018-01-09 08:59:27 +01:00
nluehr
f9bdcea022
For cuda 9.1 replace math_functions.hpp with cuda_runtime.h
2017-12-18 16:51:15 -08:00
Gael Guennebaud
06bf1047f9
Fix compilation of stableNorm with some expressions as input
2017-12-15 15:15:37 +01:00
Gael Guennebaud
546ab97d76
Add possibility to overwrite EIGEN_STRONG_INLINE.
2017-12-14 14:47:38 +01:00
Gael Guennebaud
9c3aed9d48
Fix packet and alignment propagation logic of Block<Xpr> expressions. In particular, (A+B).col(j) lost vectorisation.
2017-12-14 14:24:33 +01:00
nluehr
aefd5fd5c4
Replace __float2half_rn with __float2half
...
The latter provides a consistent definition for CUDA 8.0 and 9.0.
2017-11-28 10:15:46 -08:00
Gael Guennebaud
d0b028e173
clarify Pastix requirements
2017-11-27 22:11:57 +01:00
Gael Guennebaud
3587e481fb
silent MSVC warning
2017-11-27 21:53:02 +01:00
nluehr
dd6de618c3
Fix incorrect integer cast in predux<half2>().
...
Bug corrupts results on Maxwell and earlier GPU architectures.
2017-11-21 10:47:00 -08:00
Gael Guennebaud
672bdc126b
bug #1479 : fix failure detection in LDLT
2017-11-16 17:55:24 +01:00
Gael Guennebaud
7cc503f9f5
bug #1485 : fix linking issue of non template functions
2017-11-15 21:33:37 +01:00
Gael Guennebaud
00bc67c374
Move KLU support to official
2017-11-10 14:11:22 +01:00
Gael Guennebaud
1495b98a8e
Merged in spraetor/eigen (pull request PR-305)
...
Issue with mpreal and std::numeric_limits::digits
2017-11-10 10:28:54 +00:00
Gael Guennebaud
d306b96fb7
Merged in carpent/eigen (pull request PR-342)
...
Use col method for column-major matrix
2017-11-10 10:09:53 +00:00
Gael Guennebaud
f86bb89d39
Add EIGEN_MKL_NO_DIRECT_CALL option
2017-11-09 11:07:45 +01:00
Gael Guennebaud
5fa79f96b8
Patch from Konstantin Arturov to enable MKL's direct call by default
2017-11-09 10:58:38 +01:00
Gael Guennebaud
4c03b3511e
Fix issue with boost::multiprec in previous commit
2017-11-08 23:28:01 +01:00
Gael Guennebaud
e9d2888e74
Improve debugging tests and output in BDCSVD
2017-11-08 10:26:03 +01:00
Gael Guennebaud
e8468ea91b
Fix overflow issues in BDCSVD
2017-11-08 10:24:28 +01:00
Christoph Hertzberg
11ddac57e5
Merged in guillaume_michel/eigen (pull request PR-334)
...
- Add support for NEON plog PacketMath function
2017-10-23 13:22:22 +00:00
Benoit Steiner
f16ba2a630
Merged in LaFeuille/eigen-1/LaFeuille/typo-fix-alignmeent-alignment-1505889397887 (pull request PR-335)
...
Typo fix alignmeent ->alignment
2017-10-21 01:59:55 +00:00
Henry Schreiner
9bb26eb8f1
Restore __device__
2017-10-21 00:50:38 +00:00
Henry Schreiner
4245475d22
Fixing missing inlines on device functions for newer CUDA cards
2017-10-20 03:20:13 +00:00
Justin Carpentier
a020d9b134
Use col method for column-major matrix
2017-10-17 21:51:27 +02:00