Rasmus Munk Larsen
acce4dd050
Change Eigen's ColPivHouseholderQR to use the numerically stable norm downdate formula from http://www.netlib.org/lapack/lawnspdf/lawn176.pdf , which has been used in LAPACK's xGEQPF and xGEQP3 since 2006. With the old formula, the code chooses the wrong pivots and fails to correctly determine rank on graded matrices.
...
This change also adds additional checks for non-increasing diagonal in R11 to existing unit tests, and adds a new unit test with the Kahan matrix, which consistently fails for the original code.
Benchmark timings on Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz. Code compiled with AVX & FMA. I just ran on square matrices of 3 difference sizes.
Benchmark Time(ns) CPU(ns) Iterations
-------------------------------------------------------
Before:
BM_EigencolPivQR/64 53677 53627 12890
BM_EigencolPivQR/512 15265408 15250784 46
BM_EigencolPivQR/4k 15403556228 15388788368 2
After (non-vectorized version):
Benchmark Time(ns) CPU(ns) Iterations Degradation
--------------------------------------------------------------------
BM_EigencolPivQR/64 63736 63669 10844 18.5%
BM_EigencolPivQR/512 16052546 16037381 43 5.1%
BM_EigencolPivQR/4k 15149263620 15132025316 2 -2.0%
Performance-wise there seems to be a ~18.5% degradation for small (64x64) matrices, probably due to the cost of more O(min(m,n)^2) sqrt operations that are not needed for the unstable formula.
2016-01-28 15:07:26 -08:00
Gael Guennebaud
b908e071a8
bug #178 : get rid of some const_cast in SparseCore
2016-01-28 22:11:18 +01:00
Gael Guennebaud
c1d900af61
bug #178 : remove additional const on nested expression, and remove several const_cast.
2016-01-28 21:43:20 +01:00
Gael Guennebaud
f50bb1e6f3
Fix compilation with gcc
2016-01-28 13:25:26 +01:00
Gael Guennebaud
ddf64babde
merge
2016-01-28 13:21:48 +01:00
Gael Guennebaud
df15fbc452
bug #1158 : PartialReduxExpr is a vector expression, and it thus must expose the LinearAccessBit flag
2016-01-28 13:16:30 +01:00
Gael Guennebaud
9bcadb7fd1
Disable stupid MSVC warning
2016-01-28 12:14:16 +01:00
Gael Guennebaud
b4d87fff4a
Fix MSVC warning.
2016-01-28 12:12:30 +01:00
Gael Guennebaud
2bad3e78d9
bug #96 , bug #1006 : fix by value argument in result_of.
2016-01-28 12:12:06 +01:00
Benoit Steiner
291069e885
Fixed some compilation problems with nvcc + clang
2016-01-27 15:37:03 -08:00
Gael Guennebaud
4865e1e732
Update link to suitesparse.
2016-01-27 22:48:40 +01:00
Eugene Brevdo
c8d94ae944
digamma special function: merge shared code.
...
Moved type-specific code into a helper class digamma_impl_maybe_poly<Scalar>.
2016-01-27 09:52:29 -08:00
Gael Guennebaud
9c8f7dfe94
bug #1156 : fix several function declarations whose arguments were passed by value instead of being passed by reference
2016-01-27 18:34:42 +01:00
Gael Guennebaud
9aa6fae123
bug #1154 : move to dynamic scheduling for spmv products.
2016-01-27 18:03:51 +01:00
Gael Guennebaud
9801c959e6
Fix tri = complex * real product, and add respective unit test.
2016-01-27 17:12:25 +01:00
Gael Guennebaud
21b5345782
Add meta_least_common_multiple helper.
2016-01-27 17:11:39 +01:00
Gael Guennebaud
fecea26d93
Extend doc on shifting strategy
2016-01-27 15:55:15 +01:00
Gael Guennebaud
cfa21f8123
Remove dead code.
2016-01-26 23:33:15 +01:00
Gael Guennebaud
6850eab33b
Re-enable blocking on rows in non-l3 blocking mode.
2016-01-26 23:32:48 +01:00
Gael Guennebaud
aa8c6a251e
Make sure that micro-panel-size is smaller than blocking sizes (otherwise we might get a buffer overflow)
2016-01-26 23:31:48 +01:00
Gael Guennebaud
5b0a9ee003
Make sure that block sizes are smaller than input matrix sizes.
2016-01-26 23:30:24 +01:00
Christoph Hertzberg
44d4674955
bug #1153 : Don't rely on __GXX_EXPERIMENTAL_CXX0X__ to detect C++11 support
2016-01-26 16:45:33 +01:00
Gael Guennebaud
8328caa618
bug #51 : add block preallocation mechanism to selfadjoit*matrix product.
2016-01-25 22:06:42 +01:00
Gael Guennebaud
e58827d2ed
bug #51 : make general_matrix_matrix_triangular_product use L3-blocking helper so that general symmetric rank-updates and general-matrix-to-triangular products do not trigger dynamic memory allocation for fixed size matrices.
2016-01-25 17:16:33 +01:00
Gael Guennebaud
b114e6fd3b
Improve documentation.
2016-01-25 11:56:25 +01:00
Gael Guennebaud
869b4443ac
Add SparseVector::conservativeResize() method.
2016-01-25 11:55:39 +01:00
Gael Guennebaud
acf6f7af6b
Merged in larsmans/eigen (pull request PR-156)
...
Documentation fixes
2016-01-24 22:28:49 +01:00
Lars Buitinck
cc482e32f1
Method is called visit, not visitor
2016-01-24 15:50:59 +01:00
Gael Guennebaud
1cf85bd875
bug #977 : add stableNormalize[d] methods: they are analogues to normalize[d] but with carefull handling of under/over-flow
2016-01-23 22:40:11 +01:00
Gael Guennebaud
369d6d1ae3
Add link to reference paper.
2016-01-23 22:16:03 +01:00
Gael Guennebaud
0caa4b1531
bug #1150 : make IncompleteCholesky more robust by iteratively increase the shift until the factorization succeed (with at most 10 attempts).
2016-01-23 22:13:54 +01:00
Gael Guennebaud
5358c38589
bug #1095 : add Cholmod*::logDeterminant/determinant (from patch of Joshua Pritikin)
2016-01-22 16:05:29 +01:00
Gael Guennebaud
06971223ef
Unify std::numeric_limits and device::numeric_limits within numext namespace
2016-01-22 15:02:21 +01:00
Gael Guennebaud
ee37eb4eed
bug #977 : avoid division by 0 in normalize() and normalized().
2016-01-21 20:43:42 +01:00
Gael Guennebaud
7cae8918c0
Fix compilation on old gcc+AVX
2016-01-21 20:30:32 +01:00
Gael Guennebaud
8dca9f97e3
Add numext::sqrt function to enable custom optimized implementation.
...
This changeset add two specializations for float/double on SSE. Those
are mostly usefull with GCC for which std::sqrt add an extra and costly
check on the result of _mm_sqrt_*. Clang does not add this burden.
In this changeset, only DenseBase::norm() makes use of it.
2016-01-21 20:18:51 +01:00
Gael Guennebaud
34340458cb
bug #1151 : remove useless critical section
2016-01-21 14:29:45 +01:00
Gael Guennebaud
ed8ade9c65
bug #1149 : fix Pastix*::*parm()
2016-01-20 19:01:24 +01:00
Gael Guennebaud
4c5e96aab6
bug #1148 : silent Pastix by default
2016-01-20 18:56:17 +01:00
Gael Guennebaud
db237d0c75
bug #1145 : fix PastixSupport LLT/LDLT wrappers (missing resize prior to calls to selfAdjointView)
2016-01-20 18:49:01 +01:00
Gael Guennebaud
0b7169d1f7
bug #1147 : fix compilation of PastixSupport
2016-01-20 18:15:59 +01:00
Gael Guennebaud
234a1094b7
Add static assertion to y(), z(), w() accessors
2016-01-20 09:18:44 +01:00
Eugene Brevdo
6a75e7e0d5
Digamma cleanup
...
* Added permission from cephes author to use his code
* Cleanup in ArrayCwiseUnaryOps
2016-01-15 16:32:21 -08:00
Benoit Steiner
bbdabbb379
Made the blas utils usable from within a cuda kernel
2016-01-11 17:26:56 -08:00
Gael Guennebaud
8b9dc9f0df
bug #1144 : fix regression in x=y+A*x (aliasing), and move evaluator_traits::AssumeAliasing to evaluator_assume_aliasing.
2016-01-09 08:30:38 +01:00
Gael Guennebaud
ee738321aa
rm remaining debug code
2016-01-06 14:49:40 +01:00
Christoph Hertzberg
54bf582303
bug #1143 : Work-around gcc bug
2016-01-06 11:59:24 +01:00
Gael Guennebaud
715f6f049f
Improve inline documentation of SparseCompressedBase and its derived classes
2016-01-03 21:56:30 +01:00
Gael Guennebaud
8b0d1eb0f7
Fix numerous doxygen shortcomings, and workaround some clang -Wdocumentation warnings
2016-01-01 21:45:06 +01:00
Gael Guennebaud
9900782e88
Mark AlignedBit and EvalBeforeNestingBit with deprecated attribute, and remove the remaining usages of EvalBeforeNestingBit.
2015-12-30 16:47:49 +01:00
Gael Guennebaud
70404e07c2
Workaround clang -Wdocumentation warning about "/*<"
2015-12-30 16:46:45 +01:00
Gael Guennebaud
addb7066e8
Workaround "empty paragraph" warning with clang -Wdocumentation
2015-12-30 16:45:44 +01:00
Gael Guennebaud
eadc377b3f
Add missing doc of Derived template parameter
2015-12-30 16:43:19 +01:00
Gael Guennebaud
29bb599e03
Fix numerous doxygen issues in auto-link generation
2015-12-30 16:04:24 +01:00
Gael Guennebaud
25f2b8d824
bug #1141 : add missing initialization of CholmodBase::m_*IsOk
2015-12-29 15:50:11 +01:00
Eugene Brevdo
f2471f31e0
Modify constants in SpecialFunctions to lowercase (avoid name conflicts).
2015-12-28 17:48:38 -08:00
Eugene Brevdo
afb35385bf
Change PI* to M_PI* in SpecialFunctions to avoid possible breakage
...
with external DEFINEs.
2015-12-28 17:34:06 -08:00
Eugene Brevdo
cef81c9084
Merged eigen/eigen into default
2015-12-24 21:17:33 -08:00
Eugene Brevdo
f7362772e3
Add digamma for CPU + CUDA. Includes tests.
2015-12-24 21:15:38 -08:00
Gael Guennebaud
d2e288ae50
Workaround compilers that do not even define _mm256_set_m128.
2015-12-24 16:53:43 +01:00
Benoit Steiner
3504ae47ca
Made it possible to run the lgamma, erf, and erfc functors on a CUDA gpu.
2015-12-21 15:20:06 -08:00
Benoit Steiner
a6c243617b
Fixed a typo in previous change.
2015-12-21 09:05:45 -08:00
Benoit Steiner
51be91f15e
Added support for CUDA architectures that don's support for 3.5 capabilities
2015-12-21 08:42:58 -08:00
Benoit Steiner
6d777e1bc7
Fixed a typo.
2015-12-18 19:25:50 -08:00
Gael Guennebaud
3abd8470ca
bug #1140 : remove custom definition and use of _mm256_setr_m128
2015-12-18 14:18:59 +01:00
Gael Guennebaud
9f9de1aaa9
bump to 3.3-beta1
2015-12-16 21:48:48 +01:00
Gael Guennebaud
ae8b217a01
Update doc to make it clear that only SuperLU 4.x is supported
2015-12-16 10:47:03 +01:00
Gael Guennebaud
140f3a02a8
Fix MKL wrapper for ComplexSchur
2015-12-11 23:31:21 +01:00
Gael Guennebaud
4483c0fdf6
Fix unused variable warning.
2015-12-11 23:29:53 +01:00
Gael Guennebaud
774dba87c8
merge
2015-12-11 23:28:44 +01:00
Gael Guennebaud
c884a8e7f4
merge
2015-12-11 23:07:33 +01:00
Gael Guennebaud
b60a8967f5
bug #1134 : fix JacobiSVD pre-allocation
...
(grafted from f22036f5f8
)
2015-12-11 11:59:11 +01:00
Gael Guennebaud
ca39b1546e
Merged in ebrevdo/eigen (pull request PR-148)
...
Add special functions to eigen: lgamma, erf, erfc.
2015-12-11 11:52:09 +01:00
Gael Guennebaud
82152f2ae6
bug #1132 : add EIGEN_MAPBASE_PLUGIN
2015-12-11 11:43:49 +01:00
Gael Guennebaud
4519fd5d40
Fix MKL compilation issue
2015-12-11 11:11:38 +01:00
Gael Guennebaud
7385e6e2ef
Remove useless explicit
2015-12-11 11:11:19 +01:00
Gael Guennebaud
bcb4f126a7
Fix compilation of PardisoSupport
2015-12-11 11:11:00 +01:00
Gael Guennebaud
30b5c4cd14
Remove useless "explicit", and fix inline/static order.
2015-12-11 10:59:39 +01:00
Gael Guennebaud
79c1e6d0a6
Fix compilation of MKL support.
2015-12-11 10:55:07 +01:00
Gael Guennebaud
c684a07eba
merge
2015-12-11 10:06:38 +01:00
Benoit Steiner
b820b097b8
Created EIGEN_HAS_C99_MATH define as Gael suggested.
2015-12-10 13:52:05 -08:00
Gael Guennebaud
df6f54ff63
Fix storage order of PartialRedux
2015-12-10 22:24:58 +01:00
Mark Borgerding
22dd368ea0
sign(complex) compiles for GPU
2015-12-10 16:14:29 -05:00
Benoit Steiner
58e06447de
Silence a compilation warning
2015-12-10 13:11:36 -08:00
Benoit Steiner
48877a6933
Only implement the lgamma, erf, and erfc functions when using a compiler compliant with the C99 specification.
2015-12-10 13:09:49 -08:00
Gael Guennebaud
7ad1aaec1d
bug #1103 : fix neon vectorization of pmul(Packet1cd,Packet1cd)
2015-12-10 16:06:33 +01:00
Benoit Steiner
53b196aa5f
Simplified the implementation of lgamma, erf, and erfc
2015-12-08 14:17:34 -08:00
Benoit Steiner
e535450573
Cleanup
2015-12-08 14:06:39 -08:00
Benoit Steiner
b1ae39794c
Simplified the code a bit
2015-12-07 16:46:35 -08:00
Benoit Steiner
73b68d4370
Fixed a couple of typos
...
Cleaned up the code a bit.
2015-12-07 16:38:48 -08:00
Eugene Brevdo
fa4f933c0f
Add special functions to Eigen: lgamma, erf, erfc.
...
Includes CUDA support and unit tests.
2015-12-07 15:24:49 -08:00
Gael Guennebaud
ad3d68400e
Add matrix-free solver example
2015-12-07 12:33:38 +01:00
Gael Guennebaud
b37036afce
Implement wrapper for matrix-free iterative solvers
2015-12-07 12:23:22 +01:00
Benoit Steiner
e25e3a041b
Added rsqrt() method to the Array class: this method computes the coefficient-wise inverse square root much more efficiently than calling sqrt().inverse().
2015-12-03 18:16:35 -08:00
Benoit Steiner
c41e9e4bd0
Merged in Unril/eigen-1/Unril/fixes-internal-compiler-error-while-comp-1449156092576 (pull request PR-147)
...
Fixes internal compiler error while compiling with VC2015 Update1 x64.
2015-12-03 14:26:14 -08:00
Gael Guennebaud
1562e13aba
Add missing Rotation2D::operator=(Matrix2x2)
2015-12-03 22:25:26 +01:00
Nikolay Fedorov
944647c0aa
Fixes internal compiler error while compiling with VC2015 Update1 x64.
2015-12-03 15:21:43 +00:00
Benoit Steiner
d2d4c45d55
Made it possible to leverage several binary functor in a CUDA kernel
...
Explicitely specified the return type of the various scalar_cmp_op functors.
2015-12-02 17:21:33 -08:00
Gael Guennebaud
c5b86893e7
bug #1123 : add missing documentation of angle() and axis()
2015-12-01 14:45:08 +01:00
Gael Guennebaud
0bb12fa614
Add LU::transpose().solve() and LU::adjoint().solve() API.
2015-12-01 14:38:47 +01:00
Rasmus Munk Larsen
1663d15da7
Add internal method _solve_impl_transposed() to LU decomposition classes that solves A^T x = b or A^* x = b.
2015-11-30 13:39:24 -08:00
Gael Guennebaud
6c02cbbb0f
Fix matrix to quaternion (and angleaxis) conversion for matrix expression.
2015-12-01 09:45:56 +01:00
Gael Guennebaud
1d906d883d
Fix degenerate cases in syrk and trsm
2015-11-30 22:20:31 +01:00
Gael Guennebaud
afa11d646d
Fix UmfPackLU ctor for exppressions
2015-11-27 22:04:22 +01:00
Gael Guennebaud
6bdeb8cfbe
bug #918 , umfpack: add access to umfpack return code and parameters
2015-11-27 21:58:36 +01:00
Gael Guennebaud
3f32f5ec22
ArrayBase::sign: add unit test and fix doc
2015-11-27 16:27:53 +01:00
Gael Guennebaud
1261d020c3
bug #1120 , superlu: mem_usage_t is now uniquely defined, so let's use it.
2015-11-27 10:39:09 +01:00
Gael Guennebaud
ca001d7c2a
Big 1009, part 2/2: add static assertion on LinearAccessBit in coeff(index)-like methods.
2015-11-27 10:06:47 +01:00
Gael Guennebaud
91a7059459
bug #1009 , part 1/2: make sure vector expressions expose LinearAccessBit flag.
2015-11-27 10:06:07 +01:00
Mark Borgerding
7ddcf97da7
added scalar_sign_op (both real,complex)
2015-11-24 17:15:07 -05:00
Gael Guennebaud
f9fff67a56
Disable "decorated name length exceeded, name was truncated" MSVC warning.
2015-11-23 15:03:24 +01:00
Gael Guennebaud
f3dca16a1d
bug #1117 : workaround unused-local-typedefs warning when EIGEN_NO_STATIC_ASSERT and NDEBUG are both defined.
2015-11-23 14:07:52 +01:00
Gael Guennebaud
82bd4e546a
Merged in dr15jones/eigen (pull request PR-146)
...
Use a class constructor to initialize CPU cache sizes
2015-11-22 22:50:31 +01:00
Gael Guennebaud
35c17a3fc8
Use overload instead of template full specialization to please old MSVC
2015-11-22 22:09:57 +01:00
Gael Guennebaud
b265979a70
Make FullPivLU::solve use rank() instead of nonzeroPivots().
2015-11-21 15:03:04 +01:00
Chris Jones
4946d758c9
Use a class constructor to initialize CPU cache sizes
...
Using a static instance of a class to initialize the values for
the CPU cache sizes guarantees thread-safe initialization of the
values when using C++11. Therefore under C++11 it is no longer
necessary to call Eigen::initParallel() before calling any eigen
functions on different threads.
2015-11-20 19:58:08 +01:00
Gael Guennebaud
027a846b34
Use .data() instead of &coeffRef(0).
2015-11-20 15:30:10 +01:00
Gael Guennebaud
4fc36079e7
Fix overload instantiation for clang
2015-11-20 15:29:03 +01:00
Gael Guennebaud
5c9c0dca4d
Add missing using statement to enable fast Array<complex> / real operations. (was ok for Matrix only)
2015-11-20 14:51:36 +01:00
Gael Guennebaud
e1b27bcb0b
Workaround MSVC missing overloads of std::fpclassify for integral types
2015-11-20 13:55:34 +01:00
Gael Guennebaud
e52d4f8d8d
Add is_integral<> type traits
2015-11-20 13:54:28 +01:00
Benoit Steiner
7d1cedd0fe
Added numeric limits for unsigned integers
2015-11-18 17:17:44 -08:00
Benoit Jacob
4926251f13
bug #1115 : enable static alignment on ARM outside of old-GCC
2015-11-18 10:55:23 -05:00
Benoit Steiner
bf792f59e3
Only enable the use of constexpr with nvcc if we're using version 7.5 or above
2015-11-13 12:24:22 -08:00
Benoit Steiner
1e1755352d
Made it possible to compute atan, tanh, sinh and cosh on GPU
2015-11-12 20:19:38 -08:00
Benoit Steiner
e4d45f3440
Only enable the use of const expression when nvcc is called with the -std=c++11 option
2015-11-12 18:18:35 -08:00
Benoit Steiner
8037826367
Simplified more of the IndexList code.
2015-11-12 17:19:45 -08:00
Gael Guennebaud
dfbb889fe9
Fix missing Dynamic versus HugeCost changes
2015-11-12 12:09:48 +01:00
Gael Guennebaud
e701cb2c7c
Update EIGEN_FAST_MATH doc
2015-11-12 12:09:19 +01:00
Benoit Steiner
4f471146fb
Allow the vectorized version of the Binary and the Nullary functors to run on GPU
2015-11-11 15:19:00 -08:00
Gael Guennebaud
e73ef4f25e
bug #1109 : use noexcept instead of throw for C++11 compilers
2015-12-10 14:21:23 +01:00
Gael Guennebaud
145ad5d800
Use more explicit names.
2015-12-10 12:03:38 +01:00
Gael Guennebaud
75f0fe3795
Fix usage of "Index" as a compile time integral.
2015-12-10 12:01:06 +01:00
Gael Guennebaud
f248249c1f
bug #1113 : fix name conflict with C99's "I".
2015-12-10 11:57:57 +01:00
Gael Guennebaud
fbe18d5507
Forbid the creation of SparseCompressedBase object
2015-12-09 15:47:32 +01:00
Gael Guennebaud
dc73430d4b
bug #1074 : forbid the creation of PlainObjectBase object by making its ctor protected
2015-12-09 15:47:08 +01:00
Gael Guennebaud
1257fbd2f9
Fix sign-unsigned issue in enum
2015-12-09 10:06:42 +01:00
Gael Guennebaud
4549549992
Fix and clarify documentation of Transform wrt operator*(MatrixBase)
2015-12-08 16:21:49 +01:00
Gael Guennebaud
543bd28a24
Fix Alignment in coeff-based product, and enable unaligned vectorization
2015-12-08 11:28:05 +01:00
Benoit Steiner
d27e4f1cba
Added missing EIGEN_DEVICE_FUNC statements
2015-11-06 09:23:58 -08:00
Benoit Steiner
ed1962b464
Reimplement the tensor comparison operators by using the scalar_cmp_op functors. This makes them more cuda friendly.
2015-11-06 09:18:43 -08:00
Gael Guennebaud
bfd6ee64f3
bug #1105 : fix default preallocation when moving from compressed to uncompressed mode
2015-11-06 15:05:37 +01:00
Gael Guennebaud
ae87f094eb
Fix "," in non SSE4 mode
2015-11-05 12:08:36 +01:00
Gael Guennebaud
90323f1751
Fix AVX round/ceil/floor, and fix respective unit test
2015-11-04 22:15:57 +01:00
Gael Guennebaud
3dd24bdf99
Merged in aavenel/eigen (pull request PR-142)
...
Add round, ceil and floor for SSE4.1/AVX (Bug #70 )
2015-11-04 18:26:38 +01:00
Gael Guennebaud
902750826b
Add support for dense.cwiseProduct(sparse)
...
This also fixes a regression regarding (dense*sparse).diagonal()
2015-11-04 17:42:07 +01:00
Gael Guennebaud
f6b1deebab
Fix compilation of sparse-triangular to dense assignment
2015-11-04 17:02:32 +01:00
Benoit Steiner
36cd6daaae
Made the CUDA implementation of ploadt_ro compatible with cuda implementations older than 3.5
2015-11-03 16:36:30 -08:00
Gael Guennebaud
29a94c8055
compilation issue
2015-11-02 16:11:59 +01:00
Alexandre Avenel
38832e0791
Merge
2015-11-01 10:55:42 +01:00