Nicolas Mellado
1dd6a329e8
Cuda compatibility: remove explicit call to std math functions
2015-07-11 19:40:15 +02:00
Nicolas Mellado
bc40eb745d
Merged eigen/eigen into default
2015-07-11 19:33:43 +02:00
Benoit Steiner
6de6fa9483
Use NumTraits<T>::RequireInitialization instead of internal::is_arithmetic<T>::value to check whether it's possible to bypass the type constructor in the tensor code.
2015-07-07 15:23:56 -07:00
Benoit Steiner
7b7df7b6b8
Updated internal::is_arithmetic::value to be true for complex numbers
2015-07-07 12:57:35 -07:00
Gael Guennebaud
7fa6fe8d8c
typo
2015-07-07 17:47:24 +02:00
Gael Guennebaud
fa17358c4b
Rotation2D: fix slerp to take the shortest path, and add convenient method to get the angle in [-pi,pi] or [0,pi]
2015-07-07 17:27:12 +02:00
Nicolas Mellado
5359e5cdb2
Protect against compilation errors with nvcc and numext/complex.
...
Disable functions explicitely involving std::complex when compiling with nvcc.
Improve code compatilibity using the new macro EIGEN_USING_NUMEXT_MATH (same spirit than EIGEN_USING_STD_MATH but for numext functions)
2015-07-06 20:55:01 +02:00
Gael Guennebaud
c2019dfeb3
Merged in Emie/eigen (pull request PR-121)
...
typo correction in mathFunction
2015-07-06 16:48:54 +02:00
Emilie Guy
ea7113dd0c
typo correction in mathFunction
2015-07-06 14:31:08 +02:00
Nicolas Mellado
9115896590
Merged eigen/eigen into default
2015-07-03 00:41:11 +02:00
Benoit Steiner
95ef94f1ee
Fixed a typo in the patch
2015-07-02 07:06:55 +00:00
Benoit Steiner
44eedd8915
Marked the cast functions as EIGEN_DEVICE_FUNC to ensure that we can run casting on GPUs
2015-06-30 15:48:55 -07:00
Gael Guennebaud
c911fc8dee
split compiler intensive bdcsvd_1 unit test
2015-06-26 16:14:23 +02:00
Gael Guennebaud
98ff17eb9e
Add special path for matrix<complex>/real.
...
This also fixes underflow issues when scaling complex matrices through complex/complex operator.
2015-06-26 16:08:15 +02:00
Gael Guennebaud
e102ddbf1f
bug #1026 : fix infinite loop for an empty input
2015-06-26 14:02:52 +02:00
Gael Guennebaud
555b9c6843
Doc: explain perf and multithreading issues in sparse iterative solvers
2015-06-26 10:49:40 +02:00
Gael Guennebaud
53b930887d
Enable OpenMP parallelization of row-major-sparse * dense products.
...
I observed significant speed-up of the CG solver.
2015-06-26 10:32:34 +02:00
Gael Guennebaud
7f824dd613
Optimize CG to enable faster spare row-major * dense vector products when the input matrix is complete (Upper|Lower) but column major.
2015-06-25 17:17:38 +02:00
Gael Guennebaud
33e699c9fe
Remove redundant accessors in Reverse
2015-06-25 14:14:59 +02:00
Gael Guennebaud
973b0a90db
Clarify documentation of the tolerance and error returned in iterative solvers
2015-06-25 13:51:13 +02:00
Gael Guennebaud
b4ab72678c
bug #1000 : MSVC 2013 does need the operator= workaround
2015-06-25 09:45:22 +02:00
Gael Guennebaud
788941d3b1
Workaround MSVC ambiguous instanciation
2015-06-24 23:35:17 +02:00
Gael Guennebaud
4c8cd13b35
Add explicit ctor for diagonal to sparse conversion
2015-06-24 18:11:06 +02:00
Gael Guennebaud
c38c195321
Document how cross behaves on complex numbers
2015-06-24 18:02:33 +02:00
Gael Guennebaud
62f21e2d11
Add support for sparse = diagonal
2015-06-24 17:55:00 +02:00
Gael Guennebaud
763c833637
Make SparseSelfAdjointView, twists, and SparseQR more evaluator friendly
2015-06-24 17:54:09 +02:00
Gael Guennebaud
36643eec0c
Add a call_assignment_no_alias_no_transpose shortcut
2015-06-24 17:50:43 +02:00
Gael Guennebaud
02db7c9bc6
Inherit operator+= and -= with 'using' kkeyword
2015-06-24 17:49:20 +02:00
Gael Guennebaud
95e19be381
Fix compilation of MKL Pardiso support
2015-06-24 14:53:43 +02:00
Gael Guennebaud
b0d08869a9
Fix underflow in 3x3 tridiagonalization
2015-06-23 14:54:31 +02:00
Gael Guennebaud
71523a2e25
Fix a warning with icc
2015-06-23 14:20:20 +02:00
Gael Guennebaud
d9778f3391
Enable VML's pow wrapper on windows (the previous wrapper used the Fortran interface)
2015-06-23 14:04:50 +02:00
Gael Guennebaud
5f9630d7f9
bug #923 : update support for Intel's VML wrt new evaluation mechanisms
2015-06-23 14:03:25 +02:00
Gael Guennebaud
793e4c6d77
bug #923 : fix EIGEN_USE_BLAS mode
2015-06-23 11:13:24 +02:00
Gael Guennebaud
307c4fc292
Workaround missalignment produced by first_aligned for PacketSize==1 and size==1
2015-06-23 10:10:17 +02:00
Gael Guennebaud
bb3a9b4941
Use Ref<> to bypass forceAlignmentIf
2015-06-22 17:48:28 +02:00
Gael Guennebaud
476beed7f8
bug #1017 : apply Christoph's patch preventing underflows in makeHouseholder
2015-06-22 16:51:45 +02:00
Gael Guennebaud
0848ba0a6e
Fix return nullary return types: it must be based on the PlainObject type instead of the expression type.
2015-06-22 10:52:08 +02:00
Nicolas Mellado
ad5fdc7ddd
Fix double to Scalar unwanted promotions
2015-06-21 20:21:23 +02:00
Gael Guennebaud
40821876ea
Fix regression on CompressedStorage::operator=
2015-06-20 13:59:13 +02:00
Gael Guennebaud
84aaef93ba
Merged in vanhoucke/eigen_vanhoucke (pull request PR-118)
...
Fix two small undefined behaviors caught by static analysis.
2015-06-20 13:56:48 +02:00
Gael Guennebaud
6b33b29f00
Get rid of must_nest_by_value
2015-06-19 18:12:40 +02:00
Gael Guennebaud
846b227bb7
Get rid of class internal::nested<> (still have to updated Tensor module)
2015-06-19 17:56:39 +02:00
vanhoucke
368ea23406
Fix undefined behavior. When resizing a default-constructed SparseArray, we end up calling memcpy(ptr, 0, 0), which is technically UB and gets caught by static analysis.
2015-06-19 15:53:30 +00:00
Gael Guennebaud
386d9e5ebd
Fix usage of nested versus nested_eval
2015-06-19 17:42:27 +02:00
Gael Guennebaud
a5a7b68b76
Fix ambiguous instanciation using clean class-level SFINAE in product_evaluator
2015-06-19 17:25:13 +02:00
Gael Guennebaud
6fc5438205
Remove a few deprecated internal expressions
2015-06-19 17:06:12 +02:00
Gael Guennebaud
5c84dd5665
Fix permutation/transposiitons products wrt nested_eval
2015-06-19 16:37:04 +02:00
Gael Guennebaud
0c8b0e007b
Introduce a AliasFreeProduct option for Permutations and Transpositions
2015-06-19 15:38:19 +02:00
Gael Guennebaud
3f6aa4cd5d
Remove useless specializations of evaluator_traits
2015-06-19 14:18:29 +02:00
Gael Guennebaud
4a8888dfbc
Improbe compatibility of Transpositions and evaluators
2015-06-19 14:10:44 +02:00
Gael Guennebaud
3af4c6c1c9
Make Transpositions use evaluators
2015-06-19 11:50:24 +02:00
Gael Guennebaud
82b6ac0864
Enforce eigenvectors to be column-major (for performance reasons)
2015-06-19 11:25:46 +02:00
Gael Guennebaud
fad36cc814
Clean implementation of permutation * matrix products.
2015-06-19 10:51:57 +02:00
Gael Guennebaud
06036d8bb1
Fix compilation of BDCSVD with DEFAULT_TO_ROWMAJOR
2015-06-19 10:37:25 +02:00
Gael Guennebaud
d2db15016b
Fix storage order computation in traits<Product>
2015-06-19 10:36:38 +02:00
Gael Guennebaud
7baa1ba03e
Remove the usage of result_of for DenseBase::redux as discussed in bug #1006
2015-06-15 22:40:18 +02:00
Gael Guennebaud
97cbe28829
Remove support of std::binder* in C++11
2015-06-15 15:34:05 +02:00
Gael Guennebaud
972a535288
Remove aligned-on-scalar assert and fallback to non vectorized path at runtime (first_aligned already had this runtime guard)
2015-06-14 15:04:07 +02:00
Gael Guennebaud
a546be56e0
typo
2015-06-15 15:08:51 +02:00
Gael Guennebaud
321a2cbe3d
Add missing forward declaration of AlignedBox
2015-06-15 15:01:20 +02:00
Gael Guennebaud
91b64a9c65
Relax aligned-on-scalar assert as in the 3.2 branch
2015-06-12 11:25:57 +02:00
Gael Guennebaud
84d103bee8
Enable C++11 math function in a more conservative manner.
2015-06-11 21:45:02 +02:00
Gael Guennebaud
d93ba137f2
Introduce EIGEN_PI, get rid of M_PI and acos(-1.0)
2015-06-10 17:12:10 +02:00
Gael Guennebaud
9756c7fb4d
Make more use of EIGEN_HAS_C99_MATH
2015-06-10 16:26:55 +02:00
Gael Guennebaud
93a62265dc
fix isinf(complex(inf,NaN)) to return false.
2015-06-10 16:19:10 +02:00
Gael Guennebaud
b0d5aaafcc
Rename free functions isFinite, isInf, isNaN to be compatible with c++11
2015-06-10 16:17:09 +02:00
Gael Guennebaud
25a98be948
bug #80 : merge with d_hood branch on adding more coefficient-wise unary array functors
2015-06-10 15:52:05 +02:00
Gael Guennebaud
192bce2795
bug #890 , add a more general routine to check that two dense object reference to the same data
2015-06-10 10:09:04 +02:00
Gael Guennebaud
0b2cbb2bdc
bug #897 : make umfpack support use Ref<>
2015-06-09 23:30:06 +02:00
Gael Guennebaud
feaf76c001
bug #910 : add a StandardCompressedFormat option to Ref<SparseMatrix> to enforce standard compressed storage format.
...
If the input is not compressed, then this trigger a copy for a const Ref, and a runtime assert for non-const Ref.
2015-06-09 23:11:24 +02:00
Gael Guennebaud
f899aeb301
bug #650 : fix sparse * dense wrt noalias and compound assignment
2015-06-09 18:33:24 +02:00
Gael Guennebaud
785b9c0127
bug #1003 : assert in MapBase if the provided pointer is not aligned on scalar while it is expected to be. Also add a EIGEN_ALIGN8 macro.
2015-06-09 17:42:09 +02:00
Gael Guennebaud
4aba24a1b2
Clean argument names of some functions
2015-06-09 13:32:12 +02:00
Gael Guennebaud
302cf8ffe2
Add missing documentation for TriangularViewImpl<MatrixType,Mode,Sparse>
2015-06-09 11:40:07 +02:00
Gael Guennebaud
3a4299b245
bug #872 : remove usage of deprecated bind1st.
2015-06-09 10:52:04 +02:00
Gael Guennebaud
9a2447b0c9
Fix shadow warnings triggered by clang
2015-06-09 09:11:12 +02:00
Gael Guennebaud
cd8b996f99
Extend unit test and documentation of SelfAdjointEigenSolver::computeDirect
2015-06-08 16:16:42 +02:00
Gael Guennebaud
8f031a3cee
bug #997 : add missing evaluators for m.lazyProduct(v.homogeneous())
2015-06-08 15:43:41 +02:00
Gael Guennebaud
274b1f5d7e
Fix homogeneous() for 1x1 matrix: in this case, homogeneous follows the storage order guaranteeing that v.transpose().homogeneous() == v.homogeneous().transpose()
2015-06-08 15:40:51 +02:00
Gael Guennebaud
cbe3a1a83e
Add missing accessors for 1D index based access to Replicate<> expressions.
2015-06-08 15:39:09 +02:00
Gael Guennebaud
a7ae628c9f
bug #1005 : fix regression regarding sparse coeff-wise binary operator that did not trigger a static assertion for mismatched storage
2015-06-08 10:14:08 +02:00
Gael Guennebaud
0a9b5d1396
bug #705 : fix handling of Lapack potrf return code
2015-06-05 15:59:13 +02:00
Gael Guennebaud
d0b7b5cb55
minor documentation fixes
2015-06-05 14:40:07 +02:00
Gael Guennebaud
56d4ef7ad6
BiCGSTAB: set default guess to 0, and improve restart mechanism by recomputing the accurate residual.
2015-06-05 14:37:57 +02:00
Gael Guennebaud
d457734a19
Avoid calling smart_copy with null pointers.
2015-05-25 22:30:56 +02:00
Benoit Jacob
051d5325cc
Abandon blocking size lookup table approach. Not performing as well in real world as in microbenchmark.
2015-05-19 11:03:59 -04:00
Christoph Hertzberg
ebea530782
bug #1014 : More stable direct computation of eigenvalues and -vectors for 3x3 matrices
2015-05-17 21:54:32 +02:00
Benoit Jacob
c88e1abaf3
also uninitialized here, see previous cset
2015-05-15 11:34:57 -04:00
Benoit Jacob
807793ec3b
Fix uninitialized var warning. The compiler was clearing the register anyway, so this does not change resulting code
2015-05-15 11:15:53 -04:00
Pete Warden
140f85bb99
Check for the macro __ARM_NEON__ (with two underscores at the end) as well as __ARM_NEON. The second macro is correct according to the ARM language extensions specification, but historically the first one has been more common. Some older compilers (e.g. gcc v4.6 on a Beaglebone Black) only define the first, so without this patch NEON isn't enabled.
2015-05-12 16:03:43 -07:00
Gael Guennebaud
ef81730625
Ignore denormal numbers in selfadjoint eigensolver.
2015-05-12 18:38:43 +02:00
Christoph Hertzberg
494fa991c3
bug #872 : Avoid deprecated binder1st/binder2nd usage by providing custom functors for comparison operators
2015-05-07 17:28:40 +02:00
Gael Guennebaud
4a936974a5
bug #1013 : fix 2x2 direct eigensolver for identical eiegnvalues
2015-05-07 15:55:12 +02:00
Gael Guennebaud
ebf8ca4fa8
Fix bug #1010 : m_isInitialized was improperly updated
2015-05-07 14:20:42 +02:00
Konstantinos Margaritis
dd698e6680
Merged in doug_kwan/eigen (pull request PR-103)
...
Fix bug in pdiv<Packet1cd> which swaps 32-bit halves of a pair of
2015-05-05 20:50:14 +03:00
Benoit Steiner
1dded10cb7
Added a double-precision implementation of the exp() function for AVX.
2015-05-04 10:42:51 -07:00
Christoph Hertzberg
28a4c92cbf
bug #998 : Started fixing doxygen warnings
2015-05-01 22:10:41 +02:00
Christoph Hertzberg
173b34e9ab
bug #999 : clarify that behavior of empty AlignedBoxes is undefined, and further improvements in documentation
2015-04-30 19:30:36 +02:00
Gael Guennebaud
de18cd413d
Disable posix_memalign on Solaris and SunOS, and allows to by-pass built-in posix_memalign detection rules.
2015-04-24 11:26:51 +02:00
Gael Guennebaud
40258078c6
bug #360 : add value_type typedef to DenseBase/SparseMatrixBase
2015-04-24 09:44:24 +02:00
Christoph Hertzberg
c460af414e
Fix bug #1000 : Manually inherit assignment operators for MSVC 2013 and later (as required by the standard).
2015-04-23 13:39:03 +02:00
Gael Guennebaud
dbd12b4cda
Make sure that BlockImpl<const SparseMatrix> ctor is called with the right type
2015-04-21 10:15:36 +02:00
Gael Guennebaud
d6a8b43b39
Fix typo in the definition of EIGEN_COMP_GNUC_STRICT
2015-04-21 10:12:38 +02:00
Deanna Hood
e5048b5501
Use std::isfinite when available
2015-04-20 14:59:57 -04:00
Deanna Hood
249c48ba00
Incorporate C++11 check into EIGEN_HAS_C99_MATH macro
2015-04-20 14:57:04 -04:00
Deanna Hood
0250f4a9f2
Merged default into unary-array-cwise-functors
2015-04-20 14:01:35 -04:00
Deanna Hood
0339502a4f
Only use std::isnan and std::isinf if they are available
2015-04-20 13:14:06 -04:00
Gael Guennebaud
fc2d5b86ce
simplify previous changeset: sub-expressions are nested by value
2015-04-18 22:50:16 +02:00
Gael Guennebaud
5a3c48e3c6
bug #942 : fix dangling references in evaluator of diagonal * sparse products.
2015-04-18 22:43:27 +02:00
Christoph Hertzberg
4f126b862d
Add internal assertions to purely fixed-size DenseStorage, mark optional variables always as unused
2015-04-17 11:36:21 +02:00
Christoph Hertzberg
9d7843d0d0
Add internal assertions to DenseStorage constructor
2015-04-16 15:47:06 +02:00
Christoph Hertzberg
3be9f5c4d7
Constructing a Matrix/Array with implicit transpose could lead to memory leaks.
...
Also reduced code duplication for Matrix/Array constructors
2015-04-16 13:25:20 +02:00
Gael Guennebaud
e0cff9ae0d
Fix bug #996 : fix comparisons to 0 instead of Scalar(0)
2015-04-15 14:48:53 +02:00
Gael Guennebaud
5dbe758dc3
Backed out changeset 04c8c5d9ef
2015-04-15 14:47:08 +02:00
Gael Guennebaud
04c8c5d9ef
Fix bug #996 : fix comparisons to 0 instead of Scalar(0)
2015-04-15 14:44:57 +02:00
Benoit Steiner
0f82399fe9
Pulled latest changes from trunk
2015-04-14 19:13:34 -07:00
Christoph Hertzberg
761691f18d
Make conversion from 0 to Scalar explicit (issue reported by Brad Bell)
2015-04-13 17:15:00 +02:00
Benoit Steiner
5401fbcc50
Improved the blocking strategy to speedup multithreaded tensor contractions.
2015-04-09 16:44:10 -07:00
Deanna Hood
085aa8e601
Don't use M_PI since it's only guaranteed to be defined in Eigen/Geometry
2015-04-08 13:59:18 -05:00
Gael Guennebaud
0eb220c00d
add a note on bug #992
2015-04-08 09:25:34 +02:00
Benoit Jacob
d7f51feb07
bug #992 : don't select a 3p GEMM path with non-vectorizable scalar types, this hits unsupported paths in symm/triangular products code
2015-04-07 15:13:55 -04:00
Benoit Steiner
7c18ab921c
Pulled latest updates from trunk
2015-04-04 20:07:04 -07:00
Gael Guennebaud
15b5adb327
Fix regression in DynamicSparseMatrix and SuperLUSupport wrt recent change on nonZeros/nonZerosEstimate
2015-04-02 22:21:41 +02:00
Benoit Steiner
74e558cfa8
Pulled latest updates from trunk
2015-04-01 23:24:11 -07:00
Gael Guennebaud
5861cfb55e
Remove unused GenericSparseBlockInnerIteratorImpl code.
2015-04-01 22:29:29 +02:00
Gael Guennebaud
3105986e71
bug #875 : remove broken SparseMatrixBase::nonZeros and introduce a nonZerosEstimate() method to sparse evaluators for internal uses.
...
Factorize some code in SparseCompressedBase.
2015-04-01 22:27:34 +02:00
Gael Guennebaud
39dcd01b0a
bug #973 : enable alignment of multiples of half-packet size (e.g., Vector6d with AVX)
2015-04-01 13:55:09 +02:00
Gael Guennebaud
8481dc21ea
bug #986 : add support for coefficient-based product with 0 depth.
2015-04-01 13:15:23 +02:00
Gael Guennebaud
79b4e6acaf
Fix bug #987 : wrong alignement guess in diagonal product.
2015-03-31 23:35:12 +02:00
Gael Guennebaud
3c38589984
Remove most of the dynamic memory allocations that occured in D&C SVD. Still remains the calls to JacobiSVD and UpperBidiagonalization.
2015-03-31 22:54:47 +02:00
Gael Guennebaud
8313fb7df7
Add row/column-wise reverseInPlace feature.
2015-03-31 21:35:53 +02:00
Gael Guennebaud
dfb674a25e
Make reverseInPlace really work in-place.
2015-03-31 20:17:10 +02:00
Gael Guennebaud
20d030f207
Fix vectorization of swap for non trivial expressions
2015-03-31 20:16:02 +02:00
Benoit Jacob
73cdeae1d3
Only use blocking sizes LUTs for single-thread products for now
2015-03-31 11:17:23 -04:00
Benoit Jacob
0cbd5ae3cb
Correctly detect Android with ndk_build
2015-03-31 11:17:21 -04:00
Gael Guennebaud
ae01c05e18
Fix computeProductBlockingSizes with m==0, and add respective unit test.
2015-03-31 15:19:57 +02:00
Gael Guennebaud
bd76d837e6
Fix sign of SuperLU::determinant
2015-03-31 14:57:32 +02:00
Gael Guennebaud
35d3053d55
Fix regression introduced in 3b169d792d
2015-03-31 09:23:53 +02:00
Christoph Hertzberg
3b169d792d
Suppress unused variable warning
2015-03-31 00:49:08 +02:00
Christoph Hertzberg
1efae98fee
bug #985 : RealQZ failed when either matrix had zero rows or columns (report and patch by Ben Goodrich)
...
Also added a regression test
2015-03-30 23:56:20 +02:00
Benoit Steiner
35722fa022
Made the index type a template parameter of the tensor class instead of encoding it in the options.
2015-03-30 14:55:54 -07:00
Christoph Hertzberg
58af8bf90c
bug #982 : Make sure numext::maxi and numext::mini are called correctly, in case Scalar expressions return expression templates.
2015-03-30 16:47:22 +02:00
Gael Guennebaud
2adbf6b8ca
fix stupid warning with old GCC
2015-03-28 22:34:54 +01:00
Gael Guennebaud
41e20248f8
merge
2015-03-28 14:43:35 +01:00
Christoph Hertzberg
09a5361d1b
bug #983 : Pass Vector3 by const reference and not by value
2015-03-28 12:36:24 +01:00
Gael Guennebaud
eb7e4c2b9c
Pass Vector3 type by reference
2015-03-27 12:11:24 +01:00
Gael Guennebaud
79cb875249
merge
2015-03-27 10:56:04 +01:00
Gael Guennebaud
1b8cc9af43
Slight numerical stability improvement in 2x2 svd
2015-03-27 10:55:00 +01:00
Gael Guennebaud
3d59ae0203
Fix hypot(0,0).
2015-03-27 09:59:24 +01:00
Benoit Steiner
d3f7915aeb
Pulled latest update from the eigen main codebase
2015-03-24 13:12:14 -07:00
Benoit Steiner
abdbe8562e
Fixed the CUDA packet primitives
2015-03-24 10:45:46 -07:00
Gael Guennebaud
29eaa2b0f1
Make MatrixBase::is* methods aware of nested_eval.
2015-03-24 13:42:42 +01:00
Gael Guennebaud
d27968eb7e
D&C SVD: directly falls back to JacobiSVD for very small problems (by-pass upper-bidiagonalization)
2015-03-24 13:38:07 +01:00
Gael Guennebaud
4472f3e578
Avoid SVD: consider denormalized small numbers as zero when computing the rank of the matrix
2015-03-23 09:40:21 +01:00
Deanna Hood
83e5b7656b
Use M_PI instead of acos(-1) for pi
2015-03-22 06:04:31 +10:00
Deanna Hood
4bab4790c0
Add \sa tags of isFinite/isInf for each other
2015-03-22 05:39:08 +10:00
Gael Guennebaud
4e2b18d909
Update approx. minimum ordering method to push and keep structural empty diagonal elements to the bottom-right part of the matrix
2015-03-20 16:33:48 +01:00
Gael Guennebaud
d6b2f300db
Fix MSVC compilation: aligned type must be passed by reference
2015-03-19 17:28:32 +01:00
Gael Guennebaud
61c45d7cfd
Fix comparison warning
2015-03-19 17:13:22 +01:00
Gael Guennebaud
f329d0908a
Improve random number generation for integer and add unit test
2015-03-19 15:10:36 +01:00
Benoit Jacob
dc04f12967
use unsigned short instead of uint16_t which doesn't exist in c++98
2015-03-17 10:31:45 -04:00
Deanna Hood
596be3cd86
Use std::arg for real numbers when c++11 is used, custom implementation otherwise
2015-03-17 15:28:12 +10:00
Deanna Hood
e26134ed62
Use std::round when c++11 is used, custom implementation otherwise
2015-03-17 14:55:14 +10:00
Deanna Hood
e21e29a088
Update cost of arg call to depend on if the scalar is complex or not
2015-03-17 14:04:33 +10:00
Deanna Hood
447a5a6b01
Fix VML declarations to only be for real/complex as appropriate
2015-03-17 13:33:31 +10:00
Deanna Hood
f52b78491c
Remove packet isNaN, isInf, isFinite
2015-03-17 09:26:24 +10:00
Deanna Hood
1c78d6f2a6
Add boolean not operator (!) array support
2015-03-17 08:29:57 +10:00
Benoit Jacob
364cfd529d
Similar to cset 3589a9c115
...
, also in 2px4 kernel: actual_panel_rows computation should always be resilient to parameters not consistent with the known L1 cache size, see comment
2015-03-16 16:28:44 -04:00
Deanna Hood
e1d6e6c972
Make cube, inverse and abs2 free-functions
2015-03-17 06:25:24 +10:00
Benoit Jacob
577056aa94
Include stdint.h. Not going for cstdint because it is a C++11 addition. Needed for uint16_t at least, in lookup-table code.
2015-03-16 16:21:50 -04:00
Benoit Jacob
eb6929cb19
fix bug in maxsize calculation, which would cause products of size > 2048 to address the lookup table out of bounds
2015-03-16 16:15:47 -04:00
Deanna Hood
fef4e071d7
Rename isinf to isInf
2015-03-17 05:58:47 +10:00
Deanna Hood
46cf9cda32
Add isfinite array support as isFinite
2015-03-17 04:33:12 +10:00
Deanna Hood
1d76ceab55
Remove floor, ceil, round for complex numbers
2015-03-17 02:36:07 +10:00
Deanna Hood
717b7954ce
Update cost of coeff-wise arg call
2015-03-17 02:11:57 +10:00
Deanna Hood
fb68b149cb
Rename isnan to isNaN
2015-03-17 02:04:42 +10:00
Benoit Jacob
35c3a8bb84
Update Nexus 5 lookup table from combining now 2 runs of the benchmark, using the analyze-blocking-sizes partition tool. Gives better worst-case performance.
2015-03-16 11:05:51 -04:00
Benoit Jacob
e274607d7f
fix compilation with GCC 4.8
2015-03-16 10:48:27 -04:00
Benoit Jacob
151b8b95c6
Fix bug in case where EIGEN_TEST_SPECIFIC_BLOCKING_SIZE is defined but false
2015-03-15 19:10:51 -04:00
Benoit Jacob
02babb9c0f
Provide a empirical lookup table for blocking sizes measured on a Nexus 5. Only for float, only for Android on ARM 32bit for now.
2015-03-15 18:13:12 -04:00
Benoit Jacob
3589a9c115
actual_panel_rows computation should always be resilient to parameters not consistent with the known L1 cache size, see comment
2015-03-15 18:12:18 -04:00
Benoit Jacob
1dd3d89818
Fix a unused-var warning
2015-03-15 18:07:19 -04:00
Benoit Jacob
e56aabf205
Refactor computeProductBlockingSizes to make room for the possibility of using lookup tables
2015-03-15 18:05:12 -04:00
Benoit Jacob
488c15615a
organize a little our default cache sizes, and use a saner default L1 outside of x86 (10% faster on Nexus 5)
2015-03-13 14:51:26 -07:00
Gael Guennebaud
1330f8bbd1
bug #973 , improve AVX support by enabling vectorization of Vector4i-like types, and enforcing alignement of Vector4f/Vector2d-like types to preserve compatibility with SSE and future Eigen versions that will vectorize them with AVX enabled.
2015-03-13 21:15:50 +01:00
Gael Guennebaud
d99ab35f9e
Fix internal::random(x,y) for integer types. The previous implementation could return y+1. The new implementation uses rejection sampling to get an unbiased behabior.
2015-03-13 21:12:46 +01:00
Gael Guennebaud
8580eb6808
bug #949 : add static assertion for incompatible scalar types in dense end-user decompositions.
2015-03-13 21:06:20 +01:00
Gael Guennebaud
a9df28c95b
SparseMatrix::insert: switch to a fully uncompressed mode if sequential insertion is not possible (otherwise an arbitrary large amount of memory was preallocated in some cases)
2015-03-13 21:00:21 +01:00
Gael Guennebaud
5ffe29cb9f
Bound pre-allocation to the maximal size representable by StorageIndex and throw bad_alloc if that's not possible.
2015-03-13 20:57:33 +01:00
Gael Guennebaud
2f6f8bf31c
Add missing coeff/coeffRef members to Block<sparse>, and extend unit tests.
2015-03-13 16:24:40 +01:00
Doug Kwan
657407227e
Fix bug in pdiv<Packet1cd> which swaps 32-bit halves of a pair of
...
doubles instead of swapping the doubles.
2015-03-11 15:13:37 -07:00
Deanna Hood
f89fcefa79
Add hyperbolic trigonometric functions from std array support
2015-03-11 13:13:30 +10:00
Deanna Hood
a5e49976f5
Add log10 array support
2015-03-11 08:56:42 +10:00
Deanna Hood
19a71056ae
Allow calling of square(array) in addition to array.square()
2015-03-11 06:59:28 +10:00
Deanna Hood
31fdd67756
Additional unary coeff-wise functors (isnan, round, arg, e.g.)
2015-03-11 06:39:23 +10:00
Gael Guennebaud
fd78874888
Fix compilation of iterative solvers with dense matrices
2015-03-09 21:31:03 +01:00
Gael Guennebaud
d4317a85e8
Add typedefs for return types of SparseMatrixBase::selfadjointView
2015-03-09 21:29:46 +01:00
Gael Guennebaud
9e885fb766
Add unit tests for CG and sparse-LLT for long int as storage-index
2015-03-09 14:33:15 +01:00
Gael Guennebaud
224a1fe4c6
bug #963 : make IncompleteLUT compatible with non-default storage index types.
2015-03-09 13:55:20 +01:00
Gael Guennebaud
0ee391863e
Avoid undeflow when blocking size are tuned manually.
2015-03-06 21:51:09 +01:00
Gael Guennebaud
14a5f135a3
bug #969 : workaround abiguous calls to Ref using enable_if.
2015-03-06 17:51:31 +01:00
Gael Guennebaud
87681e508f
bug #978 : early return for vanishing products
2015-03-06 16:11:22 +01:00
Gael Guennebaud
cd3bbffa73
Improve blocking heuristic: if the lhs fit within L1, then block on the rhs in L1 (allows to keep packed rhs in L1)
2015-03-06 14:31:39 +01:00
Gael Guennebaud
58740ce4c6
Improve product kernel: replace the previous dynamic loop swaping strategy by a more general one:
...
It consists in increasing the actual number of rows of lhs's micro horizontal panel for small depth such that L1 cache is fully exploited.
2015-03-06 10:30:35 +01:00
Gael Guennebaud
4c8b95d5c5
Rename LSCG to LeastSquaresConjugateGradient
2015-03-05 10:16:32 +01:00
Gael Guennebaud
7550107028
Product optimization: implement a dynamic loop-swapping startegy to improve memory accesses to the destination matrix in the case of K-rank-update like products, i.e., for products of the kind: "large x small" * "small x large"
2015-03-05 10:03:46 +01:00
Gael Guennebaud
2dc968e453
bug #824 : improve accuracy of Quaternion::angularDistance using atan2 instead of acos.
2015-03-04 17:03:13 +01:00
Benoit Steiner
0196141938
Fixed the optimized AVX implementation of the fast rsqrt function
2015-03-02 13:49:39 -08:00
Benoit Steiner
4fd7f47692
Added an optimized version of rsqrt for SSE and AVX that is used when EIGEN_FAST_MATH is defined.
2015-03-02 09:38:47 -08:00
Benoit Steiner
fb53384b0f
Improved the default implementation of prsqrt
2015-02-28 01:51:26 -08:00
Benoit Steiner
306fceccbe
Pulled latest updates from trunk
2015-02-27 13:05:26 -08:00
Benoit Steiner
2386fc8528
Added support for 32bit index on a per tensor/tensor expression. This enables us to use 32bit indices to evaluate expressions on GPU faster while keeping the ability to use 64 bit indices to manipulate large tensors on CPU in the same binary.
2015-02-27 12:57:13 -08:00
Benoit Jacob
6466fa63be
Reimplement the selection between rotating and non-rotating kernels
...
using templates instead of macros and if()'s.
That was needed to fix the build of unit tests on ARM, which I had
broken. My bad for not testing earlier.
2015-02-27 15:30:10 -05:00
Benoit Steiner
05089aba75
Switch to truncated casting when converting floating point types to integer. This ensures that vectorized casts are consistent with scalar casts
2015-02-27 09:27:30 -08:00
Benoit Steiner
573b377110
Added support for vectorized type casting of tensors
2015-02-27 08:46:04 -08:00
Benoit Jacob
2fc3b484d7
remove trailing comma
2015-02-27 11:37:45 -05:00
Benoit Jacob
33669348c4
Disable Packet2f/2i halfpacket support in NEON.
...
I believe that it was erroneously turned on, since Packet2f/2i intrinsics are unimplemented,
and code trying to use halfpackets just fails to compile on NEON, as it tries to use the
default implementation of pload/pstore and the types don't match.
2015-02-27 11:35:37 -05:00
Benoit Jacob
b7fc8746e0
Replace a static assert by a runtime one, fixes the build of unit tests on ARM
...
Also safely assert in the non-implemented path that should never be taken in practice,
and would return wrong results.
2015-02-27 10:01:59 -05:00
Benoit Steiner
f41b1f1666
Added support for fast reciprocal square root computation.
2015-02-26 09:42:41 -08:00
Gael Guennebaud
bcf9bb5c1f
Avoid packing rhs multiple-times when blocking on the lhs only.
2015-02-26 17:01:33 +01:00
Gael Guennebaud
4ec3f04b3a
Make sure that the block size computation is tested by our unit test.
2015-02-26 17:00:36 +01:00
Gael Guennebaud
a8ad8887bf
Implement a more generic blocking-size selection algorithm. See explanations inlines.
...
It performs extremely well on Haswell. The main issue is to reliably and quickly find the
actual cache size to be used for our 2nd level of blocking, that is: max(l2,l3/nb_core_sharing_l3)
2015-02-26 16:04:35 +01:00
Gael Guennebaud
400becc591
Fix typos in block-size testing code, and set peeling on k to 8.
2015-02-26 15:57:06 +01:00
Benoit Jacob
692136350b
So I extensively measured the impact of the offset in this prefetch. I tried offset values from 0 to 128 (on this float* pointer, so implicitly times 4 bytes).
...
On x86, I tested a Sandy Bridge with AVX with 12M cache and a Haswell with AVX+FMA with 6M cache on MatrixXf sizes up to 2400.
I could not see any significant impact of this offset.
On Nexus 5, the offset has a slight effect: values around 32 (times sizeof float) are worst. Anything else is the same: the current 64 (8*pk), or... 0.
So let's just go with 0!
Note that we needed a fix anyway for not accounting for the value of RhsProgress. 0 nicely avoids the issue altogether!
2015-02-25 12:37:14 -05:00
Christoph Hertzberg
531fa9de77
bug #970 : Add EIGEN_DEVICE_FUNC to RValue functions, in case Cuda supports RValue-references.
2015-02-24 21:03:28 +01:00
Benoit Jacob
26275b250a
Fix my recent prefetch changes:
...
- the first prefetch is actually harmful on Haswell with FMA,
but it is the most beneficial on ARM.
- the second prefetch... I was very stupid and multiplied by sizeof(scalar)
and offset of a scalar* pointer. The old offset was 64; pk = 8, so 64=pk*8.
So this effectively restores the older offset. Actually, there were
two prefetches here, one with offset 48 and one with offset 64. I could not
confirm any benefit from this strange 48 offset on either the haswell or
my ARM device.
2015-02-23 16:55:17 -05:00
Christoph Hertzberg
052b6b40f1
Fix two trivial warnings
2015-02-22 12:40:51 +01:00
Christoph Hertzberg
ecbf2a6656
log1p is defined only for real Scalars in C++11
2015-02-21 19:58:24 +01:00
Gael Guennebaud
3cf642baa3
Fix compilation of unit tests disabling assertion cheking
2015-02-21 14:13:48 +01:00
Gael Guennebaud
2da1594750
Fix doc of Ref<>
2015-02-20 11:52:22 +01:00
Gael Guennebaud
b192e29eae
In C++11 destructors do not throw by default (fix CommaInitializer unit test)
2015-02-20 09:28:34 +01:00
Benoit Steiner
ab41652d81
Pulled latest changes from trunk
2015-02-19 21:23:37 -08:00
Benoit Steiner
7765039f1c
Marked the CUDA packet primitives as EIGEN_DEVICE_FUNC since they'll end up being executed on the GPU device.
2015-02-19 21:22:51 -08:00
Gael Guennebaud
a66f5fc2fd
Fix regression with C++11 support of lambda: now internal::result_of falls back to std::result_of in C++11.
2015-02-19 23:32:12 +01:00
Gael Guennebaud
1b7e12847d
Fix some calls to result_of on binary functors as unary ones.
2015-02-19 23:30:41 +01:00
Gael Guennebaud
0f4dd15dfc
Declare const some const variables
2015-02-19 23:28:57 +01:00
Gael Guennebaud
829dddd0fd
Add support for C++11 result_of/lambdas
2015-02-19 15:18:37 +01:00
Benoit Jacob
db05f2d01e
rotating kernel: avoid compiling anything outside of ARM
2015-02-18 15:43:52 -05:00
Benoit Jacob
0ed00d5438
remove a newly introduced redundant typedef - sorry.
2015-02-18 15:05:01 -05:00
Benoit Jacob
9bd8a4bab5
bug #955 - Implement a rotating kernel alternative in the 3px4 gebp path
...
This is substantially faster on ARM, where it's important to minimize the number of loads.
This is specific to the case where all packet types are of size 4. I made my best attempt to minimize how dirty this is... opinions welcome.
Eventually one could have a generic rotated kernel, but it would take some work to get there. Also, on sandy bridge, in my experience, it's not beneficial (even about 1% slower).
2015-02-18 15:03:35 -05:00
Hauke Heibel
ee27d50633
Fixed template parameter.
2015-02-18 18:51:08 +01:00
Gael Guennebaud
73a24de424
merge
2015-02-18 15:51:00 +01:00
Gael Guennebaud
63eb0f6fe6
Clean a bit computeProductBlockingSizes (use Index type, remove CEIL macro)
2015-02-18 15:49:05 +01:00
Benoit Jacob
4a3e6c8be1
bug #958 - Allow testing specific blocking sizes
...
This is only a debugging/testing patch. It allows testing specific
product blocking sizes, typically to study the impact on performance.
Example usage:
int testk, testm, testn;
#define EIGEN_TEST_SPECIFIC_BLOCKING_SIZES
#define EIGEN_TEST_SPECIFIC_BLOCKING_SIZE_K testk
#define EIGEN_TEST_SPECIFIC_BLOCKING_SIZE_M testm
#define EIGEN_TEST_SPECIFIC_BLOCKING_SIZE_N testn
#include <Eigen/Core>
2015-02-18 09:43:55 -05:00
Gael Guennebaud
c7bb1e8ea8
Fix a regression when using OpenMP, and fix bug #714 : the number of threads might be lower than the number of requested ones
2015-02-18 15:19:23 +01:00
Jan Blechta
168ceb271e
Really use zero guess in ConjugateGradients::solve as documented
...
and expected for consistency with other methods.
2015-02-18 14:26:10 +01:00
Gael Guennebaud
8fdcaded5e
merge
2015-03-04 10:18:08 +01:00
Gael Guennebaud
c43154bbc5
Check for no-reallocation in SparseMatrix::insert (bug #974 )
2015-03-04 10:16:46 +01:00
Gael Guennebaud
1ce0178363
Improve efficiency of SparseMatrix::insert/coeffRef for sequential outer-index insertion strategies (bug #974 )
2015-03-04 09:39:26 +01:00