Gael Guennebaud
deefa54a54
Fix tracking of temporaries in unit tests
2017-02-19 10:32:54 +01:00
Gael Guennebaud
cbbf88c4d7
Use int32_t instead of int in NEON code. Some platforms with 16 bytes int supports ARM NEON.
2017-02-17 14:39:02 +01:00
Gael Guennebaud
582b5e39bf
bug #1393 : enable Matrix/Array explicit ctor from types with conversion operators (was ok with 3.2)
2017-02-17 14:10:57 +01:00
Benoit Steiner
31a25ab226
Merged eigen/eigen into default
2017-02-14 15:36:21 -08:00
Gael Guennebaud
5937c4ae32
Fall back is_integral to std::is_integral in c++11
2017-02-13 17:14:26 +01:00
Jonathan Hseu
3453b00a1e
Fix vector indexing with uint64_t
2017-02-11 21:45:32 -08:00
Gael Guennebaud
e7ebe52bfb
bug #1391 : include IO.h before DenseBase to enable its usage in DenseBase plugins.
2017-02-13 09:46:20 +01:00
Gael Guennebaud
b3750990d5
Workaround some gcc 4.7 warnings
2017-02-11 23:24:06 +01:00
Gael Guennebaud
c16ee72b20
bug #1392 : fix #include <Eigen/Sparse> with mpl2-only
2017-02-11 10:35:01 +01:00
Gael Guennebaud
e43016367a
Forgot to include a file in previous commit
2017-02-11 10:34:18 +01:00
Gael Guennebaud
6486d4fc95
Worakound gcc 4.7 issue in c++11.
2017-02-11 10:29:10 +01:00
Gael Guennebaud
4a4a72951f
Fix previous commits: disbale only problematic indexed view methods for old compilers instead of disabling everything.
...
Tested with gcc 4.7 (c++03) and gcc 4.8 (c++03 & c++11)
2017-02-11 10:28:44 +01:00
Benoit Steiner
fad776492f
Merged eigen/eigen into default
2017-02-10 14:27:43 -08:00
Benoit Steiner
1ef30b8090
Fixed bug introduced in previous commit
2017-02-10 13:35:10 -08:00
Benoit Steiner
769208a17f
Pulled latest updates from upstream
2017-02-10 13:11:40 -08:00
Benoit Steiner
8b3cc54c42
Added a new EIGEN_HAS_INDEXED_VIEW define that set to 0 for older compilers that are known to fail to compile the indexed views (I used the define from the indexed_views.cpp test).
...
Only include the indexed view methods when the compiler supports the code.
This makes it possible to use Eigen again in complex code bases such as TensorFlow and older compilers such as gcc 4.8
2017-02-10 13:08:49 -08:00
Gael Guennebaud
a1ff24f96a
Fix prunning in (sparse*sparse).pruned() when the result is nearly dense.
2017-02-10 13:59:32 +01:00
Gael Guennebaud
0256c52359
Include clang in the list of non strict MSVC (just to be sure)
2017-02-10 13:41:52 +01:00
Alexander Neumann
dd58462e63
fixed inlining issue with clang-cl on visual studio
...
(grafted from 7962ac1a58
)
2017-02-08 23:50:38 +01:00
Gael Guennebaud
fc8fd5fd24
Improve multi-threading heuristic for matrix products with a small number of columns.
2017-02-07 17:19:59 +01:00
Gael Guennebaud
4254b3eda3
bug #1389 : MSVC's std containers do not properly align in 64 bits mode if the requested alignment is larger than 16 bytes (e.g., with AVX)
2017-02-03 15:22:35 +01:00
Benoit Steiner
2db75c07a6
fixed the ordering of the template and EIGEN_DEVICE_FUNC keywords in a few more places to get more of the Eigen codebase to compile with nvcc again.
2017-02-01 15:41:29 -08:00
Benoit Steiner
fcd257039b
Replaced EIGEN_DEVICE_FUNC template<foo> with template<foo> EIGEN_DEVICE_FUNC to make the code compile with nvcc8.
2017-02-01 15:30:49 -08:00
Gael Guennebaud
0eceea4efd
Define EIGEN_COMP_GNUC to reflect version number: 47, 48, 49, 50, 60, ...
2017-02-01 23:36:40 +01:00
Gael Guennebaud
645a8e32a5
Fix compilation of JacobiSVD for vectors type
2017-01-31 16:22:54 +01:00
Gael Guennebaud
53026d29d4
bug #478 : fix regression in the eigen decomposition of zero matrices.
2017-01-31 14:22:42 +01:00
Benoit Steiner
fbc39fd02c
Merge latest changes from upstream
2017-01-30 15:25:57 -08:00
Gael Guennebaud
c86911ac73
bug #1384 : fix evaluation of "sparse/scalar" that used the wrong evaluation path.
2017-01-30 13:38:24 +01:00
Gael Guennebaud
d024e9942d
MSVC 1900 release is not c++14 compatible enough for us. The 1910 update seems to be fine though.
2017-01-27 22:17:59 +01:00
Rasmus Munk Larsen
edaa0fc5d1
Revert PR-292. After further investigation, the memcpy->memmove change was only good for Haswell on older versions of glibc. Adding a switch for small sizes is perhaps useful for string copies, but also has an overhead for larger sizes, making it a poor trade-off for general memcpy.
...
This PR also removes a couple of unnecessary semi-colons in Eigen/src/Core/AssignEvaluator.h that caused compiler warning everywhere.
2017-01-26 12:46:06 -08:00
Gael Guennebaud
25a1703579
Merged in ggael/eigen-flexidexing (pull request PR-294)
...
generalized operator() for indexed access and slicing
2017-01-26 08:04:23 +00:00
Gael Guennebaud
98dfe0c13f
Fix useless ';' warning
2017-01-25 22:55:04 +01:00
Gael Guennebaud
28351073d8
Fix unamed type as template argument (ok in c++11 only)
2017-01-25 22:54:51 +01:00
Gael Guennebaud
607be65a03
Fix duplicates of array_size bewteen unsupported and Core
2017-01-25 22:53:58 +01:00
Rasmus Munk Larsen
7d39c6d50a
Merged eigen/eigen into default
2017-01-25 09:22:26 -08:00
Rasmus Munk Larsen
5c9ed4ba0d
Reverse arguments for pmin in AVX.
2017-01-25 09:21:57 -08:00
Gael Guennebaud
850ca961d2
bug #1383 : fix regression in LinSpaced for integers and high<low
2017-01-25 18:13:53 +01:00
Gael Guennebaud
296d24be4d
bug #1381 : fix sparse.diagonal() used as a rvalue.
...
The problem was that is "sparse" is not const, then sparse.diagonal() must have the
LValueBit flag meaning that sparse.diagonal().coeff(i) must returns a const reference,
const Scalar&. However, sparse::coeff() cannot returns a reference for a non-existing
zero coefficient. The trick is to return a reference to a local member of
evaluator<SparseMatrix>.
2017-01-25 17:39:01 +01:00
Gael Guennebaud
d06a48959a
bug #1383 : Fix regression from 3.2 with LinSpaced(n,0,n-1) with n==0.
2017-01-25 15:27:13 +01:00
Rasmus Munk Larsen
ae3e43a125
Remove extra space.
2017-01-24 16:16:39 -08:00
Benoit Steiner
e96c77668d
Merged in rmlarsen/eigen2 (pull request PR-292)
...
Adds a fast memcpy function to Eigen.
2017-01-25 00:14:04 +00:00
Rasmus Munk Larsen
3be5ee2352
Update copy helper to use fast_memcpy.
2017-01-24 14:22:49 -08:00
Rasmus Munk Larsen
e6b1020221
Adds a fast memcpy function to Eigen. This takes advantage of the following:
...
1. For small fixed sizes, the compiler generates inline code for memcpy, which is much faster.
2. My colleague eriche at googl dot com discovered that for large sizes, memmove is significantly faster than memcpy (at least on Linux with GCC or Clang). See benchmark numbers measured on a Haswell (HP Z440) workstation here: https://docs.google.com/a/google.com/spreadsheets/d/1jLs5bKzXwhpTySw65MhG1pZpsIwkszZqQTjwrd_n0ic/pubhtml This is of course surprising since memcpy is a less constrained version of memmove. This stackoverflow thread contains some speculation as to the causes: http://stackoverflow.com/questions/22793669/poor-memcpy-performance-on-linux
Below are numbers for copying and slicing tensors using the multithreaded TensorDevice. The numbers show significant improvements for memcpy of very small blocks and for memcpy of large blocks single threaded (we were already able to saturate memory bandwidth for >1 threads before on large blocks). The "slicingSmallPieces" benchmark also shows small consistent improvements, since memcpy cost is a fair portion of that particular computation.
The benchmarks operate on NxN matrices, and the names are of the form BM_$OP_${NUMTHREADS}T/${N}.
Measured improvements in wall clock time:
Run on rmlarsen3.mtv (12 X 3501 MHz CPUs); 2017-01-20T11:26:31.493023454-08:00
CPU: Intel Haswell with HyperThreading (6 cores) dL1:32KB dL2:256KB dL3:15MB
Benchmark Base (ns) New (ns) Improvement
------------------------------------------------------------------
BM_memcpy_1T/2 3.48 2.39 +31.3%
BM_memcpy_1T/8 12.3 6.51 +47.0%
BM_memcpy_1T/64 371 383 -3.2%
BM_memcpy_1T/512 66922 66720 +0.3%
BM_memcpy_1T/4k 9892867 6849682 +30.8%
BM_memcpy_1T/5k 14951099 10332856 +30.9%
BM_memcpy_2T/2 3.50 2.46 +29.7%
BM_memcpy_2T/8 12.3 7.66 +37.7%
BM_memcpy_2T/64 371 376 -1.3%
BM_memcpy_2T/512 66652 66788 -0.2%
BM_memcpy_2T/4k 6145012 6117776 +0.4%
BM_memcpy_2T/5k 9181478 9010942 +1.9%
BM_memcpy_4T/2 3.47 2.47 +31.0%
BM_memcpy_4T/8 12.3 6.67 +45.8
BM_memcpy_4T/64 374 376 -0.5%
BM_memcpy_4T/512 67833 68019 -0.3%
BM_memcpy_4T/4k 5057425 5188253 -2.6%
BM_memcpy_4T/5k 7555638 7779468 -3.0%
BM_memcpy_6T/2 3.51 2.50 +28.8%
BM_memcpy_6T/8 12.3 7.61 +38.1%
BM_memcpy_6T/64 373 378 -1.3%
BM_memcpy_6T/512 66871 66774 +0.1%
BM_memcpy_6T/4k 5112975 5233502 -2.4%
BM_memcpy_6T/5k 7614180 7772246 -2.1%
BM_memcpy_8T/2 3.47 2.41 +30.5%
BM_memcpy_8T/8 12.4 10.5 +15.3%
BM_memcpy_8T/64 372 388 -4.3%
BM_memcpy_8T/512 67373 66588 +1.2%
BM_memcpy_8T/4k 5148462 5254897 -2.1%
BM_memcpy_8T/5k 7660989 7799058 -1.8%
BM_memcpy_12T/2 3.50 2.40 +31.4%
BM_memcpy_12T/8 12.4 7.55 +39.1
BM_memcpy_12T/64 374 378 -1.1%
BM_memcpy_12T/512 67132 66683 +0.7%
BM_memcpy_12T/4k 5185125 5292920 -2.1%
BM_memcpy_12T/5k 7717284 7942684 -2.9%
BM_slicingSmallPieces_1T/2 47.3 47.5 +0.4%
BM_slicingSmallPieces_1T/8 53.6 52.3 +2.4%
BM_slicingSmallPieces_1T/64 491 476 +3.1%
BM_slicingSmallPieces_1T/512 21734 18814 +13.4%
BM_slicingSmallPieces_1T/4k 394660 396760 -0.5%
BM_slicingSmallPieces_1T/5k 218722 209244 +4.3%
BM_slicingSmallPieces_2T/2 80.7 79.9 +1.0%
BM_slicingSmallPieces_2T/8 54.2 53.1 +2.0
BM_slicingSmallPieces_2T/64 497 477 +4.0%
BM_slicingSmallPieces_2T/512 21732 18822 +13.4%
BM_slicingSmallPieces_2T/4k 392885 390490 +0.6%
BM_slicingSmallPieces_2T/5k 221988 208678 +6.0%
BM_slicingSmallPieces_4T/2 80.8 80.1 +0.9%
BM_slicingSmallPieces_4T/8 54.1 53.2 +1.7%
BM_slicingSmallPieces_4T/64 493 476 +3.4%
BM_slicingSmallPieces_4T/512 21702 18758 +13.6%
BM_slicingSmallPieces_4T/4k 393962 404023 -2.6%
BM_slicingSmallPieces_4T/5k 249667 211732 +15.2%
BM_slicingSmallPieces_6T/2 80.5 80.1 +0.5%
BM_slicingSmallPieces_6T/8 54.4 53.4 +1.8%
BM_slicingSmallPieces_6T/64 488 478 +2.0%
BM_slicingSmallPieces_6T/512 21719 18841 +13.3%
BM_slicingSmallPieces_6T/4k 394950 397583 -0.7%
BM_slicingSmallPieces_6T/5k 223080 210148 +5.8%
BM_slicingSmallPieces_8T/2 81.2 80.4 +1.0%
BM_slicingSmallPieces_8T/8 58.1 53.5 +7.9%
BM_slicingSmallPieces_8T/64 489 480 +1.8%
BM_slicingSmallPieces_8T/512 21586 18798 +12.9%
BM_slicingSmallPieces_8T/4k 394592 400165 -1.4%
BM_slicingSmallPieces_8T/5k 219688 208301 +5.2%
BM_slicingSmallPieces_12T/2 80.2 79.8 +0.7%
BM_slicingSmallPieces_12T/8 54.4 53.4 +1.8
BM_slicingSmallPieces_12T/64 488 476 +2.5%
BM_slicingSmallPieces_12T/512 21931 18831 +14.1%
BM_slicingSmallPieces_12T/4k 393962 396541 -0.7%
BM_slicingSmallPieces_12T/5k 218803 207965 +5.0%
2017-01-24 13:55:18 -08:00
Rasmus Munk Larsen
7b6aaa3440
Fix NaN propagation for AVX512.
2017-01-24 13:37:08 -08:00
Rasmus Munk Larsen
5e144bbaa4
Make NaN propagatation consistent between the pmax/pmin and std::max/std::min. This makes the NaN propagation consistent between the scalar and vectorized code paths of Eigen's scalar_max_op and scalar_min_op.
...
See #1373 for details.
2017-01-24 13:32:50 -08:00
Gael Guennebaud
d83db761a2
Add support for std::integral_constant
2017-01-24 16:28:12 +01:00
Gael Guennebaud
bc10201854
Add test for multiple symbols
2017-01-24 16:27:51 +01:00
Gael Guennebaud
c43d254d13
Fix seq().reverse() in c++98
2017-01-24 11:36:43 +01:00
Gael Guennebaud
ddd83f82d8
Add support for "SymbolicExpr op fix<N>" in C++98/11 mode.
2017-01-24 10:54:42 +01:00
Gael Guennebaud
228fef1b3a
Extended the set of arithmetic operators supported by FixedInt (-,+,*,/,%,&,|)
2017-01-24 10:53:51 +01:00
Gael Guennebaud
bb52f74e62
Add internal doc
2017-01-24 10:13:35 +01:00
Gael Guennebaud
41c523a0ab
Rename fix_t to FixedInt
2017-01-24 09:39:49 +01:00
Gael Guennebaud
ba3f977946
bug #1376 : add missing assertion on size mismatch with compound assignment operators (e.g., mat += mat.col(j))
2017-01-23 22:06:08 +01:00
Gael Guennebaud
b0db4eff36
bug #1382 : move using std::size_t/ptrdiff_t to Eigen's namespace (still better than the global namespace!)
2017-01-23 22:03:57 +01:00
Gael Guennebaud
ca79c1545a
Add std:: namespace prefix to all (hopefully) instances if size_t/ptrdfiff_t
2017-01-23 22:02:53 +01:00
Gael Guennebaud
4b607b5692
Use Index instead of size_t
2017-01-23 22:00:33 +01:00
Gael Guennebaud
0fe278f7be
bug #1379 : fix compilation in sparse*diagonal*dense with openmp
2017-01-21 23:27:01 +01:00
Gael Guennebaud
22a172751e
bug #1378 : fix doc (DiagonalIndex vs Diagonal)
2017-01-21 22:09:59 +01:00
Gael Guennebaud
4d302a080c
Recover compile-time size from seq(A,B) when A and B are fixed values. (c++11 only)
2017-01-19 20:34:18 +01:00
Gael Guennebaud
54f3fbee24
Exploit fixed values in seq and reverse with C++98 compatibility
2017-01-19 19:57:32 +01:00
Gael Guennebaud
7691723e34
Add support for fixed-value in symbolic expression, c++11 only for now.
2017-01-19 19:25:29 +01:00
Benoit Steiner
924600a0e8
Made sure that enabling avx2 instructions enables avx and sse instructions as well.
2017-01-19 09:54:48 -08:00
Gael Guennebaud
e84ed7b6ef
Remove dead code
2017-01-18 23:18:28 +01:00
Gael Guennebaud
f3ccbe0419
Add a Symbolic::FixedExpr helper expression to make sure the compiler fully optimize the usage of last and end.
2017-01-18 23:16:32 +01:00
Gael Guennebaud
15471432fe
Add a .reverse() member to ArithmeticSequence.
2017-01-18 11:35:27 +01:00
Gael Guennebaud
e4f8dd860a
Add missing operator*
2017-01-18 10:49:01 +01:00
Gael Guennebaud
198507141b
Update all block expressions to accept compile-time sizes passed by fix<N> or fix<N>(n)
2017-01-18 09:43:58 +01:00
Gael Guennebaud
5484ddd353
Merge the generic and dynamic overloads of block()
2017-01-17 22:11:46 +01:00
Gael Guennebaud
655ba783f8
Defer set-to-zero in triangular = product so that no aliasing issue occur in the common:
...
A.triangularView() = B*A.sefladjointView()*B.adjoint()
case that used to work in 3.2.
2017-01-17 18:03:35 +01:00
Gael Guennebaud
5e36ec3b6f
Fix regression when passing enums to operator()
2017-01-17 17:10:16 +01:00
Gael Guennebaud
4f36dcfda8
Add a generic block() method compatible with Eigen::fix
2017-01-17 11:34:28 +01:00
Gael Guennebaud
71e5b71356
Add a get_runtime_value helper to deal with pointer-to-function hack,
...
plus some refactoring to make the internals more consistent.
2017-01-17 11:33:57 +01:00
Gael Guennebaud
23bfcfc15f
Add missing overload of get_compile_time for c++98/11
2017-01-17 10:30:21 +01:00
Gael Guennebaud
edff32c2c2
Disambiguate the two versions of fix for doxygen
2017-01-17 10:29:33 +01:00
Gael Guennebaud
4989922be2
Add support for symbolic expressions as arguments of operator()
2017-01-16 22:21:23 +01:00
Gael Guennebaud
12e22a2844
typos in doc
2017-01-16 16:31:19 +01:00
Gael Guennebaud
e70c4c97fa
Typo
2017-01-16 16:20:16 +01:00
Gael Guennebaud
a9232af845
Introduce a variable_or_fixed<N> proxy returned by fix<N>(val) to pass both a compile-time and runtime fallback value in case N means "runtime".
...
This mechanism is used by the seq/seqN functions. The proxy object is immediately converted to pure compile-time (as fix<N>) or pure runtime (i.e., an Index) to avoid redundant template instantiations.
2017-01-16 16:17:01 +01:00
Gael Guennebaud
6e97698161
Introduce a EIGEN_HAS_CXX14 macro
2017-01-16 16:13:37 +01:00
Mehdi Goli
e46e722381
Adding Tensor ReverseOp; TensorStriding; TensorConversionOp; Modifying Tensor Contractsycl to be located in any place in the expression tree.
2017-01-16 13:58:49 +00:00
Luke Iwanski
23778a15d8
Reverting unintentional change to Eigen/Geometry
2017-01-16 11:05:56 +00:00
Fraser Cormack
8245d3c7ad
Fix case-sensitivity of file include
2017-01-12 12:13:18 +00:00
Gael Guennebaud
752bd92ba5
Large code refactoring:
...
- generalize some utilities and move them to Meta (size(), array_size())
- move handling of all and single indices to IndexedViewHelper.h
- several cleanup changes
2017-01-11 17:24:02 +01:00
Gael Guennebaud
f93d1c58e0
Make get_compile_time compatible with variable_if_dynamic
2017-01-11 17:08:59 +01:00
Gael Guennebaud
c020d307a6
Make variable_if_dynamic<T> implicitely convertible to T
2017-01-11 17:08:05 +01:00
Gael Guennebaud
43c617e2ee
merge
2017-01-11 14:33:37 +01:00
Gael Guennebaud
b1dc0fa813
Move fix and symbolic to their own file, and improve doxygen compatibility
2017-01-11 14:28:28 +01:00
Gael Guennebaud
04397f17e2
Add 1D overloads of operator()
2017-01-11 13:17:09 +01:00
Gael Guennebaud
1b5570988b
Add doc to seq, seqN, ArithmeticSequence, operator(), etc.
2017-01-10 22:58:58 +01:00
Gael Guennebaud
17eac60446
Factorize const and non-const version of the generic operator() method.
2017-01-10 21:45:55 +01:00
Gael Guennebaud
d072fc4b14
add writeable IndexedView
2017-01-10 17:10:35 +01:00
Gael Guennebaud
c9d5e5c6da
Simplify Symbolic API: std::tuple is now used internally and automatically built.
2017-01-10 16:55:07 +01:00
Gael Guennebaud
407e7b7a93
Simplify symbolic API by using "symbol=value" to associate a runtime value to a symbol.
2017-01-10 16:45:32 +01:00
Gael Guennebaud
96e6cf9aa2
Fix linking issue.
2017-01-10 16:35:46 +01:00
Gael Guennebaud
e63678bc89
Fix ambiguous call
2017-01-10 16:33:40 +01:00
Gael Guennebaud
8e247744a4
Fix linking issue
2017-01-10 16:32:06 +01:00
Gael Guennebaud
b47a7e5c3a
Add doc for IndexedView
2017-01-10 16:28:57 +01:00
Gael Guennebaud
87963f441c
Fallback to Block<> when possible (Index, all, seq with > increment).
...
This is important to take advantage of the optimized implementations (evaluator, products, etc.),
and to support sparse matrices.
2017-01-10 14:25:30 +01:00
Gael Guennebaud
a98c7efb16
Add a more generic evaluation mechanism and minimalistic doc.
2017-01-10 11:46:29 +01:00
Gael Guennebaud
13d954f270
Cleanup Eigen's namespace
2017-01-10 11:06:02 +01:00
Gael Guennebaud
9eaab4f9e0
Refactoring: move all symbolic stuff into its own namespace
2017-01-10 10:57:08 +01:00
Gael Guennebaud
acd08900c9
Move 'last' and 'end' to their own namespace
2017-01-10 10:31:07 +01:00
Gael Guennebaud
1df2377d78
Implement c++98 version of seq()
2017-01-10 10:28:45 +01:00
Gael Guennebaud
ecd9cc5412
Isolate legacy code (we keep it for performance comparison purpose)
2017-01-10 09:34:25 +01:00
Gael Guennebaud
b50c3e967e
Add a minimalistic symbolic scalar type with expression template and make use of it to define the last placeholder and to unify the return type of seq and seqN.
2017-01-09 23:42:16 +01:00
Gael Guennebaud
68064e14fa
Rename span/range to seqN/seq
2017-01-09 17:35:21 +01:00
Gael Guennebaud
ad3eef7608
Add link to SO
2017-01-09 13:01:39 +01:00
Gael Guennebaud
75aef5b37f
Fix extraction of compile-time size of std::array with gcc
2017-01-06 22:04:49 +01:00
Gael Guennebaud
233dff1b35
Add support for plain arrays for columns and both rows/columns
2017-01-06 22:01:53 +01:00
Gael Guennebaud
76e183bd52
Propagate compile-time size for plain arrays
2017-01-06 22:01:23 +01:00
Gael Guennebaud
3264d3c761
Add support for plain-array as indices, e.g., mat({1,2,3,4})
2017-01-06 21:53:32 +01:00
Gael Guennebaud
831fffe874
Add missing doc of SparseView
2017-01-06 18:01:29 +01:00
Gael Guennebaud
a875167d99
Propagate compile-time increment and strides.
...
Had to introduce a UndefinedIncr constant for non structured list of indices.
2017-01-06 15:54:55 +01:00
Gael Guennebaud
e383d6159a
MSVC 2015 has all we want about c++11 and MSVC 2017 fails on binder1st/binder2nd
2017-01-06 15:44:13 +01:00
Gael Guennebaud
fad1fa75b3
Propagate compile-time size with "all" and add c++11 array unit test
2017-01-06 13:29:33 +01:00
Gael Guennebaud
3730e3ca9e
Use "fix" for compile-time values, propagate compile-time sizes for span, clean some cleanup.
2017-01-06 13:10:10 +01:00
Gael Guennebaud
ac7e4ac9c0
Initial commit to add a generic indexed-based view of matrices.
...
This version already works as a read-only expression.
Numerous refactoring, renaming, extension, tuning passes are expected...
2017-01-06 00:01:44 +01:00
Jim Radford
0c226644d8
LLT: const the arg to solveInPlace() to allow passing .transpose(), .block(), etc.
2017-01-04 14:42:57 -08:00
Jim Radford
be281e5289
LLT: avoid making a copy when decomposing in place
2017-01-04 14:43:56 -08:00
Gael Guennebaud
e27f17bf5c
Gub 1453: fix Map with non-default inner-stride but no outer-stride.
2017-08-22 13:27:37 +02:00
Gael Guennebaud
21d0a0bcf5
bug #1456 : add perf recommendation for LLT and storage format
2017-08-22 12:46:35 +02:00
Gael Guennebaud
2c3d70d915
Re-enable hidden doc in LLT
2017-08-22 12:04:09 +02:00
Gael Guennebaud
a6e7a41a55
bug #1455 : Cholesky module depends on Jacobi for rank-updates.
2017-08-22 11:37:32 +02:00
Gael Guennebaud
e6021cc8cc
bug #1458 : fix documentation of LLT and LDLT info() method.
2017-08-22 11:32:55 +02:00
Gael Guennebaud
f727844658
use MKL's lapacke.h header when using MKL
2017-08-17 21:58:39 +02:00
Gael Guennebaud
8c858bd891
Clarify doc regarding the usage of MKL_DIRECT_CALL
2017-08-17 12:17:45 +02:00
Gael Guennebaud
b95f92843c
Fix support for MKL's BLAS when using MKL_DIRECT_CALL.
2017-08-17 12:07:10 +02:00
Gael Guennebaud
687bedfcad
Make NoAlias and JacobiRotation compatible with CUDA.
2017-08-17 11:51:22 +02:00
Gael Guennebaud
1f4b24d2df
Do not preallocate more space than the matrix size (when the sparse matrix boils down to a vector
2017-07-20 10:13:48 +02:00
Gael Guennebaud
55d7181557
Fix lazyness of operator* with CUDA
2017-07-20 09:47:28 +02:00
Gael Guennebaud
cda47c42c2
Fix compilation in c++98 mode.
2017-07-17 21:08:20 +02:00
Gael Guennebaud
3182bdbae6
Disable vectorization when compiled by nvcc, even is EIGEN_NO_CUDA is defined
2017-07-17 11:01:28 +02:00
Gael Guennebaud
9f8136ff74
disable nvcc boolean-expr-is-constant warning
2017-07-17 10:43:18 +02:00
Gael Guennebaud
bbd97b4095
Add a EIGEN_NO_CUDA option, and introduce EIGEN_CUDACC and EIGEN_CUDA_ARCH aliases
2017-07-17 01:02:51 +02:00
Gael Guennebaud
2299717fd5
Fix and workaround several doxygen issues/warnings
2017-01-04 23:27:33 +01:00
Gael Guennebaud
ee6f7f6c0c
Add doc for sparse triangular solve functions
2017-01-04 23:10:36 +01:00
Gael Guennebaud
a0a36ad0ef
bug #1336 : workaround doxygen failing to include numerous members of MatriBase in Matrix
2017-01-04 22:02:39 +01:00
Gael Guennebaud
29a1a58113
Document selfadjointView
2017-01-04 22:01:50 +01:00
Gael Guennebaud
8702562177
bug #1370 : add doc for StorageIndex
2017-01-03 11:25:41 +01:00
Gael Guennebaud
575c078759
bug #1370 : rename _Index to _StorageIndex in SparseMatrix, and add a warning in the doc regarding the 3.2 to 3.3 change of SparseMatrix::Index
2017-01-03 11:19:14 +01:00
Valentin Roussellet
d3c5525c23
Added += and + operators to inner iterators
...
Fix #1340
#1340
2016-12-28 18:29:30 +01:00
Gael Guennebaud
5c27962453
Move common cwise-unary method from MatrixBase/ArrayBase to the common DenseBase class.
2017-01-02 22:27:07 +01:00
Gael Guennebaud
8d7810a476
bug #1365 : fix another type mismatch warning
...
(sync is set from and compared to an Index)
2016-12-28 23:35:43 +01:00
Gael Guennebaud
97812ff0d3
bug #1369 : fix type mismatch warning.
...
Returned values of omp thread id and numbers are int,
o let's use int instead of Index here.
2016-12-28 23:29:35 +01:00
Gael Guennebaud
7713e20fd2
Fix compilation
2016-12-27 22:04:58 +01:00
Gael Guennebaud
ab69a7f6d1
Cleanup because trait<CwiseBinaryOp>::Flags now expose the correct storage order
2016-12-27 16:55:47 +01:00
Gael Guennebaud
d32a43e33a
Make sure that traits<CwiseBinaryOp>::Flags reports the correct storage order so that methods like .outerSize()/.innerSize() work properly.
2016-12-27 16:35:45 +01:00
Gael Guennebaud
7136267461
Add missing .outer() member to iterators of evaluators of cwise sparse binary expression
2016-12-27 16:34:30 +01:00
Gael Guennebaud
fe0ee72390
Fix check of storage order mismatch for "sparse cwiseop sparse".
2016-12-27 16:33:19 +01:00
Gael Guennebaud
6b8f637ab1
Harmless typo
2016-12-27 16:31:17 +01:00
Benoit Steiner
354baa0fb1
Avoid using horizontal adds since they're not very efficient.
2016-12-21 20:55:07 -08:00
Benoit Steiner
d7825b6707
Use native AVX512 types instead of Eigen Packets whenever possible.
2016-12-21 20:06:18 -08:00
Gael Guennebaud
c6882a72ed
Merged in joaoruileal/eigen (pull request PR-276)
...
Minor improvements to Umfpack support
2016-12-21 21:39:48 +01:00
Joao Rui Leal
c8c89b5e19
renamed methods umfpackReportControl(), umfpackReportInfo(), and umfpackReportStatus() from UmfPackLU to printUmfpackControl(), printUmfpackInfo(), and printUmfpackStatus()
2016-12-21 09:16:28 +00:00
Gael Guennebaud
f2f9df8aa5
Remove MSVC warning 4127 - conditional expression is constant from the disabled list as we now have a local workaround.
2016-12-20 22:53:19 +01:00
Gael Guennebaud
2b3fc981b8
bug #1362 : workaround constant conditional warning produced by MSVC
2016-12-20 22:52:27 +01:00
Gael Guennebaud
94e8d8902f
Fix bug #1367 : compilation fix for gcc 4.1!
2016-12-20 22:17:01 +01:00
Gael Guennebaud
684cfc762d
Add transpose, adjoint, conjugate methods to SelfAdjointView (useful to write generic code)
2016-12-20 16:33:53 +01:00
Gael Guennebaud
11f55b2979
Optimize storage layout of Cwise* and PlainObjectBase evaluator to remove the functor or outer-stride if they are empty.
...
For instance, sizeof("(A-B).cwiseAbs2()") with A,B Vector4f is now 16 bytes, instead of 48 before this optimization.
In theory, evaluators should be completely optimized away by the compiler, but this might help in some cases.
2016-12-20 15:55:40 +01:00
Gael Guennebaud
5271474b15
Remove common "noncopyable" base class from evaluator_base to get a chance to get EBO (Empty Base Optimization)
...
Note: we should probbaly get rid of this class and define a macro instead.
2016-12-20 15:51:30 +01:00
Gael Guennebaud
316673bbde
Clean-up usage of ExpressionTraits in all/any implementation.
2016-12-20 14:38:05 +01:00
Christoph Hertzberg
10c6bcdc2e
Add support for long indexes and for (real-valued) row-major matrices to CholmodSupport module
2016-12-19 14:07:42 +01:00
Gael Guennebaud
f5d644b415
Make sure that HyperPlane::transform manitains a unit normal vector in the Affine case.
2016-12-20 09:35:00 +01:00
Benoit Steiner
923acadfac
Fixed compilation errors with gcc6 when compiling the AVX512 intrinsics
2016-12-19 13:02:27 -08:00
Benoit Jacob
751e097c57
Use 32 registers on ARM64
2016-12-19 13:44:46 -05:00
Benoit Steiner
fb1d0138ec
Include SSE packet instructions when compiling with avx512 enabled.
2016-12-19 07:32:48 -08:00
Joao Rui Leal
95b804c0fe
it is now possible to change Umfpack control settings before factorizations; added access to the report functions of Umfpack
2016-12-19 10:45:59 +00:00
Gael Guennebaud
8c0e701504
bug #1360 : fix sign issue with pmull on altivec
2016-12-18 22:13:19 +00:00
Gael Guennebaud
fc94258e77
Fix unused warning
2016-12-18 22:11:48 +00:00
ermak
d60cca32e5
Transformation methods added to ParametrizedLine class.
2016-12-17 00:45:13 +07:00
Benoit Steiner
9e03dfb452
Made sure EIGEN_HAS_C99_MATH is defined when compiling OpenCL code
2016-12-17 09:23:37 -08:00
Rafael Guglielmetti
8f11df2667
NumTraits.h:
...
For the values 'ReadCost, AddCost and MulCost', information about value Eigen::HugeCost
2016-12-16 09:07:12 +00:00
Benoit Steiner
1324ffef2f
Reenabled the use of constexpr on OpenCL devices
2016-12-15 06:49:38 -08:00
Gael Guennebaud
5d00fdf0e8
bug #1363 : fix mingw's ABI issue
2016-12-15 11:58:31 +01:00
Gael Guennebaud
11b492e993
bug #1358 : fix compilation for sparse += sparse.selfadjointView();
2016-12-14 17:53:47 +01:00
Gael Guennebaud
e67397bfa7
bug #1359 : fix compilation of col_major_sparse.row() *= scalar
...
(used to work in 3.2.9 though the expression is not really writable)
2016-12-14 17:05:26 +01:00
Gael Guennebaud
98d7458275
bug #1359 : fix sparse /=scalar and *=scalar implementation.
...
InnerIterators must be obtained from an evaluator.
2016-12-14 17:03:13 +01:00
Gael Guennebaud
c817ce3ba3
bug #1361 : fix compilation issue in mat=perm.inverse()
2016-12-13 23:10:27 +01:00
Benoit Steiner
6811e6cf49
Merged in srvasude/eigen/fix_cuda_exp (pull request PR-268)
...
Fix expm1 CUDA implementation (do not shadow exp CUDA implementation).
2016-12-08 05:14:11 -08:00
Angelos Mantzaflaris
7694684992
Remove superfluous const's (can cause warnings on some Intel compilers)
...
(grafted from e236d3443c
)
2016-12-07 00:37:48 +01:00
Gael Guennebaud
eb621413c1
Revert vec/y to vec*(1/y) in row-major TRSM:
...
- div is extremely costly
- this is consistent with the column-major case
- this is consistent with all other BLAS implementations
2016-12-06 15:04:50 +01:00
Gael Guennebaud
8365c2c941
Fix BLAS backend for symmetric rank K updates.
2016-12-06 14:47:09 +01:00
Srinivas Vasudevan
e6c8b5500c
Change comparisons to use Scalar instead of RealScalar.
2016-12-05 14:01:45 -08:00
Srinivas Vasudevan
f7d7c33a28
Fix expm1 CUDA implementation (do not shadow exp CUDA implementation).
2016-12-05 12:19:01 -08:00
Srinivas Vasudevan
09ee7f0c80
Fix small nit where I changed name of plog1p to pexpm1.
2016-12-02 15:30:12 -08:00
Srinivas Vasudevan
a0d3ac760f
Sync from Head.
2016-12-02 14:14:45 -08:00
Srinivas Vasudevan
218764ee1f
Added support for expm1 in Eigen.
2016-12-02 14:13:01 -08:00
Gael Guennebaud
66f65ccc36
Ease compiler job to generate clean and efficient code in mat*vec.
2016-12-02 22:41:26 +01:00
Gael Guennebaud
fe696022ec
Operators += and -= do not resize!
2016-12-02 22:40:25 +01:00
Angelos Mantzaflaris
18de92329e
use numext::abs
...
(grafted from 0a08d4c60b
)
2016-12-02 11:48:06 +01:00
Angelos Mantzaflaris
e8a6aa518e
1. Add explicit template to abs2 (resolves deduction for some arithmetic types)
...
2. Avoid signed-unsigned conversion in comparison (warning in case Scalar is unsigned)
(grafted from 4086187e49
)
2016-12-02 11:39:18 +01:00
Gael Guennebaud
a6b971e291
Fix memory leak in Ref<Sparse>
2016-12-05 16:59:30 +01:00
Gael Guennebaud
8640ffac65
Optimize SparseLU::solve for rhs vectors
2016-12-05 15:41:14 +01:00
Gael Guennebaud
62acd67903
remove temporary in SparseLU::solve
2016-12-05 15:11:57 +01:00
Gael Guennebaud
0db6d5b3f4
bug #1356 : fix calls to evaluator::coeffRef(0,0) to get the address of the destination
...
by adding a dstDataPtr() member to the kernel. This fixes undefined behavior if dst is empty (nullptr).
2016-12-05 15:08:09 +01:00
Gael Guennebaud
91003f3b86
typo
2016-12-05 13:51:07 +01:00
Gael Guennebaud
e3f613cbd4
Improve performance of row-major-dense-matrix * vector products for recent CPUs.
...
This revised version does not bother about aligned loads/stores,
and rather processes 8 rows at ones for better instruction pipelining.
2016-12-05 13:02:01 +01:00
Gael Guennebaud
3abc827354
Clean debugging code
2016-12-05 12:59:32 +01:00
Benoit Steiner
462c28e77a
Merged in srvasude/eigen (pull request PR-265)
...
Add Expm1 support to Eigen.
2016-12-05 02:31:11 +00:00
Gael Guennebaud
6a5fe86098
Complete rewrite of column-major-matrix * vector product to deliver higher performance of modern CPU.
...
The previous code has been optimized for Intel core2 for which unaligned loads/stores were prohibitively expensive.
This new version exhibits much higher instruction independence (better pipelining) and explicitly leverage FMA.
According to my benchmark, on Haswell this new kernel is always faster than the previous one, and sometimes even twice as fast.
Even higher performance could be achieved with a better blocking size heuristic and, perhaps, with explicit prefetching.
We should also check triangular product/solve to optimally exploit this new kernel (working on vertical panel of 4 columns is probably not optimal anymore).
2016-12-03 21:14:14 +01:00