Commit Graph

8613 Commits

Author SHA1 Message Date
Gael Guennebaud
66917299a9 Add debug output 2016-07-06 22:27:15 +02:00
Gael Guennebaud
5ca2457fa5 Fix unit test. 2016-07-06 22:25:24 +02:00
Gael Guennebaud
9b68ed4537 Relax is_equal to is_approx because scaling might modify last bit. 2016-07-06 15:02:49 +02:00
Gael Guennebaud
c3b23d7dbf Fix support of Intel's VML 2016-07-06 14:07:32 +02:00
Gael Guennebaud
8ec4d6480d Fix compilation with recent updates of icc 2016 2016-07-06 14:07:14 +02:00
Gael Guennebaud
5b3a6f51d3 Improve numerical robustness of RealSchur: add scaling and compare sub-diag entries to largest diagonal entry instead of the 2 neighbors. 2016-07-06 13:45:30 +02:00
Gael Guennebaud
d2b5a19e0f Fix warning. 2016-07-06 11:05:30 +02:00
Gael Guennebaud
367ef66af3 Re-enable some specializations for Assignment<.,Product<>> 2016-07-05 22:58:14 +02:00
Gael Guennebaud
155d8d8603 Fix compilation with msvc 2016-07-05 14:43:42 +02:00
Gael Guennebaud
43696ede8f Revert unwanted changes. 2016-07-04 22:40:36 +02:00
Gael Guennebaud
b39fd8217f Fix nesting of SolveWithGuess, and add unit test. 2016-07-04 17:47:47 +02:00
Gael Guennebaud
ec02af1047 Fix template resolution. 2016-07-04 17:37:33 +02:00
Gael Guennebaud
fbcfc2f862 Add unit test for solveWithGuess, and fix template resolution. 2016-07-04 17:19:38 +02:00
Gael Guennebaud
7f7839c12f Add documentation and exemples for inplace decomposition. 2016-07-04 17:18:26 +02:00
Gael Guennebaud
32a41ee659 bug #707: add inplace decomposition through Ref<> for Cholesky, LU and QR decompositions. 2016-07-04 15:13:35 +02:00
Gael Guennebaud
75e80792cc Update relevent list of changesets. 2016-07-04 14:32:34 +02:00
Gael Guennebaud
dacc544b84 asm escape was not strong enough to prevent too aggressive compiler optimization let's fallback to no-inline. 2016-07-04 14:32:15 +02:00
Gael Guennebaud
b74e45906c Few fixes in perf-monitoring. 2016-07-04 14:30:50 +02:00
Gael Guennebaud
ce9fc0ce14 fix clang compilation 2016-07-04 12:59:02 +02:00
Gael Guennebaud
440020474c Workaround compilation issue with msvc 2016-07-04 12:49:19 +02:00
Gael Guennebaud
e61cee7a50 Fix compilation of some unit tests with msvc 2016-07-04 11:49:03 +02:00
Gael Guennebaud
91b3039013 Change the semantic of the last template parameter of Assignment from "Scalar" to "SFINAE" only.
The previous "Scalar" semantic was obsolete since we allow for different scalar types in the source and destination expressions.
On can still specialize on scalar types through SFINAE and/or assignment functor.
2016-07-04 11:02:00 +02:00
Gael Guennebaud
0fa9e4a15c Fix performance regression in dgemm introduced by changeset 5d51a7f12c 2016-07-02 17:35:08 +02:00
Gael Guennebaud
672076db5d Fix performance regression introduced in changeset e56aabf205
.
Register blocking sizes are better handled by the cache size heuristics.
The current code introduced very small blocks, for instance for 9x9 matrix,
thus killing performance.
2016-07-02 15:40:56 +02:00
Gael Guennebaud
d161b8f03a Merged in carpent/eigen (pull request PR-204)
Use complete nested namespace Eigen::internal, thus making the custom static assertion macros available outside the Eigen's namespace.
2016-07-01 09:56:44 +02:00
Benoit Steiner
cb2d8b8fa6 Made it possible to compile reductions for an old cuda architecture and run them on a recent gpu. 2016-06-29 15:42:01 -07:00
Benoit Steiner
b2a47641ce Made the code compile when using CUDA architecture < 300 2016-06-29 15:32:47 -07:00
Benoit Steiner
b047ca765f Merged in ibab/eigen/fix-tensor-scan-gpu (pull request PR-205)
Add missing CUDA kernel to tensor scan op
2016-06-29 14:52:19 -07:00
Igor Babuschkin
85699850d9 Add missing CUDA kernel to tensor scan op
The TensorScanOp implementation was missing a CUDA kernel launch.
This adds a simple placeholder implementation.
2016-06-29 11:54:35 +01:00
Justin Carpentier
6126886a67 Use complete nested namespace Eigen::internal 2016-06-28 20:09:25 +02:00
Benoit Jacob
328c5d876a Undo changes in AltiVec --- I don't have any way to test there. 2016-06-28 11:15:25 -04:00
Benoit Jacob
38fb606052 Avoid global variables with static constructors in NEON/Complex.h 2016-06-28 11:12:49 -04:00
Benoit Steiner
1a9f92e781 Added a test to validate the tensor scan evaluation on GPU. The test is currently disabled since the code segfaults. 2016-06-27 16:02:52 -07:00
Benoit Steiner
75c333f94c Don't store the scan axis in the evaluator of the tensor scan operation since it's only used in the constructor.
Also avoid taking references to values that may becomes stale after a copy construction.
2016-06-27 10:32:38 -07:00
xantares
c52c8d76da Disable pkgconfig only for native windows builds
ie enable it for MinGW
2016-06-27 16:43:08 +00:00
Gael Guennebaud
d937a420a2 Fix compilation with MSVC by using our portable numext::log1p implementation. 2016-08-22 15:44:21 +02:00
Gael Guennebaud
2d5731e40a bug #1270: bypass custom asm for pmadd and recent clang version 2016-08-22 15:38:03 +02:00
Gael Guennebaud
49b005181a Define EIGEN_COMP_CLANG to clang version as major*100+minor (e.g., 307 corresponds to clang 3.7) 2016-08-22 15:37:05 +02:00
Gael Guennebaud
130f891bb0 bug #1278: ease parsing 2016-08-22 15:00:29 +02:00
Benoit Steiner
7944d4431f Made the cost model cwiseMax and cwiseMin methods consts to help the PowerPC cuda compiler compile this code. 2016-08-18 13:46:36 -07:00
Benoit Steiner
647a51b426 Force the inlining of a simple accessor. 2016-08-18 12:31:02 -07:00
Benoit Steiner
a452dedb4f Merged in ibab/eigen/double-tensor-reduction (pull request PR-216)
Enable efficient Tensor reduction for doubles on the GPU (continued)
2016-08-18 12:29:54 -07:00
Igor Babuschkin
18c67df31c Fix remaining CUDA >= 300 checks 2016-08-18 17:18:30 +01:00
Igor Babuschkin
1569a7d7ab Add the necessary CUDA >= 300 checks back 2016-08-18 17:15:12 +01:00
Benoit Steiner
2b17f34574 Properly detect the type of the result of a contraction. 2016-08-16 16:00:30 -07:00
Igor Babuschkin
59bacfe520 Fix compilation on CUDA 8 by removing call to h2log1p 2016-08-15 23:38:05 +01:00
Benoit Steiner
34ae80179a Use array_prod instead of calling TotalSize since TotalSize is only available on DSize. 2016-08-15 10:29:14 -07:00
Benoit Steiner
2556565b4b Merged in ibab/eigen/extend-log1p (pull request PR-218)
Fix compilation on CUDA 8 due to missing h2log1p function
2016-08-15 08:31:03 -07:00
Benoit Steiner
30dd6f5e34 Close branch extend-log1p 2016-08-15 08:31:03 -07:00
Benoit Steiner
fe73648c98 Fixed a bug in the documentation. 2016-08-12 10:00:43 -07:00