Gael Guennebaud
d4317a85e8
Add typedefs for return types of SparseMatrixBase::selfadjointView
2015-03-09 21:29:46 +01:00
Gael Guennebaud
9e885fb766
Add unit tests for CG and sparse-LLT for long int as storage-index
2015-03-09 14:33:15 +01:00
Gael Guennebaud
224a1fe4c6
bug #963 : make IncompleteLUT compatible with non-default storage index types.
2015-03-09 13:55:20 +01:00
Gael Guennebaud
cf9940e17b
Make sparse unit-test helpers aware of StorageIndex
2015-03-09 13:54:05 +01:00
Benoit Jacob
39228cb224
deserialization assumed benchmarks in same order, but we shuffle them.
2015-03-06 19:29:01 -05:00
Benoit Jacob
a4f956b1da
merge
2015-03-06 19:13:36 -05:00
Benoit Jacob
19bf13aa62
Automatically serialize partial results to disk, reboot, and resume, when timings are getting bad
2015-03-06 19:11:50 -05:00
Gael Guennebaud
0ee391863e
Avoid undeflow when blocking size are tuned manually.
2015-03-06 21:51:09 +01:00
Gael Guennebaud
14a5f135a3
bug #969 : workaround abiguous calls to Ref using enable_if.
2015-03-06 17:51:31 +01:00
Gael Guennebaud
d23fcc0672
bug #978 : add unit test for zero-sized products
2015-03-06 16:12:08 +01:00
Gael Guennebaud
87681e508f
bug #978 : early return for vanishing products
2015-03-06 16:11:22 +01:00
Gael Guennebaud
4c8eeeaed6
update gemm changeset list
2015-03-06 15:08:20 +01:00
Gael Guennebaud
cd3bbffa73
Improve blocking heuristic: if the lhs fit within L1, then block on the rhs in L1 (allows to keep packed rhs in L1)
2015-03-06 14:31:39 +01:00
Gael Guennebaud
eedd5063fd
Update gemm performance monitoring tool:
...
- permit to recompute a subset of changesets
- update changeset list
- add a few more cases
2015-03-06 11:47:13 +01:00
Gael Guennebaud
58740ce4c6
Improve product kernel: replace the previous dynamic loop swaping strategy by a more general one:
...
It consists in increasing the actual number of rows of lhs's micro horizontal panel for small depth such that L1 cache is fully exploited.
2015-03-06 10:30:35 +01:00
Benoit Jacob
4ab01f7c21
slightly increase tolerance to clock speed variation
2015-03-05 14:41:16 -05:00
Benoit Jacob
5db2baa573
Make benchmark-blocking-sizes detect changes to clock speed and be resilient to that.
2015-03-05 13:44:20 -05:00
Gael Guennebaud
4c8b95d5c5
Rename LSCG to LeastSquaresConjugateGradient
2015-03-05 10:16:32 +01:00
Gael Guennebaud
7550107028
Product optimization: implement a dynamic loop-swapping startegy to improve memory accesses to the destination matrix in the case of K-rank-update like products, i.e., for products of the kind: "large x small" * "small x large"
2015-03-05 10:03:46 +01:00
Gael Guennebaud
2dc968e453
bug #824 : improve accuracy of Quaternion::angularDistance using atan2 instead of acos.
2015-03-04 17:03:13 +01:00
Benoit Jacob
2231b3dece
output to cout, not cerr, the actual results
2015-03-04 09:45:12 -05:00
Benoit Jacob
00ea121881
Complete the tool to analyze the efficiency of default sizes.
2015-03-04 09:30:56 -05:00
Benoit Steiner
0196141938
Fixed the optimized AVX implementation of the fast rsqrt function
2015-03-02 13:49:39 -08:00
Benoit Steiner
b0f2b6f297
Updated the tensor type casting code as follow: in the case where TgtRatio < SrcRatio, disable the vectorization of the source expression unless is has direct-access.
2015-03-02 10:11:40 -08:00
Benoit Steiner
d9cb604a5d
Disabled the use of aligned memory loads when converting a tensor from float to doubles since alignment can't always be guaranteed.
2015-03-02 09:41:36 -08:00
Benoit Steiner
4fd7f47692
Added an optimized version of rsqrt for SSE and AVX that is used when EIGEN_FAST_MATH is defined.
2015-03-02 09:38:47 -08:00
Benoit Steiner
ae73859a0a
Fixed incorrect assertion
2015-02-28 08:02:02 -08:00
Benoit Steiner
131449298f
Fixed clang compilation warning
2015-02-28 03:01:19 -08:00
Benoit Steiner
56ea45ff0f
Silenced some compilation warnings
2015-02-28 02:37:41 -08:00
Benoit Steiner
bb483313f6
Fixed another batch of compilation warnings
2015-02-28 02:32:46 -08:00
Benoit Steiner
fb53384b0f
Improved the default implementation of prsqrt
2015-02-28 01:51:26 -08:00
Benoit Steiner
61409d9449
Silenced one more comilation warning
2015-02-28 01:49:09 -08:00
Benoit Steiner
1a7b84dc75
Silenced a few compilation warnings
2015-02-28 01:45:15 -08:00
Benoit Steiner
37357a310f
Fixed compilation warnings
2015-02-27 23:54:24 -08:00
Benoit Steiner
cf1eea11de
Fixed compilation warnings
2015-02-27 23:52:02 -08:00
Benoit Steiner
78732186ee
Fixed compilation warnings
2015-02-27 23:51:16 -08:00
Benoit Steiner
4250a0cab0
Fixed compilation warnings
2015-02-27 21:59:10 -08:00
Benoit Steiner
a4e37b0617
Reverted the README
2015-02-27 13:09:49 -08:00
Benoit Steiner
306fceccbe
Pulled latest updates from trunk
2015-02-27 13:05:26 -08:00
Benoit Steiner
75e7f381c8
Pulled latest updates from trunk
2015-02-27 12:57:55 -08:00
Benoit Steiner
2386fc8528
Added support for 32bit index on a per tensor/tensor expression. This enables us to use 32bit indices to evaluate expressions on GPU faster while keeping the ability to use 64 bit indices to manipulate large tensors on CPU in the same binary.
2015-02-27 12:57:13 -08:00
Benoit Steiner
e1f6a45b14
README.md edited online with Bitbucket
2015-02-27 20:44:24 +00:00
Benoit Steiner
90893bbe18
README.md edited online with Bitbucket
2015-02-27 20:44:10 +00:00
Benoit Steiner
473e6d4c3d
README.md edited online with Bitbucket
2015-02-27 20:41:45 +00:00
Benoit Steiner
4369538227
README.md edited online with Bitbucket
2015-02-27 20:41:33 +00:00
Benoit Steiner
99cfbd6e84
README.md edited online with Bitbucket
2015-02-27 20:41:14 +00:00
Benoit Jacob
6466fa63be
Reimplement the selection between rotating and non-rotating kernels
...
using templates instead of macros and if()'s.
That was needed to fix the build of unit tests on ARM, which I had
broken. My bad for not testing earlier.
2015-02-27 15:30:10 -05:00
Benoit Steiner
05089aba75
Switch to truncated casting when converting floating point types to integer. This ensures that vectorized casts are consistent with scalar casts
2015-02-27 09:27:30 -08:00
Benoit Steiner
bf9877a92a
Pulled latest updates from trunk
2015-02-27 09:23:22 -08:00
Benoit Steiner
90f4e90f1d
Fixed off-by-one error that prevented the evaluation of small tensor expressions from being vectorized
2015-02-27 09:22:37 -08:00