Benoit Jacob
|
e56aabf205
|
Refactor computeProductBlockingSizes to make room for the possibility of using lookup tables
|
2015-03-15 18:05:12 -04:00 |
|
Benoit Jacob
|
b6b88c0808
|
update perf_monitoring/gemm/changesets.txt
|
2015-03-13 14:57:05 -07:00 |
|
Benoit Jacob
|
488c15615a
|
organize a little our default cache sizes, and use a saner default L1 outside of x86 (10% faster on Nexus 5)
|
2015-03-13 14:51:26 -07:00 |
|
Gael Guennebaud
|
9f58524cbd
|
merge
|
2015-03-13 21:16:39 +01:00 |
|
Gael Guennebaud
|
1330f8bbd1
|
bug #973, improve AVX support by enabling vectorization of Vector4i-like types, and enforcing alignement of Vector4f/Vector2d-like types to preserve compatibility with SSE and future Eigen versions that will vectorize them with AVX enabled.
|
2015-03-13 21:15:50 +01:00 |
|
Gael Guennebaud
|
d99ab35f9e
|
Fix internal::random(x,y) for integer types. The previous implementation could return y+1. The new implementation uses rejection sampling to get an unbiased behabior.
|
2015-03-13 21:12:46 +01:00 |
|
Gael Guennebaud
|
8580eb6808
|
bug #949: add static assertion for incompatible scalar types in dense end-user decompositions.
|
2015-03-13 21:06:20 +01:00 |
|
Gael Guennebaud
|
a9df28c95b
|
SparseMatrix::insert: switch to a fully uncompressed mode if sequential insertion is not possible (otherwise an arbitrary large amount of memory was preallocated in some cases)
|
2015-03-13 21:00:21 +01:00 |
|
Gael Guennebaud
|
5ffe29cb9f
|
Bound pre-allocation to the maximal size representable by StorageIndex and throw bad_alloc if that's not possible.
|
2015-03-13 20:57:33 +01:00 |
|
Benoit Jacob
|
d73ccd717e
|
Add support for dumping blocking sizes tables
|
2015-03-13 10:36:01 -07:00 |
|
Gael Guennebaud
|
2f6f8bf31c
|
Add missing coeff/coeffRef members to Block<sparse>, and extend unit tests.
|
2015-03-13 16:24:40 +01:00 |
|
Benoit Jacob
|
f2c3e2b10f
|
Add --only-cubic-sizes option to analyze-blocking-sizes tool
|
2015-03-12 13:16:33 -07:00 |
|
Doug Kwan
|
657407227e
|
Fix bug in pdiv<Packet1cd> which swaps 32-bit halves of a pair of
doubles instead of swapping the doubles.
|
2015-03-11 15:13:37 -07:00 |
|
Gael Guennebaud
|
fd78874888
|
Fix compilation of iterative solvers with dense matrices
|
2015-03-09 21:31:03 +01:00 |
|
Gael Guennebaud
|
d4317a85e8
|
Add typedefs for return types of SparseMatrixBase::selfadjointView
|
2015-03-09 21:29:46 +01:00 |
|
Gael Guennebaud
|
9e885fb766
|
Add unit tests for CG and sparse-LLT for long int as storage-index
|
2015-03-09 14:33:15 +01:00 |
|
Gael Guennebaud
|
224a1fe4c6
|
bug #963: make IncompleteLUT compatible with non-default storage index types.
|
2015-03-09 13:55:20 +01:00 |
|
Gael Guennebaud
|
cf9940e17b
|
Make sparse unit-test helpers aware of StorageIndex
|
2015-03-09 13:54:05 +01:00 |
|
Benoit Jacob
|
39228cb224
|
deserialization assumed benchmarks in same order, but we shuffle them.
|
2015-03-06 19:29:01 -05:00 |
|
Benoit Jacob
|
a4f956b1da
|
merge
|
2015-03-06 19:13:36 -05:00 |
|
Benoit Jacob
|
19bf13aa62
|
Automatically serialize partial results to disk, reboot, and resume, when timings are getting bad
|
2015-03-06 19:11:50 -05:00 |
|
Gael Guennebaud
|
0ee391863e
|
Avoid undeflow when blocking size are tuned manually.
|
2015-03-06 21:51:09 +01:00 |
|
Gael Guennebaud
|
14a5f135a3
|
bug #969: workaround abiguous calls to Ref using enable_if.
|
2015-03-06 17:51:31 +01:00 |
|
Gael Guennebaud
|
d23fcc0672
|
bug #978: add unit test for zero-sized products
|
2015-03-06 16:12:08 +01:00 |
|
Gael Guennebaud
|
87681e508f
|
bug #978: early return for vanishing products
|
2015-03-06 16:11:22 +01:00 |
|
Gael Guennebaud
|
4c8eeeaed6
|
update gemm changeset list
|
2015-03-06 15:08:20 +01:00 |
|
Gael Guennebaud
|
cd3bbffa73
|
Improve blocking heuristic: if the lhs fit within L1, then block on the rhs in L1 (allows to keep packed rhs in L1)
|
2015-03-06 14:31:39 +01:00 |
|
Gael Guennebaud
|
eedd5063fd
|
Update gemm performance monitoring tool:
- permit to recompute a subset of changesets
- update changeset list
- add a few more cases
|
2015-03-06 11:47:13 +01:00 |
|
Gael Guennebaud
|
58740ce4c6
|
Improve product kernel: replace the previous dynamic loop swaping strategy by a more general one:
It consists in increasing the actual number of rows of lhs's micro horizontal panel for small depth such that L1 cache is fully exploited.
|
2015-03-06 10:30:35 +01:00 |
|
Benoit Jacob
|
4ab01f7c21
|
slightly increase tolerance to clock speed variation
|
2015-03-05 14:41:16 -05:00 |
|
Benoit Jacob
|
5db2baa573
|
Make benchmark-blocking-sizes detect changes to clock speed and be resilient to that.
|
2015-03-05 13:44:20 -05:00 |
|
Gael Guennebaud
|
4c8b95d5c5
|
Rename LSCG to LeastSquaresConjugateGradient
|
2015-03-05 10:16:32 +01:00 |
|
Gael Guennebaud
|
7550107028
|
Product optimization: implement a dynamic loop-swapping startegy to improve memory accesses to the destination matrix in the case of K-rank-update like products, i.e., for products of the kind: "large x small" * "small x large"
|
2015-03-05 10:03:46 +01:00 |
|
Gael Guennebaud
|
2dc968e453
|
bug #824: improve accuracy of Quaternion::angularDistance using atan2 instead of acos.
|
2015-03-04 17:03:13 +01:00 |
|
Benoit Jacob
|
2231b3dece
|
output to cout, not cerr, the actual results
|
2015-03-04 09:45:12 -05:00 |
|
Benoit Jacob
|
00ea121881
|
Complete the tool to analyze the efficiency of default sizes.
|
2015-03-04 09:30:56 -05:00 |
|
Benoit Steiner
|
0196141938
|
Fixed the optimized AVX implementation of the fast rsqrt function
|
2015-03-02 13:49:39 -08:00 |
|
Benoit Steiner
|
b0f2b6f297
|
Updated the tensor type casting code as follow: in the case where TgtRatio < SrcRatio, disable the vectorization of the source expression unless is has direct-access.
|
2015-03-02 10:11:40 -08:00 |
|
Benoit Steiner
|
d9cb604a5d
|
Disabled the use of aligned memory loads when converting a tensor from float to doubles since alignment can't always be guaranteed.
|
2015-03-02 09:41:36 -08:00 |
|
Benoit Steiner
|
4fd7f47692
|
Added an optimized version of rsqrt for SSE and AVX that is used when EIGEN_FAST_MATH is defined.
|
2015-03-02 09:38:47 -08:00 |
|
Benoit Steiner
|
ae73859a0a
|
Fixed incorrect assertion
|
2015-02-28 08:02:02 -08:00 |
|
Benoit Steiner
|
131449298f
|
Fixed clang compilation warning
|
2015-02-28 03:01:19 -08:00 |
|
Benoit Steiner
|
56ea45ff0f
|
Silenced some compilation warnings
|
2015-02-28 02:37:41 -08:00 |
|
Benoit Steiner
|
bb483313f6
|
Fixed another batch of compilation warnings
|
2015-02-28 02:32:46 -08:00 |
|
Benoit Steiner
|
fb53384b0f
|
Improved the default implementation of prsqrt
|
2015-02-28 01:51:26 -08:00 |
|
Benoit Steiner
|
61409d9449
|
Silenced one more comilation warning
|
2015-02-28 01:49:09 -08:00 |
|
Benoit Steiner
|
1a7b84dc75
|
Silenced a few compilation warnings
|
2015-02-28 01:45:15 -08:00 |
|
Benoit Steiner
|
37357a310f
|
Fixed compilation warnings
|
2015-02-27 23:54:24 -08:00 |
|
Benoit Steiner
|
cf1eea11de
|
Fixed compilation warnings
|
2015-02-27 23:52:02 -08:00 |
|
Benoit Steiner
|
78732186ee
|
Fixed compilation warnings
|
2015-02-27 23:51:16 -08:00 |
|