Benoit Jacob
|
ca5c12587b
|
Polish lookup tables generation
|
2015-03-15 18:05:53 -04:00 |
|
Benoit Jacob
|
e56aabf205
|
Refactor computeProductBlockingSizes to make room for the possibility of using lookup tables
|
2015-03-15 18:05:12 -04:00 |
|
Benoit Jacob
|
b6b88c0808
|
update perf_monitoring/gemm/changesets.txt
|
2015-03-13 14:57:05 -07:00 |
|
Benoit Jacob
|
488c15615a
|
organize a little our default cache sizes, and use a saner default L1 outside of x86 (10% faster on Nexus 5)
|
2015-03-13 14:51:26 -07:00 |
|
Gael Guennebaud
|
9f58524cbd
|
merge
|
2015-03-13 21:16:39 +01:00 |
|
Gael Guennebaud
|
1330f8bbd1
|
bug #973, improve AVX support by enabling vectorization of Vector4i-like types, and enforcing alignement of Vector4f/Vector2d-like types to preserve compatibility with SSE and future Eigen versions that will vectorize them with AVX enabled.
|
2015-03-13 21:15:50 +01:00 |
|
Gael Guennebaud
|
d99ab35f9e
|
Fix internal::random(x,y) for integer types. The previous implementation could return y+1. The new implementation uses rejection sampling to get an unbiased behabior.
|
2015-03-13 21:12:46 +01:00 |
|
Gael Guennebaud
|
8580eb6808
|
bug #949: add static assertion for incompatible scalar types in dense end-user decompositions.
|
2015-03-13 21:06:20 +01:00 |
|
Gael Guennebaud
|
a9df28c95b
|
SparseMatrix::insert: switch to a fully uncompressed mode if sequential insertion is not possible (otherwise an arbitrary large amount of memory was preallocated in some cases)
|
2015-03-13 21:00:21 +01:00 |
|
Gael Guennebaud
|
5ffe29cb9f
|
Bound pre-allocation to the maximal size representable by StorageIndex and throw bad_alloc if that's not possible.
|
2015-03-13 20:57:33 +01:00 |
|
Benoit Jacob
|
d73ccd717e
|
Add support for dumping blocking sizes tables
|
2015-03-13 10:36:01 -07:00 |
|
Gael Guennebaud
|
2f6f8bf31c
|
Add missing coeff/coeffRef members to Block<sparse>, and extend unit tests.
|
2015-03-13 16:24:40 +01:00 |
|
Benoit Jacob
|
f2c3e2b10f
|
Add --only-cubic-sizes option to analyze-blocking-sizes tool
|
2015-03-12 13:16:33 -07:00 |
|
Gael Guennebaud
|
fd78874888
|
Fix compilation of iterative solvers with dense matrices
|
2015-03-09 21:31:03 +01:00 |
|
Gael Guennebaud
|
d4317a85e8
|
Add typedefs for return types of SparseMatrixBase::selfadjointView
|
2015-03-09 21:29:46 +01:00 |
|
Gael Guennebaud
|
9e885fb766
|
Add unit tests for CG and sparse-LLT for long int as storage-index
|
2015-03-09 14:33:15 +01:00 |
|
Gael Guennebaud
|
224a1fe4c6
|
bug #963: make IncompleteLUT compatible with non-default storage index types.
|
2015-03-09 13:55:20 +01:00 |
|
Gael Guennebaud
|
cf9940e17b
|
Make sparse unit-test helpers aware of StorageIndex
|
2015-03-09 13:54:05 +01:00 |
|
Benoit Jacob
|
39228cb224
|
deserialization assumed benchmarks in same order, but we shuffle them.
|
2015-03-06 19:29:01 -05:00 |
|
Benoit Jacob
|
a4f956b1da
|
merge
|
2015-03-06 19:13:36 -05:00 |
|
Benoit Jacob
|
19bf13aa62
|
Automatically serialize partial results to disk, reboot, and resume, when timings are getting bad
|
2015-03-06 19:11:50 -05:00 |
|
Gael Guennebaud
|
0ee391863e
|
Avoid undeflow when blocking size are tuned manually.
|
2015-03-06 21:51:09 +01:00 |
|
Gael Guennebaud
|
14a5f135a3
|
bug #969: workaround abiguous calls to Ref using enable_if.
|
2015-03-06 17:51:31 +01:00 |
|
Gael Guennebaud
|
d23fcc0672
|
bug #978: add unit test for zero-sized products
|
2015-03-06 16:12:08 +01:00 |
|
Gael Guennebaud
|
87681e508f
|
bug #978: early return for vanishing products
|
2015-03-06 16:11:22 +01:00 |
|
Gael Guennebaud
|
4c8eeeaed6
|
update gemm changeset list
|
2015-03-06 15:08:20 +01:00 |
|
Gael Guennebaud
|
cd3bbffa73
|
Improve blocking heuristic: if the lhs fit within L1, then block on the rhs in L1 (allows to keep packed rhs in L1)
|
2015-03-06 14:31:39 +01:00 |
|
Gael Guennebaud
|
eedd5063fd
|
Update gemm performance monitoring tool:
- permit to recompute a subset of changesets
- update changeset list
- add a few more cases
|
2015-03-06 11:47:13 +01:00 |
|
Gael Guennebaud
|
58740ce4c6
|
Improve product kernel: replace the previous dynamic loop swaping strategy by a more general one:
It consists in increasing the actual number of rows of lhs's micro horizontal panel for small depth such that L1 cache is fully exploited.
|
2015-03-06 10:30:35 +01:00 |
|
Benoit Jacob
|
4ab01f7c21
|
slightly increase tolerance to clock speed variation
|
2015-03-05 14:41:16 -05:00 |
|
Benoit Jacob
|
5db2baa573
|
Make benchmark-blocking-sizes detect changes to clock speed and be resilient to that.
|
2015-03-05 13:44:20 -05:00 |
|
Gael Guennebaud
|
4c8b95d5c5
|
Rename LSCG to LeastSquaresConjugateGradient
|
2015-03-05 10:16:32 +01:00 |
|
Gael Guennebaud
|
7550107028
|
Product optimization: implement a dynamic loop-swapping startegy to improve memory accesses to the destination matrix in the case of K-rank-update like products, i.e., for products of the kind: "large x small" * "small x large"
|
2015-03-05 10:03:46 +01:00 |
|
Gael Guennebaud
|
2dc968e453
|
bug #824: improve accuracy of Quaternion::angularDistance using atan2 instead of acos.
|
2015-03-04 17:03:13 +01:00 |
|
Benoit Jacob
|
2231b3dece
|
output to cout, not cerr, the actual results
|
2015-03-04 09:45:12 -05:00 |
|
Benoit Jacob
|
00ea121881
|
Complete the tool to analyze the efficiency of default sizes.
|
2015-03-04 09:30:56 -05:00 |
|
Jan Blechta
|
168ceb271e
|
Really use zero guess in ConjugateGradients::solve as documented
and expected for consistency with other methods.
|
2015-02-18 14:26:10 +01:00 |
|
Gael Guennebaud
|
8fdcaded5e
|
merge
|
2015-03-04 10:18:08 +01:00 |
|
Gael Guennebaud
|
c43154bbc5
|
Check for no-reallocation in SparseMatrix::insert (bug #974)
|
2015-03-04 10:16:46 +01:00 |
|
Gael Guennebaud
|
1ce0178363
|
Improve efficiency of SparseMatrix::insert/coeffRef for sequential outer-index insertion strategies (bug #974)
|
2015-03-04 09:39:26 +01:00 |
|
Gael Guennebaud
|
3dca4a1efc
|
Update manual wrt new LSCG solver.
|
2015-03-04 09:35:30 +01:00 |
|
Gael Guennebaud
|
05274219a7
|
Add a CG-based solver for rectangular least-square problems (bug #975).
|
2015-03-04 09:34:27 +01:00 |
|
Benoit Jacob
|
2aa09e6b4e
|
Fix asm comments in 1px1 kernel
|
2015-03-03 13:44:00 -05:00 |
|
Benoit Steiner
|
5d2fd64a1a
|
Fixed compilation error when compiling with gcc4.7
|
2015-03-03 08:56:49 -08:00 |
|
Benoit Jacob
|
f64b4480af
|
Add missing copyright notices
|
2015-03-03 11:43:56 -05:00 |
|
Benoit Jacob
|
eae8e27b7d
|
Add a benchmark-default-sizes action to benchmark-blocking-sizes.cpp
|
2015-03-03 11:41:21 -05:00 |
|
Marc Glisse
|
37a93c4263
|
New scoring functor to select the pivot.
This is can be useful for non-floating point scalars, where choosing the biggest element is generally not the best choice.
|
2015-03-03 17:08:28 +01:00 |
|
Benoit Jacob
|
ccc1277a42
|
must also disable complex<double> when disabling double vectorization
|
2015-03-03 10:17:05 -05:00 |
|
Benoit Jacob
|
f839099512
|
Work around an ICE in Clang 3.5 in the iOS toolchain with double NEON intrinsics.
|
2015-03-03 09:35:22 -05:00 |
|
Benoit Jacob
|
9930e9583b
|
Improve analyze-blocking-sizes, and in particular give it a evaluate-defaults tool
that shows the efficiency of Eigen's default blocking sizes choices, using a
previously computed table from benchmark-blocking-sizes.
|
2015-03-02 18:08:38 -05:00 |
|