Gael Guennebaud
|
a9df28c95b
|
SparseMatrix::insert: switch to a fully uncompressed mode if sequential insertion is not possible (otherwise an arbitrary large amount of memory was preallocated in some cases)
|
2015-03-13 21:00:21 +01:00 |
|
Gael Guennebaud
|
5ffe29cb9f
|
Bound pre-allocation to the maximal size representable by StorageIndex and throw bad_alloc if that's not possible.
|
2015-03-13 20:57:33 +01:00 |
|
Benoit Jacob
|
d73ccd717e
|
Add support for dumping blocking sizes tables
|
2015-03-13 10:36:01 -07:00 |
|
Gael Guennebaud
|
2f6f8bf31c
|
Add missing coeff/coeffRef members to Block<sparse>, and extend unit tests.
|
2015-03-13 16:24:40 +01:00 |
|
Benoit Jacob
|
f2c3e2b10f
|
Add --only-cubic-sizes option to analyze-blocking-sizes tool
|
2015-03-12 13:16:33 -07:00 |
|
Doug Kwan
|
657407227e
|
Fix bug in pdiv<Packet1cd> which swaps 32-bit halves of a pair of
doubles instead of swapping the doubles.
|
2015-03-11 15:13:37 -07:00 |
|
Gael Guennebaud
|
fd78874888
|
Fix compilation of iterative solvers with dense matrices
|
2015-03-09 21:31:03 +01:00 |
|
Gael Guennebaud
|
d4317a85e8
|
Add typedefs for return types of SparseMatrixBase::selfadjointView
|
2015-03-09 21:29:46 +01:00 |
|
Gael Guennebaud
|
9e885fb766
|
Add unit tests for CG and sparse-LLT for long int as storage-index
|
2015-03-09 14:33:15 +01:00 |
|
Gael Guennebaud
|
224a1fe4c6
|
bug #963: make IncompleteLUT compatible with non-default storage index types.
|
2015-03-09 13:55:20 +01:00 |
|
Gael Guennebaud
|
cf9940e17b
|
Make sparse unit-test helpers aware of StorageIndex
|
2015-03-09 13:54:05 +01:00 |
|
Benoit Jacob
|
39228cb224
|
deserialization assumed benchmarks in same order, but we shuffle them.
|
2015-03-06 19:29:01 -05:00 |
|
Benoit Jacob
|
a4f956b1da
|
merge
|
2015-03-06 19:13:36 -05:00 |
|
Benoit Jacob
|
19bf13aa62
|
Automatically serialize partial results to disk, reboot, and resume, when timings are getting bad
|
2015-03-06 19:11:50 -05:00 |
|
Gael Guennebaud
|
0ee391863e
|
Avoid undeflow when blocking size are tuned manually.
|
2015-03-06 21:51:09 +01:00 |
|
Gael Guennebaud
|
14a5f135a3
|
bug #969: workaround abiguous calls to Ref using enable_if.
|
2015-03-06 17:51:31 +01:00 |
|
Gael Guennebaud
|
d23fcc0672
|
bug #978: add unit test for zero-sized products
|
2015-03-06 16:12:08 +01:00 |
|
Gael Guennebaud
|
87681e508f
|
bug #978: early return for vanishing products
|
2015-03-06 16:11:22 +01:00 |
|
Gael Guennebaud
|
4c8eeeaed6
|
update gemm changeset list
|
2015-03-06 15:08:20 +01:00 |
|
Gael Guennebaud
|
cd3bbffa73
|
Improve blocking heuristic: if the lhs fit within L1, then block on the rhs in L1 (allows to keep packed rhs in L1)
|
2015-03-06 14:31:39 +01:00 |
|
Gael Guennebaud
|
eedd5063fd
|
Update gemm performance monitoring tool:
- permit to recompute a subset of changesets
- update changeset list
- add a few more cases
|
2015-03-06 11:47:13 +01:00 |
|
Gael Guennebaud
|
58740ce4c6
|
Improve product kernel: replace the previous dynamic loop swaping strategy by a more general one:
It consists in increasing the actual number of rows of lhs's micro horizontal panel for small depth such that L1 cache is fully exploited.
|
2015-03-06 10:30:35 +01:00 |
|
Benoit Jacob
|
4ab01f7c21
|
slightly increase tolerance to clock speed variation
|
2015-03-05 14:41:16 -05:00 |
|
Benoit Jacob
|
5db2baa573
|
Make benchmark-blocking-sizes detect changes to clock speed and be resilient to that.
|
2015-03-05 13:44:20 -05:00 |
|
Gael Guennebaud
|
4c8b95d5c5
|
Rename LSCG to LeastSquaresConjugateGradient
|
2015-03-05 10:16:32 +01:00 |
|
Gael Guennebaud
|
7550107028
|
Product optimization: implement a dynamic loop-swapping startegy to improve memory accesses to the destination matrix in the case of K-rank-update like products, i.e., for products of the kind: "large x small" * "small x large"
|
2015-03-05 10:03:46 +01:00 |
|
Gael Guennebaud
|
2dc968e453
|
bug #824: improve accuracy of Quaternion::angularDistance using atan2 instead of acos.
|
2015-03-04 17:03:13 +01:00 |
|
Benoit Jacob
|
2231b3dece
|
output to cout, not cerr, the actual results
|
2015-03-04 09:45:12 -05:00 |
|
Benoit Jacob
|
00ea121881
|
Complete the tool to analyze the efficiency of default sizes.
|
2015-03-04 09:30:56 -05:00 |
|
Benoit Steiner
|
0196141938
|
Fixed the optimized AVX implementation of the fast rsqrt function
|
2015-03-02 13:49:39 -08:00 |
|
Benoit Steiner
|
b0f2b6f297
|
Updated the tensor type casting code as follow: in the case where TgtRatio < SrcRatio, disable the vectorization of the source expression unless is has direct-access.
|
2015-03-02 10:11:40 -08:00 |
|
Benoit Steiner
|
d9cb604a5d
|
Disabled the use of aligned memory loads when converting a tensor from float to doubles since alignment can't always be guaranteed.
|
2015-03-02 09:41:36 -08:00 |
|
Benoit Steiner
|
4fd7f47692
|
Added an optimized version of rsqrt for SSE and AVX that is used when EIGEN_FAST_MATH is defined.
|
2015-03-02 09:38:47 -08:00 |
|
Benoit Steiner
|
ae73859a0a
|
Fixed incorrect assertion
|
2015-02-28 08:02:02 -08:00 |
|
Benoit Steiner
|
131449298f
|
Fixed clang compilation warning
|
2015-02-28 03:01:19 -08:00 |
|
Benoit Steiner
|
56ea45ff0f
|
Silenced some compilation warnings
|
2015-02-28 02:37:41 -08:00 |
|
Benoit Steiner
|
bb483313f6
|
Fixed another batch of compilation warnings
|
2015-02-28 02:32:46 -08:00 |
|
Benoit Steiner
|
fb53384b0f
|
Improved the default implementation of prsqrt
|
2015-02-28 01:51:26 -08:00 |
|
Benoit Steiner
|
61409d9449
|
Silenced one more comilation warning
|
2015-02-28 01:49:09 -08:00 |
|
Benoit Steiner
|
1a7b84dc75
|
Silenced a few compilation warnings
|
2015-02-28 01:45:15 -08:00 |
|
Benoit Steiner
|
37357a310f
|
Fixed compilation warnings
|
2015-02-27 23:54:24 -08:00 |
|
Benoit Steiner
|
cf1eea11de
|
Fixed compilation warnings
|
2015-02-27 23:52:02 -08:00 |
|
Benoit Steiner
|
78732186ee
|
Fixed compilation warnings
|
2015-02-27 23:51:16 -08:00 |
|
Benoit Steiner
|
4250a0cab0
|
Fixed compilation warnings
|
2015-02-27 21:59:10 -08:00 |
|
Benoit Steiner
|
a4e37b0617
|
Reverted the README
|
2015-02-27 13:09:49 -08:00 |
|
Benoit Steiner
|
306fceccbe
|
Pulled latest updates from trunk
|
2015-02-27 13:05:26 -08:00 |
|
Benoit Steiner
|
75e7f381c8
|
Pulled latest updates from trunk
|
2015-02-27 12:57:55 -08:00 |
|
Benoit Steiner
|
2386fc8528
|
Added support for 32bit index on a per tensor/tensor expression. This enables us to use 32bit indices to evaluate expressions on GPU faster while keeping the ability to use 64 bit indices to manipulate large tensors on CPU in the same binary.
|
2015-02-27 12:57:13 -08:00 |
|
Benoit Steiner
|
e1f6a45b14
|
README.md edited online with Bitbucket
|
2015-02-27 20:44:24 +00:00 |
|
Benoit Steiner
|
90893bbe18
|
README.md edited online with Bitbucket
|
2015-02-27 20:44:10 +00:00 |
|