Benoit Steiner
120e13b1b6
Added a readme to explain how to compile the tensor benchmarks.
2016-01-28 17:06:00 -08:00
Benoit Steiner
a68864b6bc
Updated the benchmarking code to print the number of flops processed instead of the number of bytes.
2016-01-28 16:51:40 -08:00
Benoit Steiner
8217281ae4
Merge latest updates from trunk
2016-01-28 16:20:53 -08:00
Benoit Steiner
c8d5f21941
Added extra tensor benchmarks
2016-01-28 16:20:36 -08:00
Benoit Steiner
7b3044d086
Made sure to call nvcc with the relaxed-constexpr flag.
2016-01-28 15:36:34 -08:00
Rasmus Munk Larsen
acce4dd050
Change Eigen's ColPivHouseholderQR to use the numerically stable norm downdate formula from http://www.netlib.org/lapack/lawnspdf/lawn176.pdf , which has been used in LAPACK's xGEQPF and xGEQP3 since 2006. With the old formula, the code chooses the wrong pivots and fails to correctly determine rank on graded matrices.
...
This change also adds additional checks for non-increasing diagonal in R11 to existing unit tests, and adds a new unit test with the Kahan matrix, which consistently fails for the original code.
Benchmark timings on Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz. Code compiled with AVX & FMA. I just ran on square matrices of 3 difference sizes.
Benchmark Time(ns) CPU(ns) Iterations
-------------------------------------------------------
Before:
BM_EigencolPivQR/64 53677 53627 12890
BM_EigencolPivQR/512 15265408 15250784 46
BM_EigencolPivQR/4k 15403556228 15388788368 2
After (non-vectorized version):
Benchmark Time(ns) CPU(ns) Iterations Degradation
--------------------------------------------------------------------
BM_EigencolPivQR/64 63736 63669 10844 18.5%
BM_EigencolPivQR/512 16052546 16037381 43 5.1%
BM_EigencolPivQR/4k 15149263620 15132025316 2 -2.0%
Performance-wise there seems to be a ~18.5% degradation for small (64x64) matrices, probably due to the cost of more O(min(m,n)^2) sqrt operations that are not needed for the unstable formula.
2016-01-28 15:07:26 -08:00
Gael Guennebaud
b908e071a8
bug #178 : get rid of some const_cast in SparseCore
2016-01-28 22:11:18 +01:00
Gael Guennebaud
c1d900af61
bug #178 : remove additional const on nested expression, and remove several const_cast.
2016-01-28 21:43:20 +01:00
Benoit Steiner
12f8bd12a2
Merged in jiayq/eigen (pull request PR-159)
...
Modifications to the tensor benchmarks to allow compilation in a standalone fashion.
2016-01-28 11:28:55 -08:00
Yangqing Jia
270c4e1ecd
bugfix
2016-01-28 11:11:45 -08:00
Yangqing Jia
c4e47630b1
benchmark modifications to make it compilable in a standalone fashion.
2016-01-28 10:35:14 -08:00
Gael Guennebaud
f50bb1e6f3
Fix compilation with gcc
2016-01-28 13:25:26 +01:00
Gael Guennebaud
ddf64babde
merge
2016-01-28 13:21:48 +01:00
Gael Guennebaud
df15fbc452
bug #1158 : PartialReduxExpr is a vector expression, and it thus must expose the LinearAccessBit flag
2016-01-28 13:16:30 +01:00
Gael Guennebaud
9bcadb7fd1
Disable stupid MSVC warning
2016-01-28 12:14:16 +01:00
Gael Guennebaud
b4d87fff4a
Fix MSVC warning.
2016-01-28 12:12:30 +01:00
Gael Guennebaud
2bad3e78d9
bug #96 , bug #1006 : fix by value argument in result_of.
2016-01-28 12:12:06 +01:00
Gael Guennebaud
7802a6bb1c
Fix unit test filename.
2016-01-28 09:35:37 +01:00
Benoit Steiner
4bf9eaf77a
Deleted an invalid assertion that prevented the assignment of empty tensors.
2016-01-27 17:09:30 -08:00
Benoit Steiner
291069e885
Fixed some compilation problems with nvcc + clang
2016-01-27 15:37:03 -08:00
Benoit Steiner
47ca9dc809
Fixed the tensor_cuda test
2016-01-27 14:58:48 -08:00
Benoit Steiner
55a5204319
Fixed the flags passed to nvcc to compile the tensor code.
2016-01-27 14:46:34 -08:00
Gael Guennebaud
4865e1e732
Update link to suitesparse.
2016-01-27 22:48:40 +01:00
Benoit Steiner
9dfbd4fe8d
Made the cuda tests compile using make check
2016-01-27 12:22:17 -08:00
Benoit Steiner
5973bcf939
Properly specify the namespace when calling cout/endl
2016-01-27 12:04:42 -08:00
Eugene Brevdo
c8d94ae944
digamma special function: merge shared code.
...
Moved type-specific code into a helper class digamma_impl_maybe_poly<Scalar>.
2016-01-27 09:52:29 -08:00
Gael Guennebaud
9c8f7dfe94
bug #1156 : fix several function declarations whose arguments were passed by value instead of being passed by reference
2016-01-27 18:34:42 +01:00
Gael Guennebaud
9aa6fae123
bug #1154 : move to dynamic scheduling for spmv products.
2016-01-27 18:03:51 +01:00
Gael Guennebaud
9ac8e8c6a1
Extend mixing type unit test with trmv, and the following not yet supported products: trmm, symv, symm
2016-01-27 17:29:53 +01:00
Gael Guennebaud
6da5d87f92
add nomalloc unit test for rank2 updates
2016-01-27 17:26:48 +01:00
Gael Guennebaud
9801c959e6
Fix tri = complex * real product, and add respective unit test.
2016-01-27 17:12:25 +01:00
Gael Guennebaud
21b5345782
Add meta_least_common_multiple helper.
2016-01-27 17:11:39 +01:00
Gael Guennebaud
fecea26d93
Extend doc on shifting strategy
2016-01-27 15:55:15 +01:00
Ville Kallioniemi
02db1228ed
Add constructor for long types.
2016-01-26 23:41:01 -07:00
Gael Guennebaud
412bb5a631
Remove redundant test.
2016-01-26 23:35:30 +01:00
Gael Guennebaud
0f8d26c6a9
Doc: add flip* and arrayfun MatLab equivalent.
2016-01-26 23:34:48 +01:00
Gael Guennebaud
cfa21f8123
Remove dead code.
2016-01-26 23:33:15 +01:00
Gael Guennebaud
6850eab33b
Re-enable blocking on rows in non-l3 blocking mode.
2016-01-26 23:32:48 +01:00
Gael Guennebaud
aa8c6a251e
Make sure that micro-panel-size is smaller than blocking sizes (otherwise we might get a buffer overflow)
2016-01-26 23:31:48 +01:00
Gael Guennebaud
5b0a9ee003
Make sure that block sizes are smaller than input matrix sizes.
2016-01-26 23:30:24 +01:00
Benoit Jacob
639b1d864a
bug #1152 : Fix data race in static initialization of blas
2016-01-26 11:44:16 -05:00
Christoph Hertzberg
44d4674955
bug #1153 : Don't rely on __GXX_EXPERIMENTAL_CXX0X__ to detect C++11 support
2016-01-26 16:45:33 +01:00
Hauke Heibel
5eb2790be0
Fixed minor typo in SplineFitting.
2016-01-25 22:17:52 +01:00
Gael Guennebaud
8328caa618
bug #51 : add block preallocation mechanism to selfadjoit*matrix product.
2016-01-25 22:06:42 +01:00
Gael Guennebaud
2f9e6314b1
update BLAS interface to general_matrix_matrix_triangular_product
2016-01-25 21:56:05 +01:00
Gael Guennebaud
e58827d2ed
bug #51 : make general_matrix_matrix_triangular_product use L3-blocking helper so that general symmetric rank-updates and general-matrix-to-triangular products do not trigger dynamic memory allocation for fixed size matrices.
2016-01-25 17:16:33 +01:00
Gael Guennebaud
c10021c00a
bug #1144 : clarify the doc about aliasing in case of resizing and matrix product.
2016-01-25 15:50:55 +01:00
Gael Guennebaud
b114e6fd3b
Improve documentation.
2016-01-25 11:56:25 +01:00
Gael Guennebaud
869b4443ac
Add SparseVector::conservativeResize() method.
2016-01-25 11:55:39 +01:00
Benoit Steiner
e3a15a03a4
Don't explicitely evaluate the subexpression from TensorForcedEval::evalSubExprIfNeeded, as it will be done when executing the EvalTo subexpression
2016-01-24 23:04:50 -08:00