eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-27 07:29:52 +08:00

Author	SHA1	Message	Date
Benoit Steiner	120e13b1b6	Added a readme to explain how to compile the tensor benchmarks.	2016-01-28 17:06:00 -08:00
Benoit Steiner	a68864b6bc	Updated the benchmarking code to print the number of flops processed instead of the number of bytes.	2016-01-28 16:51:40 -08:00
Benoit Steiner	8217281ae4	Merge latest updates from trunk	2016-01-28 16:20:53 -08:00
Benoit Steiner	c8d5f21941	Added extra tensor benchmarks	2016-01-28 16:20:36 -08:00
Benoit Steiner	7b3044d086	Made sure to call nvcc with the relaxed-constexpr flag.	2016-01-28 15:36:34 -08:00
Rasmus Munk Larsen	acce4dd050	Change Eigen's ColPivHouseholderQR to use the numerically stable norm downdate formula from http://www.netlib.org/lapack/lawnspdf/lawn176.pdf , which has been used in LAPACK's xGEQPF and xGEQP3 since 2006. With the old formula, the code chooses the wrong pivots and fails to correctly determine rank on graded matrices. This change also adds additional checks for non-increasing diagonal in R11 to existing unit tests, and adds a new unit test with the Kahan matrix, which consistently fails for the original code. Benchmark timings on Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz. Code compiled with AVX & FMA. I just ran on square matrices of 3 difference sizes. Benchmark Time(ns) CPU(ns) Iterations ------------------------------------------------------- Before: BM_EigencolPivQR/64 53677 53627 12890 BM_EigencolPivQR/512 15265408 15250784 46 BM_EigencolPivQR/4k 15403556228 15388788368 2 After (non-vectorized version): Benchmark Time(ns) CPU(ns) Iterations Degradation -------------------------------------------------------------------- BM_EigencolPivQR/64 63736 63669 10844 18.5% BM_EigencolPivQR/512 16052546 16037381 43 5.1% BM_EigencolPivQR/4k 15149263620 15132025316 2 -2.0% Performance-wise there seems to be a ~18.5% degradation for small (64x64) matrices, probably due to the cost of more O(min(m,n)^2) sqrt operations that are not needed for the unstable formula.	2016-01-28 15:07:26 -08:00
Gael Guennebaud	b908e071a8	bug #178 : get rid of some const_cast in SparseCore	2016-01-28 22:11:18 +01:00
Gael Guennebaud	c1d900af61	bug #178 : remove additional const on nested expression, and remove several const_cast.	2016-01-28 21:43:20 +01:00
Benoit Steiner	12f8bd12a2	Merged in jiayq/eigen (pull request PR-159) Modifications to the tensor benchmarks to allow compilation in a standalone fashion.	2016-01-28 11:28:55 -08:00
Yangqing Jia	270c4e1ecd	bugfix	2016-01-28 11:11:45 -08:00
Yangqing Jia	c4e47630b1	benchmark modifications to make it compilable in a standalone fashion.	2016-01-28 10:35:14 -08:00
Gael Guennebaud	f50bb1e6f3	Fix compilation with gcc	2016-01-28 13:25:26 +01:00
Gael Guennebaud	ddf64babde	merge	2016-01-28 13:21:48 +01:00
Gael Guennebaud	df15fbc452	bug #1158 : PartialReduxExpr is a vector expression, and it thus must expose the LinearAccessBit flag	2016-01-28 13:16:30 +01:00
Gael Guennebaud	9bcadb7fd1	Disable stupid MSVC warning	2016-01-28 12:14:16 +01:00
Gael Guennebaud	b4d87fff4a	Fix MSVC warning.	2016-01-28 12:12:30 +01:00
Gael Guennebaud	2bad3e78d9	bug #96 , bug #1006 : fix by value argument in result_of.	2016-01-28 12:12:06 +01:00
Gael Guennebaud	7802a6bb1c	Fix unit test filename.	2016-01-28 09:35:37 +01:00
Benoit Steiner	4bf9eaf77a	Deleted an invalid assertion that prevented the assignment of empty tensors.	2016-01-27 17:09:30 -08:00
Benoit Steiner	291069e885	Fixed some compilation problems with nvcc + clang	2016-01-27 15:37:03 -08:00
Benoit Steiner	47ca9dc809	Fixed the tensor_cuda test	2016-01-27 14:58:48 -08:00
Benoit Steiner	55a5204319	Fixed the flags passed to nvcc to compile the tensor code.	2016-01-27 14:46:34 -08:00
Gael Guennebaud	4865e1e732	Update link to suitesparse.	2016-01-27 22:48:40 +01:00
Benoit Steiner	9dfbd4fe8d	Made the cuda tests compile using make check	2016-01-27 12:22:17 -08:00
Benoit Steiner	5973bcf939	Properly specify the namespace when calling cout/endl	2016-01-27 12:04:42 -08:00
Eugene Brevdo	c8d94ae944	digamma special function: merge shared code. Moved type-specific code into a helper class digamma_impl_maybe_poly<Scalar>.	2016-01-27 09:52:29 -08:00
Gael Guennebaud	9c8f7dfe94	bug #1156 : fix several function declarations whose arguments were passed by value instead of being passed by reference	2016-01-27 18:34:42 +01:00
Gael Guennebaud	9aa6fae123	bug #1154 : move to dynamic scheduling for spmv products.	2016-01-27 18:03:51 +01:00
Gael Guennebaud	9ac8e8c6a1	Extend mixing type unit test with trmv, and the following not yet supported products: trmm, symv, symm	2016-01-27 17:29:53 +01:00
Gael Guennebaud	6da5d87f92	add nomalloc unit test for rank2 updates	2016-01-27 17:26:48 +01:00
Gael Guennebaud	9801c959e6	Fix tri = complex * real product, and add respective unit test.	2016-01-27 17:12:25 +01:00
Gael Guennebaud	21b5345782	Add meta_least_common_multiple helper.	2016-01-27 17:11:39 +01:00
Gael Guennebaud	fecea26d93	Extend doc on shifting strategy	2016-01-27 15:55:15 +01:00
Ville Kallioniemi	02db1228ed	Add constructor for long types.	2016-01-26 23:41:01 -07:00
Gael Guennebaud	412bb5a631	Remove redundant test.	2016-01-26 23:35:30 +01:00
Gael Guennebaud	0f8d26c6a9	Doc: add flip* and arrayfun MatLab equivalent.	2016-01-26 23:34:48 +01:00
Gael Guennebaud	cfa21f8123	Remove dead code.	2016-01-26 23:33:15 +01:00
Gael Guennebaud	6850eab33b	Re-enable blocking on rows in non-l3 blocking mode.	2016-01-26 23:32:48 +01:00
Gael Guennebaud	aa8c6a251e	Make sure that micro-panel-size is smaller than blocking sizes (otherwise we might get a buffer overflow)	2016-01-26 23:31:48 +01:00
Gael Guennebaud	5b0a9ee003	Make sure that block sizes are smaller than input matrix sizes.	2016-01-26 23:30:24 +01:00
Benoit Jacob	639b1d864a	bug #1152 : Fix data race in static initialization of blas	2016-01-26 11:44:16 -05:00
Christoph Hertzberg	44d4674955	bug #1153 : Don't rely on __GXX_EXPERIMENTAL_CXX0X__ to detect C++11 support	2016-01-26 16:45:33 +01:00
Hauke Heibel	5eb2790be0	Fixed minor typo in SplineFitting.	2016-01-25 22:17:52 +01:00
Gael Guennebaud	8328caa618	bug #51 : add block preallocation mechanism to selfadjoit*matrix product.	2016-01-25 22:06:42 +01:00
Gael Guennebaud	2f9e6314b1	update BLAS interface to general_matrix_matrix_triangular_product	2016-01-25 21:56:05 +01:00
Gael Guennebaud	e58827d2ed	bug #51 : make general_matrix_matrix_triangular_product use L3-blocking helper so that general symmetric rank-updates and general-matrix-to-triangular products do not trigger dynamic memory allocation for fixed size matrices.	2016-01-25 17:16:33 +01:00
Gael Guennebaud	c10021c00a	bug #1144 : clarify the doc about aliasing in case of resizing and matrix product.	2016-01-25 15:50:55 +01:00
Gael Guennebaud	b114e6fd3b	Improve documentation.	2016-01-25 11:56:25 +01:00
Gael Guennebaud	869b4443ac	Add SparseVector::conservativeResize() method.	2016-01-25 11:55:39 +01:00
Benoit Steiner	e3a15a03a4	Don't explicitely evaluate the subexpression from TensorForcedEval::evalSubExprIfNeeded, as it will be done when executing the EvalTo subexpression	2016-01-24 23:04:50 -08:00

1 2 3 4 5 ...

7400 Commits