eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-21 07:19:46 +08:00

Author	SHA1	Message	Date
Gael Guennebaud	2c3224924b	Fix warning and replace min/max macros by calls to mini/maxi	2016-02-01 10:23:45 +01:00
Benoit Steiner	e80ed948e1	Fixed a number of compilation warnings generated by the cuda tests	2016-01-31 20:09:41 -08:00
Benoit Steiner	6720b38fbf	Fixed a few compilation warnings	2016-01-31 16:48:50 -08:00
Benoit Steiner	3f1ee45833	Fixed compilation errors triggered by duplicate inline declaration	2016-01-31 10:48:49 -08:00
Benoit Steiner	70be6f6531	Pulled latest changes from trunk	2016-01-31 10:44:45 -08:00
Benoit Steiner	4a2ddfb81d	Sharded the CUDA argmax tensor test	2016-01-31 10:44:15 -08:00
Gael Guennebaud	d142165942	bug #667 : declare several critical functions as FORECE_INLINE to make ICC happier. <g.gael@free.fr> HG: branch 'default' HG: changed Eigen/src/Core/ArrayBase.h HG: changed Eigen/src/Core/AssignEvaluator.h HG: changed Eigen/src/Core/CoreEvaluators.h HG: changed Eigen/src/Core/CwiseUnaryOp.h HG: changed Eigen/src/Core/DenseBase.h HG: changed Eigen/src/Core/MatrixBase.h	2016-01-31 16:34:10 +01:00
Gael Guennebaud	a4e4542b89	Avoid overflow in unit test.	2016-01-30 22:26:17 +01:00
Gael Guennebaud	3ba8a3ab1a	Disable underflow unit test on the i387 FPU.	2016-01-30 22:14:04 +01:00
Benoit Steiner	483082ef6e	Fixed a few memory leaks in the cuda tests	2016-01-30 11:59:22 -08:00
Benoit Steiner	bd21aba181	Sharded the cxx11_tensor_cuda test and fixed a memory leak	2016-01-30 11:47:09 -08:00
Benoit Steiner	9de155d153	Added a test to cover threaded tensor shuffling	2016-01-30 10:56:47 -08:00
Benoit Steiner	32088c06a1	Made the comparison between single and multithreaded contraction results more resistant to numerical noise to prevent spurious test failures.	2016-01-30 10:51:14 -08:00
Benoit Steiner	2053478c56	Made sure to use a tensor of rank 0 to store the result of a full reduction in the tensor thread pool test	2016-01-30 10:46:36 -08:00
Benoit Steiner	d0db95f730	Sharded the tensor thread pool test	2016-01-30 10:43:57 -08:00
Benoit Steiner	ba27c8a7de	Made the CUDA contract test more robust to numerical noise.	2016-01-30 10:28:43 -08:00
Benoit Steiner	4281eb1e2c	Added 2 benchmarks to the suite of tensor benchmarks running on GPU	2016-01-30 10:20:43 -08:00
Gael Guennebaud	102fa96a96	Extend doc on dense+sparse	2016-01-30 14:58:21 +01:00
Gael Guennebaud	1bc207c528	backout changeset `d4a9e61569` : the extended SparseView is not needed anymore	2016-01-30 14:43:21 +01:00
Gael Guennebaud	8ed1553d20	bug #632 : implement general coefficient-wise "dense op sparse" operations through specialized evaluators instead of using SparseView. This permits to deal with arbitrary storage order, and to by-pass the more complex iterator of the sparse-sparse case.	2016-01-30 14:39:50 +01:00
Gael Guennebaud	699634890a	bug #946 : generalize Cholmod::solve to handle any rhs expression	2016-01-29 23:02:22 +01:00
Gael Guennebaud	15084cf1ac	bug #632 : add support for "dense +/- sparse" operations. The current implementation is based on SparseView to make the dense subexpression compatible with the sparse one.	2016-01-29 22:09:45 +01:00
Gael Guennebaud	d4a9e61569	Extend SparseView to allow keeping explicit zeros. This is equivalent to sparseView(1,-1) but faster because the test is removed at compile-time.	2016-01-29 22:07:56 +01:00
Gael Guennebaud	d8d37349c3	bug #696 : enable zero-sized block at compile-time by relaxing the respective assertion	2016-01-29 12:44:49 +01:00
Gael Guennebaud	e8ccc06fe5	merge	2016-01-29 09:40:38 +01:00
Benoit Steiner	963f2d2a8f	Marked several methods EIGEN_DEVICE_FUNC	2016-01-28 23:37:48 -08:00
Benoit Steiner	c5d25bf1d0	Fixed a couple of compilation warnings.	2016-01-28 23:15:45 -08:00
Benoit Steiner	e4f83bae5d	Fixed the tensor benchmarks on apple devices	2016-01-28 21:08:07 -08:00
Benoit Steiner	10bea90c4a	Fixed clang related compilation error	2016-01-28 20:52:08 -08:00
Benoit Steiner	d3f533b395	Fixed compilation warning	2016-01-28 20:09:45 -08:00
Abhijit Kundu	3fde202215	Making ceil() functor generic w.r.t packet type	2016-01-28 21:27:00 -05:00
Benoit Steiner	211d350fc3	Fixed a typo	2016-01-28 17:13:04 -08:00
Benoit Steiner	bd2e5a788a	Made sure the number of floating point operations done by a benchmark is computed using 64 bit integers to avoid overflows.	2016-01-28 17:10:40 -08:00
Benoit Steiner	120e13b1b6	Added a readme to explain how to compile the tensor benchmarks.	2016-01-28 17:06:00 -08:00
Benoit Steiner	a68864b6bc	Updated the benchmarking code to print the number of flops processed instead of the number of bytes.	2016-01-28 16:51:40 -08:00
Benoit Steiner	8217281ae4	Merge latest updates from trunk	2016-01-28 16:20:53 -08:00
Benoit Steiner	c8d5f21941	Added extra tensor benchmarks	2016-01-28 16:20:36 -08:00
Benoit Steiner	7b3044d086	Made sure to call nvcc with the relaxed-constexpr flag.	2016-01-28 15:36:34 -08:00
Rasmus Munk Larsen	acce4dd050	Change Eigen's ColPivHouseholderQR to use the numerically stable norm downdate formula from http://www.netlib.org/lapack/lawnspdf/lawn176.pdf , which has been used in LAPACK's xGEQPF and xGEQP3 since 2006. With the old formula, the code chooses the wrong pivots and fails to correctly determine rank on graded matrices. This change also adds additional checks for non-increasing diagonal in R11 to existing unit tests, and adds a new unit test with the Kahan matrix, which consistently fails for the original code. Benchmark timings on Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz. Code compiled with AVX & FMA. I just ran on square matrices of 3 difference sizes. Benchmark Time(ns) CPU(ns) Iterations ------------------------------------------------------- Before: BM_EigencolPivQR/64 53677 53627 12890 BM_EigencolPivQR/512 15265408 15250784 46 BM_EigencolPivQR/4k 15403556228 15388788368 2 After (non-vectorized version): Benchmark Time(ns) CPU(ns) Iterations Degradation -------------------------------------------------------------------- BM_EigencolPivQR/64 63736 63669 10844 18.5% BM_EigencolPivQR/512 16052546 16037381 43 5.1% BM_EigencolPivQR/4k 15149263620 15132025316 2 -2.0% Performance-wise there seems to be a ~18.5% degradation for small (64x64) matrices, probably due to the cost of more O(min(m,n)^2) sqrt operations that are not needed for the unstable formula.	2016-01-28 15:07:26 -08:00
Gael Guennebaud	b908e071a8	bug #178 : get rid of some const_cast in SparseCore	2016-01-28 22:11:18 +01:00
Gael Guennebaud	c1d900af61	bug #178 : remove additional const on nested expression, and remove several const_cast.	2016-01-28 21:43:20 +01:00
Benoit Steiner	12f8bd12a2	Merged in jiayq/eigen (pull request PR-159) Modifications to the tensor benchmarks to allow compilation in a standalone fashion.	2016-01-28 11:28:55 -08:00
Yangqing Jia	270c4e1ecd	bugfix	2016-01-28 11:11:45 -08:00
Yangqing Jia	c4e47630b1	benchmark modifications to make it compilable in a standalone fashion.	2016-01-28 10:35:14 -08:00
Gael Guennebaud	f50bb1e6f3	Fix compilation with gcc	2016-01-28 13:25:26 +01:00
Gael Guennebaud	ddf64babde	merge	2016-01-28 13:21:48 +01:00
Gael Guennebaud	df15fbc452	bug #1158 : PartialReduxExpr is a vector expression, and it thus must expose the LinearAccessBit flag	2016-01-28 13:16:30 +01:00
Gael Guennebaud	9bcadb7fd1	Disable stupid MSVC warning	2016-01-28 12:14:16 +01:00
Gael Guennebaud	b4d87fff4a	Fix MSVC warning.	2016-01-28 12:12:30 +01:00
Gael Guennebaud	2bad3e78d9	bug #96 , bug #1006 : fix by value argument in result_of.	2016-01-28 12:12:06 +01:00

... 5 6 7 8 9 ...

7634 Commits