eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-27 07:29:52 +08:00

Author	SHA1	Message	Date
Benoit Steiner	a432fc102d	Moved the choice of ThreadPool to unsupported/Eigen/CXX11/ThreadPool	2016-12-12 15:24:16 -08:00
Benoit Steiner	8ae68924ed	Made ThreadPoolInterface::Cancel() an optional functionality	2016-12-12 11:58:38 -08:00
Gael Guennebaud	57acb05eef	Update and extend doc on alignment issues.	2016-12-11 22:45:32 +01:00
Benoit Steiner	76fca22134	Use a more accurate timer to sleep on Linux systems.	2016-12-09 15:12:24 -08:00
Benoit Steiner	4deafd35b7	Introduce a portable EIGEN_SLEEP macro.	2016-12-09 14:52:15 -08:00
Benoit Steiner	aafa97f4d2	Fixed build error with MSVC	2016-12-09 14:42:32 -08:00
Benoit Steiner	2f5b7a199b	Reworked the threadpool cancellation mechanism to not depend on pthread_cancel since it turns out that pthread_cancel doesn't work properly on numerous platforms.	2016-12-09 13:05:14 -08:00
Benoit Steiner	3d59a47720	Added a message to ease the detection of platforms on which thread cancellation isn't supported.	2016-12-08 14:51:46 -08:00
Benoit Steiner	28ee8f42b2	Added a Flush method to the RunQueue	2016-12-08 14:07:56 -08:00
Benoit Steiner	69ef267a77	Added the new threadpool cancel method to the threadpool interface based class.	2016-12-08 14:03:25 -08:00
Benoit Steiner	7bfff85355	Added support for thread cancellation on Linux	2016-12-08 08:12:49 -08:00
Benoit Steiner	6811e6cf49	Merged in srvasude/eigen/fix_cuda_exp (pull request PR-268) Fix expm1 CUDA implementation (do not shadow exp CUDA implementation).	2016-12-08 05:14:11 -08:00
Gael Guennebaud	747202d338	typo	2016-12-08 12:48:15 +01:00
Gael Guennebaud	bb297abb9e	make sure we use the right eigen version	2016-12-08 12:00:11 +01:00
Gael Guennebaud	8b4b00d277	fix usage of custom compiler	2016-12-08 11:59:39 +01:00
Gael Guennebaud	7105596899	Add missing include and use -O3	2016-12-07 16:56:08 +01:00
Gael Guennebaud	780f3c1adf	Fix call to convert on linux	2016-12-07 16:30:11 +01:00
Gael Guennebaud	3855ab472f	Cleanup file structure	2016-12-07 14:23:49 +01:00
Gael Guennebaud	59a59fa8e7	Update perf monitoring scripts to generate html/svg outputs	2016-12-07 13:36:56 +01:00
Gael Guennebaud	f2c506b03d	Add a script example to run and upload performance tests	2016-12-06 16:46:52 +01:00
Gael Guennebaud	1b4e085a7f	generate png file for web upload	2016-12-06 16:46:22 +01:00
Gael Guennebaud	f725f1cebc	Mention the CMAKE_PREFIX_PATH variable.	2016-12-06 15:23:45 +01:00
Gael Guennebaud	f90c4aebc5	Update monitored changeset lists	2016-12-06 15:07:46 +01:00
Gael Guennebaud	eb621413c1	Revert vec/y to vec*(1/y) in row-major TRSM: - div is extremely costly - this is consistent with the column-major case - this is consistent with all other BLAS implementations	2016-12-06 15:04:50 +01:00
Gael Guennebaud	8365c2c941	Fix BLAS backend for symmetric rank K updates.	2016-12-06 14:47:09 +01:00
Gael Guennebaud	0c4d05b009	Explain how to choose your favorite Eigen version	2016-12-06 11:34:06 +01:00
Silvio Traversaro	e049a2a72a	Added relocatable cmake support also for CMake before 3.0 and after 2.8.8	2016-12-06 10:37:34 +01:00
Srinivas Vasudevan	e6c8b5500c	Change comparisons to use Scalar instead of RealScalar.	2016-12-05 14:01:45 -08:00
Srinivas Vasudevan	f7d7c33a28	Fix expm1 CUDA implementation (do not shadow exp CUDA implementation).	2016-12-05 12:19:01 -08:00
Silvio Traversaro	18481b518f	Make CMake config file relocatable	2016-12-05 10:39:52 +01:00
Gael Guennebaud	c68c8631e7	fix compilation of BTL's blaze interface	2016-12-05 23:02:16 +01:00
Gael Guennebaud	1ff1d4a124	Add performance monitoring for LLT	2016-12-05 23:01:52 +01:00
Srinivas Vasudevan	09ee7f0c80	Fix small nit where I changed name of plog1p to pexpm1.	2016-12-02 15:30:12 -08:00
Srinivas Vasudevan	a0d3ac760f	Sync from Head.	2016-12-02 14:14:45 -08:00
Srinivas Vasudevan	218764ee1f	Added support for expm1 in Eigen.	2016-12-02 14:13:01 -08:00
Gael Guennebaud	66f65ccc36	Ease compiler job to generate clean and efficient code in mat*vec.	2016-12-02 22:41:26 +01:00
Gael Guennebaud	fe696022ec	Operators += and -= do not resize!	2016-12-02 22:40:25 +01:00
Angelos Mantzaflaris	18de92329e	use numext::abs (grafted from `0a08d4c60b` )	2016-12-02 11:48:06 +01:00
Angelos Mantzaflaris	e8a6aa518e	1. Add explicit template to abs2 (resolves deduction for some arithmetic types) 2. Avoid signed-unsigned conversion in comparison (warning in case Scalar is unsigned) (grafted from `4086187e49` )	2016-12-02 11:39:18 +01:00
Gael Guennebaud	a6b971e291	Fix memory leak in Ref<Sparse>	2016-12-05 16:59:30 +01:00
Gael Guennebaud	8640ffac65	Optimize SparseLU::solve for rhs vectors	2016-12-05 15:41:14 +01:00
Gael Guennebaud	62acd67903	remove temporary in SparseLU::solve	2016-12-05 15:11:57 +01:00
Gael Guennebaud	0db6d5b3f4	bug #1356 : fix calls to evaluator::coeffRef(0,0) to get the address of the destination by adding a dstDataPtr() member to the kernel. This fixes undefined behavior if dst is empty (nullptr).	2016-12-05 15:08:09 +01:00
Gael Guennebaud	91003f3b86	typo	2016-12-05 13:51:07 +01:00
Gael Guennebaud	445c015751	extend monitoring benchmarks with transpose matrix-vector and triangular matrix-vectors.	2016-12-05 13:36:26 +01:00
Gael Guennebaud	e3f613cbd4	Improve performance of row-major-dense-matrix * vector products for recent CPUs. This revised version does not bother about aligned loads/stores, and rather processes 8 rows at ones for better instruction pipelining.	2016-12-05 13:02:01 +01:00
Gael Guennebaud	3abc827354	Clean debugging code	2016-12-05 12:59:32 +01:00
Benoit Steiner	462c28e77a	Merged in srvasude/eigen (pull request PR-265) Add Expm1 support to Eigen.	2016-12-05 02:31:11 +00:00
Gael Guennebaud	4465d20403	Add missing generic load methods.	2016-12-03 21:25:04 +01:00
Gael Guennebaud	6a5fe86098	Complete rewrite of column-major-matrix * vector product to deliver higher performance of modern CPU. The previous code has been optimized for Intel core2 for which unaligned loads/stores were prohibitively expensive. This new version exhibits much higher instruction independence (better pipelining) and explicitly leverage FMA. According to my benchmark, on Haswell this new kernel is always faster than the previous one, and sometimes even twice as fast. Even higher performance could be achieved with a better blocking size heuristic and, perhaps, with explicit prefetching. We should also check triangular product/solve to optimally exploit this new kernel (working on vertical panel of 4 columns is probably not optimal anymore).	2016-12-03 21:14:14 +01:00

1 2 3 4 5 ...

8968 Commits