eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-02-11 18:00:51 +08:00

Author	SHA1	Message	Date
Mehdi Goli	524fa4c46f	Reducing the code by generalising sycl backend functions/structs.	2016-10-14 12:09:55 +01:00
Benoit Steiner	737e4152c3	Merged in lukier/eigen (pull request PR-234) Enabling CUDA in Geometry	2016-10-13 18:09:28 +00:00
Robert Lukierski	a94791b69a	Fixes for min and abs after Benoit's comments, switched to numext.	2016-10-13 15:00:22 +01:00
Avi Ginsburg	ac63d6891c	Patch to allow VS2015 & CUDA 8.0 to compile with Eigen included. I'm not sure whether to limit the check to this compiler combination (` \|\| (EIGEN_COMP_MSVC == 1900 && __CUDACC_VER__) `) or to leave it as it is. I also don't know if this will have any affect on including Eigen in device code (I'm not in my current project).	2016-10-13 08:47:32 +00:00
Benoit Steiner	7e4a6754b2	Merged eigen/eigen into default	2016-10-12 22:42:33 -07:00
Benoit Steiner	38b6048e14	Deleted redundant implementation of predux	2016-10-12 14:37:56 -07:00
Gael Guennebaud	e74612b9a0	Remove double ;;	2016-10-12 22:49:47 +02:00
Benoit Steiner	78d2926508	Merged eigen/eigen into default	2016-10-12 13:46:29 -07:00
Benoit Steiner	2e2f48e30e	Take advantage of AVX512 instructions whenever possible to speedup the processing of 16 bit floats.	2016-10-12 13:45:39 -07:00
Gael Guennebaud	f939c351cb	Fix SPQR for rectangular matrices	2016-10-12 22:39:33 +02:00
Robert Lukierski	471075f7ad	Fixes min() warnings.	2016-10-12 18:59:05 +01:00
Gael Guennebaud	5c366fe1d7	Merged in rmlarsen/eigen (pull request PR-230) Fix a bug in psqrt for SSE and AVX when EIGEN_FAST_MATH=1	2016-10-12 16:30:51 +00:00
Robert Lukierski	86711497c4	Adding EIGEN_DEVICE_FUNC in the Geometry module. Additional CUDA necessary fixes in the Core (mostly usage of EIGEN_USING_STD_MATH).	2016-10-12 16:35:17 +01:00
Rasmus Munk Larsen	47150af1c8	Fix copy-paste error: Must use _mm256_cmp_ps for AVX.	2016-10-12 08:34:39 -07:00
Gael Guennebaud	89e315152c	bug #1325 : fix compilation on NEON with clang	2016-10-12 16:55:47 +02:00
Benoit Steiner	5727e4d89c	Reenabled the use of variadic templates on tegra x1 provides that the latest version (i.e. JetPack 2.3) is used.	2016-10-08 22:19:03 +00:00
Benoit Steiner	5c68051cd7	Merge the content of the ComputeCpp branch into the default branch	2016-10-07 11:04:16 -07:00
Gael Guennebaud	4860727ac2	Remove static qualifier of free-functions (inline is enough and this helps ICC to find the right overload)	2016-10-07 09:21:12 +02:00
Benoit Steiner	507b661106	Renamed predux_half into predux_downto4	2016-10-06 17:57:04 -07:00
Benoit Steiner	a498ff7df6	Fixed incorrect comment	2016-10-06 15:27:27 -07:00
Benoit Steiner	a7473d6d5a	Fixed compilation error with gcc >= 5.3	2016-10-06 14:33:22 -07:00
Benoit Steiner	5e64cea896	Silenced a compilation warning	2016-10-06 14:24:17 -07:00
Benoit Steiner	d485d12c51	Added missing AVX intrinsics for fp16: in particular, implemented predux which is required by the matrix-vector code.	2016-10-06 10:41:03 -07:00
Rasmus Munk Larsen	48c635e223	Add a simple cost model to prevent Eigen's parallel GEMM from using too many threads when the inner dimension is small. Timing for square matrices is unchanged, but both CPU and Wall time are significantly improved for skinny matrices. The benchmarks below are for multiplying NxK * KxN matrices with test names of the form BM_OuterishProd/N/K. Improvements in Wall time: Run on [redacted] (12 X 3501 MHz CPUs); 2016-10-05T17:40:02.462497196-07:00 CPU: Intel Haswell with HyperThreading (6 cores) dL1:32KB dL2:256KB dL3:15MB Benchmark Base (ns) New (ns) Improvement ------------------------------------------------------------------ BM_OuterishProd/64/1 3088 1610 +47.9% BM_OuterishProd/64/4 3562 2414 +32.2% BM_OuterishProd/64/32 8861 7815 +11.8% BM_OuterishProd/128/1 11363 6504 +42.8% BM_OuterishProd/128/4 11128 9794 +12.0% BM_OuterishProd/128/64 27691 27396 +1.1% BM_OuterishProd/256/1 33214 28123 +15.3% BM_OuterishProd/256/4 34312 36818 -7.3% BM_OuterishProd/256/128 174866 176398 -0.9% BM_OuterishProd/512/1 7963684 104224 +98.7% BM_OuterishProd/512/4 7987913 112867 +98.6% BM_OuterishProd/512/256 8198378 1306500 +84.1% BM_OuterishProd/1k/1 7356256 324432 +95.6% BM_OuterishProd/1k/4 8129616 331621 +95.9% BM_OuterishProd/1k/512 27265418 7517538 +72.4% Improvements in CPU time: Run on [redacted] (12 X 3501 MHz CPUs); 2016-10-05T17:40:02.462497196-07:00 CPU: Intel Haswell with HyperThreading (6 cores) dL1:32KB dL2:256KB dL3:15MB Benchmark Base (ns) New (ns) Improvement ------------------------------------------------------------------ BM_OuterishProd/64/1 6169 1608 +73.9% BM_OuterishProd/64/4 7117 2412 +66.1% BM_OuterishProd/64/32 17702 15616 +11.8% BM_OuterishProd/128/1 45415 6498 +85.7% BM_OuterishProd/128/4 44459 9786 +78.0% BM_OuterishProd/128/64 110657 109489 +1.1% BM_OuterishProd/256/1 265158 28101 +89.4% BM_OuterishProd/256/4 274234 183885 +32.9% BM_OuterishProd/256/128 1397160 1408776 -0.8% BM_OuterishProd/512/1 78947048 520703 +99.3% BM_OuterishProd/512/4 86955578 1349742 +98.4% BM_OuterishProd/512/256 74701613 15584661 +79.1% BM_OuterishProd/1k/1 78352601 3877911 +95.1% BM_OuterishProd/1k/4 78521643 3966221 +94.9% BM_OuterishProd/1k/512 258104736 89480530 +65.3%	2016-10-06 10:33:10 -07:00
Benoit Steiner	9f3276981c	Enabling AVX512 should also enable AVX2.	2016-10-06 10:29:48 -07:00
Gael Guennebaud	80b5133789	Fix compilation of qr.inverse() for column and full pivoting variants.	2016-10-06 09:55:50 +02:00
Benoit Steiner	4131074818	Deleted unecessary CMakeLists.txt file	2016-10-05 18:54:35 -07:00
Benoit Steiner	cb5cd69872	Silenced a compilation warning.	2016-10-05 18:50:53 -07:00
Benoit Steiner	78b569f685	Merged latest updates from trunk	2016-10-05 18:48:55 -07:00
Benoit Steiner	9c2b6c049b	Silenced a few compilation warnings	2016-10-05 18:37:31 -07:00
Benoit Steiner	ae1385c7e4	Pull the latest updates from trunk	2016-10-05 14:54:36 -07:00
Benoit Steiner	698ff69450	Properly characterize the CUDA packet primitives for fp16 as device only	2016-10-04 16:53:30 -07:00
Rasmus Munk Larsen	7f67e6dfdb	Update comment for fast sqrt.	2016-10-04 15:09:11 -07:00
Rasmus Munk Larsen	765615609d	Update comment for fast sqrt.	2016-10-04 15:08:41 -07:00
Rasmus Munk Larsen	3ed67cb0bb	Fix a bug in the implementation of Carmack's fast sqrt algorithm in Eigen (enabled by EIGEN_FAST_MATH), which causes the vectorized parts of the computation to return -0.0 instead of NaN for negative arguments. Benchmark speed in Giga-sqrts/s Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz ----------------------------------------- SSE AVX Fast=1 2.529G 4.380G Fast=0 1.944G 1.898G Fast=1 fixed 2.214G 3.739G This table illustrates the worst case in terms speed impact: It was measured by repeatedly computing the sqrt of an n=4096 float vector that fits in L1 cache. For large vectors the operation becomes memory bound and the differences between the different versions almost negligible.	2016-10-04 14:22:56 -07:00
Benoit Steiner	881b90e984	Use explicit type casting to generate packets of zeros.	2016-10-04 08:23:38 -07:00
Benoit Steiner	409e887d78	Added support for constand std::complex numbers on GPU	2016-10-03 11:06:24 -07:00
Gael Guennebaud	9d6d0dff8f	bug #1317 : fix performance regression with some Block expressions and clang by helping it to remove dead code. The trick is to get rid of the nested expression in the evaluator by copying only the required information (here, the strides).	2016-10-01 15:37:00 +02:00
Gael Guennebaud	8b84801f7f	bug #1310 : workaround a compilation regression from 3.2 regarding triangular * homogeneous	2016-09-30 22:49:59 +02:00
Gael Guennebaud	67b4f45836	Fix angle range	2016-09-30 12:46:33 +02:00
Gael Guennebaud	27f3970453	Remove std:: prefix	2016-09-30 12:40:41 +02:00
Gael Guennebaud	3860a0bc8f	bug #1312 : Quaternion to AxisAngle conversion now ensures the angle will be in the range [-pi,pi]. This also increases accuracy when q.w is negative.	2016-09-29 23:23:35 +02:00
Gael Guennebaud	33500050c3	bug #1308 : fix compilation of some small products involving nullary-expressions.	2016-09-29 09:40:44 +02:00
Benoit Steiner	27d7628f16	Updated the list of warnings to reflect the new message ids introduced in cuda 8.0	2016-09-28 17:42:59 -07:00
Gael Guennebaud	f3a00dd2b5	Merged in sergiu/eigen (pull request PR-229) Disabled MSVC level 4 warning C4714	2016-09-27 09:28:08 +02:00
Gael Guennebaud	892afb9416	Add debug info.	2016-09-26 23:53:57 +02:00
Gael Guennebaud	779774f98c	bug #1311 : fix alignment logic in some cases of (scalar*small).lazyProduct(small)	2016-09-26 23:53:40 +02:00
Gael Guennebaud	48dfe98abd	bug #1308 : fix compilation of vector * rowvector::nullary.	2016-09-25 14:54:35 +02:00
Sergiu Deitsch	fe29157d02	disabled MSVC level 4 warning C4714 The level 4 warning (/W4) warns about functions marked as __forceinline not inlined, and generates a lot of noise.	2016-09-25 14:25:47 +02:00
Gael Guennebaud	86caba838d	bug #1304 : fix Projective * scaling and Projective *= scaling	2016-09-23 13:41:21 +02:00
Benoit Steiner	2a69290ddb	Added a specialization of Eigen::numext::real and Eigen::numext::imag for std::complex<T> to be used when compiling a cuda kernel. This is unfortunately necessary to be able to process complex numbers from a CUDA kernel on MacOS.	2016-09-22 15:52:23 -07:00
Gael Guennebaud	77e27fbeee	bump to 3.3-rc1	2016-09-22 22:37:39 +02:00
Gael Guennebaud	2ada122bc6	merge	2016-09-22 22:33:18 +02:00
Gael Guennebaud	8f2bdde373	merge	2016-09-22 22:32:55 +02:00
Gael Guennebaud	ba0f844d6b	Backout changeset `ce3557ca69`	2016-09-22 22:28:51 +02:00
Benoit Steiner	50e3bbfc90	Calls x.imag() instead of imag(x) when x is a complex number since the former is a constexpr while the later isn't. This fixes compilation errors triggered by nvcc on Mac.	2016-09-22 13:17:25 -07:00
Gael Guennebaud	ca3746c6f8	Bypass identity reflectors.	2016-09-22 22:07:13 +02:00
Felix Gruber	8bde7da086	fix documentation of LinSpaced The index of the highest value in a LinSpace is size-1.	2016-09-22 14:50:07 +02:00
Gael Guennebaud	66cbabafed	Add a note regarding gcc bug #72867	2016-09-22 11:18:52 +02:00
Gael Guennebaud	9fa2c8650e	Fix alignement of statically allocated temporaries in symv, and trmv.	2016-09-21 17:34:24 +02:00
Gael Guennebaud	ac5377e161	Improve cost estimation of complex division	2016-09-21 17:26:04 +02:00
Benoit Steiner	26f9907542	Added missing typedefs	2016-09-20 12:58:03 -07:00
RJ Ryan	b2c6dc48d9	Add CUDA-specific std::complex<T> specializations for scalar_sum_op, scalar_difference_op, scalar_product_op, and scalar_quotient_op.	2016-09-20 07:18:20 -07:00
Benoit Steiner	8a66ca4b10	Pulled latest updates from trunk	2016-09-19 14:13:55 -07:00
Benoit Steiner	59e9edfbf1	Removed EIGEN_DEVICE_FUNC qualifers for the lu(), fullPivLu(), partialPivLu(), and inverse() functions since they aren't ready to run on GPU	2016-09-19 14:13:20 -07:00
Hongkai Dai	5dcc6d301a	remove ternary operator in euler angles	2016-09-19 10:30:30 -07:00
Luke Iwanski	b91e021172	Merged with default.	2016-09-19 14:03:54 +01:00
Luke Iwanski	cb81975714	Partial OpenCL support via SYCL compatible with ComputeCpp CE.	2016-09-19 12:44:13 +01:00
Gael Guennebaud	4cc2c73e6a	Fix alignement of statically allocated temporaries in gemv.	2016-09-17 12:52:27 +02:00
Christoph Hertzberg	ce3557ca69	Make makeHouseholder more stable for cases where real(c0) is not very small (but the rest is).	2016-09-16 14:24:47 +02:00
Gael Guennebaud	ee62f168e6	Doc: add link from block methods to respective tutorial section.	2016-09-16 11:26:25 +02:00
Gael Guennebaud	ca7f061a5f	bug #828 : clarify documentation of SparseMatrixBase's methods returning a sub-matrix.	2016-09-16 11:23:19 +02:00
Gael Guennebaud	50e203c717	bug #828 : clarify documentation of SparseMatrixBase's unary methods.	2016-09-16 10:40:50 +02:00
Gael Guennebaud	fa9049a544	Let be consistent and consider any denormal number as zero.	2016-09-15 11:24:03 +02:00
Gael Guennebaud	b33144e4df	merge	2016-09-15 11:22:16 +02:00
Benoit Steiner	c0d56a543e	Added several missing EIGEN_DEVICE_FUNC qualifiers	2016-09-14 14:06:21 -07:00
Benoit Steiner	779faaaeba	Fixed compilation warnings generated by nvcc 6.5 (and below) when compiling the EIGEN_THROW macro	2016-09-14 09:56:11 -07:00
Gael Guennebaud	1c8347e554	Fix product for custom complex type. (conjugation was ignored)	2016-09-14 18:28:49 +02:00
Benoit Steiner	ff47717f25	Suppress warning 2527 and 2529, which correspond to the "calling a __host__ function from a __host__ __device__ function is not allowed" message in nvcc 6.5.	2016-09-13 12:49:40 -07:00
Benoit Steiner	309190cf02	Suppress message 1222 when compiling with nvcc: this ensures that we don't warnings about unknown warning messages when compiling with older versions of nvcc	2016-09-13 12:42:13 -07:00
Gael Guennebaud	c10620b2b0	Fix typo in doc.	2016-09-13 09:25:07 +02:00
Gael Guennebaud	73c8f2f697	bug #1285 : fix regression introduced in changeset `00c29c2cae`	2016-09-13 07:58:39 +02:00
Benoit Steiner	5f50f12d2c	Added the ability to compute the absolute value of a complex number on GPU, as well as a test to catch the problem.	2016-09-12 13:46:13 -07:00
Gael Guennebaud	228ae29591	Fix compilation on 32 bits systems.	2016-09-09 22:34:38 +02:00
Gael Guennebaud	471eac5399	bug #1195 : move NumTraits::Div<>::Cost to internal::scalar_div_cost (with some specializations in arch/SSE and arch/AVX)	2016-09-08 08:36:27 +02:00
Gael Guennebaud	d780983f59	Doc: explain minimal requirements on nullary functors	2016-09-06 23:14:52 +02:00
Gael Guennebaud	85fb517eaf	Generalize ScalarBinaryOpTraits to any complex-real combination as defined by NumTraits (instead of supporting std::complex only).	2016-09-06 17:23:15 +02:00
Gael Guennebaud	447f269561	Disable previous workaround.	2016-09-06 15:49:02 +02:00
Gael Guennebaud	b046a3f87d	Workaround MSVC instantiation faillure of has_ary_operator at the level of triats<Ref>::match so that the has_ary_operator are really properly instantiated throughout the compilation unit.	2016-09-06 15:47:04 +02:00
Gael Guennebaud	3cb914f332	bug #1266 : remove CUDA guards on MatrixBase::<decomposition> definitions. (those used to break old nvcc versions that we propably don't care anymore)	2016-09-06 09:55:50 +02:00
Gael Guennebaud	19a95b3309	Fix shadowing wrt Eigen::Index	2016-09-05 17:19:47 +02:00
Gael Guennebaud	e13071dd13	Workaround a weird msvc 2012 compilation error.	2016-09-05 15:50:41 +02:00
Gael Guennebaud	d123717e21	Fix for msvc 2012 and older	2016-09-05 15:26:56 +02:00
Benoit Steiner	373c340b71	Fixed a typo	2016-09-02 15:41:17 -07:00
Benoit Steiner	5a6be66cef	Turned the Index type used by the nullary wrapper into a template parameter.	2016-09-02 14:10:29 -07:00
Gael Guennebaud	d6c8366d84	Fix compilation with MSVC 2012	2016-09-02 15:23:32 +02:00
Gael Guennebaud	ef54723dbe	One more msvc fix iteration, the previous one was over-simplified for visual	2016-09-01 15:04:53 +02:00
Gael Guennebaud	f9f32e9e2d	Fix compilation with nvcc	2016-09-01 13:06:14 +02:00
Gael Guennebaud	3d946e42b3	Fix compilation with visual studio	2016-09-01 12:59:32 +02:00
Gael Guennebaud	836fa25a82	Make sure sizeof is truelly needed, thus improving SFINAE portability.	2016-08-31 23:40:18 +02:00
Gael Guennebaud	84cf6e42ca	minor tweaks in has_* helpers	2016-08-31 23:04:14 +02:00
Gael Guennebaud	218c37beb4	bug #1286 : automatically detect the available prototypes of functors passed to CwiseNullaryExpr such that functors have only to implement the operators that matters among: operator()() operator()(i) operator()(i,j) Linear access is also automatically detected based on the availability of operator()(i,j).	2016-08-31 15:45:25 +02:00
Gael Guennebaud	3456247437	bug #1283 : quick fix for products involving uncommon general block access to vectors.	2016-08-31 08:17:15 +02:00
Gael Guennebaud	8c48d42530	Fix 4x4 inverse with non-linear destination	2016-08-30 23:16:38 +02:00
Gael Guennebaud	e7fbbc2748	Doc: add links and discourage user to write their own expression (better use CwiseNullaryOp)	2016-08-30 15:57:46 +02:00
Gael Guennebaud	9c9e23858e	Doc: split customizing-eigen page into sub-pages and re-structure a bit the different topics	2016-08-30 11:10:08 +02:00
Gael Guennebaud	cffe8bbff7	Doc: add link to example	2016-08-30 10:45:27 +02:00
Gael Guennebaud	68e803a26e	Fix warning	2016-08-30 09:21:57 +02:00
Gael Guennebaud	2915e1fc5d	Revert part of changeset `5b3a6f51d3` to keep accuracy of smallest eigenvalues.	2016-08-29 14:14:18 +02:00
Gael Guennebaud	7e029d1d6e	bug #1271 : add SparseMatrix::coeffs() methods returning a 1D view of the non zero coefficients.	2016-08-29 12:06:37 +02:00
Gael Guennebaud	8f4b4ad5fb	use ::hlog if available.	2016-08-29 11:05:32 +02:00
Gael Guennebaud	35a8e94577	bug #1167 : simplify installation of header files using cmake's install(DIRECTORY ...) command.	2016-08-29 10:59:37 +02:00
Gael Guennebaud	0decc31aa8	Add generic implementation of conj_helper for custom complex types.	2016-08-29 09:42:29 +02:00
Gael Guennebaud	fd9caa1bc2	bug #1282 : fix implicit double to float conversion warning	2016-08-28 22:45:56 +02:00
Gael Guennebaud	68d1897e8a	Make sure that our log1p implementation is called as a last resort only.	2016-08-26 15:30:55 +02:00
Gael Guennebaud	fe60856fed	Add overload of numext::log1p for float/double in CUDA	2016-08-26 15:28:59 +02:00
Gael Guennebaud	1329c55875	Fix compilation with boost::multiprec.	2016-08-25 14:54:39 +02:00
Gael Guennebaud	441b7eaab2	Add support for non trivial scalar factor in sparse selfadjoint * dense products, and enable +=/-= assignement for such products. This changeset also improves the performance by working on column of the result at once.	2016-08-24 13:06:34 +02:00
Gael Guennebaud	8132a12625	bug #1268 : detect faillure in LDLT and report them through info()	2016-08-23 23:15:55 +02:00
Gael Guennebaud	bde9b456dc	Typo	2016-08-23 21:36:36 +02:00
Gael Guennebaud	ea2e968257	Address several implicit scalar conversions.	2016-08-23 18:44:33 +02:00
Gael Guennebaud	0a6a50d1b0	Cleanup eiegnvector extraction: leverage matrix products and compile-time sizes, remove numerous useless temporaries.	2016-08-23 18:14:37 +02:00
Gael Guennebaud	00b2666853	bug #645 : patch from Tobias Wood implementing the extraction of eigenvectors in GeneralizedEigenSolver	2016-08-23 17:37:38 +02:00
Gael Guennebaud	504a4404f1	Optimize expression matching "d?=a-bc" as "d?=a; d?=bc;"	2016-08-23 16:52:22 +02:00
Gael Guennebaud	e47a8928ec	Fix compilation in check_for_aliasing due to ambiguous specializations	2016-08-23 16:19:10 +02:00
Gael Guennebaud	ef3de20481	Cleanup cost of tanh	2016-08-23 14:39:55 +02:00
Gael Guennebaud	b3151bca40	Implement pmadd for float and double to make it consistent with the vectorized path when FMA is available.	2016-08-23 14:24:08 +02:00
Gael Guennebaud	a4c266f827	Factorize the 4 copies of tanh implementations, make numext::tanh consistent with array::tanh, enable fast tanh in fast-math mode only.	2016-08-23 14:23:08 +02:00
Gael Guennebaud	82147cefff	Fix possible overflow and biais in integer random generator	2016-08-23 13:25:31 +02:00
Gael Guennebaud	581b6472d1	bug #1265 : remove outdated notes	2016-08-22 23:25:39 +02:00
Igor Babuschkin	59bacfe520	Fix compilation on CUDA 8 by removing call to h2log1p	2016-08-15 23:38:05 +01:00
Christoph Hertzberg	c83b754ee0	bug #1272 : Disable assertion when total number of columns is zero. Also moved assertion to finished() method and adapted unit-test	2016-08-12 15:15:34 +02:00
Igor Babuschkin	aee693ac52	Add log1p support for CUDA and half floats	2016-08-08 20:24:59 +01:00
Benoit Steiner	72096f3bd4	Merged in suiyuan2009/eigen/fix_tanh_inconsistent_for_tensorflow (pull request PR-215) Fix_tanh_inconsistent_for_tensorflow	2016-08-08 09:06:45 -07:00
Christoph Hertzberg	3e4a33d4ba	bug #1272 : Let CommaInitializer work for more border cases (enhances fix of bug #1242 ). The unit test tests all combinations of 2x2 block-sizes from 0 to 3.	2016-08-08 17:26:48 +02:00
Ziming Dong	1031223c09	fix tanh inconsistent	2016-08-06 19:48:50 +08:00
Benoit Steiner	fe778427f2	Fixed the constructors of the new half_base class.	2016-08-04 18:32:26 -07:00
Benoit Steiner	9506343349	Fixed the isnan, isfinite and isinf operations on GPU	2016-08-04 17:25:53 -07:00
Gael Guennebaud	17b9a55d98	Move Eigen::half_impl::half to Eigen::half while preserving the free functions to the Eigen::half_impl namespace together with ADL	2016-08-04 00:00:43 +02:00
Gael Guennebaud	7995cec90c	Fix vectorization logic for coeff-based product for some corner cases.	2016-07-31 15:20:22 +02:00
Benoit Steiner	02fe89f5ef	half implementation has been moved to half_impl namespace	2016-07-29 15:09:34 -07:00
Christoph Hertzberg	c5b893f434	bug #1266 : half implementation has been moved to half_impl namespace	2016-07-29 18:36:08 +02:00
klimpel	ca5effa16c	MSVC-2010 is making problems with SFINAE again. But restricting to the variant for very old compilers (enum, template<typename C> for both function definitions) fixes the problem.	2016-07-28 15:58:17 +01:00
Gael Guennebaud	4057f9b1fc	Enable slice-vectorization+inner-unrolling when unaligned vectorization is allowed. For instance, this permits to vectorize 5x5 matrices (including product)	2016-07-28 13:47:33 +02:00
Gael Guennebaud	a72752caac	Vectorize more small product expressions by letting the general assignement logic decides on the sizes that are OK for vectorization.	2016-07-28 11:21:07 +02:00
Christoph Hertzberg	d3d7c6245d	Add brackets to block matrix and fixed some typos	2016-07-27 09:55:39 +02:00
Gael Guennebaud	f6b3cf8de9	Bump to 3.3-beta2	2016-07-26 23:51:59 +02:00
Gael Guennebaud	95113cb15c	Improve robustness of 2x2 eigenvalue with shifting and scaling	2016-07-26 14:43:54 +02:00
Gael Guennebaud	7f7e84aa36	Fix compilation with MKL support	2016-07-26 13:31:29 +02:00
Gael Guennebaud	c581c8fa79	Fix with expession template scalar types.	2016-07-26 11:33:28 +02:00
Gael Guennebaud	757971e7ea	bug #1258 : fix compilation of Map<SparseMatrix>::coeffRef	2016-07-26 09:40:19 +02:00
Gael Guennebaud	9c663e4ee8	Clean references to MKL in LAPACKe support.	2016-07-25 18:20:08 +02:00
Gael Guennebaud	0c06077efa	Rename MKL files	2016-07-25 18:00:47 +02:00
Gael Guennebaud	4d54e3dd33	bug #173 : remove dependency to MKL for LAPACKe backend.	2016-07-25 17:55:07 +02:00
Gael Guennebaud	34b483e25d	bug #1249 : enable use of __builtin_prefetch for GCC, clang, and ICC only.	2016-07-25 15:17:45 +02:00
Gael Guennebaud	9908020d36	Add minimal support for Array<string>, and fix Tensor<string>	2016-07-25 14:25:56 +02:00
Gael Guennebaud	1b2049fbda	Enforce scalar types in calls to max/min (helps with expression template scalar types)	2016-07-25 12:35:10 +02:00
Gael Guennebaud	b118bc76eb	Add digits10 overload for complex.	2016-07-25 12:33:21 +02:00
Gael Guennebaud	c96af5381f	Remove custom complex division function cdiv.	2016-07-25 12:31:58 +02:00
Gael Guennebaud	e1c7c5968a	Update doc.	2016-07-25 11:18:04 +02:00
Gael Guennebaud	8fffc81606	Add NumTraits::digits10() function based on numeric_limits::digits10 and make use of it for printing matrices.	2016-07-25 11:13:01 +02:00
Gael Guennebaud	1b0353c659	Fix misuse of dummy_precesion in eigenvalues solvers	2016-07-23 17:52:31 +02:00
Gael Guennebaud	72744d93ef	Allows the compiler to inline outer products (the change from default to dont-inline in changeset `737bed19c1` was not motivated)	2016-07-22 17:02:28 +02:00
Gael Guennebaud	395c835f4b	Fix CUDA compilation	2016-07-22 15:30:24 +02:00
Gael Guennebaud	47afc9a365	More cleaning in half: - put its definition and functions in its own half_impl namespace such that the free function does not polute the Eigen namespace while still making them visible for half through ADL. - expose Eigen::half throguh a using statement - move operator<< from std to half_float namespace	2016-07-22 14:33:28 +02:00
Gael Guennebaud	0f350a8b7e	Fix CUDA compilation	2016-07-21 18:47:07 +02:00
Gael Guennebaud	bf91a44f4a	Use ADL and log10 for printing matrices.	2016-07-21 15:48:24 +02:00
Gael Guennebaud	87fbda812f	Add missing log10 and random generator for half.	2016-07-21 15:46:45 +02:00
Gael Guennebaud	01d12d3e82	Some cleanup in Halh: standard functions should be defined in the namespace of the class half to make ADL work, and thus the global is* functions can be removed.	2016-07-21 15:10:48 +02:00
Gael Guennebaud	7722913475	Fix ambiguous specialization with custom scalar type	2016-07-20 15:13:44 +02:00
Gael Guennebaud	fd057f86b3	Complete the coeff-wise math function table.	2016-07-20 12:14:10 +02:00
Gael Guennebaud	9e8476ef22	Add missing Eigen::rsqrt global function	2016-07-20 11:59:49 +02:00
Gael Guennebaud	4b4c296d6e	Simplify ScalarBinaryOpTraits by removing the Defined enum, and extend its documentation.	2016-07-20 09:56:39 +02:00
Gael Guennebaud	e3bf874c83	Workaround MSVC 2010 compilation issue.	2016-07-18 15:17:25 +02:00
Gael Guennebaud	0f89c6d6b5	Add a summary of possible values for EIGEN_COMP_MSVC	2016-07-18 15:16:13 +02:00
Gael Guennebaud	18884f17d7	Remove static constant declaration: this enforces compiler to generate costly code for thread safety.	2016-07-18 15:05:17 +02:00
Gael Guennebaud	79574e384e	Make scalar_product_op the default (instead of void)	2016-07-18 12:03:05 +02:00
Gael Guennebaud	6a3c451c1c	Permits call to explicit ctor.	2016-07-18 12:02:20 +02:00
Gael Guennebaud	0c3fe4aca5	merge	2016-07-18 10:44:15 +02:00
Gael Guennebaud	db9b154193	Add missing non-const reverse method in VectorwiseOp.	2016-07-16 15:19:28 +02:00
Gael Guennebaud	461cd819c2	Workaround VS2015 bug	2016-07-13 18:46:01 +02:00
Gael Guennebaud	5ea0864c81	Fix regression in a previous commit: some diagonal entry might not be treated by the 2x2 real preconditioner.	2016-07-13 18:37:54 +02:00
Gael Guennebaud	b4343aa67e	Avoid division by very small entries when extracting singularvalues, and explicitly handle the 1x1 complex case.	2016-07-12 17:22:03 +02:00
Gael Guennebaud	e2aa58b631	Consider denormals as zero in makeJacobi and 2x2 SVD. This also fix serious issues with x387 for which values can be much smaller than the smallest denormal!	2016-07-12 17:21:03 +02:00
klimpel	8b3fc31b55	compile fix (SFINAE variant apparently didn't work for all compilers) for the following compiler/platform: gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46) Copyright (C) 2006 Free Software Foundation, Inc.	2016-07-11 17:42:22 +02:00
Gael Guennebaud	a96a7ce3f7	Move CUDA's special functions to SpecialFunctions module.	2016-07-11 18:39:11 +02:00
Gael Guennebaud	fd60966310	merge	2016-07-11 18:11:47 +02:00
Gael Guennebaud	3e348fdcf9	Workaround MSVC bug	2016-07-11 15:24:52 +02:00
Konstantinos Margaritis	ef05463fcf	Merged kmargar/eigen/tip into default, Altivec/VSX port should be working ok now.	2016-07-10 16:11:46 +03:00
Konstantinos Margaritis	9f7caa7e7d	minor fixes for big endian altivec/vsx	2016-07-10 07:05:10 -03:00
Christoph Hertzberg	3c795c6923	bug #1119 : Adjust call to ?gssvx for SuperLU 5 Also improved corresponding cmake module to detect versions 5.x Based on patch by Christoph Grüninger.	2016-07-10 02:29:57 +02:00
Gael Guennebaud	2f7e2614e7	bug #1232 : refactor special functions as a new SpecialFunctions module, currently in unsupported/.	2016-07-08 11:13:55 +02:00
Gael Guennebaud	66917299a9	Add debug output	2016-07-06 22:27:15 +02:00
Gael Guennebaud	c3b23d7dbf	Fix support of Intel's VML	2016-07-06 14:07:32 +02:00
Gael Guennebaud	8ec4d6480d	Fix compilation with recent updates of icc 2016	2016-07-06 14:07:14 +02:00
Gael Guennebaud	5b3a6f51d3	Improve numerical robustness of RealSchur: add scaling and compare sub-diag entries to largest diagonal entry instead of the 2 neighbors.	2016-07-06 13:45:30 +02:00
Gael Guennebaud	367ef66af3	Re-enable some specializations for Assignment<.,Product<>>	2016-07-05 22:58:14 +02:00
Gael Guennebaud	155d8d8603	Fix compilation with msvc	2016-07-05 14:43:42 +02:00
Gael Guennebaud	b39fd8217f	Fix nesting of SolveWithGuess, and add unit test.	2016-07-04 17:47:47 +02:00
Gael Guennebaud	ec02af1047	Fix template resolution.	2016-07-04 17:37:33 +02:00
Gael Guennebaud	fbcfc2f862	Add unit test for solveWithGuess, and fix template resolution.	2016-07-04 17:19:38 +02:00
Gael Guennebaud	7f7839c12f	Add documentation and exemples for inplace decomposition.	2016-07-04 17:18:26 +02:00
Gael Guennebaud	32a41ee659	bug #707 : add inplace decomposition through Ref<> for Cholesky, LU and QR decompositions.	2016-07-04 15:13:35 +02:00
Gael Guennebaud	91b3039013	Change the semantic of the last template parameter of Assignment from "Scalar" to "SFINAE" only. The previous "Scalar" semantic was obsolete since we allow for different scalar types in the source and destination expressions. On can still specialize on scalar types through SFINAE and/or assignment functor.	2016-07-04 11:02:00 +02:00
Gael Guennebaud	0fa9e4a15c	Fix performance regression in dgemm introduced by changeset `5d51a7f12c`	2016-07-02 17:35:08 +02:00
Gael Guennebaud	672076db5d	Fix performance regression introduced in changeset `e56aabf205` . Register blocking sizes are better handled by the cache size heuristics. The current code introduced very small blocks, for instance for 9x9 matrix, thus killing performance.	2016-07-02 15:40:56 +02:00
Justin Carpentier	6126886a67	Use complete nested namespace Eigen::internal	2016-06-28 20:09:25 +02:00
Benoit Jacob	328c5d876a	Undo changes in AltiVec --- I don't have any way to test there.	2016-06-28 11:15:25 -04:00
Benoit Jacob	38fb606052	Avoid global variables with static constructors in NEON/Complex.h	2016-06-28 11:12:49 -04:00
Gael Guennebaud	d937a420a2	Fix compilation with MSVC by using our portable numext::log1p implementation.	2016-08-22 15:44:21 +02:00
Gael Guennebaud	2d5731e40a	bug #1270 : bypass custom asm for pmadd and recent clang version	2016-08-22 15:38:03 +02:00
Gael Guennebaud	49b005181a	Define EIGEN_COMP_CLANG to clang version as major*100+minor (e.g., 307 corresponds to clang 3.7)	2016-08-22 15:37:05 +02:00
Gael Guennebaud	130f891bb0	bug #1278 : ease parsing	2016-08-22 15:00:29 +02:00
Gael Guennebaud	d476cadbb8	bug #1247 : fix regression in compilation of pow(integer,integer), and add respective unit tests.	2016-06-25 10:12:06 +02:00
Gael Guennebaud	c50c73cae2	Fix missing specialization.	2016-06-24 23:10:39 +02:00
Gael Guennebaud	cd577a275c	Relax promote_scalar_arg logic to enable promotion to Expr::Scalar if conversion to Expr::Literal fails. This is useful to cancel expression template at the scalar level, e.g. with AutoDiff<AutoDiff<>>. This patch also defers calls to NumTraits in cases for which types are not directly compatible.	2016-06-24 11:28:54 +02:00
Gael Guennebaud	deb45ad4bc	bug #1245 : fix compilation with msvc	2016-06-24 09:52:25 +02:00
Gael Guennebaud	55fc04e8b5	Fix operator priority	2016-06-23 15:36:42 +02:00
Gael Guennebaud	bf2d5edecc	Fix warning.	2016-06-23 15:35:17 +02:00
Gael Guennebaud	7c6561485a	merge PR 194	2016-06-23 15:29:57 +02:00
Konstantinos Margaritis	be107e387b	fix compilation with clang 3.9, fix performance with pset1, use vector operators instead of intrinsics in some cases	2016-06-23 10:19:05 -03:00
Gael Guennebaud	76faf4a965	Introduce a NumTraits<T>::Literal type to be used for literals, and improve mixing type support in operations between arrays and scalars: - 2 * ArrayXcf is now optimized in the sense that the integer 2 is properly promoted to a float instead of a complex<float> (fix a regression) - 2.1 * ArrayXi is now forbiden (previously, 2.1 was converted to 2) - This mechanism should be applicable to any custom scalar type, assuming NumTraits<T>::Literal is properly defined (it defaults to T)	2016-06-23 14:27:20 +02:00
Gael Guennebaud	a3f7edf7e7	Biug 1242: fix comma init with empty matrices.	2016-06-23 10:25:04 +02:00
Konstantinos Margaritis	8c34b5a0e3	mostly cleanups and modernizing code	2016-06-19 16:13:17 -03:00
Konstantinos Margaritis	b410d46482	mostly cleanups and modernizing code	2016-06-19 16:12:52 -03:00
Konstantinos Margaritis	b80379bda0	fixed pexp<Packet2d>, was failing tests	2016-06-19 16:11:58 -03:00
Benoit Steiner	b055590e91	Made log1p_impl usable inside a GPU kernel	2016-06-16 11:37:40 -07:00
Gael Guennebaud	67c12531e5	Fix warnings with gcc	2016-06-15 18:11:33 +02:00
Gael Guennebaud	eb91345d64	Move scalar/expr to ArrayBase and fix documentation	2016-06-15 15:22:03 +02:00
Gael Guennebaud	4794834397	Propagate functor to ScalarBinaryOpTraits	2016-06-15 09:58:49 +02:00
Gael Guennebaud	c55035b9c0	Include the cost of stores in unrolling of triangular expressions.	2016-06-15 09:57:33 +02:00
Gael Guennebaud	4e7c3af874	Cleanup useless helper: internal::product_result_scalar	2016-06-15 00:04:10 +02:00
Gael Guennebaud	101ea26f5e	Include the cost of stores in unrolling (also fix infinite unrolling with expression costing 0 like Constant)	2016-06-15 00:01:16 +02:00
Gael Guennebaud	76236cdea4	merge	2016-06-14 15:33:47 +02:00
Gael Guennebaud	1004c4df99	Cleanup unused functors.	2016-06-14 15:27:28 +02:00
Gael Guennebaud	70dad84b73	Generalize expr/expr and scalar/expr wrt scalar types.	2016-06-14 15:26:37 +02:00
Gael Guennebaud	62134082aa	Update AutoDiffScalar wrt to scalar-multiple.	2016-06-14 15:06:35 +02:00
Gael Guennebaud	396d9cfb6e	Generalize expr.pow(scalar), pow(expr,scalar) and pow(scalar,expr). Internal: scalar_pow_op (unary) is removed, and scalar_binary_pow_op is renamed scalar_pow_op.	2016-06-14 14:10:07 +02:00
Gael Guennebaud	a8c08e8b8e	Implement expr+scalar, scalar+expr, expr-scalar, and scalar-expr as binary expressions, and generalize supported scalar types. The following functors are now deprecated: scalar_add_op, scalar_sub_op, and scalar_rsub_op.	2016-06-14 12:06:10 +02:00
Gael Guennebaud	756ac4a93d	Fix doc.	2016-06-14 12:03:39 +02:00
Gael Guennebaud	bcc0f38f98	Add unittesting plugins to scalar_product_op and scalar_quotient_op to help chaking that types are properly propagated.	2016-06-14 11:31:27 +02:00
Gael Guennebaud	f57fd78e30	Generalize coeff-wise sparse products to support different scalar types	2016-06-14 11:29:54 +02:00
Gael Guennebaud	f5b1c73945	Set cost of constant expression to 0 (the cost should be amortized through the expression)	2016-06-14 11:29:06 +02:00
Gael Guennebaud	deb8306e60	Move MatrixBase::operaotr*(UniformScaling) as a free function in Scaling.h, and fix return type.	2016-06-14 11:28:03 +02:00
Gael Guennebaud	64fcfd314f	Implement scalar multiples and division by a scalar as a binary-expression with a constant expression. This slightly complexifies the type of the expressions and implies that we now have to distinguish between scalarexpr and exprscalar to catch scalar-multiple expression (e.g., see BlasUtil.h), but this brings several advantages: - it makes it clear on each side the scalar is applied, - it clearly reflects that we are dealing with a binary-expression, - the complexity of the type is hidden through macros defined at the end of Macros.h, - distinguishing between "scalar op expr" and "expr op scalar" is important to support non commutative fields (like quaternions) - "scalar op expr" is now fully equivalent to "ConstantExpr(scalar) op expr" - scalar_multiple_op, scalar_quotient1_op and scalar_quotient2_op are not used anymore in officially supported modules (still used in Tensor)	2016-06-14 11:26:57 +02:00
Gael Guennebaud	3c12e24164	Add bind1st_op and bind2nd_op helpers to turn binary functors into unary ones, and implement scalar_multiple2 and scalar_quotient2 on top of them.	2016-06-13 16:18:59 +02:00
Gael Guennebaud	7a9ef7bbb4	Add default template parameters for the second scalar type of binary functors. This enhences backward compatibility.	2016-06-13 16:17:23 +02:00
Gael Guennebaud	4c61f00838	Add missing explicit scalar conversion	2016-06-12 22:42:13 +02:00
Gael Guennebaud	83904a21c1	Make sure T(i+1,i)==0 when diagonalizing T(i:i+1,i:i+1)	2016-06-11 14:41:36 +02:00
Gael Guennebaud	fabae6c9a1	Cleanup	2016-06-10 15:58:33 +02:00

... 3 4 5 6 7 ...

5233 Commits