eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-21 07:19:46 +08:00

Author	SHA1	Message	Date
Gael Guennebaud	8313fb7df7	Add row/column-wise reverseInPlace feature.	2015-03-31 21:35:53 +02:00
Gael Guennebaud	dfb674a25e	Make reverseInPlace really work in-place.	2015-03-31 20:17:10 +02:00
Gael Guennebaud	20d030f207	Fix vectorization of swap for non trivial expressions	2015-03-31 20:16:02 +02:00
Benoit Jacob	73cdeae1d3	Only use blocking sizes LUTs for single-thread products for now	2015-03-31 11:17:23 -04:00
Benoit Jacob	0cbd5ae3cb	Correctly detect Android with ndk_build	2015-03-31 11:17:21 -04:00
Gael Guennebaud	ae01c05e18	Fix computeProductBlockingSizes with m==0, and add respective unit test.	2015-03-31 15:19:57 +02:00
Gael Guennebaud	bd76d837e6	Fix sign of SuperLU::determinant	2015-03-31 14:57:32 +02:00
Gael Guennebaud	35d3053d55	Fix regression introduced in `3b169d792d`	2015-03-31 09:23:53 +02:00
Christoph Hertzberg	3b169d792d	Suppress unused variable warning	2015-03-31 00:49:08 +02:00
Christoph Hertzberg	1efae98fee	bug #985 : RealQZ failed when either matrix had zero rows or columns (report and patch by Ben Goodrich) Also added a regression test	2015-03-30 23:56:20 +02:00
Benoit Steiner	35722fa022	Made the index type a template parameter of the tensor class instead of encoding it in the options.	2015-03-30 14:55:54 -07:00
Christoph Hertzberg	58af8bf90c	bug #982 : Make sure numext::maxi and numext::mini are called correctly, in case Scalar expressions return expression templates.	2015-03-30 16:47:22 +02:00
Gael Guennebaud	2adbf6b8ca	fix stupid warning with old GCC	2015-03-28 22:34:54 +01:00
Gael Guennebaud	41e20248f8	merge	2015-03-28 14:43:35 +01:00
Christoph Hertzberg	09a5361d1b	bug #983 : Pass Vector3 by const reference and not by value	2015-03-28 12:36:24 +01:00
Gael Guennebaud	eb7e4c2b9c	Pass Vector3 type by reference	2015-03-27 12:11:24 +01:00
Gael Guennebaud	79cb875249	merge	2015-03-27 10:56:04 +01:00
Gael Guennebaud	1b8cc9af43	Slight numerical stability improvement in 2x2 svd	2015-03-27 10:55:00 +01:00
Gael Guennebaud	3d59ae0203	Fix hypot(0,0).	2015-03-27 09:59:24 +01:00
Benoit Steiner	d3f7915aeb	Pulled latest update from the eigen main codebase	2015-03-24 13:12:14 -07:00
Benoit Steiner	abdbe8562e	Fixed the CUDA packet primitives	2015-03-24 10:45:46 -07:00
Gael Guennebaud	29eaa2b0f1	Make MatrixBase::is* methods aware of nested_eval.	2015-03-24 13:42:42 +01:00
Gael Guennebaud	d27968eb7e	D&C SVD: directly falls back to JacobiSVD for very small problems (by-pass upper-bidiagonalization)	2015-03-24 13:38:07 +01:00
Gael Guennebaud	4472f3e578	Avoid SVD: consider denormalized small numbers as zero when computing the rank of the matrix	2015-03-23 09:40:21 +01:00
Deanna Hood	83e5b7656b	Use M_PI instead of acos(-1) for pi	2015-03-22 06:04:31 +10:00
Deanna Hood	4bab4790c0	Add \sa tags of isFinite/isInf for each other	2015-03-22 05:39:08 +10:00
Gael Guennebaud	4e2b18d909	Update approx. minimum ordering method to push and keep structural empty diagonal elements to the bottom-right part of the matrix	2015-03-20 16:33:48 +01:00
Gael Guennebaud	d6b2f300db	Fix MSVC compilation: aligned type must be passed by reference	2015-03-19 17:28:32 +01:00
Gael Guennebaud	61c45d7cfd	Fix comparison warning	2015-03-19 17:13:22 +01:00
Gael Guennebaud	f329d0908a	Improve random number generation for integer and add unit test	2015-03-19 15:10:36 +01:00
Benoit Jacob	dc04f12967	use unsigned short instead of uint16_t which doesn't exist in c++98	2015-03-17 10:31:45 -04:00
Deanna Hood	596be3cd86	Use std::arg for real numbers when c++11 is used, custom implementation otherwise	2015-03-17 15:28:12 +10:00
Deanna Hood	e26134ed62	Use std::round when c++11 is used, custom implementation otherwise	2015-03-17 14:55:14 +10:00
Deanna Hood	e21e29a088	Update cost of arg call to depend on if the scalar is complex or not	2015-03-17 14:04:33 +10:00
Deanna Hood	447a5a6b01	Fix VML declarations to only be for real/complex as appropriate	2015-03-17 13:33:31 +10:00
Deanna Hood	f52b78491c	Remove packet isNaN, isInf, isFinite	2015-03-17 09:26:24 +10:00
Deanna Hood	1c78d6f2a6	Add boolean not operator (!) array support	2015-03-17 08:29:57 +10:00
Benoit Jacob	364cfd529d	Similar to cset `3589a9c115` , also in 2px4 kernel: actual_panel_rows computation should always be resilient to parameters not consistent with the known L1 cache size, see comment	2015-03-16 16:28:44 -04:00
Deanna Hood	e1d6e6c972	Make cube, inverse and abs2 free-functions	2015-03-17 06:25:24 +10:00
Benoit Jacob	577056aa94	Include stdint.h. Not going for cstdint because it is a C++11 addition. Needed for uint16_t at least, in lookup-table code.	2015-03-16 16:21:50 -04:00
Benoit Jacob	eb6929cb19	fix bug in maxsize calculation, which would cause products of size > 2048 to address the lookup table out of bounds	2015-03-16 16:15:47 -04:00
Deanna Hood	fef4e071d7	Rename isinf to isInf	2015-03-17 05:58:47 +10:00
Deanna Hood	46cf9cda32	Add isfinite array support as isFinite	2015-03-17 04:33:12 +10:00
Deanna Hood	1d76ceab55	Remove floor, ceil, round for complex numbers	2015-03-17 02:36:07 +10:00
Deanna Hood	717b7954ce	Update cost of coeff-wise arg call	2015-03-17 02:11:57 +10:00
Deanna Hood	fb68b149cb	Rename isnan to isNaN	2015-03-17 02:04:42 +10:00
Benoit Jacob	35c3a8bb84	Update Nexus 5 lookup table from combining now 2 runs of the benchmark, using the analyze-blocking-sizes partition tool. Gives better worst-case performance.	2015-03-16 11:05:51 -04:00
Benoit Jacob	e274607d7f	fix compilation with GCC 4.8	2015-03-16 10:48:27 -04:00
Benoit Jacob	151b8b95c6	Fix bug in case where EIGEN_TEST_SPECIFIC_BLOCKING_SIZE is defined but false	2015-03-15 19:10:51 -04:00
Benoit Jacob	02babb9c0f	Provide a empirical lookup table for blocking sizes measured on a Nexus 5. Only for float, only for Android on ARM 32bit for now.	2015-03-15 18:13:12 -04:00
Benoit Jacob	3589a9c115	actual_panel_rows computation should always be resilient to parameters not consistent with the known L1 cache size, see comment	2015-03-15 18:12:18 -04:00
Benoit Jacob	1dd3d89818	Fix a unused-var warning	2015-03-15 18:07:19 -04:00
Benoit Jacob	e56aabf205	Refactor computeProductBlockingSizes to make room for the possibility of using lookup tables	2015-03-15 18:05:12 -04:00
Benoit Jacob	488c15615a	organize a little our default cache sizes, and use a saner default L1 outside of x86 (10% faster on Nexus 5)	2015-03-13 14:51:26 -07:00
Gael Guennebaud	1330f8bbd1	bug #973 , improve AVX support by enabling vectorization of Vector4i-like types, and enforcing alignement of Vector4f/Vector2d-like types to preserve compatibility with SSE and future Eigen versions that will vectorize them with AVX enabled.	2015-03-13 21:15:50 +01:00
Gael Guennebaud	d99ab35f9e	Fix internal::random(x,y) for integer types. The previous implementation could return y+1. The new implementation uses rejection sampling to get an unbiased behabior.	2015-03-13 21:12:46 +01:00
Gael Guennebaud	8580eb6808	bug #949 : add static assertion for incompatible scalar types in dense end-user decompositions.	2015-03-13 21:06:20 +01:00
Gael Guennebaud	a9df28c95b	SparseMatrix::insert: switch to a fully uncompressed mode if sequential insertion is not possible (otherwise an arbitrary large amount of memory was preallocated in some cases)	2015-03-13 21:00:21 +01:00
Gael Guennebaud	5ffe29cb9f	Bound pre-allocation to the maximal size representable by StorageIndex and throw bad_alloc if that's not possible.	2015-03-13 20:57:33 +01:00
Gael Guennebaud	2f6f8bf31c	Add missing coeff/coeffRef members to Block<sparse>, and extend unit tests.	2015-03-13 16:24:40 +01:00
Doug Kwan	657407227e	Fix bug in pdiv<Packet1cd> which swaps 32-bit halves of a pair of doubles instead of swapping the doubles.	2015-03-11 15:13:37 -07:00
Deanna Hood	f89fcefa79	Add hyperbolic trigonometric functions from std array support	2015-03-11 13:13:30 +10:00
Deanna Hood	a5e49976f5	Add log10 array support	2015-03-11 08:56:42 +10:00
Deanna Hood	19a71056ae	Allow calling of square(array) in addition to array.square()	2015-03-11 06:59:28 +10:00
Deanna Hood	31fdd67756	Additional unary coeff-wise functors (isnan, round, arg, e.g.)	2015-03-11 06:39:23 +10:00
Gael Guennebaud	fd78874888	Fix compilation of iterative solvers with dense matrices	2015-03-09 21:31:03 +01:00
Gael Guennebaud	d4317a85e8	Add typedefs for return types of SparseMatrixBase::selfadjointView	2015-03-09 21:29:46 +01:00
Gael Guennebaud	9e885fb766	Add unit tests for CG and sparse-LLT for long int as storage-index	2015-03-09 14:33:15 +01:00
Gael Guennebaud	224a1fe4c6	bug #963 : make IncompleteLUT compatible with non-default storage index types.	2015-03-09 13:55:20 +01:00
Gael Guennebaud	0ee391863e	Avoid undeflow when blocking size are tuned manually.	2015-03-06 21:51:09 +01:00
Gael Guennebaud	14a5f135a3	bug #969 : workaround abiguous calls to Ref using enable_if.	2015-03-06 17:51:31 +01:00
Gael Guennebaud	87681e508f	bug #978 : early return for vanishing products	2015-03-06 16:11:22 +01:00
Gael Guennebaud	cd3bbffa73	Improve blocking heuristic: if the lhs fit within L1, then block on the rhs in L1 (allows to keep packed rhs in L1)	2015-03-06 14:31:39 +01:00
Gael Guennebaud	58740ce4c6	Improve product kernel: replace the previous dynamic loop swaping strategy by a more general one: It consists in increasing the actual number of rows of lhs's micro horizontal panel for small depth such that L1 cache is fully exploited.	2015-03-06 10:30:35 +01:00
Gael Guennebaud	4c8b95d5c5	Rename LSCG to LeastSquaresConjugateGradient	2015-03-05 10:16:32 +01:00
Gael Guennebaud	7550107028	Product optimization: implement a dynamic loop-swapping startegy to improve memory accesses to the destination matrix in the case of K-rank-update like products, i.e., for products of the kind: "large x small" * "small x large"	2015-03-05 10:03:46 +01:00
Gael Guennebaud	2dc968e453	bug #824 : improve accuracy of Quaternion::angularDistance using atan2 instead of acos.	2015-03-04 17:03:13 +01:00
Benoit Steiner	0196141938	Fixed the optimized AVX implementation of the fast rsqrt function	2015-03-02 13:49:39 -08:00
Benoit Steiner	4fd7f47692	Added an optimized version of rsqrt for SSE and AVX that is used when EIGEN_FAST_MATH is defined.	2015-03-02 09:38:47 -08:00
Benoit Steiner	fb53384b0f	Improved the default implementation of prsqrt	2015-02-28 01:51:26 -08:00
Benoit Steiner	306fceccbe	Pulled latest updates from trunk	2015-02-27 13:05:26 -08:00
Benoit Steiner	2386fc8528	Added support for 32bit index on a per tensor/tensor expression. This enables us to use 32bit indices to evaluate expressions on GPU faster while keeping the ability to use 64 bit indices to manipulate large tensors on CPU in the same binary.	2015-02-27 12:57:13 -08:00
Benoit Jacob	6466fa63be	Reimplement the selection between rotating and non-rotating kernels using templates instead of macros and if()'s. That was needed to fix the build of unit tests on ARM, which I had broken. My bad for not testing earlier.	2015-02-27 15:30:10 -05:00
Benoit Steiner	05089aba75	Switch to truncated casting when converting floating point types to integer. This ensures that vectorized casts are consistent with scalar casts	2015-02-27 09:27:30 -08:00
Benoit Steiner	573b377110	Added support for vectorized type casting of tensors	2015-02-27 08:46:04 -08:00
Benoit Jacob	2fc3b484d7	remove trailing comma	2015-02-27 11:37:45 -05:00
Benoit Jacob	33669348c4	Disable Packet2f/2i halfpacket support in NEON. I believe that it was erroneously turned on, since Packet2f/2i intrinsics are unimplemented, and code trying to use halfpackets just fails to compile on NEON, as it tries to use the default implementation of pload/pstore and the types don't match.	2015-02-27 11:35:37 -05:00
Benoit Jacob	b7fc8746e0	Replace a static assert by a runtime one, fixes the build of unit tests on ARM Also safely assert in the non-implemented path that should never be taken in practice, and would return wrong results.	2015-02-27 10:01:59 -05:00
Benoit Steiner	f41b1f1666	Added support for fast reciprocal square root computation.	2015-02-26 09:42:41 -08:00
Gael Guennebaud	bcf9bb5c1f	Avoid packing rhs multiple-times when blocking on the lhs only.	2015-02-26 17:01:33 +01:00
Gael Guennebaud	4ec3f04b3a	Make sure that the block size computation is tested by our unit test.	2015-02-26 17:00:36 +01:00
Gael Guennebaud	a8ad8887bf	Implement a more generic blocking-size selection algorithm. See explanations inlines. It performs extremely well on Haswell. The main issue is to reliably and quickly find the actual cache size to be used for our 2nd level of blocking, that is: max(l2,l3/nb_core_sharing_l3)	2015-02-26 16:04:35 +01:00
Gael Guennebaud	400becc591	Fix typos in block-size testing code, and set peeling on k to 8.	2015-02-26 15:57:06 +01:00
Benoit Jacob	692136350b	So I extensively measured the impact of the offset in this prefetch. I tried offset values from 0 to 128 (on this float* pointer, so implicitly times 4 bytes). On x86, I tested a Sandy Bridge with AVX with 12M cache and a Haswell with AVX+FMA with 6M cache on MatrixXf sizes up to 2400. I could not see any significant impact of this offset. On Nexus 5, the offset has a slight effect: values around 32 (times sizeof float) are worst. Anything else is the same: the current 64 (8*pk), or... 0. So let's just go with 0! Note that we needed a fix anyway for not accounting for the value of RhsProgress. 0 nicely avoids the issue altogether!	2015-02-25 12:37:14 -05:00
Christoph Hertzberg	531fa9de77	bug #970 : Add EIGEN_DEVICE_FUNC to RValue functions, in case Cuda supports RValue-references.	2015-02-24 21:03:28 +01:00
Benoit Jacob	26275b250a	Fix my recent prefetch changes: - the first prefetch is actually harmful on Haswell with FMA, but it is the most beneficial on ARM. - the second prefetch... I was very stupid and multiplied by sizeof(scalar) and offset of a scalar* pointer. The old offset was 64; pk = 8, so 64=pk*8. So this effectively restores the older offset. Actually, there were two prefetches here, one with offset 48 and one with offset 64. I could not confirm any benefit from this strange 48 offset on either the haswell or my ARM device.	2015-02-23 16:55:17 -05:00
Christoph Hertzberg	052b6b40f1	Fix two trivial warnings	2015-02-22 12:40:51 +01:00
Christoph Hertzberg	ecbf2a6656	log1p is defined only for real Scalars in C++11	2015-02-21 19:58:24 +01:00
Gael Guennebaud	3cf642baa3	Fix compilation of unit tests disabling assertion cheking	2015-02-21 14:13:48 +01:00
Gael Guennebaud	2da1594750	Fix doc of Ref<>	2015-02-20 11:52:22 +01:00
Gael Guennebaud	b192e29eae	In C++11 destructors do not throw by default (fix CommaInitializer unit test)	2015-02-20 09:28:34 +01:00
Benoit Steiner	ab41652d81	Pulled latest changes from trunk	2015-02-19 21:23:37 -08:00
Benoit Steiner	7765039f1c	Marked the CUDA packet primitives as EIGEN_DEVICE_FUNC since they'll end up being executed on the GPU device.	2015-02-19 21:22:51 -08:00
Gael Guennebaud	a66f5fc2fd	Fix regression with C++11 support of lambda: now internal::result_of falls back to std::result_of in C++11.	2015-02-19 23:32:12 +01:00
Gael Guennebaud	1b7e12847d	Fix some calls to result_of on binary functors as unary ones.	2015-02-19 23:30:41 +01:00
Gael Guennebaud	0f4dd15dfc	Declare const some const variables	2015-02-19 23:28:57 +01:00
Gael Guennebaud	829dddd0fd	Add support for C++11 result_of/lambdas	2015-02-19 15:18:37 +01:00
Benoit Jacob	db05f2d01e	rotating kernel: avoid compiling anything outside of ARM	2015-02-18 15:43:52 -05:00
Benoit Jacob	0ed00d5438	remove a newly introduced redundant typedef - sorry.	2015-02-18 15:05:01 -05:00
Benoit Jacob	9bd8a4bab5	bug #955 - Implement a rotating kernel alternative in the 3px4 gebp path This is substantially faster on ARM, where it's important to minimize the number of loads. This is specific to the case where all packet types are of size 4. I made my best attempt to minimize how dirty this is... opinions welcome. Eventually one could have a generic rotated kernel, but it would take some work to get there. Also, on sandy bridge, in my experience, it's not beneficial (even about 1% slower).	2015-02-18 15:03:35 -05:00
Hauke Heibel	ee27d50633	Fixed template parameter.	2015-02-18 18:51:08 +01:00
Gael Guennebaud	73a24de424	merge	2015-02-18 15:51:00 +01:00
Gael Guennebaud	63eb0f6fe6	Clean a bit computeProductBlockingSizes (use Index type, remove CEIL macro)	2015-02-18 15:49:05 +01:00
Benoit Jacob	4a3e6c8be1	bug #958 - Allow testing specific blocking sizes This is only a debugging/testing patch. It allows testing specific product blocking sizes, typically to study the impact on performance. Example usage: int testk, testm, testn; #define EIGEN_TEST_SPECIFIC_BLOCKING_SIZES #define EIGEN_TEST_SPECIFIC_BLOCKING_SIZE_K testk #define EIGEN_TEST_SPECIFIC_BLOCKING_SIZE_M testm #define EIGEN_TEST_SPECIFIC_BLOCKING_SIZE_N testn #include <Eigen/Core>	2015-02-18 09:43:55 -05:00
Gael Guennebaud	c7bb1e8ea8	Fix a regression when using OpenMP, and fix bug #714 : the number of threads might be lower than the number of requested ones	2015-02-18 15:19:23 +01:00
Jan Blechta	168ceb271e	Really use zero guess in ConjugateGradients::solve as documented and expected for consistency with other methods.	2015-02-18 14:26:10 +01:00
Gael Guennebaud	8fdcaded5e	merge	2015-03-04 10:18:08 +01:00
Gael Guennebaud	c43154bbc5	Check for no-reallocation in SparseMatrix::insert (bug #974 )	2015-03-04 10:16:46 +01:00
Gael Guennebaud	1ce0178363	Improve efficiency of SparseMatrix::insert/coeffRef for sequential outer-index insertion strategies (bug #974 )	2015-03-04 09:39:26 +01:00
Gael Guennebaud	3dca4a1efc	Update manual wrt new LSCG solver.	2015-03-04 09:35:30 +01:00
Gael Guennebaud	05274219a7	Add a CG-based solver for rectangular least-square problems (bug #975 ).	2015-03-04 09:34:27 +01:00
Benoit Jacob	2aa09e6b4e	Fix asm comments in 1px1 kernel	2015-03-03 13:44:00 -05:00
Benoit Jacob	eae8e27b7d	Add a benchmark-default-sizes action to benchmark-blocking-sizes.cpp	2015-03-03 11:41:21 -05:00
Marc Glisse	37a93c4263	New scoring functor to select the pivot. This is can be useful for non-floating point scalars, where choosing the biggest element is generally not the best choice.	2015-03-03 17:08:28 +01:00
Benoit Jacob	ccc1277a42	must also disable complex<double> when disabling double vectorization	2015-03-03 10:17:05 -05:00
Benoit Jacob	f839099512	Work around an ICE in Clang 3.5 in the iOS toolchain with double NEON intrinsics.	2015-03-03 09:35:22 -05:00
Benoit Jacob	1ec0f4fadf	HalfPacket also needed to be disabled for double, on ARMv8.	2015-03-02 16:08:54 -05:00
Gael Guennebaud	3109f0e74e	Add SSE vectorization of Quaternion::conjugate. Significant speed-up when combined with products like q1*q2.conjugate()	2015-03-02 20:09:33 +01:00
Gael Guennebaud	9aee1e300a	Increase unit-test L1 cache size to ensure we are doing at least 2 peeled loop within product kernel.	2015-02-27 22:55:12 +01:00
Gael Guennebaud	b10cd3afd2	Re-enbale detection of min/max parentheses protection, and re-enable mpreal_support unit test.	2015-02-27 22:38:00 +01:00
Gael Guennebaud	548b781380	Fix bug #945 : workaround MSVC warning	2015-02-18 12:53:49 +01:00
Gael Guennebaud	6f4adc9e94	Add missing install directives for arch/CUDA	2015-02-18 11:40:06 +01:00
Gael Guennebaud	63464754ef	Add an internal assertion in makeCompressed to catch a possible risk of null-pointer access.	2015-02-18 11:29:54 +01:00
Gael Guennebaud	eb563049f7	Remove some dead stores.	2015-02-18 11:26:48 +01:00
Gael Guennebaud	dc7e6acc05	Fix possible usage of a null pointer in CholmodSupport	2015-02-18 11:26:25 +01:00
Gael Guennebaud	d4eda01488	Big 957, workaround MSVC/ICC compilation issue	2015-02-18 11:24:32 +01:00
Gael Guennebaud	20cac72b82	Packet must be passed by const reference and not by value to avoid alignment issue.	2015-02-17 22:58:32 +01:00
Christoph Hertzberg	97a36ecba4	Suppress some remaining Index conversion warnings	2015-02-17 18:52:39 +01:00
Gael Guennebaud	159fb181c2	Disable __m128* wrappers when compiling with AVX and -fabi-version=4	2015-02-17 16:27:20 +01:00
Gael Guennebaud	91ab2489dd	Fix compilation with GCC/AVX (workaround __m128 and __m256 being the same type with default ABI)	2015-02-17 16:08:07 +01:00
Gael Guennebaud	9daf8eba6f	Fix compilation of Cholmod*(matrix) ctor	2015-02-17 15:24:52 +01:00
Gael Guennebaud	3373c903b3	Fix compilation of int*complex with gcc	2015-02-16 19:18:12 +01:00
Gael Guennebaud	f0b1b1df9b	Fix SparseLU::signDeterminant() method, and add a SparseLU::determinant() method.	2015-02-16 19:09:22 +01:00
Gael Guennebaud	8768ff3c31	Add PermutationMatrix::determinant method.	2015-02-16 19:08:25 +01:00
Martin Drozdik	64b29e06b9	bug #956 : Fixed bug in move constructors of DenseStorage which caused "moved-from" objects to be in an invalid state.	2015-02-16 18:18:46 +09:00
Gael Guennebaud	1c0e8bcf09	Fix unused variable warning.	2015-02-16 17:21:30 +01:00
Gael Guennebaud	0f464d9d87	bug #897 : fix regression in BiCGSTAB(mat) ctor (an all other iterative solvers). Add respective regression unit test.	2015-02-16 17:05:10 +01:00
Gael Guennebaud	470d26d580	Remove some useless typedefs	2015-02-16 16:48:21 +01:00
Gael Guennebaud	953d5ccfd5	Doc: explain how to free allocated memory in SparseMAtrix	2015-02-16 15:56:11 +01:00
Gael Guennebaud	98604576d1	Merged in chtz/eigen-indexconversion (pull request PR-92) bug #877, bug #572: Get rid of Index conversion warnings, summary of changes: - Introduce a global typedef Eigen::Index making Eigen::DenseIndex and AnyExpr<>::Index deprecated (default is std::ptrdiff_t). - Eigen::Index is used throughout the API to represent indices, offsets, and sizes. - Classes storing an array of indices uses the type StorageIndex to store them. This is a template parameter of the class. Default is int. - Methods that explicitly set or return an element of such an array take or return a StorageIndex type. In all other cases, the Index type is used.	2015-02-16 15:29:00 +01:00
Gael Guennebaud	45cbb0bbb1	The usage of DenseIndex is deprecated, so let's replace DenseIndex by Index	2015-02-16 15:05:41 +01:00
Gael Guennebaud	cc641aabb7	Remove deprecated usage of expr::Index.	2015-02-16 14:46:51 +01:00
Gael Guennebaud	aa6c516ec1	Fix many long to int conversion warnings: - fix usage of Index (API) versus StorageIndex (when multiple indexes are stored) - use StorageIndex(val) when the input has already been check - use internal::convert_index<StorageIndex>(val) when val is potentially unsafe (directly comes from user input)	2015-02-16 13:19:05 +01:00
Christoph Hertzberg	bd511dde9d	bug #952 : Missing \endcode made doxygen fail to build ColPivHouseholderQR	2015-02-15 06:08:25 +01:00
Benoit Steiner	e2cfddf75f	Pulled latest updates from trunk	2015-02-13 16:21:59 -08:00
Benoit Steiner	0927801a84	Optimized version of the sin(), exp(), log() and sqrt() function for AVX	2015-02-13 16:07:08 -08:00
Benoit Jacob	e972b55ec4	bug #953 - Fix prefetches in 3px4 product kernel This gives a 10% speedup on nexus 4 and on nexus 5.	2015-02-13 14:52:36 -05:00
Gael Guennebaud	fc202bab39	Index refactoring: StorageIndex must be used for storage only (and locally when it make sense). In all other cases use the global Index type.	2015-02-13 18:57:41 +01:00
Gael Guennebaud	fe51319980	Merge Index-refactoring branch with default, fix PastixSupport, remove some useless typedefs	2015-02-13 10:03:53 +01:00
Gael Guennebaud	0918c51e60	merge Tensor module within Eigen/unsupported and update gemv BLAS wrapper	2015-02-12 21:48:41 +01:00
Gael Guennebaud	409547a0c8	update EIGEN_FAST_MATH documentation	2015-02-12 21:04:31 +01:00
Benoit Steiner	f669f5656a	Marked a few functions as EIGEN_DEVICE_FUNC to enable the use of tensors in cuda kernels.	2015-02-10 14:29:47 -08:00
Gael Guennebaud	029d236ceb	merge	2015-02-10 23:12:47 +01:00
Gael Guennebaud	fe25f3b8e3	FMA has been wrongly disabled	2015-02-10 23:11:35 +01:00
Benoit Steiner	cc5d7ff523	Added vectorized implementation of the exponential function for ARM/NEON	2015-02-10 14:02:38 -08:00
Jan Blechta	c3f3580b8f	Fix bug #733 : step by step solving is not a good example for solveWithGuess	2015-02-10 14:24:39 +01:00
Gael Guennebaud	c6e8caf090	Allows Lower\|Upper as a template argument of CG and MINRES: in this case the full matrix will be considered.	2015-02-10 18:57:41 +01:00
Gael Guennebaud	87629cd639	bug #897 : makes iterative sparse solvers use a Ref<SparseMatrix> instead of a SparseMatrix pointer. This fixes usage of iterative solvers with a Map<SparseMatrix>.	2015-02-09 11:41:25 +01:00
Gael Guennebaud	d4ec48575e	Make Block<SparseMatrix> inherit SparseCompressedBase in the case of an inner-panels and fix valuePtr() innerIndexPtr()	2015-02-09 11:14:36 +01:00
Gael Guennebaud	3af29caae8	Cleaning and add more unit tests for Ref<SparseMatrix> and Map<SparseMatrix>	2015-02-09 10:23:45 +01:00
Gael Guennebaud	f2ff8c091e	Add a Ref<SparseMatrix> specialization.	2015-02-07 22:04:18 +01:00
Gael Guennebaud	f3be317614	Add a Map<SparseMatrix> specialization.	2015-02-07 22:03:25 +01:00
Gael Guennebaud	08081f8293	Make SparseTranspose inherit SparseCompressBase when possible	2015-02-07 22:02:14 +01:00
Gael Guennebaud	7838fda82c	Add a SparseCompressedBase class providing (un)compressed accessors (like data()/*Stride() for dense matrices), and a CompressedAccessBit flag (similar to DirectAccessBit for dense matrices).	2015-02-07 22:00:46 +01:00
Benoit Steiner	01f7918788	Pulled latest fixes	2015-02-06 05:30:20 -08:00
Gael Guennebaud	b50ffaddf2	merge	2015-02-06 14:27:12 +01:00
Gael Guennebaud	74e460b995	Fix symmetric product	2015-02-06 14:26:24 +01:00
Benoit Steiner	c739102ef9	Pulled the latest changes from the trunk	2015-02-06 05:25:03 -08:00
Benoit Steiner	dcb2a8b184	Added the EIGEN_HAS_CONSTEXPR define Gate the tensor index list code based on the value of EIGEN_HAS_CONSTEXPR	2015-02-06 02:51:59 -08:00
Gael Guennebaud	b1eca55328	Use Ref<> to ensure that both x and b in Ax=b are compatible with Umfpack/SuperLU expectations	2015-02-03 23:46:05 +01:00
Gael Guennebaud	ebdf6a2dbb	SPQR: fix default threshold value	2015-02-03 22:32:34 +01:00
Benoit Jacob	5ef95fabee	bug #936 , patch 3/3: Properly detect FMA support on ARM (requires VFPv4) and use it instead of MLA when available, because it's both more accurate, and faster.	2015-01-30 17:45:03 -05:00
Benoit Jacob	0f21613698	bug #936 , patch 2/3: Remove EIGEN_VECTORIZE_FMA, was redundant with EIGEN_HAS_SINGLE_INSTRUCTION_MADD	2015-01-30 17:44:26 -05:00
Benoit Jacob	340b8afb14	bug #936 , patch 1.5/3: rename _FUSED_ macros to _SINGLE_INSTRUCTION_, because this is what they are about. "Fused" means "no intermediate rounding between the mul and the add, only one rounding at the end". Instead, what we are concerned about here is whether a temporary register is needed, i.e. whether the MUL and ADD are separate instructions. Concretely, on ARM NEON, a single-instruction mul-add is always available: VMLA. But a true fused mul-add is only available on VFPv4: VFMA.	2015-01-31 14:15:57 -05:00
Benoit Jacob	9f99f61e69	bug #936 , patch 1/3: some cleanup and renaming for consistency.	2015-01-30 17:43:56 -05:00
Benoit Jacob	759bd92a85	bug #935 : Add asm comments in GEBP kernels to work around a bug in both GCC and Clang on ARM/NEON, whereby they spill registers, severely harming performance. The reason why the asm comments make a difference is that they prevent the compiler from reordering code across these boundaries, which has the effect of extending the lifetime of local variables and increasing register pressure on this register-tight code.	2015-01-30 17:27:56 -05:00
Gael Guennebaud	f1092d2f73	bug #941 : fix accuracy issue in ColPivHouseholderQR, do not stop decomposition on a small pivot	2015-01-30 19:04:04 +01:00
Gael Guennebaud	9d82f7e30d	Supernodes was disabled.	2015-01-30 17:24:40 +01:00
Gael Guennebaud	a727a2c4ed	bug #933 : RealSchur, do not consider the input matrix norm to check negligible sub-diag entries. This also makes this test consistent with the complex and self-adjoint cases.	2015-01-28 16:07:51 +01:00
Gael Guennebaud	c6eb84aabc	Enable vectorization of transposeInPlace for PacketSize x PacketSize matrices	2015-01-26 17:09:01 +01:00
Gael Guennebaud	e1f1091fde	Add support for dense ?= diagonal	2015-01-24 10:32:49 +01:00
Gael Guennebaud	b9d314ae19	bug #329 : fix typo	2015-01-17 21:55:33 +01:00
Gael Guennebaud	279786e987	Fix missing evaluator in outer-product	2015-01-13 10:25:50 +01:00
Gael Guennebaud	ae4644cc68	bug #907 , ARM64: workaround ICE in xcode/clang	2015-01-13 10:03:00 +01:00
Gael Guennebaud	36f7c1337f	bug #907 , ARM64: workaround vreinterpretq_u64_* not defined in xcode/clang	2015-01-13 09:57:37 +01:00
Gael Guennebaud	63974bcb88	Big 907: workaround some missing intrinsics in current NDK's gcc version (ARM64)	2015-01-07 09:44:25 +01:00
Gael Guennebaud	79f4a59ed9	bug #907 : fix compilation with ARM64	2015-01-07 09:41:56 +01:00
Benoit Steiner	9f98650d0a	Ensured that contractions that can be reduced to a matrix vector product work correctly even when the input coefficients aren't aligned.	2015-01-06 09:29:13 -08:00
Gael Guennebaud	f5f6e2c6f4	bug #921 : fix utilization of bitwise operation on enums in first_aligned	2014-12-19 14:41:59 +01:00
Gael Guennebaud	25c7d9164f	bug #920 : fix MSVC 2015 compilation issues	2014-12-18 22:58:15 +01:00
Gael Guennebaud	b8d9eaa19b	Use true compile time "if" for Transform::makeAffine	2014-12-13 22:16:39 +01:00
Gael Guennebaud	7dad5f797e	bug #821 : workaround MSVC 2013 issue with using Base::Base::operator=	2014-12-16 13:33:43 +01:00
Christoph Hertzberg	e8cdbedefb	bug #877 , bug #572 : Introduce a global Index typedef. Rename Sparse::Index to StorageIndex, make Dense::StorageIndex an alias to DenseIndex. Overall this commit gets rid of all Index conversion warnings.	2014-12-04 22:48:53 +01:00
Gael Guennebaud	433bce5c3a	UmfPack support: fix redundant evaluation/copies when calling compute() and support generic expressions as input	2014-12-02 17:30:57 +01:00
Gael Guennebaud	775f7e5fbb	bug #697 : make sure empty classes are at the end in case of multiple inheritence	2014-12-02 14:40:19 +01:00
Gael Guennebaud	a819fa148d	Fix MSVC compilation issue	2014-12-02 14:35:31 +01:00
Gael Guennebaud	1a8dc85142	bug #897 : fix UmfPack usage with mapped sparse matrices	2014-12-02 13:57:13 +01:00
Gael Guennebaud	4974d1d2b4	Fix bug #911 : m_extractedDataAreDirty was not initialized in UmfPackLU	2014-12-02 13:54:06 +01:00
Gael Guennebaud	e2f3e4e4aa	Document non-const SparseMatrix::diagonal() method.	2014-12-01 14:45:15 +01:00
Gael Guennebaud	b26e697182	Make SparseMatrix::coeff() returns a const reference and add a non const version of SparseMatrix::diagonal()	2014-12-01 14:41:39 +01:00
Gael Guennebaud	b1f9f603a0	Simplify return type of diagonal(Index) (and ease compiler job)	2014-11-28 14:39:47 +01:00
Gael Guennebaud	5384e89147	Disable MatrixBase::bdcSvd with CUDA (just like MatrixBase::jacobiSvd	2014-11-26 22:29:29 +01:00
Gael Guennebaud	8518ba0bbc	Fix Hyperplane::Through(a,b,c) when points are aligned or identical. We use the stratgey as in Quaternion::setFromTwoVectors.	2014-11-26 15:01:53 +01:00
Gael Guennebaud	0efaff9b3b	Fix out-of-bounds write	2014-12-11 16:15:20 +01:00
Gael Guennebaud	41a20994cc	In simplicial cholesky: avoid deep copy of the input matrix is this later can be used readily	2014-12-08 17:56:33 +01:00
Gael Guennebaud	a910a7466e	Fix inner iterator type	2014-12-08 17:55:31 +01:00
Gael Guennebaud	4371911861	Remove useless and non standard numext::atanh2 function.	2014-12-08 16:44:34 +01:00
Gael Guennebaud	bea36925db	bug #876 : implement a portable log1p function	2014-12-08 16:26:53 +01:00
Gael Guennebaud	7f7a712062	Optimize Simplicial Cholesky when NaturalOrdering is used.	2014-12-08 15:02:25 +01:00
Gael Guennebaud	30c849669d	Fix dynamic allocation in JacobiSVD (regression)	2014-12-08 14:45:04 +01:00
Gael Guennebaud	80ed5bd90c	Workaround various "returning reference to temporary" warnings.	2014-12-05 12:49:30 +01:00
Gael Guennebaud	da584912b6	Fix memory pre-allocation when permuting inner vectors of a sparse matrix.	2014-11-24 17:31:59 +01:00
Benoit Steiner	509e4ddc02	Added reduction packet primitives for CUDA	2014-11-19 10:34:11 -08:00
Gael Guennebaud	722916e19d	bug #903 : clean swap API regarding extra enable_if parameters, and add failtests for swap	2014-11-06 09:25:26 +01:00
Gael Guennebaud	c6fefe5d8e	Big 853: replace enable_if in Ref<> ctor by static assertions and add failtests for Ref<>	2014-11-05 16:15:17 +01:00
Gael Guennebaud	ee06f78679	Introduce unified macros to identify compiler, OS, and architecture. They are all defined in util/Macros.h and prefixed with EIGEN_COMP_, EIGEN_OS_, and EIGEN_ARCH_ respectively.	2014-11-04 21:58:52 +01:00
Benoit Steiner	2dde63499c	Generalized the matrix vector product code.	2014-10-31 16:33:51 -07:00
Christoph Hertzberg	0833b82efd	Run sparse_basic unit tests also for rectangular matrices. TriangularView with UnitDiag does not work properly yet (bug #901)	2014-10-31 17:12:13 +01:00
Benoit Steiner	bc99c5f7db	fixed some potential alignment issues.	2014-10-30 18:09:53 -07:00
Benoit Steiner	1946cc4478	Added missing packet primitives for CUDA.	2014-10-30 17:52:32 -07:00
Christoph Hertzberg	4ec2f07a5b	Fixed bug in SparseBlock which caused a segfault in sparse_extra_3 test	2014-10-30 21:34:10 +01:00
Christoph Hertzberg	883168ed94	Make select CUDA compatible (comparison operators aren't yet, so no test case yet)	2014-10-30 20:16:16 +01:00
Christoph Hertzberg	e5f134006b	EIGEN_UNUSED_VARIABLE works better than casting to void. Make this also usable from CUDA code	2014-10-30 19:59:09 +01:00
Gael Guennebaud	21c0a2ce0c	Move D&C SVD to official SVD module.	2014-10-29 11:29:33 +01:00
Christoph Hertzberg	e2e7ba9f85	bug #898 : add inline hint to const_cast_ptr	2014-10-28 14:49:44 +01:00
Christoph Hertzberg	bd2d330b25	Temporary workaround for bug #875 : Let TriangularView<Sparse>::nonZeros() return nonZeros() of the nested expression	2014-10-28 13:31:00 +01:00
Konstantinos Margaritis	79225db0b6	Merged in kmargar/eigen (pull request PR-87) Extend NEON to add ARMv8 64-bit double support	2014-10-28 13:08:53 +02:00
Konstantinos Margaritis	94ed7c81e6	Bug #896 : Swap order of checking __VSX__/__ALTIVEC__	2014-10-22 06:15:18 -04:00
Konstantinos Margaritis	fcb3573d17	Merged eigen/eigen into default	2014-10-22 10:42:18 +03:00
Konstantinos Margaritis	fae4fd7a26	Added ARMv8 support	2014-10-22 07:39:49 +00:00
Christoph Hertzberg	cf09c5f687	Prevent CUDA `calling a __host__ function from a __host__ __device__ function is not allowed` error.	2014-10-21 20:40:09 +02:00
Konstantinos Margaritis	b508619392	working 64-bit support in PacketMath.h, Complex.h needed	2014-10-21 18:10:33 +00:00
Konstantinos Margaritis	87524922dc	check for __ARM_NEON instead as it's defined in arm64 as well	2014-10-21 18:08:50 +00:00
Gael Guennebaud	fe57b2f963	bug #701 : workaround (min) and (max) blocking ADL by introducing numext::mini and numext::maxi internal functions and a EIGEN_NOT_A_MACRO macro.	2014-10-20 15:55:32 +02:00
Gael Guennebaud	973e6a035f	bug #718 : Introduce a compilation error when using the wrong InnerIterator type with a SparseVector	2014-10-20 14:07:08 +02:00
Christoph Hertzberg	84aaa03182	Addendum to bug #859 : pexp(NaN) for double did not return NaN, also, plog(NaN) did not return NaN. psqrt(NaN) and psqrt(-1) shall return NaN if EIGEN_FAST_MATH==0	2014-10-20 13:13:43 +02:00
Gael Guennebaud	aa5f79206f	Fix bug #859 : pexp(NaN) returned Inf instead of NaN	2014-10-20 11:38:51 +02:00
Gael Guennebaud	b4a9b3f496	Add unit tests for Rotation2D's inverse(), operator*, slerp, and fix regression wrt explicit ctor change	2014-10-20 11:04:32 +02:00
Gael Guennebaud	d04f23260d	Fix bug #894 : the sign of LDLT was not re-initialized at each call of compute()	2014-10-20 10:48:40 +02:00
Gael Guennebaud	8838b0a1ff	Fix SparseQR::rank for a completely empty matrix.	2014-10-19 22:42:20 +02:00

... 3 4 5 6 7 ...

4086 Commits