Commit Graph

2224 Commits

Author SHA1 Message Date
Gael Guennebaud
d551e99644 make HessenbergDecompositionMatrixHReturnType internal 2010-11-26 15:39:01 +01:00
Gael Guennebaud
e06c6553e0 make TridiagonalizationMatrixTReturnType internal and only export a public MatrixTReturnType typedef 2010-11-26 15:36:29 +01:00
Gael Guennebaud
0d63212257 add a TridiagonalizationMatrixTReturnType class to make Tridiagonalization::matrixT() more efficient and future proof. 2010-11-26 15:31:47 +01:00
Gael Guennebaud
421b2b5ff7 fix a couple of issues with TridiagonalMatrix 2010-11-26 13:04:20 +01:00
Gael Guennebaud
d8b26cfeec s/id/p to avoid name clash 2010-11-26 08:36:16 +01:00
Gael Guennebaud
156a31b0e9 fully implement scalar_fuzzy_impl<bool> as, e.g., the missing isMuchSmallerThan is convenient to filter out false values. 2010-11-25 18:00:30 +01:00
Gael Guennebaud
f1690fb9fa fix bug #122 : rank 2 update test and scalar multiple extraction were both wrong 2010-11-23 19:19:04 +01:00
Benoit Jacob
0ab9a0a2f7 make UpperBidiagonalization internal: don't want to support it, it's not used.
Keeping it because it tests BandMatrix.
2010-11-23 11:12:42 -05:00
Benoit Jacob
ee38dbf1e6 Rework nested<> to be cleaner, see bug #76. 2010-11-23 11:11:40 -05:00
Frederic Gosselin
4c5932f8f5 Improves the filter for hidden files in "Eigen" and "Eigen/src".
This generic solution prevent cmake from having an error .svn folders when the source folder is under subversion.
2010-11-22 10:47:07 -05:00
Gael Guennebaud
7213dd1e6b this product still badly read the imaginary part on the diagonal 2010-11-22 18:00:47 +01:00
Benoit Jacob
a3f214ade9 holy crap, i had disabled all static asserts in 71f023de3e 2010-11-22 08:21:30 -05:00
Gael Guennebaud
d8396a8da0 fix compilation of product_mmtr 2010-11-21 10:23:06 +01:00
Gael Guennebaud
fb6d9ca951 add missing non const data() method to MapBase 2010-11-21 10:17:25 +01:00
Gael Guennebaud
12bfe5e718 make sure our internal selfadjoint*vector product does not use the imaginary part of the diagonal entries 2010-11-21 10:08:48 +01:00
Gael Guennebaud
1ac9124fac implements TRMV level 2 blas routine 2010-11-20 23:29:20 +01:00
Gael Guennebaud
d72a8f1e50 make trmv uses direct access 2010-11-20 22:42:24 +01:00
Gael Guennebaud
86474115f5 IBM XL C compiler supports __attribute__((aligned(n))) syntax 2010-11-19 17:33:51 +01:00
Thomas Capricelli
94f59a92cb fix typo 2010-11-19 17:16:28 +01:00
Gael Guennebaud
5ce199b1dd update rank 2 update doc 2010-11-19 16:50:49 +01:00
Gael Guennebaud
f369b5a711 makes rank 2 update function conformant to BLAS HER2 2010-11-19 16:50:15 +01:00
Gael Guennebaud
3f24dbf6f5 fix compilation of transform * scaling 2010-11-19 14:45:45 +01:00
Gael Guennebaud
1618df55df Add support for sparse symmetric permutations 2010-11-18 10:28:39 +01:00
Gael Guennebaud
da05b6af0e fix some remainign issue with ei_ -> internal change 2010-11-16 15:54:48 +01:00
Gael Guennebaud
9a3ec637ff new feature: copy from a sparse selfadjoint view to a full sparse matrix 2010-11-15 14:14:05 +01:00
Gael Guennebaud
5a3a229550 fix return type of rightHouseholderSequence() 2010-11-15 11:11:22 +01:00
Jitse Niesen
cad73d9cdc Correct std::map fix (two commits ago); copy fix to aligned_allocator doc. 2010-11-12 12:06:24 +00:00
Gael Guennebaud
b4fa8261b1 properly use nested types 2010-11-10 19:06:20 +01:00
Gael Guennebaud
05ed9be639 prevent warning 2010-11-10 18:59:16 +01:00
Gael Guennebaud
2577ef90c0 generalize our internal rank K update routine to support more general A*B product while evaluating only one triangular part and make it available via, e.g.:
R.triangularView<Lower>() += s * A * B;
2010-11-10 18:58:55 +01:00
Gael Guennebaud
c810d14d4d add missing specialization 2010-11-09 12:03:20 +01:00
Gael Guennebaud
572b5585e3 fix Eigen's trsv for complexes 2010-11-05 14:36:34 +01:00
Gael Guennebaud
0e30c4ae3f blas level2: gemv and trsv are green 2010-11-05 14:14:50 +01:00
Gael Guennebaud
3fdea699b8 trsv: simplifications/cleaning 2010-11-05 12:54:32 +01:00
Gael Guennebaud
0e6c1170ab trsv: add support for inner-stride!=1, reduce code instanciation, move implementation to a new products/XX.h file 2010-11-05 12:43:14 +01:00
Gael Guennebaud
5a4f77716d fix bug #107: SelfAdjointEigenSolver and RowMajor (and add unit test) 2010-11-04 09:33:05 +01:00
Gael Guennebaud
1eea88bff7 fix matrix product bug with OpenMP 2010-11-03 16:12:37 +01:00
Gael Guennebaud
8d27f55eb3 rm auto normalization in favor of clamping 2010-11-03 15:32:40 +01:00
Hauke Heibel
3a3f163e31 Fix bug #65.
In order to prevent compilation errors, the default functor "struct func" must not be defined inside the function scope. I just moved it into a private section of SparseMatrix.
2010-11-02 14:32:41 +01:00
Hauke Heibel
b3007db131 Added a comment on why is_arithmetic is used in DenseCoeffsBase. 2010-11-02 10:11:22 +01:00
Hauke Heibel
96e4a4b59c Fixed compilation due to lacking Transform definitions. 2010-11-01 16:53:39 +01:00
Gael Guennebaud
d2e257cb5d oops (rm commented code) 2010-11-01 09:40:33 +01:00
Gael Guennebaud
c7eda0d866 Let's be safe: enable auto normalization is quaternion to angle-axis code since a slight numerical issue may trigger NaN. The overhead is small and I doubt the perf of this function could be critival for any application ! 2010-10-31 23:26:01 +01:00
Benoit Jacob
99ccb26cfe add eigen2support Transform typedefs, add Eigen2To3 section on Transform 2010-10-29 09:00:35 -04:00
Benoit Jacob
868f753d10 document LvalueBit better 2010-10-28 09:40:20 -04:00
Gael Guennebaud
1d4e80f09d generalize the prune function 2010-10-28 11:39:31 +02:00
Gael Guennebaud
02c8b6af82 fix sparse rankUpdate and triangularView iterator 2010-10-27 15:13:03 +02:00
Gael Guennebaud
241e5ee3e7 add the possibility to solve for sparse rhs with Cholmod 2010-10-27 14:31:23 +02:00
Hauke Heibel
5d4ff3f99c Fixed bug #95 by changing _M_IX64 to _M_X64 as proposed by Jan Schlicht. 2010-10-27 11:07:38 +02:00
Hauke Heibel
c738cd56eb Renamed cleantype to remove_all since it is close to remove_{const|pointer|reference}. 2010-10-26 16:47:01 +02:00
Hauke Heibel
7bc8e3ac09 Initial fixes for bug #85.
Renamed meta_{true|false} to {true|false}_type, meta_if to conditional, is_same_type to is_same, un{ref|pointer|const} to remove_{reference|pointer|const} and makeconst to add_const.
Changed boolean type 'ret' member to 'value'.
Changed 'ret' members refering to types to 'type'.
Adapted all code occurences.
2010-10-25 22:13:49 +02:00
Benoit Jacob
4716040703 bug #86 : use internal:: namespace instead of ei_ prefix 2010-10-25 10:15:22 -04:00
Hauke Heibel
ba86d3ef65 Fixed bug #84. 2010-10-21 10:13:17 +02:00
Benoit Jacob
8c17fab8f5 renaming: ei_matrix_storage -> DenseStorage
DenseStorageBase  -> PlainObjectBase
2010-10-20 09:34:13 -04:00
Benoit Jacob
e259f71477 rename PlanarRotation -> JacobiRotation 2010-10-19 21:56:26 -04:00
Benoit Jacob
9044c98cff work around stupid msvc error when constructing at compile time an expression
that involves a division by zero, even if the numeric type has floating point
2010-10-19 21:56:11 -04:00
Hauke Heibel
9f8b6ad43e Fixed bug #79. 2010-10-19 09:43:54 +02:00
Benoit Jacob
3481f10e7a re-fix the broken msvc warning in JacobiSVD 2010-10-18 09:46:22 -04:00
Benoit Jacob
3404d5fb14 improvements in pages 5 and 7 of the tutorial. 2010-10-18 09:09:30 -04:00
Benoit Jacob
597bb61c23 fix stupid msvc warning in jacobisvd 2010-10-18 06:54:11 -04:00
Benoit Jacob
8356bc8d06 add jacobiSvd() method, update test & docs 2010-10-17 09:40:52 -04:00
Benoit Jacob
3f79884f03 bump to 2.92.0 2010-10-15 09:46:20 -04:00
Benoit Jacob
6dc478fd77 doc typo 2010-10-14 10:19:46 -04:00
Benoit Jacob
65c01e2bf7 JacobiSVD doc fix 2010-10-14 10:17:40 -04:00
Benoit Jacob
8f0e80fe30 JacobiSVD:
* fix preallocating constructors, allocate U and V of the right size for computation options
  * complete documentation and internal comments
  * improve unit test, test inf/nan values
2010-10-14 10:14:43 -04:00
Gael Guennebaud
47197065da compilation fix 2010-10-14 10:19:55 +02:00
Gael Guennebaud
3a2bb7f782 fix compilation and warnings with fcc 4.0.1 2010-10-13 10:21:28 +02:00
Benoit Jacob
8eb0fc1e72 remove SVD class (was bad code taked from elsewhere)
Use JacobiSVD for now.
We do plan to reintroduce a bidiagonalizing SVD asap.
2010-10-12 10:19:59 -04:00
Benoit Jacob
dbedc70012 Jacobi improvements:
* add fixed-size vectorized path
  * add missing restrict keywords
  * use innerStride()
  * allow vectorization even if innerStride()>1, if PacketSize==1
    (think of the case of rows of std::complex<double>)
2010-10-12 09:58:53 -04:00
Benoit Jacob
12a152031d fix the Jacobi bug, expand unit test 2010-10-12 09:43:40 -04:00
Benoit Jacob
b8bb804007 set ColPivHouseholderQR as default preconditioner for JacobiSVD 2010-10-11 21:00:42 -04:00
Benoit Jacob
5c3d21693b implement JacobiSVD::solve() and expand the unit test 2010-10-11 15:36:04 -04:00
Benoit Jacob
d229f99ba2 adapt Quaternion to JacobiSVD API changes. 2010-10-08 10:42:41 -04:00
Benoit Jacob
8ba8d90063 add option to compute thin U/V.
By default nothing is computed. You have to ask explicitly for thin/full U/V if you want them.
2010-10-08 10:42:40 -04:00
Benoit Jacob
6fad2eb97b Rework JacobiSVD api / template parameters.
There is now an integer QRPreconditioner template parameter, defaulting to full-piv QR.
Since we have to special-case each QR dec anyway, a template template parameter didn't add much value here.
There is an option NoQRPreconditioner if you know your matrices are already square (auto-detected for fixed-size matrices).
2010-10-08 10:42:32 -04:00
Benoit Jacob
58e0cce0f7 merge backout 2010-10-08 10:42:25 -04:00
Benoit Jacob
4a98cada26 Backed out changeset 2334291157
Sorry Thomas, these doc fixes are no longer relevant with the JacobiSVD API changes, and they are preventing me from applying my patches cleanly.
2010-10-08 10:42:06 -04:00
Gael Guennebaud
a76ce042e6 MSVC for windows mobile does not have the errno.h file 2010-10-07 18:09:15 +02:00
Gael Guennebaud
af22364988 an attempt to fix compilation on windows mobile 2010-10-07 17:54:46 +02:00
Gael Guennebaud
01fad14d78 mark LLT/LDLT solveInPlace func internal and rm their boolean returned value 2010-10-05 15:56:50 +02:00
Thomas Capricelli
2334291157 fix doc 2010-10-04 04:08:32 +02:00
Benoit Jacob
71f023de3e fix compilation on ubuntu 9.04's version of gcc 4.3 (yes, wtf) 2010-09-27 09:57:57 -04:00
Radu Bogdan Rusu
94ea1eed9a fix warning 2010-09-27 09:56:54 -04:00
Hauke Heibel
327ed3d1d3 Added a note to the Gram Schmidt code and improved some formatting. 2010-09-25 14:15:35 +02:00
Hauke Heibel
316dadc8e4 Fixed some SVD issues.
Make the SVD's output unitary.
Improved unit tests.
Added an assert to the SVD ctor to check whether rows>=cols.
2010-09-24 17:32:44 +02:00
Hauke Heibel
053261de88 Make the SVD's output unitary and improved unit tests. 2010-09-24 16:28:20 +02:00
Benoit Jacob
1c54514bfc merge 2010-09-23 09:53:21 -04:00
Benoit Jacob
c253cc3d53 SVD:
* fix unit test for rectangular matrices.
 * enforce that rows >= cols since various places in the code assume that.
2010-09-23 09:51:08 -04:00
Hauke Heibel
62bf04b339 Fixed bad memory access in the SVD. 2010-09-23 11:15:36 +02:00
Benoit Jacob
77c943670e add cmakelists for 2 subdirs and make sure all subdirs are installed (GLOB) 2010-09-14 04:11:15 -04:00
Gael Guennebaud
91e9344be9 fix vectorization logic and code of cross3 which was never enabled.. 2010-09-08 14:10:01 +02:00
Gael Guennebaud
9bb75937cc fix += return by value like operations 2010-09-06 11:51:42 +02:00
Gael Guennebaud
62eb4dc99b noalias was wrongly skipping automatic transposition 2010-09-02 19:18:34 +02:00
Gael Guennebaud
4824db6444 add the possibility to extend QuaternionBase 2010-09-02 17:28:07 +02:00
Eamon Nerbonne
d17bb02ccd Fixes mingw32 compile issues 2010-09-02 10:38:23 +02:00
Gael Guennebaud
b49dde01dc fix bad mat * mat * scalar when the implicit conversion operator to a Matrix is used 2010-08-31 09:54:38 +02:00
Gael Guennebaud
dcff9ba785 fix bad "using typename" 2010-08-25 13:34:35 +02:00
Gael Guennebaud
cb7a72d5b0 Fix Sun CC parsing of Eigen/Core. In particular,
I moved all the block related methods to a plugin file. This also
significantly reduce code verbosity.
2010-08-25 13:09:56 +02:00
Benoit Jacob
bd8d06033d make a couple of typedefs public so stuff compiles 2010-08-24 10:53:33 -04:00
Gael Guennebaud
a47bbf664c fix 4x4 SSE inversion when storage orders don't match 2010-08-24 13:00:59 +02:00
Gael Guennebaud
ad9a7c69bc fix inversion of 4x4 unaligned matrices 2010-08-24 12:28:42 +02:00
Gael Guennebaud
6261f4629f add TriangularMatrix::conjugate to be consistent since we have adjoint 2010-08-23 23:38:35 +02:00
Jitse Niesen
d1111d625c Docs: Typos in ArrayBase doxygen comments 2010-08-23 11:44:51 +01:00
Jitse Niesen
103b9351fd Docs: Add references to TopicClassHierarchy 2010-08-22 18:28:19 +01:00
Jitse Niesen
a6da803873 Document DenseCoeffsBase 2010-08-22 17:30:31 +01:00
Hauke Heibel
60aad09878 Fixed DiagonalMatrix assignment. 2010-08-21 16:34:46 +02:00
Hauke Heibel
92b1674c79 Fixed typos. 2010-08-19 20:11:06 +02:00
Hauke Heibel
610d79e686 Simplified to product templates to a minimum of template parameters.
Removed the ei_is_any_projective helper and added ei_transform_traits.
2010-08-19 20:02:46 +02:00
Hauke Heibel
a64aabf73c Removed unused code. 2010-08-19 19:33:13 +02:00
Hauke Heibel
55c7848877 Matrix product refactoring (rhs products only).
Added strong inlines required for MSVC for proper inlining.
Added specializations for DiagonalMatrix products to RotationBase.
Added left- and righ-hand-side products with DiagonalMatrix to Transform.
RHS Transform products now return Matrix objects only.
Split the geo_transformations unit test. Some tests were not made for projectivities.
Removed unused variables from main.h that caused warnings.
2010-08-19 19:25:35 +02:00
Gael Guennebaud
d4b664c4cd fix ugly conversion from double[2] to complex 2010-08-19 14:47:58 +02:00
Gael Guennebaud
5354ffbb4f add missing specialization for vector * selfadjoint 2010-08-19 14:05:21 +02:00
Gael Guennebaud
ddbbd7065d * disable unalignment detection when vectorization is not enabled
* revert MapBase unalignment detection
2010-08-18 09:35:55 +02:00
Hauke Heibel
85fdcdf055 Fixed Geometry module failures.
Removed default parameter from Transform.
Removed the TransformXX typedefs.
Removed references to TransformXX from unit tests and docs.
Assigning Transforms to a sub-group is now forbidden at compile time.
Products should now properly support the Isometry flag.
Fixed alignment checks in MapBase.
2010-08-17 20:03:50 +02:00
Benoit Jacob
87aafc9169 fix Transform() constructor taking a Transform with other mode.
Not really tested as the geometry tests are currently busted.
2010-08-16 12:30:33 -04:00
Benoit Jacob
19d9c835e0 fix warnings 2010-08-16 11:11:43 -04:00
Gael Guennebaud
b37551f62a further improve compilation error message for array+=matrix 2010-08-16 11:13:02 +02:00
Gael Guennebaud
c625a6a85b improve compilation error message for array+=matrix and the likes 2010-08-16 11:07:17 +02:00
Gael Guennebaud
453d54325e fix declaration of AffineTransformType in Translation 2010-08-16 10:44:27 +02:00
Gael Guennebaud
aa2b46aa91 allow vectorization of mat44.col() by adding a InnerPanel boolean
template parameter to Block
2010-07-23 16:29:29 +02:00
Gael Guennebaud
853c0e15df slightly generalize the alignment assert in MapBase 2010-08-16 09:41:07 +02:00
Gael Guennebaud
8566ef805b remove the aligned bit flag for non vectorizable types 2010-08-16 09:38:49 +02:00
Benoit Jacob
3a30a2bc3e forgot to remove a #endif 2010-08-13 14:03:38 -04:00
Benoit Jacob
b80d9dd42e fix determination of number of registers on sse:
__i386__ was not defined by MSVC 2010.
fixed as (2*sizeof(void*)).
also move that to SSE/ and let the default for unknown arch's be just 8.
2010-08-13 13:55:28 -04:00
Benoit Jacob
8bbe556e35 merge the backout 2010-08-11 00:06:31 -04:00
Benoit Jacob
97ced33b33 Backed out changeset 40f6e26a24
See thread on mailing list: "InnerPanel change mis-detects alignment?"
2010-08-11 00:04:06 -04:00
Jitse Niesen
76fbe94279 Document EIGEN_NO_DEBUG macro.
I needed some doxygen tricks to get this to work, so it may not be worth it.
2010-08-10 11:37:23 +01:00
Hauke Heibel
3dd8225862 Added more detailed docs to the QR decompositions classes. 2010-08-05 08:56:19 +02:00
Benoit Jacob
d90d7a006f fix warnings. The one in Reverse was potentially serious: coeff() methods should return CoeffReturnType, not "Scalar", if the expression is potentially a Lvalue. 2010-08-03 10:38:48 -04:00
Hauke Heibel
cc25edd5de Fixed Affine transform typedef. 2010-08-02 21:33:48 +02:00
Hauke Heibel
7cefa75901 Added static method Identity() to the Translation class. 2010-07-29 17:30:37 +02:00
Hauke Heibel
e92993d7b9 Safeguarded some Transform functions with compile time asserts.
Added missing static Identity() to Rotation2D, AngleAxis.
2010-07-29 16:17:42 +02:00
Hauke Heibel
6b89ee0095 Transform is now per default Projective.
Improved invert() in the Transform class.
RotationBase offers matrix() to be conform with Transform's naming scheme.
Added Translation::translation() to be conform with Transform's naming scheme.
2010-07-29 15:54:32 +02:00
Hauke Heibel
2f0e8904f1 Removed debug outputs. 2010-07-28 10:47:58 +02:00
Kenneth Riddile
b038a4bb71 * added EIGEN_ALIGNED_ALLOCATOR macro to allow specifying a different aligned allocator
* attempted to add support for std::deque by copying and modifying the std::vector implementation...MSVC still fails to compile with the std::deque::resize() "will not be aligned" error...probably missing something simple but I'm not sure how to make it work
2010-07-26 19:06:47 -04:00
Jitse Niesen
1420f8b3a1 Several changes in comments to keep Doxygen happy. 2010-07-25 20:29:07 +01:00
Jitse Niesen
425444428c Add examples for API documentation of block methods in DenseBase. 2010-07-23 22:20:00 +01:00
User Martin Senst
145830e067 Add newline at the end of Dense. 2010-07-23 19:00:02 +02:00
Gael Guennebaud
40f6e26a24 allow vectorization of mat44.col() by adding a InnerPanel boolean
template parameter to Block
2010-07-23 16:29:29 +02:00
Gael Guennebaud
9daa66f262 fix merge conflicts 2010-07-22 17:23:11 +02:00
Gael Guennebaud
7020f30da3 sync with default branch 2010-07-22 16:29:35 +02:00
Gael Guennebaud
b9edd6fb85 oops 2010-07-22 16:24:01 +02:00
Gael Guennebaud
96ba7cd655 add an OpenGL module simplifying the way you can pass Eigen's objects to GL 2010-07-22 16:08:58 +02:00
Gael Guennebaud
fa6d36e0f7 fix SparseView: clean the nested matrix type 2010-07-22 15:57:01 +02:00
Hauke Heibel
734469e43f Unified LinSpaced in order to be conform with other setter methods as e.g. Constant. 2010-07-22 14:04:00 +02:00
Gael Guennebaud
c7f40e522e merge 2010-07-22 13:21:06 +02:00
Gael Guennebaud
bec3f9bfe4 rename indices to a common scheme 2010-07-22 13:17:39 +02:00
Gael Guennebaud
0916d69ca5 fix inner vectorization logic 2010-07-22 13:17:12 +02:00
Gael Guennebaud
0dfc5b296b fix strict aliasing issue 2010-07-22 13:16:53 +02:00
Gael Guennebaud
35f0bc70d8 fix a strict aliasing issue with gcc 4.3 2010-07-20 22:43:55 +02:00
Gael Guennebaud
7dbbc6ffd1 fix static allocation of workspace 2010-07-20 17:06:14 +02:00
Gael Guennebaud
ced1a45f82 add NEON ploaddup and pcplxflip functions 2010-07-20 14:24:01 +02:00
Gael Guennebaud
193eedbfe2 one more fix for openmp 2010-07-20 14:19:00 +02:00
Gael Guennebaud
d7fa09bf05 improve block-size heuristic 2010-07-20 13:23:50 +02:00
Gael Guennebaud
4824ac1363 fix openmp version 2010-07-20 13:23:19 +02:00
Gael Guennebaud
b551a2d77a fix declaration of pack_lhs in trsm 2010-07-20 12:58:22 +02:00
Gael Guennebaud
10a7668035 uncomment commented code for debug 2010-07-20 12:57:46 +02:00
Gael Guennebaud
872523844a fix trmm and symm wrt lhs packing 2010-07-20 10:06:41 +02:00
Gael Guennebaud
76eb9c9fd9 fix compilation by including file in correct order 2010-07-19 23:32:13 +02:00
Gael Guennebaud
70b1ce11c6 * fix SelfCwiseBinaryOp traits and handling of mixed types
* improve compilation error in case of type mismatch
2010-07-19 23:31:08 +02:00
Gael Guennebaud
8b0b121c9e explicitely disable vectorization for mixed coeff based products 2010-07-19 23:28:57 +02:00
Gael Guennebaud
08c841eb87 fix lhs packing in the case of real * complex products 2010-07-19 23:16:03 +02:00
Gael Guennebaud
1ed4233fd2 port Jacobi to new ei_pset1/ei_pload API 2010-07-19 16:51:38 +02:00
Gael Guennebaud
c2ee454df4 * fix compilation of mixed scalar product
* optimize mixed scalar products
2010-07-19 16:49:09 +02:00
Gael Guennebaud
6e157dd7c6 * fix a couple of remaining issues with previous commit,
* merge ei_product_blocking_traits into ei_gepb_traits
2010-07-19 15:45:13 +02:00
Gael Guennebaud
f8aae7a908 * _mm_loaddup_pd is slow
* optimize SSE ei_ploaddup<Packet4f>
2010-07-19 15:43:27 +02:00
Gael Guennebaud
cd0e5dca9b wip: extend the gebp kernel to optimize complex and mixed products 2010-07-19 08:50:59 +02:00
Gael Guennebaud
1dc9aaaf36 add support for mixing type in trsv 2010-07-13 16:03:49 +02:00
Gael Guennebaud
36d9b51a44 optimize non fused MADD, and add a flatten attribute macro to enforce
inlining within a function
2010-07-13 15:16:34 +02:00
Gael Guennebaud
b72b7ab76f matrix product: move the alpha factor to gebp instead of the packing,
clean some temporaries, etc.
2010-07-12 16:31:46 +02:00
Gael Guennebaud
f8678272a4 mixing types step 3:
- improve support of colmajor by vector and matrix - matrix
- now all configurations are well handled, but the perf are not always very good
2010-07-11 23:57:23 +02:00
Gael Guennebaud
8e3c4283f5 make colmaj * vector uses pointers only 2010-07-11 16:01:48 +02:00
Gael Guennebaud
ff96c94043 mixing types in product step 2:
* pload* and pset1 are now templated on the packet type
* gemv routines are now embeded into a structure with
  a consistent API with respect to gemm
* some configurations of vector * matrix and matrix * matrix works fine,
  some need more work...
2010-07-11 15:48:30 +02:00
Gael Guennebaud
4161b8be67 sync 2010-07-10 22:58:51 +02:00
Gael Guennebaud
e5bc9526f1 * generalize rowmajor by vector
* fix weird compilation error when constructing a matrix with a row by matrix product
2010-07-10 22:53:27 +02:00
Gael Guennebaud
c4ef69b5bd fix compilation: make the check_coordinates* functions const 2010-07-10 22:37:16 +02:00
Benoit Jacob
6dcd373b9d let ei_pset1 use _mm_loaddup_pd. Not a significant speed improvement, but also not a speed regression, and replaces 3 instructions by 1 single instruction. 2010-07-09 18:51:17 -04:00
Konstantinos Margaritis
6ad3f1ab1f Added NEON/Complex.h, ~3.5x faster than scalar std::complex<float>
minor fix in AltiVec Complex.h
2010-07-10 00:09:29 +03:00
Gael Guennebaud
96f9015807 disable MSVC optimization when the underlying compiler is ICC 2010-07-09 19:33:43 +02:00
Gael Guennebaud
b2effa2b2c move ei_conj_if to a more appropriate file 2010-07-09 18:05:57 +02:00
Konstantinos Margaritis
642cc27eb1 forgot to commit ei_p4f_FORWARD; 2010-07-09 18:08:18 +03:00
Konstantinos Margaritis
f6bd508351 forgot to add the Complex.h include for AltiVec. 2010-07-09 17:56:53 +03:00
Konstantinos Margaritis
d9e134c73c Altivec port of Complex.h.
Note: For some reason g++ 4.4 is >200% slower than g++ 4.3 on altivec code.
The same benchmark (bench_gemm) was tested, on the same hardware/OS (G4/Debian testing),
with same CFLAGS. With some code reorganizing I managed to get some minor gain
on 4.4, but I just could not reach 4.3 speed. This is most likely a bug, but I'm waiting
to see if it's fixed on 4.5. I'll look into this a bit more.
2010-07-09 17:54:41 +03:00
Gael Guennebaud
b1a17dbfe4 fix a few weird issues with gcc 4.3 32bits and complex<float> 2010-07-09 08:27:58 +02:00
Gael Guennebaud
504d3a3586 fix SliceVectorizedTraversal for packetsize==1 2010-07-08 23:31:14 +02:00
Gael Guennebaud
300a226ffa scalars fitting in a single packet requires more work, step 1
* add a, Alignable trait
* update LinearVectorization assignment
2010-07-08 14:27:47 +02:00
Gael Guennebaud
2a1500915a compilation fix 2010-07-08 14:26:00 +02:00
Gael Guennebaud
2066ed91de enabling aligned loads/store for complex<double> is much more tricky,
so the temporary fix is to always perform unaligned load/store
2010-07-07 22:50:19 +02:00
Gael Guennebaud
d89925e6de an attempt to fix wrong unaligned store 2010-07-07 22:35:06 +02:00
Gael Guennebaud
31a36aa9c4 support for real * complex matrix product - step 1 (works for some special cases) 2010-07-07 19:49:09 +02:00
Gael Guennebaud
861962c55f sync 2010-07-07 16:44:05 +02:00
Gael Guennebaud
a2415388ef optimized conjugate products for SSE3 2010-07-07 16:37:20 +02:00
Gael Guennebaud
65257f6b29 optimize for SSE3 => significant speed up !! 2010-07-07 15:34:46 +02:00
Gael Guennebaud
dd18b22f0b optimize pmul for complex<double> 2010-07-07 15:29:04 +02:00
Gael Guennebaud
845994f18f optimize gemv for complex<double> and fix gcc alignment issue in 32bits 2010-07-07 15:28:41 +02:00
Gael Guennebaud
e07c0f6bb5 cleanning 2010-07-07 11:41:29 +02:00
Gael Guennebaud
b0896382a3 s/IsVectorized/Vectorizable 2010-07-07 11:10:46 +02:00
Gael Guennebaud
74cf12cbe0 add a compile time error if someone call packet on Diagonal (instead of infinite runtime loop) 2010-07-07 11:07:12 +02:00
Gael Guennebaud
d5e0efaf69 fix vectorization rule of diagonal-product 2010-07-07 11:06:31 +02:00
Gael Guennebaud
c851044eae fix row cwise-prod column in coeff based products...
I really don't know why this worked so far...
2010-07-07 10:52:59 +02:00