Gael Guennebaud
d551e99644
make HessenbergDecompositionMatrixHReturnType internal
2010-11-26 15:39:01 +01:00
Gael Guennebaud
e06c6553e0
make TridiagonalizationMatrixTReturnType internal and only export a public MatrixTReturnType typedef
2010-11-26 15:36:29 +01:00
Gael Guennebaud
0d63212257
add a TridiagonalizationMatrixTReturnType class to make Tridiagonalization::matrixT() more efficient and future proof.
2010-11-26 15:31:47 +01:00
Gael Guennebaud
421b2b5ff7
fix a couple of issues with TridiagonalMatrix
2010-11-26 13:04:20 +01:00
Gael Guennebaud
d8b26cfeec
s/id/p to avoid name clash
2010-11-26 08:36:16 +01:00
Gael Guennebaud
156a31b0e9
fully implement scalar_fuzzy_impl<bool> as, e.g., the missing isMuchSmallerThan is convenient to filter out false values.
2010-11-25 18:00:30 +01:00
Gael Guennebaud
f1690fb9fa
fix bug #122 : rank 2 update test and scalar multiple extraction were both wrong
2010-11-23 19:19:04 +01:00
Benoit Jacob
0ab9a0a2f7
make UpperBidiagonalization internal: don't want to support it, it's not used.
...
Keeping it because it tests BandMatrix.
2010-11-23 11:12:42 -05:00
Benoit Jacob
ee38dbf1e6
Rework nested<> to be cleaner, see bug #76 .
2010-11-23 11:11:40 -05:00
Frederic Gosselin
4c5932f8f5
Improves the filter for hidden files in "Eigen" and "Eigen/src".
...
This generic solution prevent cmake from having an error .svn folders when the source folder is under subversion.
2010-11-22 10:47:07 -05:00
Gael Guennebaud
7213dd1e6b
this product still badly read the imaginary part on the diagonal
2010-11-22 18:00:47 +01:00
Benoit Jacob
a3f214ade9
holy crap, i had disabled all static asserts in 71f023de3e
2010-11-22 08:21:30 -05:00
Gael Guennebaud
d8396a8da0
fix compilation of product_mmtr
2010-11-21 10:23:06 +01:00
Gael Guennebaud
fb6d9ca951
add missing non const data() method to MapBase
2010-11-21 10:17:25 +01:00
Gael Guennebaud
12bfe5e718
make sure our internal selfadjoint*vector product does not use the imaginary part of the diagonal entries
2010-11-21 10:08:48 +01:00
Gael Guennebaud
1ac9124fac
implements TRMV level 2 blas routine
2010-11-20 23:29:20 +01:00
Gael Guennebaud
d72a8f1e50
make trmv uses direct access
2010-11-20 22:42:24 +01:00
Gael Guennebaud
86474115f5
IBM XL C compiler supports __attribute__((aligned(n))) syntax
2010-11-19 17:33:51 +01:00
Thomas Capricelli
94f59a92cb
fix typo
2010-11-19 17:16:28 +01:00
Gael Guennebaud
5ce199b1dd
update rank 2 update doc
2010-11-19 16:50:49 +01:00
Gael Guennebaud
f369b5a711
makes rank 2 update function conformant to BLAS HER2
2010-11-19 16:50:15 +01:00
Gael Guennebaud
3f24dbf6f5
fix compilation of transform * scaling
2010-11-19 14:45:45 +01:00
Gael Guennebaud
1618df55df
Add support for sparse symmetric permutations
2010-11-18 10:28:39 +01:00
Gael Guennebaud
da05b6af0e
fix some remainign issue with ei_ -> internal change
2010-11-16 15:54:48 +01:00
Gael Guennebaud
9a3ec637ff
new feature: copy from a sparse selfadjoint view to a full sparse matrix
2010-11-15 14:14:05 +01:00
Gael Guennebaud
5a3a229550
fix return type of rightHouseholderSequence()
2010-11-15 11:11:22 +01:00
Jitse Niesen
cad73d9cdc
Correct std::map fix (two commits ago); copy fix to aligned_allocator doc.
2010-11-12 12:06:24 +00:00
Gael Guennebaud
b4fa8261b1
properly use nested types
2010-11-10 19:06:20 +01:00
Gael Guennebaud
05ed9be639
prevent warning
2010-11-10 18:59:16 +01:00
Gael Guennebaud
2577ef90c0
generalize our internal rank K update routine to support more general A*B product while evaluating only one triangular part and make it available via, e.g.:
...
R.triangularView<Lower>() += s * A * B;
2010-11-10 18:58:55 +01:00
Gael Guennebaud
c810d14d4d
add missing specialization
2010-11-09 12:03:20 +01:00
Gael Guennebaud
572b5585e3
fix Eigen's trsv for complexes
2010-11-05 14:36:34 +01:00
Gael Guennebaud
0e30c4ae3f
blas level2: gemv and trsv are green
2010-11-05 14:14:50 +01:00
Gael Guennebaud
3fdea699b8
trsv: simplifications/cleaning
2010-11-05 12:54:32 +01:00
Gael Guennebaud
0e6c1170ab
trsv: add support for inner-stride!=1, reduce code instanciation, move implementation to a new products/XX.h file
2010-11-05 12:43:14 +01:00
Gael Guennebaud
5a4f77716d
fix bug #107 : SelfAdjointEigenSolver and RowMajor (and add unit test)
2010-11-04 09:33:05 +01:00
Gael Guennebaud
1eea88bff7
fix matrix product bug with OpenMP
2010-11-03 16:12:37 +01:00
Gael Guennebaud
8d27f55eb3
rm auto normalization in favor of clamping
2010-11-03 15:32:40 +01:00
Hauke Heibel
3a3f163e31
Fix bug #65 .
...
In order to prevent compilation errors, the default functor "struct func" must not be defined inside the function scope. I just moved it into a private section of SparseMatrix.
2010-11-02 14:32:41 +01:00
Hauke Heibel
b3007db131
Added a comment on why is_arithmetic is used in DenseCoeffsBase.
2010-11-02 10:11:22 +01:00
Hauke Heibel
96e4a4b59c
Fixed compilation due to lacking Transform definitions.
2010-11-01 16:53:39 +01:00
Gael Guennebaud
d2e257cb5d
oops (rm commented code)
2010-11-01 09:40:33 +01:00
Gael Guennebaud
c7eda0d866
Let's be safe: enable auto normalization is quaternion to angle-axis code since a slight numerical issue may trigger NaN. The overhead is small and I doubt the perf of this function could be critival for any application !
2010-10-31 23:26:01 +01:00
Benoit Jacob
99ccb26cfe
add eigen2support Transform typedefs, add Eigen2To3 section on Transform
2010-10-29 09:00:35 -04:00
Benoit Jacob
868f753d10
document LvalueBit better
2010-10-28 09:40:20 -04:00
Gael Guennebaud
1d4e80f09d
generalize the prune function
2010-10-28 11:39:31 +02:00
Gael Guennebaud
02c8b6af82
fix sparse rankUpdate and triangularView iterator
2010-10-27 15:13:03 +02:00
Gael Guennebaud
241e5ee3e7
add the possibility to solve for sparse rhs with Cholmod
2010-10-27 14:31:23 +02:00
Hauke Heibel
5d4ff3f99c
Fixed bug #95 by changing _M_IX64 to _M_X64 as proposed by Jan Schlicht.
2010-10-27 11:07:38 +02:00
Hauke Heibel
c738cd56eb
Renamed cleantype to remove_all since it is close to remove_{const|pointer|reference}.
2010-10-26 16:47:01 +02:00
Hauke Heibel
7bc8e3ac09
Initial fixes for bug #85 .
...
Renamed meta_{true|false} to {true|false}_type, meta_if to conditional, is_same_type to is_same, un{ref|pointer|const} to remove_{reference|pointer|const} and makeconst to add_const.
Changed boolean type 'ret' member to 'value'.
Changed 'ret' members refering to types to 'type'.
Adapted all code occurences.
2010-10-25 22:13:49 +02:00
Benoit Jacob
4716040703
bug #86 : use internal:: namespace instead of ei_ prefix
2010-10-25 10:15:22 -04:00
Hauke Heibel
ba86d3ef65
Fixed bug #84 .
2010-10-21 10:13:17 +02:00
Benoit Jacob
8c17fab8f5
renaming: ei_matrix_storage -> DenseStorage
...
DenseStorageBase -> PlainObjectBase
2010-10-20 09:34:13 -04:00
Benoit Jacob
e259f71477
rename PlanarRotation -> JacobiRotation
2010-10-19 21:56:26 -04:00
Benoit Jacob
9044c98cff
work around stupid msvc error when constructing at compile time an expression
...
that involves a division by zero, even if the numeric type has floating point
2010-10-19 21:56:11 -04:00
Hauke Heibel
9f8b6ad43e
Fixed bug #79 .
2010-10-19 09:43:54 +02:00
Benoit Jacob
3481f10e7a
re-fix the broken msvc warning in JacobiSVD
2010-10-18 09:46:22 -04:00
Benoit Jacob
3404d5fb14
improvements in pages 5 and 7 of the tutorial.
2010-10-18 09:09:30 -04:00
Benoit Jacob
597bb61c23
fix stupid msvc warning in jacobisvd
2010-10-18 06:54:11 -04:00
Benoit Jacob
8356bc8d06
add jacobiSvd() method, update test & docs
2010-10-17 09:40:52 -04:00
Benoit Jacob
3f79884f03
bump to 2.92.0
2010-10-15 09:46:20 -04:00
Benoit Jacob
6dc478fd77
doc typo
2010-10-14 10:19:46 -04:00
Benoit Jacob
65c01e2bf7
JacobiSVD doc fix
2010-10-14 10:17:40 -04:00
Benoit Jacob
8f0e80fe30
JacobiSVD:
...
* fix preallocating constructors, allocate U and V of the right size for computation options
* complete documentation and internal comments
* improve unit test, test inf/nan values
2010-10-14 10:14:43 -04:00
Gael Guennebaud
47197065da
compilation fix
2010-10-14 10:19:55 +02:00
Gael Guennebaud
3a2bb7f782
fix compilation and warnings with fcc 4.0.1
2010-10-13 10:21:28 +02:00
Benoit Jacob
8eb0fc1e72
remove SVD class (was bad code taked from elsewhere)
...
Use JacobiSVD for now.
We do plan to reintroduce a bidiagonalizing SVD asap.
2010-10-12 10:19:59 -04:00
Benoit Jacob
dbedc70012
Jacobi improvements:
...
* add fixed-size vectorized path
* add missing restrict keywords
* use innerStride()
* allow vectorization even if innerStride()>1, if PacketSize==1
(think of the case of rows of std::complex<double>)
2010-10-12 09:58:53 -04:00
Benoit Jacob
12a152031d
fix the Jacobi bug, expand unit test
2010-10-12 09:43:40 -04:00
Benoit Jacob
b8bb804007
set ColPivHouseholderQR as default preconditioner for JacobiSVD
2010-10-11 21:00:42 -04:00
Benoit Jacob
5c3d21693b
implement JacobiSVD::solve() and expand the unit test
2010-10-11 15:36:04 -04:00
Benoit Jacob
d229f99ba2
adapt Quaternion to JacobiSVD API changes.
2010-10-08 10:42:41 -04:00
Benoit Jacob
8ba8d90063
add option to compute thin U/V.
...
By default nothing is computed. You have to ask explicitly for thin/full U/V if you want them.
2010-10-08 10:42:40 -04:00
Benoit Jacob
6fad2eb97b
Rework JacobiSVD api / template parameters.
...
There is now an integer QRPreconditioner template parameter, defaulting to full-piv QR.
Since we have to special-case each QR dec anyway, a template template parameter didn't add much value here.
There is an option NoQRPreconditioner if you know your matrices are already square (auto-detected for fixed-size matrices).
2010-10-08 10:42:32 -04:00
Benoit Jacob
58e0cce0f7
merge backout
2010-10-08 10:42:25 -04:00
Benoit Jacob
4a98cada26
Backed out changeset 2334291157
...
Sorry Thomas, these doc fixes are no longer relevant with the JacobiSVD API changes, and they are preventing me from applying my patches cleanly.
2010-10-08 10:42:06 -04:00
Gael Guennebaud
a76ce042e6
MSVC for windows mobile does not have the errno.h file
2010-10-07 18:09:15 +02:00
Gael Guennebaud
af22364988
an attempt to fix compilation on windows mobile
2010-10-07 17:54:46 +02:00
Gael Guennebaud
01fad14d78
mark LLT/LDLT solveInPlace func internal and rm their boolean returned value
2010-10-05 15:56:50 +02:00
Thomas Capricelli
2334291157
fix doc
2010-10-04 04:08:32 +02:00
Benoit Jacob
71f023de3e
fix compilation on ubuntu 9.04's version of gcc 4.3 (yes, wtf)
2010-09-27 09:57:57 -04:00
Radu Bogdan Rusu
94ea1eed9a
fix warning
2010-09-27 09:56:54 -04:00
Hauke Heibel
327ed3d1d3
Added a note to the Gram Schmidt code and improved some formatting.
2010-09-25 14:15:35 +02:00
Hauke Heibel
316dadc8e4
Fixed some SVD issues.
...
Make the SVD's output unitary.
Improved unit tests.
Added an assert to the SVD ctor to check whether rows>=cols.
2010-09-24 17:32:44 +02:00
Hauke Heibel
053261de88
Make the SVD's output unitary and improved unit tests.
2010-09-24 16:28:20 +02:00
Benoit Jacob
1c54514bfc
merge
2010-09-23 09:53:21 -04:00
Benoit Jacob
c253cc3d53
SVD:
...
* fix unit test for rectangular matrices.
* enforce that rows >= cols since various places in the code assume that.
2010-09-23 09:51:08 -04:00
Hauke Heibel
62bf04b339
Fixed bad memory access in the SVD.
2010-09-23 11:15:36 +02:00
Benoit Jacob
77c943670e
add cmakelists for 2 subdirs and make sure all subdirs are installed (GLOB)
2010-09-14 04:11:15 -04:00
Gael Guennebaud
91e9344be9
fix vectorization logic and code of cross3 which was never enabled..
2010-09-08 14:10:01 +02:00
Gael Guennebaud
9bb75937cc
fix += return by value like operations
2010-09-06 11:51:42 +02:00
Gael Guennebaud
62eb4dc99b
noalias was wrongly skipping automatic transposition
2010-09-02 19:18:34 +02:00
Gael Guennebaud
4824db6444
add the possibility to extend QuaternionBase
2010-09-02 17:28:07 +02:00
Eamon Nerbonne
d17bb02ccd
Fixes mingw32 compile issues
2010-09-02 10:38:23 +02:00
Gael Guennebaud
b49dde01dc
fix bad mat * mat * scalar when the implicit conversion operator to a Matrix is used
2010-08-31 09:54:38 +02:00
Gael Guennebaud
dcff9ba785
fix bad "using typename"
2010-08-25 13:34:35 +02:00
Gael Guennebaud
cb7a72d5b0
Fix Sun CC parsing of Eigen/Core. In particular,
...
I moved all the block related methods to a plugin file. This also
significantly reduce code verbosity.
2010-08-25 13:09:56 +02:00
Benoit Jacob
bd8d06033d
make a couple of typedefs public so stuff compiles
2010-08-24 10:53:33 -04:00
Gael Guennebaud
a47bbf664c
fix 4x4 SSE inversion when storage orders don't match
2010-08-24 13:00:59 +02:00
Gael Guennebaud
ad9a7c69bc
fix inversion of 4x4 unaligned matrices
2010-08-24 12:28:42 +02:00
Gael Guennebaud
6261f4629f
add TriangularMatrix::conjugate to be consistent since we have adjoint
2010-08-23 23:38:35 +02:00
Jitse Niesen
d1111d625c
Docs: Typos in ArrayBase doxygen comments
2010-08-23 11:44:51 +01:00
Jitse Niesen
103b9351fd
Docs: Add references to TopicClassHierarchy
2010-08-22 18:28:19 +01:00
Jitse Niesen
a6da803873
Document DenseCoeffsBase
2010-08-22 17:30:31 +01:00
Hauke Heibel
60aad09878
Fixed DiagonalMatrix assignment.
2010-08-21 16:34:46 +02:00
Hauke Heibel
92b1674c79
Fixed typos.
2010-08-19 20:11:06 +02:00
Hauke Heibel
610d79e686
Simplified to product templates to a minimum of template parameters.
...
Removed the ei_is_any_projective helper and added ei_transform_traits.
2010-08-19 20:02:46 +02:00
Hauke Heibel
a64aabf73c
Removed unused code.
2010-08-19 19:33:13 +02:00
Hauke Heibel
55c7848877
Matrix product refactoring (rhs products only).
...
Added strong inlines required for MSVC for proper inlining.
Added specializations for DiagonalMatrix products to RotationBase.
Added left- and righ-hand-side products with DiagonalMatrix to Transform.
RHS Transform products now return Matrix objects only.
Split the geo_transformations unit test. Some tests were not made for projectivities.
Removed unused variables from main.h that caused warnings.
2010-08-19 19:25:35 +02:00
Gael Guennebaud
d4b664c4cd
fix ugly conversion from double[2] to complex
2010-08-19 14:47:58 +02:00
Gael Guennebaud
5354ffbb4f
add missing specialization for vector * selfadjoint
2010-08-19 14:05:21 +02:00
Gael Guennebaud
ddbbd7065d
* disable unalignment detection when vectorization is not enabled
...
* revert MapBase unalignment detection
2010-08-18 09:35:55 +02:00
Hauke Heibel
85fdcdf055
Fixed Geometry module failures.
...
Removed default parameter from Transform.
Removed the TransformXX typedefs.
Removed references to TransformXX from unit tests and docs.
Assigning Transforms to a sub-group is now forbidden at compile time.
Products should now properly support the Isometry flag.
Fixed alignment checks in MapBase.
2010-08-17 20:03:50 +02:00
Benoit Jacob
87aafc9169
fix Transform() constructor taking a Transform with other mode.
...
Not really tested as the geometry tests are currently busted.
2010-08-16 12:30:33 -04:00
Benoit Jacob
19d9c835e0
fix warnings
2010-08-16 11:11:43 -04:00
Gael Guennebaud
b37551f62a
further improve compilation error message for array+=matrix
2010-08-16 11:13:02 +02:00
Gael Guennebaud
c625a6a85b
improve compilation error message for array+=matrix and the likes
2010-08-16 11:07:17 +02:00
Gael Guennebaud
453d54325e
fix declaration of AffineTransformType in Translation
2010-08-16 10:44:27 +02:00
Gael Guennebaud
aa2b46aa91
allow vectorization of mat44.col() by adding a InnerPanel boolean
...
template parameter to Block
2010-07-23 16:29:29 +02:00
Gael Guennebaud
853c0e15df
slightly generalize the alignment assert in MapBase
2010-08-16 09:41:07 +02:00
Gael Guennebaud
8566ef805b
remove the aligned bit flag for non vectorizable types
2010-08-16 09:38:49 +02:00
Benoit Jacob
3a30a2bc3e
forgot to remove a #endif
2010-08-13 14:03:38 -04:00
Benoit Jacob
b80d9dd42e
fix determination of number of registers on sse:
...
__i386__ was not defined by MSVC 2010.
fixed as (2*sizeof(void*)).
also move that to SSE/ and let the default for unknown arch's be just 8.
2010-08-13 13:55:28 -04:00
Benoit Jacob
8bbe556e35
merge the backout
2010-08-11 00:06:31 -04:00
Benoit Jacob
97ced33b33
Backed out changeset 40f6e26a24
...
See thread on mailing list: "InnerPanel change mis-detects alignment?"
2010-08-11 00:04:06 -04:00
Jitse Niesen
76fbe94279
Document EIGEN_NO_DEBUG macro.
...
I needed some doxygen tricks to get this to work, so it may not be worth it.
2010-08-10 11:37:23 +01:00
Hauke Heibel
3dd8225862
Added more detailed docs to the QR decompositions classes.
2010-08-05 08:56:19 +02:00
Benoit Jacob
d90d7a006f
fix warnings. The one in Reverse was potentially serious: coeff() methods should return CoeffReturnType, not "Scalar", if the expression is potentially a Lvalue.
2010-08-03 10:38:48 -04:00
Hauke Heibel
cc25edd5de
Fixed Affine transform typedef.
2010-08-02 21:33:48 +02:00
Hauke Heibel
7cefa75901
Added static method Identity() to the Translation class.
2010-07-29 17:30:37 +02:00
Hauke Heibel
e92993d7b9
Safeguarded some Transform functions with compile time asserts.
...
Added missing static Identity() to Rotation2D, AngleAxis.
2010-07-29 16:17:42 +02:00
Hauke Heibel
6b89ee0095
Transform is now per default Projective.
...
Improved invert() in the Transform class.
RotationBase offers matrix() to be conform with Transform's naming scheme.
Added Translation::translation() to be conform with Transform's naming scheme.
2010-07-29 15:54:32 +02:00
Hauke Heibel
2f0e8904f1
Removed debug outputs.
2010-07-28 10:47:58 +02:00
Kenneth Riddile
b038a4bb71
* added EIGEN_ALIGNED_ALLOCATOR macro to allow specifying a different aligned allocator
...
* attempted to add support for std::deque by copying and modifying the std::vector implementation...MSVC still fails to compile with the std::deque::resize() "will not be aligned" error...probably missing something simple but I'm not sure how to make it work
2010-07-26 19:06:47 -04:00
Jitse Niesen
1420f8b3a1
Several changes in comments to keep Doxygen happy.
2010-07-25 20:29:07 +01:00
Jitse Niesen
425444428c
Add examples for API documentation of block methods in DenseBase.
2010-07-23 22:20:00 +01:00
User Martin Senst
145830e067
Add newline at the end of Dense.
2010-07-23 19:00:02 +02:00
Gael Guennebaud
40f6e26a24
allow vectorization of mat44.col() by adding a InnerPanel boolean
...
template parameter to Block
2010-07-23 16:29:29 +02:00
Gael Guennebaud
9daa66f262
fix merge conflicts
2010-07-22 17:23:11 +02:00
Gael Guennebaud
7020f30da3
sync with default branch
2010-07-22 16:29:35 +02:00
Gael Guennebaud
b9edd6fb85
oops
2010-07-22 16:24:01 +02:00
Gael Guennebaud
96ba7cd655
add an OpenGL module simplifying the way you can pass Eigen's objects to GL
2010-07-22 16:08:58 +02:00
Gael Guennebaud
fa6d36e0f7
fix SparseView: clean the nested matrix type
2010-07-22 15:57:01 +02:00
Hauke Heibel
734469e43f
Unified LinSpaced in order to be conform with other setter methods as e.g. Constant.
2010-07-22 14:04:00 +02:00
Gael Guennebaud
c7f40e522e
merge
2010-07-22 13:21:06 +02:00
Gael Guennebaud
bec3f9bfe4
rename indices to a common scheme
2010-07-22 13:17:39 +02:00
Gael Guennebaud
0916d69ca5
fix inner vectorization logic
2010-07-22 13:17:12 +02:00
Gael Guennebaud
0dfc5b296b
fix strict aliasing issue
2010-07-22 13:16:53 +02:00
Gael Guennebaud
35f0bc70d8
fix a strict aliasing issue with gcc 4.3
2010-07-20 22:43:55 +02:00
Gael Guennebaud
7dbbc6ffd1
fix static allocation of workspace
2010-07-20 17:06:14 +02:00
Gael Guennebaud
ced1a45f82
add NEON ploaddup and pcplxflip functions
2010-07-20 14:24:01 +02:00
Gael Guennebaud
193eedbfe2
one more fix for openmp
2010-07-20 14:19:00 +02:00
Gael Guennebaud
d7fa09bf05
improve block-size heuristic
2010-07-20 13:23:50 +02:00
Gael Guennebaud
4824ac1363
fix openmp version
2010-07-20 13:23:19 +02:00
Gael Guennebaud
b551a2d77a
fix declaration of pack_lhs in trsm
2010-07-20 12:58:22 +02:00
Gael Guennebaud
10a7668035
uncomment commented code for debug
2010-07-20 12:57:46 +02:00
Gael Guennebaud
872523844a
fix trmm and symm wrt lhs packing
2010-07-20 10:06:41 +02:00
Gael Guennebaud
76eb9c9fd9
fix compilation by including file in correct order
2010-07-19 23:32:13 +02:00
Gael Guennebaud
70b1ce11c6
* fix SelfCwiseBinaryOp traits and handling of mixed types
...
* improve compilation error in case of type mismatch
2010-07-19 23:31:08 +02:00
Gael Guennebaud
8b0b121c9e
explicitely disable vectorization for mixed coeff based products
2010-07-19 23:28:57 +02:00
Gael Guennebaud
08c841eb87
fix lhs packing in the case of real * complex products
2010-07-19 23:16:03 +02:00
Gael Guennebaud
1ed4233fd2
port Jacobi to new ei_pset1/ei_pload API
2010-07-19 16:51:38 +02:00
Gael Guennebaud
c2ee454df4
* fix compilation of mixed scalar product
...
* optimize mixed scalar products
2010-07-19 16:49:09 +02:00
Gael Guennebaud
6e157dd7c6
* fix a couple of remaining issues with previous commit,
...
* merge ei_product_blocking_traits into ei_gepb_traits
2010-07-19 15:45:13 +02:00
Gael Guennebaud
f8aae7a908
* _mm_loaddup_pd is slow
...
* optimize SSE ei_ploaddup<Packet4f>
2010-07-19 15:43:27 +02:00
Gael Guennebaud
cd0e5dca9b
wip: extend the gebp kernel to optimize complex and mixed products
2010-07-19 08:50:59 +02:00
Gael Guennebaud
1dc9aaaf36
add support for mixing type in trsv
2010-07-13 16:03:49 +02:00
Gael Guennebaud
36d9b51a44
optimize non fused MADD, and add a flatten attribute macro to enforce
...
inlining within a function
2010-07-13 15:16:34 +02:00
Gael Guennebaud
b72b7ab76f
matrix product: move the alpha factor to gebp instead of the packing,
...
clean some temporaries, etc.
2010-07-12 16:31:46 +02:00
Gael Guennebaud
f8678272a4
mixing types step 3:
...
- improve support of colmajor by vector and matrix - matrix
- now all configurations are well handled, but the perf are not always very good
2010-07-11 23:57:23 +02:00
Gael Guennebaud
8e3c4283f5
make colmaj * vector uses pointers only
2010-07-11 16:01:48 +02:00
Gael Guennebaud
ff96c94043
mixing types in product step 2:
...
* pload* and pset1 are now templated on the packet type
* gemv routines are now embeded into a structure with
a consistent API with respect to gemm
* some configurations of vector * matrix and matrix * matrix works fine,
some need more work...
2010-07-11 15:48:30 +02:00
Gael Guennebaud
4161b8be67
sync
2010-07-10 22:58:51 +02:00
Gael Guennebaud
e5bc9526f1
* generalize rowmajor by vector
...
* fix weird compilation error when constructing a matrix with a row by matrix product
2010-07-10 22:53:27 +02:00
Gael Guennebaud
c4ef69b5bd
fix compilation: make the check_coordinates* functions const
2010-07-10 22:37:16 +02:00
Benoit Jacob
6dcd373b9d
let ei_pset1 use _mm_loaddup_pd. Not a significant speed improvement, but also not a speed regression, and replaces 3 instructions by 1 single instruction.
2010-07-09 18:51:17 -04:00
Konstantinos Margaritis
6ad3f1ab1f
Added NEON/Complex.h, ~3.5x faster than scalar std::complex<float>
...
minor fix in AltiVec Complex.h
2010-07-10 00:09:29 +03:00
Gael Guennebaud
96f9015807
disable MSVC optimization when the underlying compiler is ICC
2010-07-09 19:33:43 +02:00
Gael Guennebaud
b2effa2b2c
move ei_conj_if to a more appropriate file
2010-07-09 18:05:57 +02:00
Konstantinos Margaritis
642cc27eb1
forgot to commit ei_p4f_FORWARD;
2010-07-09 18:08:18 +03:00
Konstantinos Margaritis
f6bd508351
forgot to add the Complex.h include for AltiVec.
2010-07-09 17:56:53 +03:00
Konstantinos Margaritis
d9e134c73c
Altivec port of Complex.h.
...
Note: For some reason g++ 4.4 is >200% slower than g++ 4.3 on altivec code.
The same benchmark (bench_gemm) was tested, on the same hardware/OS (G4/Debian testing),
with same CFLAGS. With some code reorganizing I managed to get some minor gain
on 4.4, but I just could not reach 4.3 speed. This is most likely a bug, but I'm waiting
to see if it's fixed on 4.5. I'll look into this a bit more.
2010-07-09 17:54:41 +03:00
Gael Guennebaud
b1a17dbfe4
fix a few weird issues with gcc 4.3 32bits and complex<float>
2010-07-09 08:27:58 +02:00
Gael Guennebaud
504d3a3586
fix SliceVectorizedTraversal for packetsize==1
2010-07-08 23:31:14 +02:00
Gael Guennebaud
300a226ffa
scalars fitting in a single packet requires more work, step 1
...
* add a, Alignable trait
* update LinearVectorization assignment
2010-07-08 14:27:47 +02:00
Gael Guennebaud
2a1500915a
compilation fix
2010-07-08 14:26:00 +02:00
Gael Guennebaud
2066ed91de
enabling aligned loads/store for complex<double> is much more tricky,
...
so the temporary fix is to always perform unaligned load/store
2010-07-07 22:50:19 +02:00
Gael Guennebaud
d89925e6de
an attempt to fix wrong unaligned store
2010-07-07 22:35:06 +02:00
Gael Guennebaud
31a36aa9c4
support for real * complex matrix product - step 1 (works for some special cases)
2010-07-07 19:49:09 +02:00
Gael Guennebaud
861962c55f
sync
2010-07-07 16:44:05 +02:00
Gael Guennebaud
a2415388ef
optimized conjugate products for SSE3
2010-07-07 16:37:20 +02:00
Gael Guennebaud
65257f6b29
optimize for SSE3 => significant speed up !!
2010-07-07 15:34:46 +02:00
Gael Guennebaud
dd18b22f0b
optimize pmul for complex<double>
2010-07-07 15:29:04 +02:00
Gael Guennebaud
845994f18f
optimize gemv for complex<double> and fix gcc alignment issue in 32bits
2010-07-07 15:28:41 +02:00
Gael Guennebaud
e07c0f6bb5
cleanning
2010-07-07 11:41:29 +02:00
Gael Guennebaud
b0896382a3
s/IsVectorized/Vectorizable
2010-07-07 11:10:46 +02:00
Gael Guennebaud
74cf12cbe0
add a compile time error if someone call packet on Diagonal (instead of infinite runtime loop)
2010-07-07 11:07:12 +02:00
Gael Guennebaud
d5e0efaf69
fix vectorization rule of diagonal-product
2010-07-07 11:06:31 +02:00
Gael Guennebaud
c851044eae
fix row cwise-prod column in coeff based products...
...
I really don't know why this worked so far...
2010-07-07 10:52:59 +02:00