Commit Graph

286 Commits

Author SHA1 Message Date
Benoit Jacob
953efdbfe7 - introduce Part and Extract classes, splitting and extending the former
Triangular class
- full meta-unrolling in Part
- move inverseProduct() to MatrixBase
- compilation fix in ProductWIP: introduce a meta-selector to only do
  direct access on types that support it.
- phase out the old Product, remove the WIP_DIRTY stuff.
- misc renaming and fixes
2008-05-27 05:47:30 +00:00
Gael Guennebaud
8f1fc80a77 some documentation fixes (Cwise* and Cholesky) 2008-05-22 16:31:00 +00:00
Gael Guennebaud
94e1629a1b * improved product performance:
- fallback to normal product for small dynamic matrices
 - overloaded "c += (a * b).lazy()" to avoid the expensive and useless temporary and setZero()
   in such very common cases.
* fix a couple of issues with the flags
2008-05-22 14:51:25 +00:00
Gael Guennebaud
106a0c1bef restored the product test 2008-05-22 12:35:09 +00:00
Gael Guennebaud
9ab6e186eb remove Like1DArrayBit in Transpose 2008-05-22 12:25:11 +00:00
Gael Guennebaud
522e24f2d7 update of the testing framework:
replaced the QTestLib framework my custom macros
and a (optional) custom script to run the tests from ctest.
2008-05-22 12:18:55 +00:00
Gael Guennebaud
c6789a279c Fix compilation issues with MSVC and NVCC.
Added a few typedef of complex return types in MatrixBase (Needed by MSVC)
2008-05-15 09:40:11 +00:00
Benoit Jacob
5da60897ab Introduce generic Flagged xpr, remove already Lazy.h and Temporary.h
Rename DefaultLostFlagMask --> HerediraryBits
2008-05-14 08:20:15 +00:00
Gael Guennebaud
fd2e9e5c3c * Clean a bit the eigenvalue solver: if the matrix is known to be
selfadjoint at compile time, then it returns real eigenvalues.
* Fix a couple of bugs with the new product.
2008-05-13 07:40:25 +00:00
Benoit Jacob
3eccfd1a78 -fix certain #includes
-fix CMakeLists, public headers weren't getting installed
2008-05-12 21:15:17 +00:00
Gael Guennebaud
4317fad869 * Added several cast to int of the enums (needed for some compilers)
* Fix a mistake in CwiseNullary.
* Added a CoreDeclarions header that declares only the forward declarations
  and related basic stuffs.
2008-05-12 18:09:30 +00:00
Benoit Jacob
678f18fce4 put inline keywords everywhere appropriate. So we don't need anymore to pass
-finline-limit=1000 to gcc to get good performance. By the way some cleanup.
2008-05-12 17:34:46 +00:00
Gael Guennebaud
f0eb3d2d3b updated product test to carefully test all scalar types
and fix an issue in the triangular test
2008-05-12 10:26:10 +00:00
Gael Guennebaud
45cda6704a * Draft of a eigenvalues solver
(does not support complex and does not re-use the QR decomposition)

* Rewrite the cache friendly product to have only one instance per scalar type !
  This significantly speeds up compilation time and reduces executable size.
  The current drawback is that some trivial expressions might be
  evaluated like conjugate or negate.

* Renamed "cache optimal" to "cache friendly"

* Added the ability to directly access matrix data of some expressions via:
  - the stride()/_stride() methods
  - DirectAccessBit flag (replace ReferencableBit)
2008-05-12 10:23:09 +00:00
Benoit Jacob
dca416cace move arch-specific code to arch/SSE and arch/AltiVec subdirs.
rename the noarch PacketMath.h to DummyPacketMath.h
2008-05-12 08:30:42 +00:00
Benoit Jacob
3562b01105 * Give Konstantinos a copyright line
* Fix compilation of Inverse.h with vectorisation
* Introduce EIGEN_GNUC_AT_LEAST(x,y) macro doing future-proof (e.g. gcc v5.0) check
* Only use ProductWIP if vectorisation is enabled
* rename EIGEN_ALWAYS_INLINE -> EIGEN_INLINE with fall-back to inline keyword
* some cleanup/indentation
2008-05-12 08:12:40 +00:00
Benoit Jacob
4f6d7abc87 only include SSE3 headers if compiling with SSE3 support 2008-05-08 09:15:16 +00:00
Gael Guennebaud
4754fa4868 removed "sort brief" in doxygen documentation 2008-05-08 08:13:38 +00:00
Gael Guennebaud
bf5326c3ca * Added ReferencableBit flag to known if coeffRef is available.
(needed by the new product implementation)
* Make the packet* members template to support aligned and unaligned
  access. This makes Block vectorizable. Combined with ReferencableBit,
  we should be able to determine at runtime (in some specific cases) if
  an aligned vectorization is possible or not.
* Improved the new product implementation to robustly handle all cases,
  it now passes all the tests.
* Renamed the packet version ei_predux to ei_preduxp to avoid name collision.
2008-05-08 08:12:52 +00:00
Gael Guennebaud
64c49de7ba * split PacketMath.h to SSE and Altivec specific files
* improved the flexibility of the new product implementation,
  now all sizes seems to be properly handled.
2008-05-05 17:19:47 +00:00
Gael Guennebaud
46fa4c713f * Started support for unaligned vectorization.
* Introduce a new highly optimized matrix-matrix product for large
  matrices. The code is still highly experimental and it is activated
  only if you define EIGEN_WIP_PRODUCT at compile time.
  Currently the third dimension of the product must be a factor of
  the packet size (x4 for floats) and the right handed side matrix
  must be column major.
  Moreover, currently c = a*b; actually computes c += a*b !!
  Therefore, the code is provided for experimentation purpose only !
  These limitations will be fixed soon or later to become the default
  product implementation.
2008-05-05 10:23:29 +00:00
Benoit Jacob
8c6007f80e * Patch by Konstantinos Margaritis: AltiVec vectorization.
* Fix several warnings, temporarily disable determinant test.
2008-05-03 12:21:23 +00:00
Gael Guennebaud
0545df2149 slighly improved the cache friendly product to use mul-add only 2008-05-03 10:01:30 +00:00
Gael Guennebaud
a6655dd91a added packet mul-add function (ei_pmad) and updated Product to use it.
this change nothing for current SSE architecture but might be helpful
for altivec/cell and up comming AMD processors.
2008-05-03 00:45:08 +00:00
Gael Guennebaud
102e029dad Removed ei_pload1, use posix_memalign to allocate aligned memory,
and make Product ok when only one side is vectorizable (and the product
is still vectorized)
2008-05-02 13:30:12 +00:00
Gael Guennebaud
e19f9bc523 added a test for triangular matrices 2008-05-02 11:35:59 +00:00
Benoit Jacob
890a8de962 Make products always eval into expressions. Improves performance
in benchmark. Still not as fasts as explicit eval(), strangely.
2008-05-02 08:53:23 +00:00
Gael Guennebaud
ef5b20bc50 fix flag and cost computations for nested expressions 2008-05-01 18:58:30 +00:00
Gael Guennebaud
5588def0cf nullary xpr are now vectorized 2008-05-01 14:28:53 +00:00
Gael Guennebaud
02f1615d2a Enable vectorization of product with dynamic matrices,
extended cache optimal product to work in any row/column
major situations, and a few bugfixes (forgot to add the
Cholesky header, vectorization of CwiseBinary)
2008-05-01 13:53:05 +00:00
Gael Guennebaud
6486991ac3 some cleaning in Cholesky and removed evil ei_sqrt of complex 2008-04-27 18:57:28 +00:00
Gael Guennebaud
64bacf1c3f * added ei_sqrt for complex
* updated Cholesky to support complex
* correct result_type for abs and abs2 functors
2008-04-27 14:05:40 +00:00
Gael Guennebaud
4ffffa670e added Cholesky module 2008-04-27 10:57:32 +00:00
Gael Guennebaud
1ec2d21ca5 Fixed a couple of issues introduced in previous commits.
Added a test for Triangular.
2008-04-26 20:28:27 +00:00
Gael Guennebaud
b4c974d059 Added triangular assignement, e.g.:
m.upper() = a+b;
only updates the upper triangular part of m.
Note that:
 m = (a+b).upper();
updates all coefficients of m (but half of the additions
will be skiped)

Updated back/forward substitution to better use Eigen's capability.
2008-04-26 19:20:26 +00:00
Gael Guennebaud
4c92150676 Added Triangular expression to extract upper or lower (strictly or not)
part of a matrix. Triangular also provide an optimised method for forward
and backward substitution. Further optimizations regarding assignments and
products might come later.

Updated determinant() to take into account triangular matrices.

Started the QR module with a QR decompostion algorithm.
Help needed to build a QR algorithm (eigen solver) based on it.
2008-04-26 18:26:05 +00:00
Gael Guennebaud
62bf0bbd59 fix a bug in determinant of 4x4 matrices and a small type issue in Inverse 2008-04-26 08:56:52 +00:00
Gael Guennebaud
173e582e3c added a tough test to check the determinant that currently fails 2008-04-25 23:13:20 +00:00
Gael Guennebaud
6f2c72fb53 Various fixes in:
- vector to vector assign
 - PartialRedux
 - Vectorization criteria of Product
 - returned type of normalized
 - SSE integer mul
2008-04-25 23:10:37 +00:00
Gael Guennebaud
a451835bce Make the explicit vectorization much more flexible:
- support dynamic sizes
 - support arbitrary matrix size when the matrix can be seen as a 1D array
   (except for fixed size matrices where the size in Bytes must be a factor of 16,
    this is to allow compact storage of a vector of matrices)
Note that the explict vectorization is still experimental and far to be completely tested.
2008-04-25 15:46:18 +00:00
Gael Guennebaud
30d47b5250 forgot to add a file in the previous commit 2008-04-24 20:25:55 +00:00
Gael Guennebaud
9385793f71 Fix a couple of issue with the vectorization. In particular, default ei_p* functions
are provided to handle not suported types seemlessly.

Added a generic null-ary expression with null-ary functors. They replace
Zero, Ones, Identity and Random.
2008-04-24 18:35:39 +00:00
Benoit Jacob
6ae037dfb5 give up on OpenMP... for now 2008-04-18 07:57:46 +00:00
Benoit Jacob
acfd6f3bda - add _packetCoeff() to Inverse, allowing vectorization.
- let Inverse take template parameter MatrixType instead
  of ExpressionType, in order to reduce executable code size
  when taking inverses of xpr's.
- introduce ei_corrected_matrix_flags : the flags template
  parameter to the Matrix class is only a suggestion. This
  is also useful in ei_eval.
2008-04-16 07:18:27 +00:00
Benoit Jacob
43e2bc14fe +5% optimization in 4x4 inverse:
-only evaluate block expressions for which that is beneficial
-don't check for invertibility unless requested
2008-04-15 20:39:27 +00:00
Benoit Jacob
6747b45ae7 for 4x4 matrices implement the special algorithm that Markos proposed,
falling back to the general algorithm in the bad case.
2008-04-15 20:15:36 +00:00
Benoit Jacob
2a86f052a5 - optimized determinant calculations for small matrices (size <= 4)
(only 30 muls for size 4)
- rework the matrix inversion: now using cofactor technique for size<=3,
  so the ugly unrolling is only used for size 4 anymore, and even there
  I'm looking to get rid of it.
2008-04-14 17:07:12 +00:00
Benoit Jacob
9789c04467 when evaluating an xpr, the result can now be vectorizable
even if the xpr itself wasn't vectorizable.
2008-04-14 08:55:12 +00:00
Benoit Jacob
ea3ccb1e8c * Start of the LU module, with matrix inversion already there and
fully optimized.
* Even if LargeBit is set, only parallelize for large enough objects
  (controlled by EIGEN_PARALLELIZATION_TRESHOLD).
2008-04-14 08:20:24 +00:00
Benoit Jacob
ab4046970b * Add fixed-size template versions of corner(), start(), end().
* Use them to write an unrolled path in echelon.cpp, as an
  experiment before I do this LU module.
* For floating-point types, make ei_random() use an amplitude
  of 1.
2008-04-12 17:37:27 +00:00