Commit Graph

6106 Commits

Author SHA1 Message Date
Benoit Jacob
340b8afb14 bug #936, patch 1.5/3: rename _FUSED_ macros to _SINGLE_INSTRUCTION_,
because this is what they are about. "Fused" means "no intermediate rounding
between the mul and the add, only one rounding at the end". Instead,
what we are concerned about here is whether a temporary register is needed,
i.e. whether the MUL and ADD are separate instructions.
Concretely, on ARM NEON, a single-instruction mul-add is always available: VMLA.
But a true fused mul-add is only available on VFPv4: VFMA.
2015-01-31 14:15:57 -05:00
Benoit Jacob
9f99f61e69 bug #936, patch 1/3: some cleanup and renaming for consistency. 2015-01-30 17:43:56 -05:00
Benoit Jacob
759bd92a85 bug #935: Add asm comments in GEBP kernels to work around a bug
in both GCC and Clang on ARM/NEON, whereby they spill registers,
severely harming performance. The reason why the asm comments
make a difference is that they prevent the compiler from
reordering code across these boundaries, which has the effect
of extending the lifetime of local variables and increasing
register pressure on this register-tight code.
2015-01-30 17:27:56 -05:00
Gael Guennebaud
f1092d2f73 bug #941: fix accuracy issue in ColPivHouseholderQR, do not stop decomposition on a small pivot 2015-01-30 19:04:04 +01:00
Gael Guennebaud
9d82f7e30d Supernodes was disabled. 2015-01-30 17:24:40 +01:00
Benoit Steiner
e896c0ade7 Marked the contraction operation as read only, since its result can't be assigned. 2015-01-29 10:29:47 -08:00
Benoit Steiner
5a6ea4edf6 Added more tests to cover tensor reductions 2015-01-28 10:02:47 -08:00
Gael Guennebaud
a727a2c4ed bug #933: RealSchur, do not consider the input matrix norm to check negligible sub-diag entries. This also makes this test consistent with the complex and self-adjoint cases. 2015-01-28 16:07:51 +01:00
Benoit Steiner
9dfdbd7e56 mproved the performance of tensor reductions that preserve the inner most dimension(s). 2015-01-27 14:15:31 -08:00
Benoit Steiner
46fc881e4a Added a few benchmarks for the tensor code 2015-01-26 17:46:40 -08:00
Gael Guennebaud
c6eb84aabc Enable vectorization of transposeInPlace for PacketSize x PacketSize matrices 2015-01-26 17:09:01 +01:00
Gael Guennebaud
e1f1091fde Add support for dense ?= diagonal 2015-01-24 10:32:49 +01:00
Gael Guennebaud
b9d314ae19 bug #329: fix typo 2015-01-17 21:55:33 +01:00
Benoit Steiner
14f537c296 gcc doesn't consider that
template<typename OtherDerived> TensorStridingOp& operator = (const OtherDerived& other)
provides a valid assignment operator for the striding operation, and therefore refuses to compile code like:
result.stride(foo) = source.stride(bar);

Added the explicit
   TensorStridingOp& operator = (const TensorStridingOp& other)

as a workaround to get the code to compile, and did the same in all the operations that can be used as lvalues.
2015-01-16 09:09:23 -08:00
Benoit Steiner
641e824c56 Added cube() operation 2015-01-15 11:11:48 -08:00
Benoit Steiner
b5124e7cfd Created many additional tests 2015-01-14 15:46:04 -08:00
Benoit Steiner
54e3633b43 Updated the list of include files 2015-01-14 15:43:38 -08:00
Benoit Steiner
f697df7237 Improved support for RowMajor tensors
Misc fixes and API cleanups.
2015-01-14 15:38:48 -08:00
Benoit Steiner
6559d09c60 Ensured that each thread has it's own copy of the TensorEvaluator: this avoid race conditions when the evaluator calls a non thread safe functor, eg when generating random numbers. 2015-01-14 15:34:50 -08:00
Benoit Steiner
8a382aa119 Improved the resizing of tensors 2015-01-14 15:33:11 -08:00
Benoit Steiner
703c526355 Misc improvements 2015-01-14 15:31:52 -08:00
Benoit Steiner
4cdf3fe427 Misc fixes 2015-01-14 15:30:47 -08:00
Benoit Steiner
0feff6e987 Expanded the functionality of index lists 2015-01-14 15:29:48 -08:00
Gael Guennebaud
cd679f2c47 Fix doc: setConstant does not exist for SparseMatrix. 2015-01-14 22:06:09 +01:00
Benoit Steiner
1ac8600126 Fixed the return type of coefficient wise operations. For example, the abs function returns a floating point value when called on a complex input. 2015-01-14 12:47:46 -08:00
Benoit Steiner
378bdfb7f0 Added missing apis to the TensorMap class 2015-01-14 12:45:20 -08:00
Benoit Steiner
0526dc1bb4 Added missing apis to the tensor class 2015-01-14 12:44:08 -08:00
Benoit Steiner
1a36590e84 Fixed the printing of RowMajor tensors 2015-01-14 12:43:20 -08:00
Benoit Steiner
7e0b6c56b4 Added ability to initialize a tensor using an initializer list 2015-01-14 12:41:30 -08:00
Benoit Steiner
b12dd1ae3c Misc improvements for fixed size tensors 2015-01-14 12:39:34 -08:00
Benoit Steiner
71676eaddd Added support for RowMajor inputs to the contraction code. 2015-01-14 12:36:57 -08:00
Benoit Steiner
0a0ab6dd15 Increased the functionality of the tensor devices 2015-01-14 11:45:17 -08:00
Benoit Steiner
5692723c58 Improved the performance of the contraction code on CUDA 2015-01-14 11:42:52 -08:00
Benoit Steiner
8f4b8d204b Improved the performance of tensor reductions
Added the ability to generate random numbers following a normal distribution
Created a test to validate the ability to generate random numbers.
2015-01-14 10:19:33 -08:00
Benoit Steiner
3bd2b41b2e Created a test for tensor type casting 2015-01-14 10:17:02 -08:00
Benoit Steiner
4928ea1212 Added ability to reverse the order of the coefficients in a tensor 2015-01-14 10:15:58 -08:00
Benoit Steiner
b00fe1590d Added ability to swap the layout of a tensor 2015-01-14 10:14:46 -08:00
Benoit Steiner
c94174b4fe Improved tensor references 2015-01-14 10:13:08 -08:00
Benoit Steiner
91dd53e54d Created some documentation 2015-01-13 16:07:51 -08:00
Gael Guennebaud
279786e987 Fix missing evaluator in outer-product 2015-01-13 10:25:50 +01:00
Gael Guennebaud
ae4644cc68 bug #907, ARM64: workaround ICE in xcode/clang 2015-01-13 10:03:00 +01:00
Gael Guennebaud
36f7c1337f bug #907, ARM64: workaround vreinterpretq_u64_* not defined in xcode/clang 2015-01-13 09:57:37 +01:00
Gael Guennebaud
63974bcb88 Big 907: workaround some missing intrinsics in current NDK's gcc version (ARM64) 2015-01-07 09:44:25 +01:00
Gael Guennebaud
79f4a59ed9 bug #907: fix compilation with ARM64 2015-01-07 09:41:56 +01:00
Benoit Steiner
9f98650d0a Ensured that contractions that can be reduced to a matrix vector product work correctly even when the input coefficients aren't aligned. 2015-01-06 09:29:13 -08:00
Gael Guennebaud
db5b0741b5 Fix bug #925: typo in MatLab versions of middleRows 2015-01-04 21:39:50 +01:00
Gael Guennebaud
f5f6e2c6f4 bug #921: fix utilization of bitwise operation on enums in first_aligned 2014-12-19 14:41:59 +01:00
Gael Guennebaud
25c7d9164f bug #920: fix MSVC 2015 compilation issues 2014-12-18 22:58:15 +01:00
Gael Guennebaud
b8d9eaa19b Use true compile time "if" for Transform::makeAffine 2014-12-13 22:16:39 +01:00
Gael Guennebaud
f806c23012 Fix false negatives in geo_transformations unit tests 2014-12-16 16:50:30 +01:00