Gael Guennebaud
d43b2f01f4
Fix unit testing of predux_downto4 (bad name), and add unit testing of prsqrt
2018-04-03 14:14:00 +02:00
luz.paz
e3912f5e63
MIsc. source and comment typos
...
Found using `codespell` and `grep` from downstream FreeCAD
2018-03-11 10:01:44 -04:00
Srinivas Vasudevan
218764ee1f
Added support for expm1 in Eigen.
2016-12-02 14:13:01 -08:00
Konstantinos Margaritis
a1d5c503fa
replace sizeof(Packet) with PacketSize else it breaks for ZVector.Packet4f
2016-11-17 13:27:45 -05:00
Benoit Steiner
c80587c92b
Merged eigen/eigen into default
2016-11-03 03:55:11 -07:00
Gael Guennebaud
598de8b193
Add pinsertfirst function and implement pinsertlast for complex on SSE/AVX.
2016-11-02 10:38:13 +01:00
Gael Guennebaud
13fc18d3a2
Add a pinsertlast function replacing the last entry of a packet by a scalar.
...
(useful to vectorize LinSpaced)
2016-10-25 16:48:49 +02:00
Benoit Steiner
78d2926508
Merged eigen/eigen into default
2016-10-12 13:46:29 -07:00
Benoit Steiner
507b661106
Renamed predux_half into predux_downto4
2016-10-06 17:57:04 -07:00
Benoit Steiner
78b569f685
Merged latest updates from trunk
2016-10-05 18:48:55 -07:00
Rasmus Munk Larsen
3ed67cb0bb
Fix a bug in the implementation of Carmack's fast sqrt algorithm in Eigen (enabled by EIGEN_FAST_MATH), which causes the vectorized parts of the computation to return -0.0 instead of NaN for negative arguments.
...
Benchmark speed in Giga-sqrts/s
Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz
-----------------------------------------
SSE AVX
Fast=1 2.529G 4.380G
Fast=0 1.944G 1.898G
Fast=1 fixed 2.214G 3.739G
This table illustrates the worst case in terms speed impact: It was measured by repeatedly computing the sqrt of an n=4096 float vector that fits in L1 cache. For large vectors the operation becomes memory bound and the differences between the different versions almost negligible.
2016-10-04 14:22:56 -07:00
Gael Guennebaud
66cbabafed
Add a note regarding gcc bug #72867
2016-09-22 11:18:52 +02:00
Gael Guennebaud
326320ec7b
Fix compilation in non C++11 mode.
2016-08-23 19:28:57 +02:00
Igor Babuschkin
aee693ac52
Add log1p support for CUDA and half floats
2016-08-08 20:24:59 +01:00
Benoit Steiner
03b71c273e
Made the packetmath test compile again. A better fix would be to move the special function tests to the unsupported directory where the code now resides.
2016-07-11 13:50:24 -07:00
Gael Guennebaud
35df3a32eb
Disabled GCC6's ignored-attributes warning in packetmath unit test.
2016-05-26 17:42:58 +02:00
Christoph Hertzberg
718521d5cf
Silenced several double-promotion warnings
2016-05-22 18:17:04 +02:00
Gael Guennebaud
1395056fc0
Make EIGEN_HAS_C99_MATH user configurable
2016-05-20 14:58:19 +02:00
Benoit Steiner
bf185c3c28
Extended the tests for ptanh
2016-05-10 16:21:43 -07:00
Christoph Hertzberg
dacb469bc9
Enable and fix -Wdouble-conversion warnings
2016-05-05 13:35:45 +02:00
Benoit Steiner
3b8da4be5a
Extended the packetmath test to cover all the alignments made possible by avx512 instructions.
2016-04-29 14:13:43 -07:00
Benoit Steiner
d6e596174d
Pull latest updates from upstream
2016-04-11 17:20:17 -07:00
Konstantinos Margaritis
644d0f91d2
enable all tests again
2016-04-05 05:59:54 -04:00
Konstantinos Margaritis
01e7298fe6
actually include ZVector files, passes most basic tests (float still fails)
2016-03-28 10:58:02 -04:00
Konstantinos Margaritis
ed6b9d08f1
some primitives ported, but missing intrinsics and crash with asm() are a problem
2016-03-27 18:47:49 -04:00
Benoit Steiner
1dfaafe28a
Added a regression test for tanh
2016-02-10 17:41:47 -08:00
Benoit Steiner
d93b71a301
Updated the packetmath test to call predux_half instead of predux4
2016-02-01 15:18:33 -08:00
Gael Guennebaud
ca39b1546e
Merged in ebrevdo/eigen (pull request PR-148)
...
Add special functions to eigen: lgamma, erf, erfc.
2015-12-11 11:52:09 +01:00
Benoit Steiner
6acf2bd472
Fixed compilation error triggered by MSVC 2008
2015-12-10 17:17:42 -08:00
Benoit Steiner
48877a6933
Only implement the lgamma, erf, and erfc functions when using a compiler compliant with the C99 specification.
2015-12-10 13:09:49 -08:00
Gael Guennebaud
46d2f6cd78
Workaround gcc issue with -O3 and the i387 FPU.
2015-12-10 21:33:43 +01:00
Benoit Steiner
b630d10b62
Only disable the erf, erfc, and lgamma tests for older versions of c++.
2015-12-07 17:08:08 -08:00
Benoit Steiner
73b68d4370
Fixed a couple of typos
...
Cleaned up the code a bit.
2015-12-07 16:38:48 -08:00
Eugene Brevdo
fa4f933c0f
Add special functions to Eigen: lgamma, erf, erfc.
...
Includes CUDA support and unit tests.
2015-12-07 15:24:49 -08:00
Gael Guennebaud
90323f1751
Fix AVX round/ceil/floor, and fix respective unit test
2015-11-04 22:15:57 +01:00
Alexandre Avenel
d46e2c10a6
Add round, ceil and floor for SSE4.1/AVX (Bug #70 )
2015-11-01 10:49:27 +01:00
Gael Guennebaud
ea9749fd6c
Fix packetmath unit test for pdiv not being always defined
2015-10-13 09:53:46 +02:00
Gael Guennebaud
14458ec0a0
Fix packetmath unit test for exp and log
2015-09-02 15:47:58 +02:00
Gael Guennebaud
dc2c103b3b
merge
2015-08-16 14:22:02 +02:00
Christoph Hertzberg
d6a4805fdf
Protect further isnan/isfinite/isinf calls
2015-08-16 14:00:02 +02:00
Gael Guennebaud
6245591349
Fix prototype of plset and generalize linspace functor.
2015-08-07 19:27:59 +02:00
Gael Guennebaud
aec4814370
Many files were missing in previous changeset.
2015-07-29 11:11:23 +02:00
Benoit Steiner
3625734bc8
Moved some utilities to TensorMeta.h to make it easier to reuse them accross several tensor operations.
...
Created the TensorDimensionList class to encode the list of all the dimensions of a tensor of rank n. This could be done using TensorIndexList, however TensorIndexList require cxx11 which isn't yet supported as widely as we'd like.
2015-06-29 10:49:55 -07:00
Gael Guennebaud
2a33075aeb
std::isnan is c++11 only
2015-06-24 10:29:17 +02:00
Benoit Steiner
6441befbb3
Added more checks to test the correctness of the pexp implementation
2015-06-23 19:12:46 -07:00
Gael Guennebaud
b0d5aaafcc
Rename free functions isFinite, isInf, isNaN to be compatible with c++11
2015-06-10 16:17:09 +02:00
Deanna Hood
8878e1c1de
Remove ambiguity with recent numext methods isNaN and isInf
2015-03-17 22:39:51 +10:00
Benoit Steiner
c739102ef9
Pulled the latest changes from the trunk
2015-02-06 05:25:03 -08:00
Christoph Hertzberg
84aaa03182
Addendum to bug #859 : pexp(NaN) for double did not return NaN, also, plog(NaN) did not return NaN.
...
psqrt(NaN) and psqrt(-1) shall return NaN if EIGEN_FAST_MATH==0
2014-10-20 13:13:43 +02:00
Gael Guennebaud
aa5f79206f
Fix bug #859 : pexp(NaN) returned Inf instead of NaN
2014-10-20 11:38:51 +02:00
Konstantinos Margaritis
7ff266e3ce
Initial VSX commit
2014-08-29 20:03:49 +00:00
Benoit Steiner
16047c8d4a
Pulled in the latest changes from the Eigen trunk
2014-08-13 22:25:29 -07:00
Gael Guennebaud
62f948c56a
Generalize unit testing of pscatter
2014-07-09 16:01:24 +02:00
Benoit Steiner
4304c73542
Pulled latest updates from the Eigen main trunk.
2014-06-10 10:23:32 -07:00
Benoit Steiner
29aebf96e6
Created the pblend packet primitive and implemented it using SSE and AVX instructions.
2014-06-06 20:18:44 -07:00
Christoph Hertzberg
56de8d3816
Fixed unused variable warnings
2014-05-05 15:03:29 +02:00
Gael Guennebaud
7388fdf560
pbroadcast4/2 assume aligned memory
2014-04-25 02:46:22 -07:00
Gael Guennebaud
ae4d9434e2
Add unit test for pbroadcast4/2
2014-04-25 11:21:18 +02:00
Gael Guennebaud
3d8d0f6269
Enable vectorization of pack_rhs with a column-major RHS.
...
Rename and generalize Kernel<*> to PacketBlock<*,N>.
2014-04-25 10:56:18 +02:00
Gael Guennebaud
45a4aad572
add unit tests for ploadquad and predux4, and split packetmath unit test wrt real/complex
2014-04-17 16:27:22 +02:00
Benoit Steiner
39bfbd43f0
Properly align the input data to prevent false failures of the packetmath.cpp test.
2014-03-28 12:00:08 -07:00
Benoit Steiner
8a94cb3edd
Implemented the SSE version of the gather and scatter packet primitives.
2014-03-27 18:29:01 -07:00
Benoit Steiner
ee86679096
Introduced pscatter/pgather packet primitives. They will be used to optimize the loop peeling code of the block-panel matrix multiplication kernel.
2014-03-27 16:03:03 -07:00
Benoit Steiner
a419cea4a0
Created the ptranspose packet primitive that can transpose an array of N packets, where N is the number of words in each packet. This primitive will be used to complete the vectorization of the gemm_pack_lhs and gemm_pack_rhs functions.
...
Implemented the primitive using SSE instructions.
2014-03-26 19:03:07 -07:00
Benoit Steiner
7ed9441ea4
Reverted the definition of the EIGEN_ALIGN to its former meaning (i.e. a boolean)
...
Created a new EIGEN_ALIGN_BYTES define to encode how the data should be aligned
Fixed a few remaining alignment issues exposed when the Eigen code is compiled with avx enabled.
Created a new EIGEN_ALIGN_DEFAULT define, which is set to the minimum alignment value required for the chosen instruction set. Use this value instead of EIGEN_ALIGN32 to preserve the existing alignment on SSE/Altivec/Neon.
2014-02-18 18:06:44 -08:00
Benoit Steiner
64a85800bd
Added support for AVX to Eigen.
2014-01-29 11:43:05 -08:00
Gael Guennebaud
3352b8d873
Extend the magnitude range of tested numbers in packet math unit tests
2013-06-13 18:12:58 +02:00
Gael Guennebaud
62670c83a0
Fix bug #314 : move remaining math functions from internal to numext namespace
2013-06-10 23:40:56 +02:00
Gael Guennebaud
f7e52d22d4
Fix missuse of unitialized values in unit tests
2013-04-10 09:46:16 +02:00
Gael Guennebaud
d63712163c
Add SSE4 min/max for integers
2013-03-20 18:28:40 +01:00
Gael Guennebaud
8745da14d8
Fix SSE plog<float> to return -INF on 0
2013-02-14 23:34:05 +01:00
Gael Guennebaud
a76fbbf397
Fix bug #314 :
...
- remove most of the metaprogramming kung fu in MathFunctions.h (only keep functions that differs from the std)
- remove the overloads for array expression that were in the std namespace
2012-11-06 15:25:50 +01:00
Benoit Jacob
69124cfca2
Automatic relicensing to MPL2 using Keirs script. Manual fixup follows.
2012-07-13 14:42:47 -04:00
Gael Guennebaud
42e2578ef9
the min/max macros to detect unprotected min/max were undefined by some std header,
...
so let's declare them after and do the respective fixes ;)
2011-08-19 14:18:05 +02:00
Gael Guennebaud
8170ef0b2d
add unit test for plset
2011-05-18 21:11:03 +02:00
Gael Guennebaud
4bfe38eda2
extend testing of ploaddup
2011-02-24 00:22:10 +03:00
Gael Guennebaud
0dfea7fce4
improve packetmath unit test
2011-02-23 21:24:26 +03:00
Gael Guennebaud
955c099eb5
implement ploaddup for altivec and add respective unit test
2011-02-23 18:20:55 +03:00
Gael Guennebaud
a00aaf7f7e
fix overflow in packetmath unit test
2011-02-23 17:57:18 +03:00
Gael Guennebaud
59eeb67187
add unit test for pcplxflip
2011-02-23 14:20:33 +01:00
Gael Guennebaud
aea630a98a
factorize implementation of standard real unary math functions, and add acos, asin
2011-02-17 17:37:11 +01:00
Hauke Heibel
7bc8e3ac09
Initial fixes for bug #85 .
...
Renamed meta_{true|false} to {true|false}_type, meta_if to conditional, is_same_type to is_same, un{ref|pointer|const} to remove_{reference|pointer|const} and makeconst to add_const.
Changed boolean type 'ret' member to 'value'.
Changed 'ret' members refering to types to 'type'.
Adapted all code occurences.
2010-10-25 22:13:49 +02:00
Benoit Jacob
4716040703
bug #86 : use internal:: namespace instead of ei_ prefix
2010-10-25 10:15:22 -04:00
Gael Guennebaud
3f532edc6d
update unit test for new API
2010-07-15 08:38:31 +02:00
Gael Guennebaud
2dba4b7ce7
add a unit test for conj_helper and ei_pconj
2010-07-06 20:54:14 +02:00
Gael Guennebaud
e1eccfad3f
add intitial support for the vectorization of complex<float>
2010-07-05 16:18:09 +02:00
Gael Guennebaud
6249d60715
improve packetmath unit test for sum reductions
2010-07-05 10:54:24 +02:00
Gael Guennebaud
28e64b0da3
email change
2010-06-24 23:21:58 +02:00
Konstantinos Margaritis
112c550b4a
Added initial NEON support, most tests pass however we had to use some hackish workarounds
...
as gcc on ARM (both CodeSourcery 4.4.1 used and experimental 4.5) fail to
ensure proper alignment with __attribute__((aligned(16))). This has to be
fixed upstream to remove the workarounds.
2010-03-03 11:25:41 -06:00
Benoit Jacob
2840ac7e94
big huge changes, so i dont remember everything.
...
* renaming, e.g. LU ---> FullPivLU
* split tests framework: more robust, e.g. dont generate empty tests if a number is skipped
* make all remaining tests use that splitting, as needed.
* Fix 4x4 inversion (see stable branch)
* Transform::inverse() and geo_transform test : adapt to new inverse() API, it was also trying to instantiate inverse() for 3x4 matrices.
* CMakeLists: more robust regexp to parse the version number
* misc fixes in unit tests
2009-10-28 18:19:29 -04:00
Benoit Jacob
d41577819b
we were already aligning to 16 byte boundary fixed-size objects that are multiple of 16 bytes;
...
now we also align to 8byte boundary fixed-size objects that are multiple of 8 bytes.
That's only useful for now for double, not e.g. for Vector2f, but that didn't seem to hurt. Am I missing something? Do you prefer that we don't align Vector2f at all?
Also, improvements in test_unalignedassert.
2009-10-05 10:11:11 -04:00
Benoit Jacob
6347b1db5b
remove sentence "Eigen itself is part of the KDE project."
...
it never made very precise sense. but now does it still make any?
2009-05-22 20:25:33 +02:00
Gael Guennebaud
49fc1e3e84
add vectorization of sqrt for float
2009-03-27 14:41:46 +00:00
Gael Guennebaud
17860e578c
add SSE2 versions of sin, cos, log, exp using code from Julien
...
Pommier. They are for float only, and they return exactly the same
result as the standard versions in about 90% of the cases. Otherwise the max error
is below 1e-7. However, for very large values (>1e3) the accuracy of sin and cos
slighlty decrease. They are about 3 or 4 times faster than 4 calls to their respective
standard versions. So, is it ok to enable them by default in their respective functors ?
2009-03-25 12:26:13 +00:00
Gael Guennebaud
fbf415c547
add vectorization of unary operator-() (the AltiVec version is probably
...
broken)
2009-03-20 10:03:24 +00:00
Gael Guennebaud
3f80c68be5
add the vectorization of abs
2009-03-09 18:40:09 +00:00
Gael Guennebaud
0be89a4796
big addons:
...
* add Homogeneous expression for vector and set of vectors (aka matrix)
=> the next step will be to overload operator*
* add homogeneous normalization (again for vector and set of vectors)
* add a Replicate expression (with uni-directional replication
facilities)
=> for all of them I'll add examples once we agree on the API
* fix gcc-4.4 warnings
* rename reverse.cpp array_reverse.cpp
2009-03-05 10:25:22 +00:00
Gael Guennebaud
51c991af45
* exit Sum.h, exit Prod.h, welcome vectorization of redux() !
...
* add vectorization for minCoeff and maxCoeff
2009-02-12 15:18:59 +00:00
Gael Guennebaud
cbbc6d940b
* add ei_predux_mul internal function
...
* apply Ricard Marxer's prod() patch with fixes for the vectorized path
2009-02-10 18:06:05 +00:00
Gael Guennebaud
f5d96df800
Add vectorization of Reverse (was more tricky than I thought) and
...
simplify the index based functions
2009-02-06 12:40:38 +00:00