Gael Guennebaud
9777a5ca60
Various minor fixes in BTL
2014-04-17 21:01:45 +02:00
Gael Guennebaud
9746396d1b
Optimize AVX pset1 for complexes and ploaddup
2014-04-17 20:51:04 +02:00
Benjamin Chretien
e5d0cb54a5
Fix typo in Reductions tutorial.
2014-04-17 18:49:23 +02:00
Gael Guennebaud
1dd015fea6
Reduce block sizes in unit tests.
2014-04-17 16:27:58 +02:00
Gael Guennebaud
45a4aad572
add unit tests for ploadquad and predux4, and split packetmath unit test wrt real/complex
2014-04-17 16:27:22 +02:00
Gael Guennebaud
e1d461352e
Extend mixingtype unit test to check transposed cases.
2014-04-17 16:26:35 +02:00
Gael Guennebaud
11fbdcbc38
Fix and optimize mixed products
2014-04-17 16:04:30 +02:00
Gael Guennebaud
0fa8290366
Optimize ploaddup for AVX
2014-04-17 16:02:27 +02:00
Gael Guennebaud
d936ddc3d1
Fallback to lazy products for very small ones.
2014-04-16 23:15:42 +02:00
Gael Guennebaud
de8336a9bc
Enable alloca on MAC OSX
2014-04-16 23:14:58 +02:00
Gael Guennebaud
d5a795f673
New gebp kernel handling up to 3 packets x 4 register-level blocks. Huge speeup on Haswell.
...
This changeset also introduce new vector functions: ploadquad and predux4.
2014-04-16 17:05:11 +02:00
Mark Borgerding
e0dbb68c2f
Check IMKL version for compatibility with Eigen
2014-04-15 13:57:03 -04:00
Gael Guennebaud
20c840be15
Merged in benoitsteiner/eigen-fixes/nvcc_fixes (pull request PR-56)
...
Fixed a typo in CXX11Meta.h
2014-04-15 10:38:25 +02:00
Benoit Steiner
1afd50e0f3
Fixed a typo in CXX11Meta.h
2014-04-14 14:26:30 -07:00
Gael Guennebaud
3c66bb136b
bug #793 : detect NaN and INF in EigenSolver instead of aborting with an assert.
2014-04-14 22:00:27 +02:00
Gael Guennebaud
7098e6d976
Add isfinite overload for complexes.
2014-04-14 21:57:49 +02:00
Benoit Steiner
feaf7c7e6d
Optimized SSE unaligned loads and stores when compiling a 64bit target with a recent version of gcc (ie gcc 4.8).
2014-04-14 10:44:17 -07:00
Gael Guennebaud
d567e3b893
Merged in benoitsteiner/eigen-fixes (pull request PR-55)
...
CUDA fixes
2014-04-14 14:33:50 +02:00
Gael Guennebaud
148acf8e4f
bug #790 : fix overflow in real_2x2_jacobi_svd
2014-04-14 13:52:16 +02:00
Gael Guennebaud
0587db8bf5
bug #793 : fix overflow in EigenSolver and add respective regression unit test
2014-04-14 11:43:08 +02:00
Benoit Steiner
7903d3f27b
Updated the compiler flags to enable nvcc to work with clang.
2014-04-12 23:39:37 -07:00
Benoit Steiner
a803ff18a9
Fixed a typo in cuda_basic.cu
2014-04-12 20:24:05 -07:00
Freddie Witherden
91288e9bf9
Add include LevenbergMarquardt in CMakeLists.txt.
...
This fixes bug #768 .
2014-04-12 12:53:09 +01:00
Jitse Niesen
fbd5eac7cf
Merged in benoitsteiner/eigen-fixes/nvcc_fixes (pull request PR-53)
...
Silenced a compilation warning produced by nvcc.
2014-04-11 14:16:08 +01:00
Benoit Steiner
1b333c89c9
Updated my previous fix to avoid introducing a compilation warning on ARM platforms.
2014-04-10 17:43:13 -07:00
Benoit Steiner
a1fcf599fa
Silenced a compilation warning produced by nvcc.
2014-04-10 11:19:37 -07:00
Jitse Niesen
a91a7a1964
doc: Add references to Cholesky methods in SelfAdjointView.
2014-04-07 14:14:48 +01:00
Benoit Steiner
3b2321e3ab
Updated the geo_parametrizedline_2 test for AVX.
2014-04-04 17:08:47 -07:00
Benoit Steiner
b446ff037e
Deleted some dead code.
2014-04-04 14:12:24 -07:00
Jitse Niesen
5afcb4965c
Remove out-dated comment in cholesky test.
2014-04-04 16:48:13 +01:00
Christoph Hertzberg
096af59799
Fix bug #784 : Assert if assigning a product to a triangularView does not match the size.
2014-04-04 17:48:37 +02:00
Benoit Steiner
8044b00a7f
bug #782 : Workaround for gcc <= 4.4 compilation error on the NEON PacketMath code.
2014-04-03 23:41:47 +02:00
Benoit Steiner
aecc78325a
Pulled the latest updates from the eigen trunk.
2014-04-01 22:07:05 -07:00
Christoph Hertzberg
1cb8de1250
Make some actual verifications inside the autodiff unit test
2014-04-01 17:44:48 +02:00
Florian George
56c4851323
Fixed typo: symmretric -> symmetric
2014-04-01 15:52:25 +02:00
Gael Guennebaud
ceae5b4145
Fix lapack build
2014-04-01 11:52:23 +02:00
Gael Guennebaud
ec65e6648c
bug #775 : propagate generator when workingaround cmake bug #9220
2014-04-01 11:45:43 +02:00
Gael Guennebaud
d992634fbc
Fix bug #776 : it seems that mingw does not support weak linking
2014-04-01 11:31:21 +02:00
Benoit Steiner
5e8622477b
Rename the vector() factories defined in blas/common.h into make_vector() to prevent a possible name conflict with std::vector.
2014-04-01 11:23:28 +02:00
Gael Guennebaud
1221dd90aa
Fix no newline at end of file warning
2014-04-01 11:21:14 +02:00
Gael Guennebaud
93870d95b7
BTL: add blaze
2014-03-31 10:59:55 +02:00
Gael Guennebaud
f603823ef3
BTL: fix warnings and extend to 5k matrices, update GotoBlas to OpenBlas, etc.
2014-03-31 10:58:30 +02:00
Gael Guennebaud
8d0441052e
Finally, prefetching seems to help getting more stable performance
2014-03-31 10:42:19 +02:00
Gael Guennebaud
82c8163067
Enable repetition in mixing type unit test
2014-03-31 10:41:40 +02:00
Gael Guennebaud
1c0728043a
Workaround alignment warnings
2014-03-30 22:43:47 +02:00
Gael Guennebaud
e497a27ddc
Optimize gebp kernel:
...
1 - increase peeling level along the depth dimention (+5% for large matrices, i.e., >1000)
2 - improve pipelining when dealing with latest rows of the lhs
2014-03-30 21:57:05 +02:00
Benoit Steiner
ad59ade116
Vectorized the loop peeling of the inner loop of the block-panel matrix multiplication code. This speeds up the multiplication of matrices which size is not a multiple of the packet size.
2014-03-28 12:11:23 -07:00
Benoit Steiner
39bfbd43f0
Properly align the input data to prevent false failures of the packetmath.cpp test.
2014-03-28 12:00:08 -07:00
Gael Guennebaud
10aa14592a
Add a mechanism to recursively access to half-size packet types
2014-03-28 10:18:04 +01:00
Gael Guennebaud
8d2bb2c20d
merge with default branch
2014-03-28 09:24:18 +01:00