Gael Guennebaud
b08c26aefa
merge
2010-07-15 20:41:33 +02:00
Gael Guennebaud
84fdbded4d
add support for strictly triangular matrix in trmm though it is not really useful
2010-07-15 20:39:20 +02:00
Gael Guennebaud
bfbe61454e
merge
2010-07-15 09:54:31 +02:00
Gael Guennebaud
cf9edd9958
fix compilation for non trivial types
2010-07-14 23:31:38 +02:00
Gael Guennebaud
90d6fc0e28
fix ei_aligned_delete for null pointers and non trivial dtors
2010-07-14 22:49:34 +02:00
Christoph Hertzberg
6ba5d2c90c
Implemented SSE optimized double-precision Quaternion multiplication
2010-07-12 23:30:47 +02:00
Gael Guennebaud
850c6d8a2b
fix unused warning
2010-07-11 10:58:58 +02:00
Gael Guennebaud
e5bc9526f1
* generalize rowmajor by vector
...
* fix weird compilation error when constructing a matrix with a row by matrix product
2010-07-10 22:53:27 +02:00
Gael Guennebaud
c4ef69b5bd
fix compilation: make the check_coordinates* functions const
2010-07-10 22:37:16 +02:00
Benoit Jacob
6dcd373b9d
let ei_pset1 use _mm_loaddup_pd. Not a significant speed improvement, but also not a speed regression, and replaces 3 instructions by 1 single instruction.
2010-07-09 18:51:17 -04:00
Konstantinos Margaritis
6ad3f1ab1f
Added NEON/Complex.h, ~3.5x faster than scalar std::complex<float>
...
minor fix in AltiVec Complex.h
2010-07-10 00:09:29 +03:00
Gael Guennebaud
96f9015807
disable MSVC optimization when the underlying compiler is ICC
2010-07-09 19:33:43 +02:00
Gael Guennebaud
b2effa2b2c
move ei_conj_if to a more appropriate file
2010-07-09 18:05:57 +02:00
Konstantinos Margaritis
642cc27eb1
forgot to commit ei_p4f_FORWARD;
2010-07-09 18:08:18 +03:00
Konstantinos Margaritis
f6bd508351
forgot to add the Complex.h include for AltiVec.
2010-07-09 17:56:53 +03:00
Konstantinos Margaritis
d9e134c73c
Altivec port of Complex.h.
...
Note: For some reason g++ 4.4 is >200% slower than g++ 4.3 on altivec code.
The same benchmark (bench_gemm) was tested, on the same hardware/OS (G4/Debian testing),
with same CFLAGS. With some code reorganizing I managed to get some minor gain
on 4.4, but I just could not reach 4.3 speed. This is most likely a bug, but I'm waiting
to see if it's fixed on 4.5. I'll look into this a bit more.
2010-07-09 17:54:41 +03:00
Gael Guennebaud
2066ed91de
enabling aligned loads/store for complex<double> is much more tricky,
...
so the temporary fix is to always perform unaligned load/store
2010-07-07 22:50:19 +02:00
Gael Guennebaud
861962c55f
sync
2010-07-07 16:44:05 +02:00
Gael Guennebaud
a2415388ef
optimized conjugate products for SSE3
2010-07-07 16:37:20 +02:00
Gael Guennebaud
65257f6b29
optimize for SSE3 => significant speed up !!
2010-07-07 15:34:46 +02:00
Gael Guennebaud
dd18b22f0b
optimize pmul for complex<double>
2010-07-07 15:29:04 +02:00
Gael Guennebaud
845994f18f
optimize gemv for complex<double> and fix gcc alignment issue in 32bits
2010-07-07 15:28:41 +02:00
Gael Guennebaud
e07c0f6bb5
cleanning
2010-07-07 11:41:29 +02:00
Gael Guennebaud
b0896382a3
s/IsVectorized/Vectorizable
2010-07-07 11:10:46 +02:00
Gael Guennebaud
74cf12cbe0
add a compile time error if someone call packet on Diagonal (instead of infinite runtime loop)
2010-07-07 11:07:12 +02:00
Gael Guennebaud
d5e0efaf69
fix vectorization rule of diagonal-product
2010-07-07 11:06:31 +02:00
Gael Guennebaud
c851044eae
fix row cwise-prod column in coeff based products...
...
I really don't know why this worked so far...
2010-07-07 10:52:59 +02:00
Gael Guennebaud
e38fc9692d
add a conj_product functor and optimize dot products
2010-07-07 10:00:08 +02:00
Gael Guennebaud
f8d3b4c060
fix mixing types in DiagonalProduct
2010-07-07 09:43:29 +02:00
Gael Guennebaud
bfa606d16f
* add a IsVectorized mechanism (instead of packet-size>1...)
...
* vectorize complex<double>
2010-07-06 23:36:00 +02:00
Gael Guennebaud
bc57c68cf5
bug fix forgot to conjugate the scalar factor when needed
2010-07-06 20:53:48 +02:00
Gael Guennebaud
e04c3f2cc0
reduce code generation and minor speed up
2010-07-06 19:15:02 +02:00
Gael Guennebaud
d6454788d9
add support for vectorized conjugated products
2010-07-06 19:10:24 +02:00
Jitse Niesen
49747fa4a9
Various documentation improvements.
...
* Add short documentation for Array class
* Put all classes explicitly in Core module (where applicable)
* Section on Modules in Quick Reference Guide
* Put Page 7 after Page 6 in Contents :)
2010-07-06 13:10:08 +01:00
Jens Mueller
d849bc4401
Avoid calling resizeLike, if EIGEN_NO_AUTOMATIC_RESIZING is defined
2010-07-06 10:11:18 +02:00
Gael Guennebaud
7d23e7f9f1
indentation
2010-07-06 11:02:01 +02:00
Gael Guennebaud
c69a226192
* extend the Has* packet traits and makes all functor use it
...
* extend the packing routines to support conjugation
2010-07-05 23:27:54 +02:00
Gael Guennebaud
8db60afb47
oops I did not see that
2010-07-05 21:27:15 +02:00
Gael Guennebaud
e1eccfad3f
add intitial support for the vectorization of complex<float>
2010-07-05 16:18:09 +02:00
Konstantinos Margaritis
1505221263
add check for non x86 platforms, we get a compile error on arm/powerpc without the check
...
(there is no known -yet- method to get cpuid, without resolving to kernel /sys interface)
2010-07-05 16:44:41 +03:00
Gael Guennebaud
efb79600b9
fix warning "type qualifiers ignored on function return type" for long long scalar types
2010-07-05 11:23:05 +02:00
Gael Guennebaud
fffaa58ac2
fix unaligned workspace in sybb
2010-07-05 10:12:30 +02:00
Gael Guennebaud
c201aabf3e
comment the workaround of the EIGEN_EMPTY_STRUCT_CTOR workaround for gcc 4.3
2010-07-04 15:26:58 +02:00
Gael Guennebaud
11329f49f4
suppress warning and add a fixme about this transpose argument
2010-07-03 19:39:29 +02:00
Gael Guennebaud
be1fdbf3af
fix openmp for row major destination
2010-07-03 12:52:39 +02:00
Gael Guennebaud
b4ef323e90
fix bug with openmp
2010-07-03 12:20:13 +02:00
Thomas Capricelli
b212227418
shut one more warning
2010-07-01 04:27:45 +02:00
Thomas Capricelli
1399fd9cbd
fix compilation issue with clang
2010-07-01 04:26:07 +02:00
Thomas Capricelli
d414ab51f0
oops... fix it better
2010-07-01 03:39:19 +02:00
Thomas Capricelli
2874101b62
fix compilation with icc. Anyway, the use of an enum instead of a
...
'const bool' is more consistent with the code around.
2010-07-01 03:23:47 +02:00