Gael Guennebaud
09a16ba42f
bug #1412 : fix compilation with nvcc+MSVC
2018-01-17 23:13:16 +01:00
Yan Facai (颜发才)
42a8334668
ENH: exp supports complex type for cuda
2018-01-04 16:01:01 +08:00
Eugene Chereshnev
f558ad2955
Fix incorrect ldvt in LAPACKE call from JacobiSVD
2018-01-03 12:55:52 -08:00
Gael Guennebaud
73629f8b68
Fix gcc7 warning
2018-01-09 08:59:27 +01:00
nluehr
f9bdcea022
For cuda 9.1 replace math_functions.hpp with cuda_runtime.h
2017-12-18 16:51:15 -08:00
Gael Guennebaud
06bf1047f9
Fix compilation of stableNorm with some expressions as input
2017-12-15 15:15:37 +01:00
Gael Guennebaud
546ab97d76
Add possibility to overwrite EIGEN_STRONG_INLINE.
2017-12-14 14:47:38 +01:00
Gael Guennebaud
9c3aed9d48
Fix packet and alignment propagation logic of Block<Xpr> expressions. In particular, (A+B).col(j) lost vectorisation.
2017-12-14 14:24:33 +01:00
nluehr
aefd5fd5c4
Replace __float2half_rn with __float2half
...
The latter provides a consistent definition for CUDA 8.0 and 9.0.
2017-11-28 10:15:46 -08:00
Gael Guennebaud
d0b028e173
clarify Pastix requirements
2017-11-27 22:11:57 +01:00
Gael Guennebaud
3587e481fb
silent MSVC warning
2017-11-27 21:53:02 +01:00
nluehr
dd6de618c3
Fix incorrect integer cast in predux<half2>().
...
Bug corrupts results on Maxwell and earlier GPU architectures.
2017-11-21 10:47:00 -08:00
Gael Guennebaud
672bdc126b
bug #1479 : fix failure detection in LDLT
2017-11-16 17:55:24 +01:00
Gael Guennebaud
7cc503f9f5
bug #1485 : fix linking issue of non template functions
2017-11-15 21:33:37 +01:00
Gael Guennebaud
00bc67c374
Move KLU support to official
2017-11-10 14:11:22 +01:00
Gael Guennebaud
1495b98a8e
Merged in spraetor/eigen (pull request PR-305)
...
Issue with mpreal and std::numeric_limits::digits
2017-11-10 10:28:54 +00:00
Gael Guennebaud
d306b96fb7
Merged in carpent/eigen (pull request PR-342)
...
Use col method for column-major matrix
2017-11-10 10:09:53 +00:00
Gael Guennebaud
f86bb89d39
Add EIGEN_MKL_NO_DIRECT_CALL option
2017-11-09 11:07:45 +01:00
Gael Guennebaud
5fa79f96b8
Patch from Konstantin Arturov to enable MKL's direct call by default
2017-11-09 10:58:38 +01:00
Gael Guennebaud
4c03b3511e
Fix issue with boost::multiprec in previous commit
2017-11-08 23:28:01 +01:00
Gael Guennebaud
e9d2888e74
Improve debugging tests and output in BDCSVD
2017-11-08 10:26:03 +01:00
Gael Guennebaud
e8468ea91b
Fix overflow issues in BDCSVD
2017-11-08 10:24:28 +01:00
Christoph Hertzberg
11ddac57e5
Merged in guillaume_michel/eigen (pull request PR-334)
...
- Add support for NEON plog PacketMath function
2017-10-23 13:22:22 +00:00
Benoit Steiner
f16ba2a630
Merged in LaFeuille/eigen-1/LaFeuille/typo-fix-alignmeent-alignment-1505889397887 (pull request PR-335)
...
Typo fix alignmeent ->alignment
2017-10-21 01:59:55 +00:00
Henry Schreiner
9bb26eb8f1
Restore __device__
2017-10-21 00:50:38 +00:00
Henry Schreiner
4245475d22
Fixing missing inlines on device functions for newer CUDA cards
2017-10-20 03:20:13 +00:00
Justin Carpentier
a020d9b134
Use col method for column-major matrix
2017-10-17 21:51:27 +02:00
Konstantinos Margaritis
6c3475f110
remove debugging
2017-10-12 15:34:55 -04:00
Konstantinos Margaritis
df7644aec3
Merged eigen/eigen into default
2017-10-12 22:23:13 +03:00
Konstantinos Margaritis
98e52cc770
rollback 374f750ad4
2017-10-12 15:22:10 -04:00
Konstantinos Margaritis
c4ad358565
explicitly set conjugate mask
2017-10-11 11:05:29 -04:00
Konstantinos Margaritis
380d41fd76
added some extra debugging
2017-10-11 10:40:12 -04:00
Konstantinos Margaritis
d0b7b9d0d3
some Packet2cf pmul fixes
2017-10-11 10:17:22 -04:00
Konstantinos Margaritis
df173f5620
initial pexp() for 32-bit floats, commented out due to vec_cts()
2017-10-11 09:40:49 -04:00
Konstantinos Margaritis
3dcae2a27f
initial pexp() for 32-bit floats, commented out due to vec_cts()
2017-10-11 09:40:45 -04:00
Konstantinos Margaritis
c2a2246489
fix predux_mul for z14/float
2017-10-10 13:38:32 -04:00
Konstantinos Margaritis
374f750ad4
eliminate 'enumeral and non-enumeral type in conditional expression' warning
2017-10-09 16:56:30 -04:00
Konstantinos Margaritis
bc30305d29
complete z14 port
2017-10-09 16:55:10 -04:00
Gael Guennebaud
0e85a677e3
bug #1472 : fix warning
2017-09-26 10:53:33 +02:00
Gael Guennebaud
8579195169
bug #1468 (1/2) : add missing std:: to memcpy
2017-09-22 09:23:24 +02:00
Gael Guennebaud
f92567fecc
Add link to a useful example.
2017-09-20 10:22:23 +02:00
Gael Guennebaud
7ad07fc6f2
Update documentation for aligned_allocator
2017-09-20 10:22:00 +02:00
LaFeuille
7c9b07dc5c
Typo fix alignmeent ->alignment
2017-09-20 06:38:39 +00:00
Christoph Hertzberg
23f8b00bc8
clang provides __has_feature(is_enum) (but not <type_traits>) in C++03 mode
2017-09-14 19:26:03 +02:00
Christoph Hertzberg
0c9ad2f525
std::integral_constant is not C++03 compatible
2017-09-14 19:23:38 +02:00
Gael Guennebaud
6d42309f13
Fix compilation of Vector::operator()(enum) by treating enums as Index
2017-09-07 14:34:30 +02:00
Benoit Steiner
ea4e65bf41
Fixed compilation with cuda_clang.
2017-09-07 09:13:52 +00:00
Gael Guennebaud
9c353dd145
Add C++11 max_digits10 for half.
2017-09-06 10:22:47 +02:00
Gael Guennebaud
b35d1ce4a5
Implement true compile-time "if" for apply_rotation_in_the_plane. This fixes a compilation issue for vectorized real type with missing vectorization for complexes, e.g. AVX512.
2017-09-06 10:02:49 +02:00
Gael Guennebaud
80142362ac
Fix mixing types in sparse matrix products.
2017-09-02 22:50:20 +02:00
Benoit Steiner
a4089991eb
Added support for CUDA 9.0.
2017-08-31 02:49:39 +00:00
Konstantinos Margaritis
1affe3d8df
Merged eigen/eigen into default
2017-08-24 12:24:01 +03:00
Gael Guennebaud
21633e585b
bug #1462 : remove all occurences of the deprecated __CUDACC_VER__ macro by introducing EIGEN_CUDACC_VER
2017-08-24 11:06:47 +02:00
Gael Guennebaud
12249849b5
Make the threshold from gemm to coeff-based-product configurable, and add some explanations.
2017-08-24 10:43:21 +02:00
Gael Guennebaud
39864ebe1e
bug #336 : improve doc for PlainObjectBase::Map
2017-08-22 17:18:43 +02:00
Gael Guennebaud
600e52fc7f
Add missing scalar conversion
2017-08-22 17:06:57 +02:00
Gael Guennebaud
9deee79922
bug #1457 : add setUnit() methods for consistency.
2017-08-22 16:48:07 +02:00
Gael Guennebaud
bc91a2df8b
bug #1461 : fix compilation of Map<const Quaternion>::x()
2017-08-22 15:10:42 +02:00
Gael Guennebaud
fc39d5954b
Merged in dtrebbien/eigen/patch-1 (pull request PR-312)
...
Work around a compilation error seen with nvcc V8.0.61
2017-08-22 12:17:37 +00:00
Gael Guennebaud
b223918ea9
Doc: warn about constness in LLT::solveInPlace
2017-08-22 14:12:47 +02:00
Konstantinos Margaritis
e1e71ca4e4
initial support for z14
2017-08-06 19:53:18 -04:00
Benoit Steiner
c5a241ab9b
Merged in benoitsteiner/opencl (pull request PR-323)
...
Improved support for OpenCL
2017-07-07 16:27:33 +00:00
Benoit Steiner
c92faf9d84
Merged in mehdi_goli/upstr_benoit/HiperbolicOP (pull request PR-13)
...
Adding hyperbolic operations for sycl.
* Adding hyperbolic operations.
* Adding the hyperbolic operations for CPU as well.
2017-07-06 05:05:57 +00:00
Gael Guennebaud
561f777075
Fix a gcc7 warning about bool * bool in abs2 default implementation.
2017-06-27 12:05:17 +02:00
Gael Guennebaud
b651ce0ffa
Fix a gcc7 warning: Wint-in-bool-context
2017-06-26 09:58:28 +02:00
Gael Guennebaud
24fe1de9b4
merge
2017-06-15 10:17:39 +02:00
Gael Guennebaud
b240080e64
bug #1436 : fix compilation of Jacobi rotations with ARM NEON, some specializations of internal::conj_helper were missing.
2017-06-15 10:16:30 +02:00
Benoit Steiner
3baef62b9a
Added missing __device__ qualifier
2017-06-13 12:56:55 -07:00
Benoit Steiner
449936828c
Added missing __device__ qualifier
2017-06-13 12:54:57 -07:00
Gael Guennebaud
9fbdf02059
Enable Array(EigenBase<>) ctor for compatible scalar types only. This prevents nested arrays to look as being convertible from/to simple arrays.
2017-06-12 22:30:32 +02:00
Gael Guennebaud
e43d8fe9d7
Fix compilation of streaming nested Array, i.e., cout << Array<Array<>>
2017-06-12 22:26:26 +02:00
Gael Guennebaud
d9d7bd6d62
Fix 1x1 case in Solve expression with EIGEN_DEFAULT_MATRIX_STORAGE_ORDER_OPTION==RowMajor
2017-06-12 22:25:02 +02:00
Gael Guennebaud
6dcf966558
Avoid implicit scalar conversion with accuracy loss in pow(scalar,array)
2017-06-12 16:47:22 +02:00
Gael Guennebaud
c3e2afce0d
Enable MSVC 2010 workaround from MSVC only
2017-06-09 16:25:18 +02:00
Gael Guennebaud
731c8c704d
bug #1403 : more scalar conversions fixes in BDCSVD
2017-06-09 15:45:49 +02:00
Gael Guennebaud
1bbcf19029
bug #1403 : fix implicit scalar type conversion.
2017-06-09 14:44:02 +02:00
Gael Guennebaud
ba5cab576a
bug #1405 : enable StrictlyLower/StrictlyUpper triangularView as the destination of matrix*matrix products.
2017-06-09 14:38:04 +02:00
Gael Guennebaud
90168c003d
bug #1414 : doxygen, add EigenBase to CoreModule
2017-06-09 14:01:44 +02:00
Gael Guennebaud
26f552c18d
fix compilation of Half in C++98 (issue introduced in previous commit)
2017-06-09 13:36:58 +02:00
Gael Guennebaud
1d59ca2458
Fix compilation with gcc 4.3 and ARM NEON
2017-06-09 13:20:52 +02:00
Gael Guennebaud
fb1ee04087
bug #1410 : fix lvalue propagation of Array/Matrix-Wrapper with a const nested expression.
2017-06-09 13:13:03 +02:00
Gael Guennebaud
a7be4cd1b1
Fix LeastSquareDiagonalPreconditioner for complexes (issue introduced in previous commit)
2017-06-09 11:57:53 +02:00
Gael Guennebaud
498aa95a8b
bug #1424 : add numext::abs specialization for unsigned integer types.
2017-06-09 11:53:49 +02:00
Gael Guennebaud
d588822779
Add missing std::numeric_limits specialization for half, and complete NumTraits<half>
2017-06-09 11:51:53 +02:00
Gael Guennebaud
682b2ef17e
bug #1423 : fix LSCG\'s Jacobi preconditioner for row-major matrices.
2017-06-08 15:06:27 +02:00
Gael Guennebaud
4bbc320468
bug #1435 : fix aliasing issue in exressions like: A = C - B*A;
2017-06-08 12:55:25 +02:00
Gael Guennebaud
f2a553fb7b
bug #1411 : fix usage of alignment information in vectorization of quaternion product and conjugate.
2017-06-07 10:10:30 +02:00
Christoph Hertzberg
e018142604
Make sure CholmodSupport works when included in multiple compilation units (issue was reported on stackoverflow.com)
2017-06-06 19:23:14 +02:00
Gael Guennebaud
8508db52ab
bug #1417 : make LinSpace compatible with std::complex
2017-06-06 17:25:56 +02:00
Abhijit Kundu
4343db84d8
updated warning number for nvcc relase 8 (V8.0.61) for the stupid warning message 'calling a __host__ function from a __host__ __device__ function is not allowed'.
2017-05-01 10:36:27 -04:00
Abhijit Kundu
9bc0a35731
Fixed nested angle barckets >> issue when compiling with cuda 8
2017-04-27 03:09:03 -04:00
Gael Guennebaud
891ac03483
Fix dense * sparse-selfadjoint-view product.
2017-04-25 13:58:10 +02:00
Gael Guennebaud
d9084ac8e1
Improve mixing of complex and real in the vectorized path of apply_rotation_in_the_plane
2017-04-14 11:05:13 +02:00
Gael Guennebaud
f75dfdda7e
Fix unwanted Real to Scalar to Real conversions in column-pivoting QR.
2017-04-14 10:34:30 +02:00
Simon Praetorius
511810797e
Issue with mpreal and std::numeric_limits, i.e. digits is not a constant. Added a digits() traits in NumTraits with fallback to static constant. Specialization for mpreal added in MPRealSupport.
2017-03-24 17:45:56 +01:00
Gael Guennebaud
aae19c70ac
update has_ReturnType to be more consistent with other has_ helpers
2017-03-17 17:33:15 +01:00
Benoit Steiner
7f31bb6822
Merged in ilya-biryukov/eigen/fix_clang_cuda_compilation (pull request PR-304)
...
Fixed compilation with cuda-clang
2017-03-15 16:48:52 +00:00
Gael Guennebaud
89fd0c3881
better check array index before using it
2017-03-15 15:18:03 +01:00
Benoit Jacob
61160a21d2
ARM prefetch fixes: Implement prefetch on ARM64. Do not clobber cc on ARM32.
2017-03-15 06:57:25 -04:00
Gael Guennebaud
e5156e4d25
fix typo
2017-03-07 11:25:58 +01:00
Gael Guennebaud
5694315fbb
remove UTF8 symbol
2017-03-07 10:53:47 +01:00
Gael Guennebaud
e958c2baac
remove UTF8 symbols
2017-03-07 10:47:40 +01:00
Gael Guennebaud
d967718525
do not include std header within extern C
2017-03-07 10:16:39 +01:00
Gael Guennebaud
659087b622
bug #1400 : fix stableNorm with EIGEN_DONT_ALIGN_STATICALLY
2017-03-07 10:02:34 +01:00
Ilya Biryukov
1c03d43a5c
Fixed compilation with cuda-clang
2017-03-06 12:01:12 +01:00
Benoit Steiner
09ae0e6586
Adjusted the EIGEN_DEVICE_FUNC qualifiers to make sure that:
...
* they're used consistently between the declaration and the definition of a function
* we avoid calling host only methods from host device methods.
2017-03-01 11:47:47 -08:00
Benoit Steiner
c1d87ec110
Added missing EIGEN_DEVICE_FUNC qualifiers
2017-03-01 10:08:50 -08:00
Benoit Steiner
3a3f040baa
Added missing EIGEN_DEVICE_FUNC qualifiers
2017-02-28 17:06:15 -08:00
Benoit Steiner
7b61944669
Made most of the packet math primitives usable within CUDA kernel when compiling with clang
2017-02-28 17:05:28 -08:00
Benoit Steiner
857adbbd52
Added missing EIGEN_DEVICE_FUNC qualifiers
2017-02-28 16:42:00 -08:00
Benoit Steiner
c36bc2d445
Added missing EIGEN_DEVICE_FUNC qualifiers
2017-02-28 14:58:45 -08:00
Benoit Steiner
4a7df114c8
Added missing EIGEN_DEVICE_FUNC
2017-02-28 14:00:15 -08:00
Benoit Steiner
765f4cc4b4
Deleted extra: EIGEN_DEVICE_FUNC: the QR and Cholesky code isn't ready to run on GPU yet.
2017-02-28 11:57:00 -08:00
Benoit Steiner
e993c94f07
Added missing EIGEN_DEVICE_FUNC qualifiers
2017-02-28 09:56:45 -08:00
Benoit Steiner
33443ec2b0
Added missing EIGEN_DEVICE_FUNC qualifiers
2017-02-28 09:50:10 -08:00
Benoit Steiner
f3e9c42876
Added missing EIGEN_DEVICE_FUNC qualifiers
2017-02-28 09:46:30 -08:00
Gael Guennebaud
4e98a7b2f0
bug #1396 : add some missing EIGEN_DEVICE_FUNC
2017-02-28 09:47:38 +01:00
Benoit Steiner
889c606f8f
Added missing EIGEN_DEVICE_FUNC to the SelfCwise binary ops
2017-02-27 17:17:47 -08:00
Benoit Steiner
193939d6aa
Added missing EIGEN_DEVICE_FUNC qualifiers to several nullary op methods.
2017-02-27 17:11:47 -08:00
Benoit Steiner
ed4dc9d01a
Declared the plset, ploadt_ro, and ploaddup packet primitives as usable within a gpu kernel
2017-02-27 16:57:01 -08:00
Benoit Steiner
b1fc7c9a09
Added missing EIGEN_DEVICE_FUNC qualifiers.
2017-02-27 16:48:30 -08:00
Benoit Steiner
554116bec1
Added EIGEN_DEVICE_FUNC to make the prototype of the EigenBase override match that of DenseBase
2017-02-27 16:45:31 -08:00
Benoit Steiner
34d9fce93b
Avoid unecessary float to double conversions.
2017-02-27 16:33:33 -08:00
Gael Guennebaud
76687f385c
bug #1394 : fix compilation of SelfAdjointEigenSolver<Matrix>(sparse*sparse);
2017-02-20 14:27:26 +01:00
Gael Guennebaud
6572825703
bug #1395 : fix the use of compile-time vectors as inputs of JacobiSVD.
2017-02-20 13:44:37 +01:00
Gael Guennebaud
a811a04696
Silent warning.
2017-02-20 10:14:21 +01:00
Gael Guennebaud
63798df038
Fix usage of CUDACC_VER
2017-02-20 08:16:36 +01:00
Gael Guennebaud
deefa54a54
Fix tracking of temporaries in unit tests
2017-02-19 10:32:54 +01:00
Gael Guennebaud
cbbf88c4d7
Use int32_t instead of int in NEON code. Some platforms with 16 bytes int supports ARM NEON.
2017-02-17 14:39:02 +01:00
Gael Guennebaud
582b5e39bf
bug #1393 : enable Matrix/Array explicit ctor from types with conversion operators (was ok with 3.2)
2017-02-17 14:10:57 +01:00
Benoit Steiner
31a25ab226
Merged eigen/eigen into default
2017-02-14 15:36:21 -08:00
Gael Guennebaud
5937c4ae32
Fall back is_integral to std::is_integral in c++11
2017-02-13 17:14:26 +01:00
Jonathan Hseu
3453b00a1e
Fix vector indexing with uint64_t
2017-02-11 21:45:32 -08:00
Gael Guennebaud
e7ebe52bfb
bug #1391 : include IO.h before DenseBase to enable its usage in DenseBase plugins.
2017-02-13 09:46:20 +01:00
Gael Guennebaud
b3750990d5
Workaround some gcc 4.7 warnings
2017-02-11 23:24:06 +01:00
Gael Guennebaud
c16ee72b20
bug #1392 : fix #include <Eigen/Sparse> with mpl2-only
2017-02-11 10:35:01 +01:00
Gael Guennebaud
e43016367a
Forgot to include a file in previous commit
2017-02-11 10:34:18 +01:00
Gael Guennebaud
6486d4fc95
Worakound gcc 4.7 issue in c++11.
2017-02-11 10:29:10 +01:00
Gael Guennebaud
4a4a72951f
Fix previous commits: disbale only problematic indexed view methods for old compilers instead of disabling everything.
...
Tested with gcc 4.7 (c++03) and gcc 4.8 (c++03 & c++11)
2017-02-11 10:28:44 +01:00
Benoit Steiner
fad776492f
Merged eigen/eigen into default
2017-02-10 14:27:43 -08:00
Benoit Steiner
1ef30b8090
Fixed bug introduced in previous commit
2017-02-10 13:35:10 -08:00
Benoit Steiner
769208a17f
Pulled latest updates from upstream
2017-02-10 13:11:40 -08:00
Benoit Steiner
8b3cc54c42
Added a new EIGEN_HAS_INDEXED_VIEW define that set to 0 for older compilers that are known to fail to compile the indexed views (I used the define from the indexed_views.cpp test).
...
Only include the indexed view methods when the compiler supports the code.
This makes it possible to use Eigen again in complex code bases such as TensorFlow and older compilers such as gcc 4.8
2017-02-10 13:08:49 -08:00
Gael Guennebaud
a1ff24f96a
Fix prunning in (sparse*sparse).pruned() when the result is nearly dense.
2017-02-10 13:59:32 +01:00
Gael Guennebaud
0256c52359
Include clang in the list of non strict MSVC (just to be sure)
2017-02-10 13:41:52 +01:00
Alexander Neumann
dd58462e63
fixed inlining issue with clang-cl on visual studio
...
(grafted from 7962ac1a58
)
2017-02-08 23:50:38 +01:00
Gael Guennebaud
fc8fd5fd24
Improve multi-threading heuristic for matrix products with a small number of columns.
2017-02-07 17:19:59 +01:00
Gael Guennebaud
4254b3eda3
bug #1389 : MSVC's std containers do not properly align in 64 bits mode if the requested alignment is larger than 16 bytes (e.g., with AVX)
2017-02-03 15:22:35 +01:00
Benoit Steiner
2db75c07a6
fixed the ordering of the template and EIGEN_DEVICE_FUNC keywords in a few more places to get more of the Eigen codebase to compile with nvcc again.
2017-02-01 15:41:29 -08:00
Benoit Steiner
fcd257039b
Replaced EIGEN_DEVICE_FUNC template<foo> with template<foo> EIGEN_DEVICE_FUNC to make the code compile with nvcc8.
2017-02-01 15:30:49 -08:00
Gael Guennebaud
0eceea4efd
Define EIGEN_COMP_GNUC to reflect version number: 47, 48, 49, 50, 60, ...
2017-02-01 23:36:40 +01:00
Gael Guennebaud
645a8e32a5
Fix compilation of JacobiSVD for vectors type
2017-01-31 16:22:54 +01:00
Gael Guennebaud
53026d29d4
bug #478 : fix regression in the eigen decomposition of zero matrices.
2017-01-31 14:22:42 +01:00
Benoit Steiner
fbc39fd02c
Merge latest changes from upstream
2017-01-30 15:25:57 -08:00
Gael Guennebaud
c86911ac73
bug #1384 : fix evaluation of "sparse/scalar" that used the wrong evaluation path.
2017-01-30 13:38:24 +01:00
Gael Guennebaud
d024e9942d
MSVC 1900 release is not c++14 compatible enough for us. The 1910 update seems to be fine though.
2017-01-27 22:17:59 +01:00
Rasmus Munk Larsen
edaa0fc5d1
Revert PR-292. After further investigation, the memcpy->memmove change was only good for Haswell on older versions of glibc. Adding a switch for small sizes is perhaps useful for string copies, but also has an overhead for larger sizes, making it a poor trade-off for general memcpy.
...
This PR also removes a couple of unnecessary semi-colons in Eigen/src/Core/AssignEvaluator.h that caused compiler warning everywhere.
2017-01-26 12:46:06 -08:00
Gael Guennebaud
25a1703579
Merged in ggael/eigen-flexidexing (pull request PR-294)
...
generalized operator() for indexed access and slicing
2017-01-26 08:04:23 +00:00
Gael Guennebaud
98dfe0c13f
Fix useless ';' warning
2017-01-25 22:55:04 +01:00
Gael Guennebaud
28351073d8
Fix unamed type as template argument (ok in c++11 only)
2017-01-25 22:54:51 +01:00
Gael Guennebaud
607be65a03
Fix duplicates of array_size bewteen unsupported and Core
2017-01-25 22:53:58 +01:00
Rasmus Munk Larsen
7d39c6d50a
Merged eigen/eigen into default
2017-01-25 09:22:26 -08:00
Rasmus Munk Larsen
5c9ed4ba0d
Reverse arguments for pmin in AVX.
2017-01-25 09:21:57 -08:00
Gael Guennebaud
850ca961d2
bug #1383 : fix regression in LinSpaced for integers and high<low
2017-01-25 18:13:53 +01:00
Gael Guennebaud
296d24be4d
bug #1381 : fix sparse.diagonal() used as a rvalue.
...
The problem was that is "sparse" is not const, then sparse.diagonal() must have the
LValueBit flag meaning that sparse.diagonal().coeff(i) must returns a const reference,
const Scalar&. However, sparse::coeff() cannot returns a reference for a non-existing
zero coefficient. The trick is to return a reference to a local member of
evaluator<SparseMatrix>.
2017-01-25 17:39:01 +01:00
Gael Guennebaud
d06a48959a
bug #1383 : Fix regression from 3.2 with LinSpaced(n,0,n-1) with n==0.
2017-01-25 15:27:13 +01:00
Rasmus Munk Larsen
ae3e43a125
Remove extra space.
2017-01-24 16:16:39 -08:00
Benoit Steiner
e96c77668d
Merged in rmlarsen/eigen2 (pull request PR-292)
...
Adds a fast memcpy function to Eigen.
2017-01-25 00:14:04 +00:00
Rasmus Munk Larsen
3be5ee2352
Update copy helper to use fast_memcpy.
2017-01-24 14:22:49 -08:00
Rasmus Munk Larsen
e6b1020221
Adds a fast memcpy function to Eigen. This takes advantage of the following:
...
1. For small fixed sizes, the compiler generates inline code for memcpy, which is much faster.
2. My colleague eriche at googl dot com discovered that for large sizes, memmove is significantly faster than memcpy (at least on Linux with GCC or Clang). See benchmark numbers measured on a Haswell (HP Z440) workstation here: https://docs.google.com/a/google.com/spreadsheets/d/1jLs5bKzXwhpTySw65MhG1pZpsIwkszZqQTjwrd_n0ic/pubhtml This is of course surprising since memcpy is a less constrained version of memmove. This stackoverflow thread contains some speculation as to the causes: http://stackoverflow.com/questions/22793669/poor-memcpy-performance-on-linux
Below are numbers for copying and slicing tensors using the multithreaded TensorDevice. The numbers show significant improvements for memcpy of very small blocks and for memcpy of large blocks single threaded (we were already able to saturate memory bandwidth for >1 threads before on large blocks). The "slicingSmallPieces" benchmark also shows small consistent improvements, since memcpy cost is a fair portion of that particular computation.
The benchmarks operate on NxN matrices, and the names are of the form BM_$OP_${NUMTHREADS}T/${N}.
Measured improvements in wall clock time:
Run on rmlarsen3.mtv (12 X 3501 MHz CPUs); 2017-01-20T11:26:31.493023454-08:00
CPU: Intel Haswell with HyperThreading (6 cores) dL1:32KB dL2:256KB dL3:15MB
Benchmark Base (ns) New (ns) Improvement
------------------------------------------------------------------
BM_memcpy_1T/2 3.48 2.39 +31.3%
BM_memcpy_1T/8 12.3 6.51 +47.0%
BM_memcpy_1T/64 371 383 -3.2%
BM_memcpy_1T/512 66922 66720 +0.3%
BM_memcpy_1T/4k 9892867 6849682 +30.8%
BM_memcpy_1T/5k 14951099 10332856 +30.9%
BM_memcpy_2T/2 3.50 2.46 +29.7%
BM_memcpy_2T/8 12.3 7.66 +37.7%
BM_memcpy_2T/64 371 376 -1.3%
BM_memcpy_2T/512 66652 66788 -0.2%
BM_memcpy_2T/4k 6145012 6117776 +0.4%
BM_memcpy_2T/5k 9181478 9010942 +1.9%
BM_memcpy_4T/2 3.47 2.47 +31.0%
BM_memcpy_4T/8 12.3 6.67 +45.8
BM_memcpy_4T/64 374 376 -0.5%
BM_memcpy_4T/512 67833 68019 -0.3%
BM_memcpy_4T/4k 5057425 5188253 -2.6%
BM_memcpy_4T/5k 7555638 7779468 -3.0%
BM_memcpy_6T/2 3.51 2.50 +28.8%
BM_memcpy_6T/8 12.3 7.61 +38.1%
BM_memcpy_6T/64 373 378 -1.3%
BM_memcpy_6T/512 66871 66774 +0.1%
BM_memcpy_6T/4k 5112975 5233502 -2.4%
BM_memcpy_6T/5k 7614180 7772246 -2.1%
BM_memcpy_8T/2 3.47 2.41 +30.5%
BM_memcpy_8T/8 12.4 10.5 +15.3%
BM_memcpy_8T/64 372 388 -4.3%
BM_memcpy_8T/512 67373 66588 +1.2%
BM_memcpy_8T/4k 5148462 5254897 -2.1%
BM_memcpy_8T/5k 7660989 7799058 -1.8%
BM_memcpy_12T/2 3.50 2.40 +31.4%
BM_memcpy_12T/8 12.4 7.55 +39.1
BM_memcpy_12T/64 374 378 -1.1%
BM_memcpy_12T/512 67132 66683 +0.7%
BM_memcpy_12T/4k 5185125 5292920 -2.1%
BM_memcpy_12T/5k 7717284 7942684 -2.9%
BM_slicingSmallPieces_1T/2 47.3 47.5 +0.4%
BM_slicingSmallPieces_1T/8 53.6 52.3 +2.4%
BM_slicingSmallPieces_1T/64 491 476 +3.1%
BM_slicingSmallPieces_1T/512 21734 18814 +13.4%
BM_slicingSmallPieces_1T/4k 394660 396760 -0.5%
BM_slicingSmallPieces_1T/5k 218722 209244 +4.3%
BM_slicingSmallPieces_2T/2 80.7 79.9 +1.0%
BM_slicingSmallPieces_2T/8 54.2 53.1 +2.0
BM_slicingSmallPieces_2T/64 497 477 +4.0%
BM_slicingSmallPieces_2T/512 21732 18822 +13.4%
BM_slicingSmallPieces_2T/4k 392885 390490 +0.6%
BM_slicingSmallPieces_2T/5k 221988 208678 +6.0%
BM_slicingSmallPieces_4T/2 80.8 80.1 +0.9%
BM_slicingSmallPieces_4T/8 54.1 53.2 +1.7%
BM_slicingSmallPieces_4T/64 493 476 +3.4%
BM_slicingSmallPieces_4T/512 21702 18758 +13.6%
BM_slicingSmallPieces_4T/4k 393962 404023 -2.6%
BM_slicingSmallPieces_4T/5k 249667 211732 +15.2%
BM_slicingSmallPieces_6T/2 80.5 80.1 +0.5%
BM_slicingSmallPieces_6T/8 54.4 53.4 +1.8%
BM_slicingSmallPieces_6T/64 488 478 +2.0%
BM_slicingSmallPieces_6T/512 21719 18841 +13.3%
BM_slicingSmallPieces_6T/4k 394950 397583 -0.7%
BM_slicingSmallPieces_6T/5k 223080 210148 +5.8%
BM_slicingSmallPieces_8T/2 81.2 80.4 +1.0%
BM_slicingSmallPieces_8T/8 58.1 53.5 +7.9%
BM_slicingSmallPieces_8T/64 489 480 +1.8%
BM_slicingSmallPieces_8T/512 21586 18798 +12.9%
BM_slicingSmallPieces_8T/4k 394592 400165 -1.4%
BM_slicingSmallPieces_8T/5k 219688 208301 +5.2%
BM_slicingSmallPieces_12T/2 80.2 79.8 +0.7%
BM_slicingSmallPieces_12T/8 54.4 53.4 +1.8
BM_slicingSmallPieces_12T/64 488 476 +2.5%
BM_slicingSmallPieces_12T/512 21931 18831 +14.1%
BM_slicingSmallPieces_12T/4k 393962 396541 -0.7%
BM_slicingSmallPieces_12T/5k 218803 207965 +5.0%
2017-01-24 13:55:18 -08:00
Rasmus Munk Larsen
7b6aaa3440
Fix NaN propagation for AVX512.
2017-01-24 13:37:08 -08:00
Rasmus Munk Larsen
5e144bbaa4
Make NaN propagatation consistent between the pmax/pmin and std::max/std::min. This makes the NaN propagation consistent between the scalar and vectorized code paths of Eigen's scalar_max_op and scalar_min_op.
...
See #1373 for details.
2017-01-24 13:32:50 -08:00
Gael Guennebaud
d83db761a2
Add support for std::integral_constant
2017-01-24 16:28:12 +01:00
Gael Guennebaud
bc10201854
Add test for multiple symbols
2017-01-24 16:27:51 +01:00
Gael Guennebaud
c43d254d13
Fix seq().reverse() in c++98
2017-01-24 11:36:43 +01:00
Gael Guennebaud
ddd83f82d8
Add support for "SymbolicExpr op fix<N>" in C++98/11 mode.
2017-01-24 10:54:42 +01:00
Gael Guennebaud
228fef1b3a
Extended the set of arithmetic operators supported by FixedInt (-,+,*,/,%,&,|)
2017-01-24 10:53:51 +01:00
Gael Guennebaud
bb52f74e62
Add internal doc
2017-01-24 10:13:35 +01:00
Gael Guennebaud
41c523a0ab
Rename fix_t to FixedInt
2017-01-24 09:39:49 +01:00
Gael Guennebaud
ba3f977946
bug #1376 : add missing assertion on size mismatch with compound assignment operators (e.g., mat += mat.col(j))
2017-01-23 22:06:08 +01:00
Gael Guennebaud
b0db4eff36
bug #1382 : move using std::size_t/ptrdiff_t to Eigen's namespace (still better than the global namespace!)
2017-01-23 22:03:57 +01:00
Gael Guennebaud
ca79c1545a
Add std:: namespace prefix to all (hopefully) instances if size_t/ptrdfiff_t
2017-01-23 22:02:53 +01:00
Gael Guennebaud
4b607b5692
Use Index instead of size_t
2017-01-23 22:00:33 +01:00
Gael Guennebaud
0fe278f7be
bug #1379 : fix compilation in sparse*diagonal*dense with openmp
2017-01-21 23:27:01 +01:00
Gael Guennebaud
22a172751e
bug #1378 : fix doc (DiagonalIndex vs Diagonal)
2017-01-21 22:09:59 +01:00
Gael Guennebaud
4d302a080c
Recover compile-time size from seq(A,B) when A and B are fixed values. (c++11 only)
2017-01-19 20:34:18 +01:00
Gael Guennebaud
54f3fbee24
Exploit fixed values in seq and reverse with C++98 compatibility
2017-01-19 19:57:32 +01:00
Gael Guennebaud
7691723e34
Add support for fixed-value in symbolic expression, c++11 only for now.
2017-01-19 19:25:29 +01:00
Benoit Steiner
924600a0e8
Made sure that enabling avx2 instructions enables avx and sse instructions as well.
2017-01-19 09:54:48 -08:00
Gael Guennebaud
e84ed7b6ef
Remove dead code
2017-01-18 23:18:28 +01:00
Gael Guennebaud
f3ccbe0419
Add a Symbolic::FixedExpr helper expression to make sure the compiler fully optimize the usage of last and end.
2017-01-18 23:16:32 +01:00
Gael Guennebaud
15471432fe
Add a .reverse() member to ArithmeticSequence.
2017-01-18 11:35:27 +01:00
Gael Guennebaud
e4f8dd860a
Add missing operator*
2017-01-18 10:49:01 +01:00
Gael Guennebaud
198507141b
Update all block expressions to accept compile-time sizes passed by fix<N> or fix<N>(n)
2017-01-18 09:43:58 +01:00
Gael Guennebaud
5484ddd353
Merge the generic and dynamic overloads of block()
2017-01-17 22:11:46 +01:00
Gael Guennebaud
655ba783f8
Defer set-to-zero in triangular = product so that no aliasing issue occur in the common:
...
A.triangularView() = B*A.sefladjointView()*B.adjoint()
case that used to work in 3.2.
2017-01-17 18:03:35 +01:00
Gael Guennebaud
5e36ec3b6f
Fix regression when passing enums to operator()
2017-01-17 17:10:16 +01:00
Gael Guennebaud
4f36dcfda8
Add a generic block() method compatible with Eigen::fix
2017-01-17 11:34:28 +01:00
Gael Guennebaud
71e5b71356
Add a get_runtime_value helper to deal with pointer-to-function hack,
...
plus some refactoring to make the internals more consistent.
2017-01-17 11:33:57 +01:00
Gael Guennebaud
23bfcfc15f
Add missing overload of get_compile_time for c++98/11
2017-01-17 10:30:21 +01:00