Gael Guennebaud
24409f3acd
Use fix<> API to specify compile-time reshaped sizes.
2017-01-29 15:20:35 +01:00
Gael Guennebaud
9036cda364
Cleanup intitial reshape implementation:
...
- reshape -> reshaped
- make it compatible with evaluators.
2017-01-29 14:57:45 +01:00
Gael Guennebaud
0e89baa5d8
import yoco xiao's work on reshape
2017-01-29 14:29:31 +01:00
Gael Guennebaud
25a1703579
Merged in ggael/eigen-flexidexing (pull request PR-294)
...
generalized operator() for indexed access and slicing
2017-01-26 08:04:23 +00:00
Gael Guennebaud
b0db4eff36
bug #1382 : move using std::size_t/ptrdiff_t to Eigen's namespace (still better than the global namespace!)
2017-01-23 22:03:57 +01:00
Gael Guennebaud
7691723e34
Add support for fixed-value in symbolic expression, c++11 only for now.
2017-01-19 19:25:29 +01:00
Benoit Steiner
924600a0e8
Made sure that enabling avx2 instructions enables avx and sse instructions as well.
2017-01-19 09:54:48 -08:00
Gael Guennebaud
4989922be2
Add support for symbolic expressions as arguments of operator()
2017-01-16 22:21:23 +01:00
Gael Guennebaud
752bd92ba5
Large code refactoring:
...
- generalize some utilities and move them to Meta (size(), array_size())
- move handling of all and single indices to IndexedViewHelper.h
- several cleanup changes
2017-01-11 17:24:02 +01:00
Gael Guennebaud
b1dc0fa813
Move fix and symbolic to their own file, and improve doxygen compatibility
2017-01-11 14:28:28 +01:00
Gael Guennebaud
ac7e4ac9c0
Initial commit to add a generic indexed-based view of matrices.
...
This version already works as a read-only expression.
Numerous refactoring, renaming, extension, tuning passes are expected...
2017-01-06 00:01:44 +01:00
Gael Guennebaud
3182bdbae6
Disable vectorization when compiled by nvcc, even is EIGEN_NO_CUDA is defined
2017-07-17 11:01:28 +02:00
Gael Guennebaud
bbd97b4095
Add a EIGEN_NO_CUDA option, and introduce EIGEN_CUDACC and EIGEN_CUDA_ARCH aliases
2017-07-17 01:02:51 +02:00
Gael Guennebaud
b240080e64
bug #1436 : fix compilation of Jacobi rotations with ARM NEON, some specializations of internal::conj_helper were missing.
2017-06-15 10:16:30 +02:00
Benoit Steiner
fb1d0138ec
Include SSE packet instructions when compiling with avx512 enabled.
2016-12-19 07:32:48 -08:00
Benoit Steiner
81151bd474
Fixed merge conflicts
2016-11-19 19:12:59 -08:00
Benoit Steiner
7c30078b9f
Merged eigen/eigen into default
2016-11-17 22:53:37 -08:00
Luke Iwanski
c5130dedbe
Specialised basic math functions for SYCL device.
2016-11-17 11:47:13 +00:00
Benoit Steiner
f2e8b73256
Enable the use of AVX512 instruction by default
2016-11-16 21:28:04 -08:00
Benoit Steiner
d46a36cc84
Merged eigen/eigen into default
2016-11-04 18:22:55 -07:00
Mehdi Goli
0ebe3808ca
Removed the sycl include from Eigen/Core and moved it to Unsupported/Eigen/CXX11/Tensor; added TensorReduction for sycl (full reduction and partial reduction); added TensorReduction test case for sycl (full reduction and partial reduction); fixed the tile size on TensorSyclRun.h based on the device max work group size;
2016-11-04 18:18:19 +00:00
Benoit Steiner
5c3995769c
Improved AVX512 configuration
2016-11-03 04:50:28 -07:00
Benoit Steiner
ca0ba0d9a4
Improved AVX512 support
2016-11-03 04:00:49 -07:00
Benoit Steiner
c80587c92b
Merged eigen/eigen into default
2016-11-03 03:55:11 -07:00
Benoit Steiner
0585b2965d
Disable vectorization on device only when compiling for sycl
2016-11-02 11:44:27 -07:00
Benoit Steiner
cf20b30d65
Merge latest updates from trunk
2016-10-20 09:42:05 -07:00
Benoit Steiner
d3943cd50c
Fixed a few typos in the ternary tensor expressions types
2016-10-19 12:56:12 -07:00
Mehdi Goli
8fb162fc85
Fixing the typo regarding missing #if needed for proper handling of exceptions in Eigen/Core.
2016-10-16 12:52:34 +01:00
Mehdi Goli
15380f9a87
Applyiing Benoit's comment to return the missing line back in Eigen/Core
2016-10-14 16:39:41 +01:00
Mehdi Goli
524fa4c46f
Reducing the code by generalising sycl backend functions/structs.
2016-10-14 12:09:55 +01:00
Benoit Steiner
9f3276981c
Enabling AVX512 should also enable AVX2.
2016-10-06 10:29:48 -07:00
Benoit Steiner
78b569f685
Merged latest updates from trunk
2016-10-05 18:48:55 -07:00
Benoit Steiner
ae1385c7e4
Pull the latest updates from trunk
2016-10-05 14:54:36 -07:00
Benoit Steiner
409e887d78
Added support for constand std::complex numbers on GPU
2016-10-03 11:06:24 -07:00
RJ Ryan
b2c6dc48d9
Add CUDA-specific std::complex<T> specializations for scalar_sum_op, scalar_difference_op, scalar_product_op, and scalar_quotient_op.
2016-09-20 07:18:20 -07:00
Luke Iwanski
b91e021172
Merged with default.
2016-09-19 14:03:54 +01:00
Luke Iwanski
cb81975714
Partial OpenCL support via SYCL compatible with ComputeCpp CE.
2016-09-19 12:44:13 +01:00
Gael Guennebaud
a4c266f827
Factorize the 4 copies of tanh implementations, make numext::tanh consistent with array::tanh, enable fast tanh in fast-math mode only.
2016-08-23 14:23:08 +02:00
Gael Guennebaud
2f7e2614e7
bug #1232 : refactor special functions as a new SpecialFunctions module, currently in unsupported/.
2016-07-08 11:13:55 +02:00
Eugene Brevdo
39baff850c
Add TernaryFunctors and the betainc SpecialFunction.
...
TernaryFunctors and their executors allow operations on 3-tuples of inputs.
API fully implemented for Arrays and Tensors based on binary functors.
Ported the cephes betainc function (regularized incomplete beta
integral) to Eigen, with support for CPU and GPU, floats, doubles, and
half types.
Added unit tests in array.cpp and cxx11_tensor_cuda.cu
Collapsed revision
* Merged helper methods for betainc across floats and doubles.
* Added TensorGlobalFunctions with betainc(). Removed betainc() from TensorBase.
* Clean up CwiseTernaryOp checks, change igamma_helper to cephes_helper.
* betainc: merge incbcf and incbd into incbeta_cfe. and more cleanup.
* Update TernaryOp and SpecialFunctions (betainc) based on review comments.
2016-06-02 17:04:19 -07:00
Gael Guennebaud
8d97ba6b22
bug #725 : make move ctor/assignment noexcept.
2016-06-03 14:28:25 +02:00
Benoit Steiner
3d0741f027
Include mmintrin.h to make it possible to use mmx instructions when needed. For example, this will enable the definition of a half packet for the Packet4f type.
2016-05-23 20:43:48 -07:00
Benoit Steiner
7d980d74e5
Started to vectorize the processing of 16bit floats on CPU.
2016-05-23 15:21:40 -07:00
Benoit Steiner
07a247dcf4
Pulled latest updates from upstream
2016-04-29 13:41:26 -07:00
Benoit Steiner
80200a1828
Don't attempt to leverage the _cvtss_sh and _cvtsh_ss instructions when compiling with clang since it's unclear which versions of clang actually support these instruction.
2016-04-20 12:10:27 -07:00
Benoit Steiner
1d0238375d
Made sure all the required header files are included when trying to use fp16
2016-04-19 17:44:12 -07:00
Rasmus Larsen
6498dadc2f
Merged eigen/eigen into default
2016-04-11 17:42:05 -07:00
Benoit Steiner
d6e596174d
Pull latest updates from upstream
2016-04-11 17:20:17 -07:00
Gael Guennebaud
fec4c334ba
Remove all references to MKL in BLAS wrappers.
2016-04-11 16:04:09 +02:00
Rasmus Larsen
c34e55c62b
Merged eigen/eigen into default
2016-04-07 20:23:03 -07:00
Benoit Steiner
532fdf24cb
Added support for hardware conversion between fp16 and full floats whenever
...
possible.
2016-04-06 17:11:31 -07:00
Konstantinos Margaritis
2bba4ee2cf
Merged kmargar/eigen/tip into default
2016-04-05 22:22:08 +03:00
Konstantinos Margaritis
988344daf1
enable the other includes as well
2016-04-05 05:59:30 -04:00
Rasmus Larsen
30242b7565
Merged eigen/eigen into default
2016-04-01 17:19:36 -07:00
Rasmus Munk Larsen
1aa89fb855
Add matrix condition estimator module that implements the Higham/Hager algorithm from http://www.maths.manchester.ac.uk/~higham/narep/narep135.pdf used in LPACK. Add rcond() methods to FullPivLU and PartialPivLU.
2016-04-01 10:27:59 -07:00
Benoit Steiner
4f1a7e51c1
Pull math functions from the global namespace only when compiling cuda code with nvcc. When compiling with clang, we want to use the std namespace.
2016-03-30 17:59:49 -07:00
Konstantinos Margaritis
ed6b9d08f1
some primitives ported, but missing intrinsics and crash with asm() are a problem
2016-03-27 18:47:49 -04:00
Benoit Steiner
048c4d6efd
Made half floats usable on hardware that doesn't support them natively.
2016-03-11 17:21:42 -08:00
Benoit Steiner
ac5d706a94
Added support for simple coefficient wise tensor expression using half floats on CUDA devices
2016-02-19 08:19:12 +00:00
Benoit Steiner
0606a0a39b
FP16 on CUDA are only available starting with cuda 7.5. Disable them when using an older version of CUDA
2016-02-18 23:15:23 -08:00
Benoit Steiner
7151bd8768
Reverted unintended changes introduced by a bad merge
2016-02-19 06:20:50 +00:00
Benoit Steiner
17b9fbed34
Added preliminary support for half floats on CUDA GPU. For now we can simply convert floats into half floats and vice versa
2016-02-19 06:16:07 +00:00
Benoit Steiner
6c9cf117c1
Fixed indentation
2016-02-04 10:34:10 -08:00
Benoit Steiner
9a415fb1e2
Preliminary support for AVX512
2015-12-10 15:34:57 -08:00
Eugene Brevdo
fa4f933c0f
Add special functions to Eigen: lgamma, erf, erfc.
...
Includes CUDA support and unit tests.
2015-12-07 15:24:49 -08:00
Gael Guennebaud
0bb12fa614
Add LU::transpose().solve() and LU::adjoint().solve() API.
2015-12-01 14:38:47 +01:00
Gael Guennebaud
d866279364
Clean a bit the implementation of inverse permutations
2015-10-08 18:36:39 +02:00
Doug Kwan
5c9ee73eb9
Implement plog and pexp for AltiVec.
2015-07-30 11:12:42 -07:00
Gael Guennebaud
b5ad3d2cf7
Remove deprecated Flagged expression.
2015-09-02 14:53:50 +02:00
Gael Guennebaud
e68c7b8368
Include SSE packetmath when AVX is enabled, and enable AVX's sine function only in fast-math mode (as SSE)
2015-08-07 17:40:39 +02:00
Gael Guennebaud
175ed636ea
bug #973 : update macro-level control of alignement by introducing user-controllable EIGEN_MAX_ALIGN_BYTES and EIGEN_MAX_STATIC_ALIGN_BYTES macros. This changeset also removes EIGEN_ALIGN (replaced by EIGEN_MAX_ALIGN_BYTES>0), EIGEN_ALIGN_STATICALLY (replaced by EIGEN_MAX_STATIC_ALIGN_BYTES>0), EIGEN_USER_ALIGN*, EIGEN_ALIGN_DEFAULT (replaced by EIGEN_ALIGN_MAX).
2015-07-29 10:22:25 +02:00
Benoit Steiner
6d6e6d0b88
Define EIGEN_VECTORIZE_AVX2 and EIGEN_VECTORIZE_FMA when the corresponding instructions can be used by the compiler
2015-07-22 18:22:16 -07:00
Jonas Adler
815fa0dbf6
Fixed some compiler bugs in NVCC, now compiles with CUDA.
...
(chtz: Manually joined sevaral commits to keep the history clean)
2015-07-22 12:29:18 +02:00
Nicolas Mellado
0d09845562
Revert files to remove EIGEN_USING_NUMEXT_MATH
2015-07-11 20:11:36 +02:00
Nicolas Mellado
5359e5cdb2
Protect against compilation errors with nvcc and numext/complex.
...
Disable functions explicitely involving std::complex when compiling with nvcc.
Improve code compatilibity using the new macro EIGEN_USING_NUMEXT_MATH (same spirit than EIGEN_USING_STD_MATH but for numext functions)
2015-07-06 20:55:01 +02:00
Gael Guennebaud
6fc5438205
Remove a few deprecated internal expressions
2015-06-19 17:06:12 +02:00
Benoit Jacob
051d5325cc
Abandon blocking size lookup table approach. Not performing as well in real world as in microbenchmark.
2015-05-19 11:03:59 -04:00
Pete Warden
140f85bb99
Check for the macro __ARM_NEON__ (with two underscores at the end) as well as __ARM_NEON. The second macro is correct according to the ARM language extensions specification, but historically the first one has been more common. Some older compilers (e.g. gcc v4.6 on a Beaglebone Black) only define the first, so without this patch NEON isn't enabled.
2015-05-12 16:03:43 -07:00
Benoit Steiner
d3f7915aeb
Pulled latest update from the eigen main codebase
2015-03-24 13:12:14 -07:00
Benoit Jacob
dc04f12967
use unsigned short instead of uint16_t which doesn't exist in c++98
2015-03-17 10:31:45 -04:00
Benoit Jacob
577056aa94
Include stdint.h. Not going for cstdint because it is a C++11 addition. Needed for uint16_t at least, in lookup-table code.
2015-03-16 16:21:50 -04:00
Benoit Jacob
02babb9c0f
Provide a empirical lookup table for blocking sizes measured on a Nexus 5. Only for float, only for Android on ARM 32bit for now.
2015-03-15 18:13:12 -04:00
Benoit Jacob
e56aabf205
Refactor computeProductBlockingSizes to make room for the possibility of using lookup tables
2015-03-15 18:05:12 -04:00
Benoit Steiner
573b377110
Added support for vectorized type casting of tensors
2015-02-27 08:46:04 -08:00
Benoit Steiner
e2cfddf75f
Pulled latest updates from trunk
2015-02-13 16:21:59 -08:00
Benoit Steiner
0927801a84
Optimized version of the sin(), exp(), log() and sqrt() function for AVX
2015-02-13 16:07:08 -08:00
Gael Guennebaud
0918c51e60
merge Tensor module within Eigen/unsupported and update gemv BLAS wrapper
2015-02-12 21:48:41 +01:00
Benoit Steiner
cc5d7ff523
Added vectorized implementation of the exponential function for ARM/NEON
2015-02-10 14:02:38 -08:00
Benoit Steiner
c739102ef9
Pulled the latest changes from the trunk
2015-02-06 05:25:03 -08:00
Benoit Jacob
0f21613698
bug #936 , patch 2/3: Remove EIGEN_VECTORIZE_FMA, was redundant with EIGEN_HAS_SINGLE_INSTRUCTION_MADD
2015-01-30 17:44:26 -05:00
Gael Guennebaud
ee06f78679
Introduce unified macros to identify compiler, OS, and architecture. They are all defined in util/Macros.h and prefixed with EIGEN_COMP_, EIGEN_OS_, and EIGEN_ARCH_ respectively.
2014-11-04 21:58:52 +01:00
Konstantinos Margaritis
79225db0b6
Merged in kmargar/eigen (pull request PR-87)
...
Extend NEON to add ARMv8 64-bit double support
2014-10-28 13:08:53 +02:00
Konstantinos Margaritis
94ed7c81e6
Bug #896 : Swap order of checking __VSX__/__ALTIVEC__
2014-10-22 06:15:18 -04:00
Konstantinos Margaritis
87524922dc
check for __ARM_NEON instead as it's defined in arm64 as well
2014-10-21 18:08:50 +00:00
Benoit Steiner
bbce6fa65d
define EIGEN_VECTORIZE_CUDA when compiling with nvcc
2014-10-03 19:55:35 -07:00
Benoit Steiner
95a430a2ca
Vector primitives for CUDA
2014-10-03 19:45:19 -07:00
Konstantinos Margaritis
60e093a9dc
Merged eigen/eigen into default
2014-09-21 14:02:51 +03:00
Gael Guennebaud
0ca43f7e9a
Remove deprecated code not used by evaluators
2014-09-18 15:15:27 +02:00
Konstantinos Margaritis
470aa15c35
First time it compiles, but fails to pass the tests.
2014-09-09 16:58:48 +00:00
Konstantinos Margaritis
7ff266e3ce
Initial VSX commit
2014-08-29 20:03:49 +00:00