Benoit Steiner
1d0238375d
Made sure all the required header files are included when trying to use fp16
2016-04-19 17:44:12 -07:00
Rasmus Larsen
6498dadc2f
Merged eigen/eigen into default
2016-04-11 17:42:05 -07:00
Benoit Steiner
d6e596174d
Pull latest updates from upstream
2016-04-11 17:20:17 -07:00
Gael Guennebaud
fec4c334ba
Remove all references to MKL in BLAS wrappers.
2016-04-11 16:04:09 +02:00
Rasmus Larsen
c34e55c62b
Merged eigen/eigen into default
2016-04-07 20:23:03 -07:00
Benoit Steiner
532fdf24cb
Added support for hardware conversion between fp16 and full floats whenever
...
possible.
2016-04-06 17:11:31 -07:00
Konstantinos Margaritis
2bba4ee2cf
Merged kmargar/eigen/tip into default
2016-04-05 22:22:08 +03:00
Konstantinos Margaritis
988344daf1
enable the other includes as well
2016-04-05 05:59:30 -04:00
Rasmus Larsen
30242b7565
Merged eigen/eigen into default
2016-04-01 17:19:36 -07:00
Rasmus Munk Larsen
1aa89fb855
Add matrix condition estimator module that implements the Higham/Hager algorithm from http://www.maths.manchester.ac.uk/~higham/narep/narep135.pdf used in LPACK. Add rcond() methods to FullPivLU and PartialPivLU.
2016-04-01 10:27:59 -07:00
Benoit Steiner
4f1a7e51c1
Pull math functions from the global namespace only when compiling cuda code with nvcc. When compiling with clang, we want to use the std namespace.
2016-03-30 17:59:49 -07:00
Konstantinos Margaritis
ed6b9d08f1
some primitives ported, but missing intrinsics and crash with asm() are a problem
2016-03-27 18:47:49 -04:00
Benoit Steiner
048c4d6efd
Made half floats usable on hardware that doesn't support them natively.
2016-03-11 17:21:42 -08:00
Benoit Steiner
ac5d706a94
Added support for simple coefficient wise tensor expression using half floats on CUDA devices
2016-02-19 08:19:12 +00:00
Benoit Steiner
0606a0a39b
FP16 on CUDA are only available starting with cuda 7.5. Disable them when using an older version of CUDA
2016-02-18 23:15:23 -08:00
Benoit Steiner
7151bd8768
Reverted unintended changes introduced by a bad merge
2016-02-19 06:20:50 +00:00
Benoit Steiner
17b9fbed34
Added preliminary support for half floats on CUDA GPU. For now we can simply convert floats into half floats and vice versa
2016-02-19 06:16:07 +00:00
Benoit Steiner
6c9cf117c1
Fixed indentation
2016-02-04 10:34:10 -08:00
Benoit Steiner
9a415fb1e2
Preliminary support for AVX512
2015-12-10 15:34:57 -08:00
Eugene Brevdo
fa4f933c0f
Add special functions to Eigen: lgamma, erf, erfc.
...
Includes CUDA support and unit tests.
2015-12-07 15:24:49 -08:00
Gael Guennebaud
0bb12fa614
Add LU::transpose().solve() and LU::adjoint().solve() API.
2015-12-01 14:38:47 +01:00
Gael Guennebaud
d866279364
Clean a bit the implementation of inverse permutations
2015-10-08 18:36:39 +02:00
Doug Kwan
5c9ee73eb9
Implement plog and pexp for AltiVec.
2015-07-30 11:12:42 -07:00
Gael Guennebaud
b5ad3d2cf7
Remove deprecated Flagged expression.
2015-09-02 14:53:50 +02:00
Gael Guennebaud
e68c7b8368
Include SSE packetmath when AVX is enabled, and enable AVX's sine function only in fast-math mode (as SSE)
2015-08-07 17:40:39 +02:00
Gael Guennebaud
175ed636ea
bug #973 : update macro-level control of alignement by introducing user-controllable EIGEN_MAX_ALIGN_BYTES and EIGEN_MAX_STATIC_ALIGN_BYTES macros. This changeset also removes EIGEN_ALIGN (replaced by EIGEN_MAX_ALIGN_BYTES>0), EIGEN_ALIGN_STATICALLY (replaced by EIGEN_MAX_STATIC_ALIGN_BYTES>0), EIGEN_USER_ALIGN*, EIGEN_ALIGN_DEFAULT (replaced by EIGEN_ALIGN_MAX).
2015-07-29 10:22:25 +02:00
Benoit Steiner
6d6e6d0b88
Define EIGEN_VECTORIZE_AVX2 and EIGEN_VECTORIZE_FMA when the corresponding instructions can be used by the compiler
2015-07-22 18:22:16 -07:00
Jonas Adler
815fa0dbf6
Fixed some compiler bugs in NVCC, now compiles with CUDA.
...
(chtz: Manually joined sevaral commits to keep the history clean)
2015-07-22 12:29:18 +02:00
Nicolas Mellado
0d09845562
Revert files to remove EIGEN_USING_NUMEXT_MATH
2015-07-11 20:11:36 +02:00
Nicolas Mellado
5359e5cdb2
Protect against compilation errors with nvcc and numext/complex.
...
Disable functions explicitely involving std::complex when compiling with nvcc.
Improve code compatilibity using the new macro EIGEN_USING_NUMEXT_MATH (same spirit than EIGEN_USING_STD_MATH but for numext functions)
2015-07-06 20:55:01 +02:00
Gael Guennebaud
6fc5438205
Remove a few deprecated internal expressions
2015-06-19 17:06:12 +02:00
Benoit Jacob
051d5325cc
Abandon blocking size lookup table approach. Not performing as well in real world as in microbenchmark.
2015-05-19 11:03:59 -04:00
Pete Warden
140f85bb99
Check for the macro __ARM_NEON__ (with two underscores at the end) as well as __ARM_NEON. The second macro is correct according to the ARM language extensions specification, but historically the first one has been more common. Some older compilers (e.g. gcc v4.6 on a Beaglebone Black) only define the first, so without this patch NEON isn't enabled.
2015-05-12 16:03:43 -07:00
Benoit Steiner
d3f7915aeb
Pulled latest update from the eigen main codebase
2015-03-24 13:12:14 -07:00
Benoit Jacob
dc04f12967
use unsigned short instead of uint16_t which doesn't exist in c++98
2015-03-17 10:31:45 -04:00
Benoit Jacob
577056aa94
Include stdint.h. Not going for cstdint because it is a C++11 addition. Needed for uint16_t at least, in lookup-table code.
2015-03-16 16:21:50 -04:00
Benoit Jacob
02babb9c0f
Provide a empirical lookup table for blocking sizes measured on a Nexus 5. Only for float, only for Android on ARM 32bit for now.
2015-03-15 18:13:12 -04:00
Benoit Jacob
e56aabf205
Refactor computeProductBlockingSizes to make room for the possibility of using lookup tables
2015-03-15 18:05:12 -04:00
Benoit Steiner
573b377110
Added support for vectorized type casting of tensors
2015-02-27 08:46:04 -08:00
Benoit Steiner
e2cfddf75f
Pulled latest updates from trunk
2015-02-13 16:21:59 -08:00
Benoit Steiner
0927801a84
Optimized version of the sin(), exp(), log() and sqrt() function for AVX
2015-02-13 16:07:08 -08:00
Gael Guennebaud
0918c51e60
merge Tensor module within Eigen/unsupported and update gemv BLAS wrapper
2015-02-12 21:48:41 +01:00
Benoit Steiner
cc5d7ff523
Added vectorized implementation of the exponential function for ARM/NEON
2015-02-10 14:02:38 -08:00
Benoit Steiner
c739102ef9
Pulled the latest changes from the trunk
2015-02-06 05:25:03 -08:00
Benoit Jacob
0f21613698
bug #936 , patch 2/3: Remove EIGEN_VECTORIZE_FMA, was redundant with EIGEN_HAS_SINGLE_INSTRUCTION_MADD
2015-01-30 17:44:26 -05:00
Gael Guennebaud
ee06f78679
Introduce unified macros to identify compiler, OS, and architecture. They are all defined in util/Macros.h and prefixed with EIGEN_COMP_, EIGEN_OS_, and EIGEN_ARCH_ respectively.
2014-11-04 21:58:52 +01:00
Konstantinos Margaritis
79225db0b6
Merged in kmargar/eigen (pull request PR-87)
...
Extend NEON to add ARMv8 64-bit double support
2014-10-28 13:08:53 +02:00
Konstantinos Margaritis
94ed7c81e6
Bug #896 : Swap order of checking __VSX__/__ALTIVEC__
2014-10-22 06:15:18 -04:00
Konstantinos Margaritis
87524922dc
check for __ARM_NEON instead as it's defined in arm64 as well
2014-10-21 18:08:50 +00:00
Benoit Steiner
bbce6fa65d
define EIGEN_VECTORIZE_CUDA when compiling with nvcc
2014-10-03 19:55:35 -07:00
Benoit Steiner
95a430a2ca
Vector primitives for CUDA
2014-10-03 19:45:19 -07:00
Konstantinos Margaritis
60e093a9dc
Merged eigen/eigen into default
2014-09-21 14:02:51 +03:00
Gael Guennebaud
0ca43f7e9a
Remove deprecated code not used by evaluators
2014-09-18 15:15:27 +02:00
Konstantinos Margaritis
470aa15c35
First time it compiles, but fails to pass the tests.
2014-09-09 16:58:48 +00:00
Konstantinos Margaritis
7ff266e3ce
Initial VSX commit
2014-08-29 20:03:49 +00:00
Gael Guennebaud
c1d0f15bde
Enable evaluators by default
2014-08-29 15:31:32 +02:00
Gael Guennebaud
124d12a915
merge default branch
2014-08-29 15:20:31 +02:00
Christoph Hertzberg
eeadc06e83
EIGEN_EXCEPTIONS was not defined in test/main.h, therefore all VERIFY_RAISES_ASSERT tests were not enabled
2014-08-20 16:39:25 +02:00
Christoph Hertzberg
a8283e0ed2
Define EIGEN_TRY, EIGEN_CATCH, EIGEN_THROW as suggested by Moritz Klammer.
...
Make it possible to run unit-tests with exceptions disabled via EIGEN_TEST_NO_EXCEPTIONS flag.
Enhanced ctorleak unit-test
2014-07-22 13:16:44 +02:00
Gael Guennebaud
296cb40161
merge with default branch
2014-07-10 22:04:45 +02:00
Chen-Pang He
7a915f6846
Move Doxygen-only stuff to *.dox
2014-07-05 22:41:58 +08:00
Chen-Pang He
1a817d3b70
Document internal namespace
2014-07-05 21:50:05 +08:00
Gael Guennebaud
0a8e4712d1
Do not attempt to include <intrin.h> on Windows CE
2014-07-02 16:13:05 +02:00
Gael Guennebaud
61b88d2feb
merge with default branch
2014-07-02 09:35:37 +02:00
Christoph Hertzberg
324e7e8fc9
Removed the deprecated EIGEN2_SUPPORT, as previously announced. A compilation error is raised, if this compile-switch is defined. The documentation references to the corresponding pages from Eigen3.2 now. Also, the Eigen2 testsuite has been removed.
2014-07-01 16:58:11 +02:00
Gael Guennebaud
b29b81a1f4
merge with default branch
2014-06-20 15:55:44 +02:00
Gael Guennebaud
8d2bb2c20d
merge with default branch
2014-03-28 09:24:18 +01:00
Gael Guennebaud
052aedd394
Implement pcplflip, palign, predux and the likes from AVC/complexes
2014-03-27 14:47:00 +01:00
Mark Borgerding
9ce0d78513
immintrin.h did not come until intel version 11
2014-03-26 22:26:07 -04:00
Gael Guennebaud
a6be1952f4
Fix a few regression when moving the flags
2014-03-12 16:18:34 +01:00
Benoit Steiner
db7d49efbb
Added support for FMA instructions
2014-02-24 13:45:32 -08:00
Gael Guennebaud
cbc572caf7
Split LU/Inverse.h to Core/Inverse.h for the generic Inverse expression, and LU/InverseImpl.h for the dense implementation of dense.inverse()
2014-02-24 11:49:30 +01:00
Gael Guennebaud
6c7ab50811
Get rid of GeneralProduct<> for GemmProduct
2014-02-21 16:43:03 +01:00
Gael Guennebaud
61cff28618
Disable Flagged and ForceAlignedAccess
2014-02-19 14:05:56 +01:00
Gael Guennebaud
ccc41128fb
Add a Solve expression for uniform treatment of solve() methods.
2014-02-19 11:33:29 +01:00
Gael Guennebaud
a08cba6b5f
Move is_diagonal to XprHelper, forward declare Ref
2014-02-18 11:03:59 +01:00
Benoit Steiner
64a85800bd
Added support for AVX to Eigen.
2014-01-29 11:43:05 -08:00
Gael Guennebaud
8af1ba5346
Make swap unit test work with evaluators
2013-12-02 15:07:45 +01:00
Gael Guennebaud
cc6dd878ee
Refactor dense product evaluators
2013-11-27 17:32:57 +01:00
Gael Guennebaud
76c230a84d
Add an option to test evaluators globally
2013-11-07 16:38:14 +01:00
Gael Guennebaud
8edc964734
bug #99 : refactor assignment and compound assignment mechanism through "assignment functors" and "assignement kernels".
...
The former is very low level and generic. The later abstarct the former for dense expressions. This refactoring permits
to get rid of the very ugly SwapWrapper and SelfCwiseBinaryOp classes.
In the future, this will also permit to simplify all these evaluation loops and perhaps to reuse them for reduxions.
That will also permit to specialize for operations like expr1 += expr2 outside Eigen, and so for any kind
of expressions (dense, sparse, tensor, etc.)
2013-11-06 18:17:59 +01:00
Gael Guennebaud
03de5c2410
Split the huge Functors.h file
2013-11-06 10:36:10 +01:00
Gael Guennebaud
4f572e4c14
Add minimalistic unit tests for NVCC support
2013-11-05 15:41:45 +01:00
Gael Guennebaud
1bb1a57ef7
merge with default branch
2013-11-05 10:31:59 +01:00
Gael Guennebaud
2b15e00106
Make ArrayBase operator+=(scalar) and -=(scalar) use SelfCwiseBinaryOp optimization
2013-08-19 16:40:50 +02:00
Gael Guennebaud
ddf7753631
Add nvcc support for small eigenvalues decompositions and workaround lack of support for std::swap and std::numeric_limits
2013-08-01 16:26:57 +02:00
Gael Guennebaud
2f593ee67c
merge with main branch
2013-07-17 13:21:35 +02:00
Gael Guennebaud
cc03c9d683
bug #556 : workaround mingw bug with -O3 or -fipa-cp-clone
2013-07-05 23:47:40 +02:00
Gael Guennebaud
62670c83a0
Fix bug #314 : move remaining math functions from internal to numext namespace
2013-06-10 23:40:56 +02:00
Gael Guennebaud
9cd2d14005
merge with default branch
2013-04-19 11:21:39 +02:00
Gael Guennebaud
4e2e615a7c
actually assertion are incompatible with nvcc even on host code
2013-04-19 11:14:17 +02:00
Gael Guennebaud
6eaff5a098
Enable SSE with ICC even when it mimics a gcc version lower than 4.2
2013-04-11 19:48:34 +02:00
Gael Guennebaud
12439e1249
Port SelfCwiseBinaryOp and Dot.h to nvcc, fix portability issue with std::min/max
2013-04-05 16:35:49 +02:00
Gael Guennebaud
d93c1c113b
NVCC: EIGEN_NO_DEBUG must be defined before including Macro.h
2013-02-21 19:05:23 +01:00
Gael Guennebaud
968f7591f8
Make it compile without nvcc
2013-02-21 12:51:58 +01:00
Gael Guennebaud
5adcc6c7b4
Add support for NVCC5: most of the Core and part of LU are callable from CUDA code.
...
Still a lot to do.
2013-02-07 19:06:14 +01:00
Gael Guennebaud
209199a13e
Move the definition of DenseBase::InnerIterator to Core module. (needed to make blueNorm generic)
2013-01-15 22:03:54 +01:00
Gael Guennebaud
93ee82b1fd
Big changes in Eigen documentation:
...
- Organize the documentation into "chapters".
- Each chapter include many documentation pages, reference pages organized as modules, and a quick reference page.
- The "Chapters" tree is created using the defgroup/ingroup mechanism, even for the documentation pages (i.e., .dox files for which I added an \eigenManualPage macro that we can switch between \page or \defgroup ).
- Add a "General topics" entry for all pages that do not fit well in the previous "chapters".
- The highlevel struture is managed by a new eigendoxy_layout.xml file.
- remove the "index" and quite useless pages (namespace list, class hierarchy, member list, file list, etc.)
- add the javascript search-engine.
- add the "treeview" panel.
- remove \tableofcontents (replace them by a custom \eigenAutoToc macro to be able to easily re-enable if needed).
- add javascript to automatically generate a TOC from the h1/h2 tags of the current page, and put the TOC in the left side panel.
- overload various javascript function generated by doxygen to:
- remove the root of the treeview
- remove links to section/subsection from the treeview
- automatically expand the "Chapters" section
- automatically expand the current section
- adjust the height of the treeview to take into account the TOC
- always use the default .css file, eigendoxy.css now only includes our modifications
- use Doxyfile to specify our logo
- remove cross references to unsupported modules (temporarily)
2013-01-05 16:37:11 +01:00
Gael Guennebaud
87074d97e5
old gcc versions do not have immintrin.h file...
2012-09-27 23:35:54 +02:00
Gael Guennebaud
7e97dd5bd8
we should not directly include the *mmintrin.h headers but include immintrin.h only
2012-09-26 19:28:57 +02:00