Commit Graph

4850 Commits

Author SHA1 Message Date
Benoit Steiner
1f1e0b9e30 Silenced compilation warning 2016-06-05 12:59:11 -07:00
Benoit Steiner
5b95b4daf9 Moved static assertions into the class constructor to make the code more portable 2016-06-05 12:57:48 -07:00
Sean Templeton
bd21243821 Fix compile errors initializing packets on ARM DS-5 5.20
The ARM DS-5 5.20 compiler fails compiling with the following errors:

"src/Core/arch/NEON/PacketMath.h", line 113: Error:  #146: too many initializer values
    Packet4f countdown = EIGEN_INIT_NEON_PACKET4(0, 1, 2, 3);
                         ^
"src/Core/arch/NEON/PacketMath.h", line 118: Error:  #146: too many initializer values
    Packet4i countdown = EIGEN_INIT_NEON_PACKET4(0, 1, 2, 3);
                         ^
"src/Core/arch/NEON/Complex.h", line 30: Error:  #146: too many initializer values
  static uint32x4_t p4ui_CONJ_XOR = EIGEN_INIT_NEON_PACKET4(0x00000000, 0x80000000, 0x00000000, 0x80000000);
                                    ^
"src/Core/arch/NEON/Complex.h", line 31: Error:  #146: too many initializer values
  static uint32x2_t p2ui_CONJ_XOR = EIGEN_INIT_NEON_PACKET2(0x00000000, 0x80000000);
                                    ^

The vectors are implemented as two doubles, hence the too many initializer values error.
Changed the code to use intrinsic load functions which all compilers
implementing NEON should have.
2016-06-03 10:51:35 -05:00
Gael Guennebaud
1fc2746417 Make Arrays's ctor/assignment noexcept 2016-06-09 22:52:37 +02:00
Gael Guennebaud
2bd59b0e0d Take advantage that T is already diagonal in the extraction of generalized complex eigenvalues. 2016-06-09 17:12:03 +02:00
Gael Guennebaud
c1f9ca9254 Update RealQZ to reduce 2x2 diagonal block of T corresponding to non reduced diagonal block of S to positive diagonal form.
This step involve a real 2x2 SVD problem. The respective routine is thus in src/misc/ to be shared by both EVD and AVD modules.
2016-06-09 17:11:03 +02:00
Gael Guennebaud
a20d2ec1c0 Fix shadow variable, and indexing. 2016-06-09 16:16:22 +02:00
Abhijit Kundu
0beabb4776 Fixed type conversion from int 2016-06-08 16:12:04 -04:00
Gael Guennebaud
df095cab10 Fixes for PARDISO: warnings, and defaults to metis+ in-core mode. 2016-06-08 18:31:19 +02:00
Gael Guennebaud
9fc8379328 Fix extraction of complex eigenvalue pairs in real generalized eigenvalue problems. 2016-06-08 16:39:11 +02:00
Benoit Steiner
8fd57a97f2 Enable the vectorization of adds and mults of fp16 2016-06-07 18:22:18 -07:00
Benoit Steiner
ea75dba201 Added missing EIGEN_DEVICE_FUNC qualifiers to the unary array ops 2016-06-06 13:32:28 -07:00
Benoit Steiner
33f0340188 Implement result_of for the new ternary functors 2016-06-06 12:06:42 -07:00
Gael Guennebaud
df24f4a01d bug #1201: improve code generation of affine*vec with MSVC 2016-06-06 16:46:46 +02:00
Eugene Brevdo
39baff850c Add TernaryFunctors and the betainc SpecialFunction.
TernaryFunctors and their executors allow operations on 3-tuples of inputs.
API fully implemented for Arrays and Tensors based on binary functors.

Ported the cephes betainc function (regularized incomplete beta
integral) to Eigen, with support for CPU and GPU, floats, doubles, and
half types.

Added unit tests in array.cpp and cxx11_tensor_cuda.cu


Collapsed revision
* Merged helper methods for betainc across floats and doubles.
* Added TensorGlobalFunctions with betainc().  Removed betainc() from TensorBase.
* Clean up CwiseTernaryOp checks, change igamma_helper to cephes_helper.
* betainc: merge incbcf and incbd into incbeta_cfe.  and more cleanup.
* Update TernaryOp and SpecialFunctions (betainc) based on review comments.
2016-06-02 17:04:19 -07:00
Gael Guennebaud
8d97ba6b22 bug #725: make move ctor/assignment noexcept. 2016-06-03 14:28:25 +02:00
Gael Guennebaud
fe62c06d9b Fix compilation. 2016-06-03 07:47:38 +02:00
Gael Guennebaud
969b8959a0 Fix compilation: Matrix does not indirectly live in the internal namespace anymore! 2016-06-03 07:44:58 +02:00
Gael Guennebaud
f2c2465acc Fix function dependencies 2016-06-03 07:44:18 +02:00
Gael Guennebaud
53feb73b45 Remove dead code. 2016-06-02 22:19:55 +02:00
Gael Guennebaud
2c00ac0b53 Implement generic scalar*expr and expr*scalar operator based on scalar_product_traits.
This is especially useful for custom scalar types, e.g., to enable float*expr<multi_prec> without conversion.
2016-06-02 22:16:37 +02:00
Gael Guennebaud
8b6f53222b bug #1193: fix lpNorm<Infinity> for empty input. 2016-06-02 15:29:59 +02:00
Gael Guennebaud
360e311b66 Doc: add some cross references (also fix empty macro argument warning) 2016-06-01 23:34:09 +02:00
Gael Guennebaud
3c69afca4c Add missing ArrayBase::log1p 2016-06-01 17:08:47 +02:00
Gael Guennebaud
89099b0cf7 Expose log1p to Array. 2016-06-01 17:00:08 +02:00
Gael Guennebaud
afd33539dd Doc: makes the global unary math functions visible to doxygen (and docuement them) 2016-06-01 15:27:13 +02:00
Gael Guennebaud
77e652d8ad Doc: improve documentation of Map<SparseMatrix> 2016-06-01 10:03:32 +02:00
Gael Guennebaud
da4970ead2 Doc: disable inlining of inherited members, workaround Doxygen's limited C++ parsing abilities, and improve doc of MapBase. 2016-06-01 09:38:49 +02:00
Benoit Steiner
099b354ca7 Pulled latest updates from trunk 2016-05-31 10:34:16 -07:00
Benoit Steiner
b6e306f189 Improved support for CUDA 8.0 2016-05-31 09:47:59 -07:00
Gael Guennebaud
1d3b253329 bug #1181: help MSVC inlining. 2016-05-31 17:23:42 +02:00
Gael Guennebaud
d79eee05ef Fix compilation with old icc 2016-05-31 17:13:51 +02:00
Gael Guennebaud
2c1b56f4c1 bug #1238: fix SparseMatrix::sum() overload for un-compressed mode. 2016-05-31 10:56:53 +02:00
Benoit Steiner
c4bd3b1f21 Silenced some compilation warnings triggered by nvcc 8.0 2016-05-27 14:40:49 -07:00
Benoit Steiner
3a5d6a3c38 Disable the use of MMX instructions since the code is broken on many platforms 2016-05-27 09:13:26 -07:00
Gael Guennebaud
e0cb73b46b Fix compilation with old ICC version (use C99 types instead of C++11 ones) 2016-05-27 10:28:09 +02:00
Benoit Steiner
094f4a56c8 Deleted extra namespace 2016-05-26 14:49:51 -07:00
Gael Guennebaud
7ff5fadcc0 Disable usage of MMX with msvc. 2016-05-26 17:58:46 +02:00
Gael Guennebaud
e8cef383b7 bug #1236: fix possible integer overflow in density estimation. 2016-05-26 17:51:04 +02:00
Gael Guennebaud
30d97c03ce Defer the allocation of the working space:
- it is not always needed,
- and this fixes a long-to-float conversion warning
2016-05-26 17:39:42 +02:00
Gael Guennebaud
e08f54e9eb Fix copy ctor prototype. 2016-05-26 17:37:25 +02:00
Gael Guennebaud
c7f54b11ec linspaced's divisor for integer is better stored as the underlying scalar type. 2016-05-26 17:36:54 +02:00
Gael Guennebaud
bebc5a2147 Fix/handle some int-to-long conversions. 2016-05-26 17:35:53 +02:00
Gael Guennebaud
00c29c2cae Store permutation's determinant as char.
This also fixes some long to float conversion warnings
2016-05-26 17:34:23 +02:00
Gael Guennebaud
2f56d91063 Fix a pointer to integer conversion warning 2016-05-26 17:31:45 +02:00
Gael Guennebaud
2a44a70142 Handle some Index to int conversions in BLAS/LAPACK support. 2016-05-26 17:29:04 +02:00
Gael Guennebaud
f253e19296 Disable some long to float conversion warnings 2016-05-26 17:27:14 +02:00
Gael Guennebaud
37197b602b Remove debuging code. 2016-05-26 11:53:10 +02:00
Gael Guennebaud
27f0434233 Introduce internal's UIntPtr and IntPtr types for pointer to integer conversions.
This fixes "conversion from pointer to same-sized integral type" warnings by ICC.
Ideally, we would use the std::[u]intptr_t types all the time, but since they are C99/C++11 only,
let's be safe.
2016-05-26 10:52:12 +02:00
Gael Guennebaud
40e4637d79 Turn off ICC's conversion warning in is_convertible implementation 2016-05-26 10:48:43 +02:00
Gael Guennebaud
cc1ab64f29 Add missing inclusion of mmintrin.h 2016-05-26 09:51:50 +02:00
Benoit Steiner
3585ff585e Silenced a compilation warning 2016-05-25 22:09:19 -07:00
Benoit Steiner
efeb89dcdb Specify the rounding mode in the correct location 2016-05-25 17:53:24 -07:00
Benoit Steiner
0322c66a3f Explicitly specify the rounding mode when converting floats to fp16 2016-05-25 15:56:15 -07:00
Benoit Steiner
ed783872ab Disable the use of MMX instructions on x86_64 since too many compilers only support them in 32bit mode 2016-05-25 08:27:26 -07:00
Benoit Steiner
bcfff64f9e Use numext:: instead of std:: functions. 2016-05-25 08:08:21 -07:00
Gael Guennebaud
bbf9109e25 Fix compilation with ICC. 2016-05-25 10:00:55 +02:00
Gael Guennebaud
2a1bff67fd Fix static/inline order. 2016-05-25 10:00:11 +02:00
Benoit Steiner
d041a528da Cleaned up the fp16 code a little more 2016-05-24 22:43:26 -07:00
Benoit Steiner
cb26784d07 Pulled latest updates from trunk 2016-05-24 18:51:39 -07:00
Benoit Steiner
ff4a289572 Cleaned up the fp16 code 2016-05-24 18:50:09 -07:00
Gael Guennebaud
e68e165a23 bug #256: enable vectorization with unaligned loads/stores.
This concerns all architectures and all sizes.
This new behavior can be disabled by defining EIGEN_UNALIGNED_VECTORIZE=0
2016-05-24 21:54:03 +02:00
Gael Guennebaud
78390e4189 Block<> should not disable vectorization based on inner-size, this is the responsibilty of the assignment logic. 2016-05-24 17:14:01 +02:00
Gael Guennebaud
64bb7576eb Clean propagation of Dest/Src alignments. 2016-05-24 17:12:12 +02:00
Benoit Jacob
40a16282c7 Remove now-unused protate PacketMath func 2016-05-24 11:01:18 -04:00
Benoit Jacob
6136f4fdd4 Remove the rotating kernel. It was only useful on some ARM CPUs (Qualcomm Krait) that are not as ubiquitous today as they were when I introduced it. 2016-05-24 10:00:32 -04:00
Benoit Steiner
e617711306 Don't attempt to use MMX instructions with visualstudio since they're only partially supported. 2016-05-24 06:43:58 -07:00
Benoit Steiner
334e76537f Worked around missing clang intrinsic 2016-05-24 00:29:28 -07:00
Benoit Steiner
b517ab349b Use the generic ploadquad intrinsics since it does the job 2016-05-24 00:11:17 -07:00
Benoit Steiner
646872cb3b Worked around missing clang intrinsics 2016-05-24 00:07:08 -07:00
Benoit Steiner
3dfc391a61 Added missing EIGEN_DEVICE_FUNC qualifier 2016-05-23 20:56:59 -07:00
Benoit Steiner
3d0741f027 Include mmintrin.h to make it possible to use mmx instructions when needed. For example, this will enable the definition of a half packet for the Packet4f type. 2016-05-23 20:43:48 -07:00
Benoit Steiner
33a94f5dc7 Use the Index type instead of integers to specify the strides in pgather/pscatter 2016-05-23 20:37:30 -07:00
Benoit Steiner
6bc684ab6a Added missing alignment in the fp16 packet traits 2016-05-23 20:32:30 -07:00
Benoit Steiner
283e33dea4 ptranspose is not a template. 2016-05-23 19:55:55 -07:00
Benoit Steiner
a5a3ba2b80 Avoid unnecessary float to double conversions 2016-05-23 17:16:09 -07:00
Benoit Steiner
5ba0ebe7c9 Avoid unnecessary float to double conversion. 2016-05-23 17:14:31 -07:00
Benoit Steiner
7d980d74e5 Started to vectorize the processing of 16bit floats on CPU. 2016-05-23 15:21:40 -07:00
Benoit Steiner
5d51a7f12c Don't optimize the processing of the last rows of a matrix matrix product in cases that violate the assumptions made by the optimized code path. 2016-05-23 15:13:16 -07:00
Christoph Hertzberg
88654762da Replace multiple constructors of half-type by a generic/templated constructor. This fixes an incompatibility with long double, exposed by the previous commit. 2016-05-23 10:03:03 +02:00
Christoph Hertzberg
718521d5cf Silenced several double-promotion warnings 2016-05-22 18:17:04 +02:00
Gael Guennebaud
ccaace03c9 Make EIGEN_HAS_CONSTEXPR user configurable 2016-05-20 15:10:08 +02:00
Gael Guennebaud
c3410804cd Make EIGEN_HAS_VARIADIC_TEMPLATES user configurable 2016-05-20 15:05:38 +02:00
Gael Guennebaud
abd1c1af7a Make EIGEN_HAS_STD_RESULT_OF user configurable 2016-05-20 15:01:27 +02:00
Gael Guennebaud
1395056fc0 Make EIGEN_HAS_C99_MATH user configurable 2016-05-20 14:58:19 +02:00
Gael Guennebaud
48bf5ec216 Make EIGEN_HAS_RVALUE_REFERENCES user configurable 2016-05-20 14:54:20 +02:00
Gael Guennebaud
f43ae88892 Rename EIGEN_HAVE_RVALUE_REFERENCES to EIGEN_HAS_RVALUE_REFERENCES 2016-05-20 14:48:51 +02:00
Gael Guennebaud
998f2efc58 Add a EIGEN_MAX_CPP_VER option to limit the C++ version to be used. 2016-05-20 14:44:28 +02:00
Gael Guennebaud
c028d96089 Improve doc of special math functions 2016-05-20 14:18:48 +02:00
Gael Guennebaud
0ba32f99bd Rename UniformRandom to UnitRandom. 2016-05-20 13:21:34 +02:00
Gael Guennebaud
7a9d9cde94 Fix coding practice in Quaternion::UniformRandom 2016-05-20 13:19:52 +02:00
Joseph Mirabel
eb0cc2573a bug #823: add static method to Quaternion for uniform random rotations. 2016-05-20 13:15:40 +02:00
Gael Guennebaud
6761c64d60 zeta and polygamma are not unary functions, but binary ones. 2016-05-19 18:34:16 +02:00
Gael Guennebaud
7a54032408 zeta and digamma do not require C++11/C99 2016-05-19 17:36:47 +02:00
Gael Guennebaud
ce12562710 Add some c++11 flags in documentation 2016-05-19 17:35:30 +02:00
Gael Guennebaud
b6ed8244b4 bug #1201: optimize affine*vector products 2016-05-19 16:09:15 +02:00
Gael Guennebaud
73693b5de6 bug #1221: disable gcc 6 warning: ignoring attributes on template argument 2016-05-19 15:21:53 +02:00
Gael Guennebaud
df9a5e13c6 Fix SelfAdjointEigenSolver for some input expression types, and add new regression unit tests for sparse and selfadjointview inputs. 2016-05-19 13:07:33 +02:00
Gael Guennebaud
6a2916df80 DiagonalWrapper is a vector, so it must expose the LinearAccessBit flag. 2016-05-19 13:06:21 +02:00
Gael Guennebaud
a226f6af6b Add support for SelfAdjointView::diagonal() 2016-05-19 13:05:33 +02:00
Gael Guennebaud
ee7da3c7c5 Fix SelfAdjointView::triangularView for complexes. 2016-05-19 13:01:51 +02:00
Gael Guennebaud
b6b8578a67 bug #1230: add support for SelfadjointView::triangularView. 2016-05-19 11:36:38 +02:00
Gael Guennebaud
84df9142e7 bug #1231: fix compilation regression regarding complex_array/=real_array and add respective unit tests 2016-05-18 23:00:13 +02:00
Gael Guennebaud
21d692d054 Use coeff(i,j) instead of operator(). 2016-05-18 17:09:20 +02:00
Gael Guennebaud
8456bbbadb bug #1224: fix regression in (dense*dense).sparseView() by specializing evaluator<SparseView<Product>> for sparse products only. 2016-05-18 16:53:28 +02:00
Gael Guennebaud
b507b82326 Use default sorting strategy for square products. 2016-05-18 16:51:54 +02:00
Gael Guennebaud
747e3290c0 bug #1213: rename some enums type for consistency. 2016-05-18 13:26:56 +02:00
Rasmus Munk Larsen
0dbd68145f Roll back changes to core. Move include of TensorFunctors.h up to satisfy dependence in TensorCostModel.h. 2016-05-17 10:25:19 -07:00
Rasmus Munk Larsen
e55deb21c5 Improvements to parallelFor.
Move some scalar functors from TensorFunctors. to Eigen core.
2016-05-12 14:07:22 -07:00
Benoit Steiner
fae0493f98 Fixed a couple of bugs related to the Pascalfamily of GPUs
H: Enter commit message.  Lines beginning with 'HG:' are removed.
2016-05-11 23:02:26 -07:00
Benoit Steiner
b6a517c47d Added the ability to load fp16 using the texture path.
Improved the performance of some reductions on fp16
2016-05-11 21:26:48 -07:00
Benoit Steiner
518149e868 Misc fixes for fp16 2016-05-11 20:11:14 -07:00
Benoit Steiner
56a1757d74 Made predux_min and predux_max on fp16 less noisy 2016-05-11 17:37:34 -07:00
Benoit Steiner
9091351dbe __ldg is only available with cuda architectures >= 3.5 2016-05-11 15:22:13 -07:00
Benoit Steiner
02f76dae2d Fixed a typo 2016-05-11 15:08:38 -07:00
Christoph Hertzberg
131e5a1a4a Do not copy for trivial 1x1 case. This also avoids a "maybe-uninitialized" warning in some situations. 2016-05-11 23:50:13 +02:00
Benoit Steiner
70195a5ff7 Added missing EIGEN_DEVICE_FUNC 2016-05-11 14:10:09 -07:00
Benoit Steiner
09a19c33a8 Added missing EIGEN_DEVICE_FUNC qualifiers 2016-05-11 14:07:43 -07:00
Christoph Hertzberg
33ca7e3c8d bug #1207: Add and fix logical-op warnings 2016-05-11 19:36:34 +02:00
Benoit Steiner
217d984abc Fixed a typo in my previous commit 2016-05-11 10:22:15 -07:00
Christoph Hertzberg
0f61343893 Workaround maybe-uninitialized warning 2016-05-11 09:00:18 +02:00
Christoph Hertzberg
3bfc9b47ca Workaround "misleading-indentation" warnings 2016-05-11 08:41:36 +02:00
Benoit Steiner
0b9e3dcd06 Added packet primitives to compute exp, log, sqrt and rsqrt on fp16. This improves the performance by 10 to 30%. 2016-05-10 11:05:33 -07:00
Benoit Steiner
8adf5cc70f Added support for packet processing of fp16 on kepler and maxwell gpus 2016-05-06 19:16:43 -07:00
Christoph Hertzberg
a11bd82dc3 bug #1213: Give names to anonymous enums 2016-05-06 11:31:56 +02:00
Benoit Steiner
0451940fa4 Relaxed the dummy precision for fp16 2016-05-05 15:40:01 -07:00
Christoph Hertzberg
dacb469bc9 Enable and fix -Wdouble-conversion warnings 2016-05-05 13:35:45 +02:00
Ola Røer Thorsen
be78aea6b3 fix double-promotion/float-conversion in Core/SpecialFunctions.h 2016-05-04 10:52:08 +02:00
Gael Guennebaud
75a94b9662 Improve documentation of BDCSVD 2016-05-04 12:53:14 +02:00
Gael Guennebaud
e2ca478485 bug #1214: consider denormals as zero in D&C SVD. This also workaround infinite binary search when compiling with ICC's unsafe optimizations. 2016-05-03 23:15:29 +02:00
Benoit Steiner
6c3e5b85bc Fixed compilation error with cuda >= 7.5 2016-05-03 09:38:42 -07:00
Benoit Steiner
da50419df8 Made a cast explicit 2016-05-02 19:50:22 -07:00
Gael Guennebaud
b1bd53aa6b Fix performance regression: with AVX, unaligned stores were emitted instead of aligned ones for fixed size assignement. 2016-05-01 23:25:06 +02:00
Benoit Steiner
2b890ae618 Fixed compilation errors generated by clang 2016-04-29 18:30:40 -07:00
Benoit Steiner
46bcb70969 Don't turn on const expressions when compiling with gcc >= 4.8 unless the -std=c++11 option has been used 2016-04-29 15:20:59 -07:00
Gael Guennebaud
0f3c4c8ff4 Fix compilation of sparse.cast<>().transpose(). 2016-04-29 18:26:08 +02:00
Benoit Steiner
dacb23277e Fixed the igamma and igammac implementations to make them callable from a gpu kernel. 2016-04-28 18:54:54 -07:00
Benoit Steiner
a5d4545083 Deleted unused variable 2016-04-28 14:14:48 -07:00
Justin Lebar
40d1e2f8c7 Eliminate mutual recursion in igamma{,c}_impl::Run.
Presently, igammac_impl::Run calls igamma_impl::Run, which in turn calls
igammac_impl::Run.

This isn't actually mutual recursion; the calls are guarded such that we never
get into a loop.  Nonetheless, it's a stretch for clang to prove this.  As a
result, clang emits a recursive call in both igammac_impl::Run and
igamma_impl::Run.

That this is suboptimal code is bad enough, but it's particularly bad when
compiling for CUDA/nvptx.  nvptx allows recursion, but only begrudgingly: If
you have recursive calls in a kernel, it's on you to manually specify the
kernel's stack size.  Otherwise, ptxas will dump a warning, make a guess, and
who knows if it's right.

This change explicitly eliminates the mutual recursion in igammac_impl::Run and
igamma_impl::Run.
2016-04-28 13:57:08 -07:00
Konstantinos Margaritis
87294c84a6 define Packet2d constants with VSX only 2016-04-28 14:39:56 -03:00
Konstantinos Margaritis
6ed7a7281c remove accidentally pasted code 2016-04-28 14:35:55 -03:00
Konstantinos Margaritis
62f9093b31 improve state of MathFunctions as well 2016-04-28 14:33:09 -03:00
Konstantinos Margaritis
8ed26120c8 bring Altivec/VSX to a better state, implement some of the missing functions 2016-04-28 14:32:42 -03:00
Konstantinos Margaritis
950158f6d1 add name to copyrights 2016-04-28 14:32:11 -03:00
Konstantinos Margaritis
ee0459300b minor fix, add to copyright 2016-04-28 14:31:21 -03:00
Benoit Steiner
2b917291d9 Merged in rmlarsen/eigen2 (pull request PR-183)
Detect cxx_constexpr support when compiling with clang.
2016-04-27 15:19:54 -07:00
Rasmus Munk Larsen
09b9e951e3 Depend on the more extensive support for constexpr in clang:
http://clang.llvm.org/docs/LanguageExtensions.html#c-1y-relaxed-constexpr
2016-04-27 14:59:11 -07:00
Rasmus Munk Larsen
1a325ef71c Detect cxx_constexpr support when compiling with clang. 2016-04-27 14:33:51 -07:00
Benoit Steiner
c61170e87d fpclassify isn't portable enough. In particular, the return values of the function are not available on all the platforms Eigen supportes: remove it from Eigen. 2016-04-27 14:22:20 -07:00
Benoit Steiner
f629fe95c8 Made the index type a template parameter to evaluateProductBlockingSizes
Use numext::mini and numext::maxi instead of std::min/std::max to compute blocking sizes.
2016-04-27 13:11:19 -07:00