Benoit Jacob
38fb606052
Avoid global variables with static constructors in NEON/Complex.h
2016-06-28 11:12:49 -04:00
Gael Guennebaud
d937a420a2
Fix compilation with MSVC by using our portable numext::log1p implementation.
2016-08-22 15:44:21 +02:00
Gael Guennebaud
2d5731e40a
bug #1270 : bypass custom asm for pmadd and recent clang version
2016-08-22 15:38:03 +02:00
Gael Guennebaud
49b005181a
Define EIGEN_COMP_CLANG to clang version as major*100+minor (e.g., 307 corresponds to clang 3.7)
2016-08-22 15:37:05 +02:00
Gael Guennebaud
130f891bb0
bug #1278 : ease parsing
2016-08-22 15:00:29 +02:00
Gael Guennebaud
d476cadbb8
bug #1247 : fix regression in compilation of pow(integer,integer), and add respective unit tests.
2016-06-25 10:12:06 +02:00
Gael Guennebaud
c50c73cae2
Fix missing specialization.
2016-06-24 23:10:39 +02:00
Gael Guennebaud
cd577a275c
Relax promote_scalar_arg logic to enable promotion to Expr::Scalar if conversion to Expr::Literal fails.
...
This is useful to cancel expression template at the scalar level, e.g. with AutoDiff<AutoDiff<>>.
This patch also defers calls to NumTraits in cases for which types are not directly compatible.
2016-06-24 11:28:54 +02:00
Gael Guennebaud
deb45ad4bc
bug #1245 : fix compilation with msvc
2016-06-24 09:52:25 +02:00
Gael Guennebaud
55fc04e8b5
Fix operator priority
2016-06-23 15:36:42 +02:00
Gael Guennebaud
bf2d5edecc
Fix warning.
2016-06-23 15:35:17 +02:00
Gael Guennebaud
7c6561485a
merge PR 194
2016-06-23 15:29:57 +02:00
Konstantinos Margaritis
be107e387b
fix compilation with clang 3.9, fix performance with pset1, use vector operators instead of intrinsics in some cases
2016-06-23 10:19:05 -03:00
Gael Guennebaud
76faf4a965
Introduce a NumTraits<T>::Literal type to be used for literals, and
...
improve mixing type support in operations between arrays and scalars:
- 2 * ArrayXcf is now optimized in the sense that the integer 2 is properly promoted to a float instead of a complex<float> (fix a regression)
- 2.1 * ArrayXi is now forbiden (previously, 2.1 was converted to 2)
- This mechanism should be applicable to any custom scalar type, assuming NumTraits<T>::Literal is properly defined (it defaults to T)
2016-06-23 14:27:20 +02:00
Gael Guennebaud
a3f7edf7e7
Biug 1242: fix comma init with empty matrices.
2016-06-23 10:25:04 +02:00
Konstantinos Margaritis
8c34b5a0e3
mostly cleanups and modernizing code
2016-06-19 16:13:17 -03:00
Konstantinos Margaritis
b410d46482
mostly cleanups and modernizing code
2016-06-19 16:12:52 -03:00
Konstantinos Margaritis
b80379bda0
fixed pexp<Packet2d>, was failing tests
2016-06-19 16:11:58 -03:00
Benoit Steiner
b055590e91
Made log1p_impl usable inside a GPU kernel
2016-06-16 11:37:40 -07:00
Gael Guennebaud
67c12531e5
Fix warnings with gcc
2016-06-15 18:11:33 +02:00
Gael Guennebaud
eb91345d64
Move scalar/expr to ArrayBase and fix documentation
2016-06-15 15:22:03 +02:00
Gael Guennebaud
4794834397
Propagate functor to ScalarBinaryOpTraits
2016-06-15 09:58:49 +02:00
Gael Guennebaud
c55035b9c0
Include the cost of stores in unrolling of triangular expressions.
2016-06-15 09:57:33 +02:00
Gael Guennebaud
4e7c3af874
Cleanup useless helper: internal::product_result_scalar
2016-06-15 00:04:10 +02:00
Gael Guennebaud
101ea26f5e
Include the cost of stores in unrolling (also fix infinite unrolling with expression costing 0 like Constant)
2016-06-15 00:01:16 +02:00
Gael Guennebaud
76236cdea4
merge
2016-06-14 15:33:47 +02:00
Gael Guennebaud
1004c4df99
Cleanup unused functors.
2016-06-14 15:27:28 +02:00
Gael Guennebaud
70dad84b73
Generalize expr/expr and scalar/expr wrt scalar types.
2016-06-14 15:26:37 +02:00
Gael Guennebaud
62134082aa
Update AutoDiffScalar wrt to scalar-multiple.
2016-06-14 15:06:35 +02:00
Gael Guennebaud
396d9cfb6e
Generalize expr.pow(scalar), pow(expr,scalar) and pow(scalar,expr).
...
Internal: scalar_pow_op (unary) is removed, and scalar_binary_pow_op is renamed scalar_pow_op.
2016-06-14 14:10:07 +02:00
Gael Guennebaud
a8c08e8b8e
Implement expr+scalar, scalar+expr, expr-scalar, and scalar-expr as binary expressions, and generalize supported scalar types.
...
The following functors are now deprecated: scalar_add_op, scalar_sub_op, and scalar_rsub_op.
2016-06-14 12:06:10 +02:00
Gael Guennebaud
756ac4a93d
Fix doc.
2016-06-14 12:03:39 +02:00
Gael Guennebaud
bcc0f38f98
Add unittesting plugins to scalar_product_op and scalar_quotient_op to help chaking that types are properly propagated.
2016-06-14 11:31:27 +02:00
Gael Guennebaud
f57fd78e30
Generalize coeff-wise sparse products to support different scalar types
2016-06-14 11:29:54 +02:00
Gael Guennebaud
f5b1c73945
Set cost of constant expression to 0 (the cost should be amortized through the expression)
2016-06-14 11:29:06 +02:00
Gael Guennebaud
deb8306e60
Move MatrixBase::operaotr*(UniformScaling) as a free function in Scaling.h, and fix return type.
2016-06-14 11:28:03 +02:00
Gael Guennebaud
64fcfd314f
Implement scalar multiples and division by a scalar as a binary-expression with a constant expression.
...
This slightly complexifies the type of the expressions and implies that we now have to distinguish between scalar*expr and expr*scalar to catch scalar-multiple expression (e.g., see BlasUtil.h), but this brings several advantages:
- it makes it clear on each side the scalar is applied,
- it clearly reflects that we are dealing with a binary-expression,
- the complexity of the type is hidden through macros defined at the end of Macros.h,
- distinguishing between "scalar op expr" and "expr op scalar" is important to support non commutative fields (like quaternions)
- "scalar op expr" is now fully equivalent to "ConstantExpr(scalar) op expr"
- scalar_multiple_op, scalar_quotient1_op and scalar_quotient2_op are not used anymore in officially supported modules (still used in Tensor)
2016-06-14 11:26:57 +02:00
Gael Guennebaud
3c12e24164
Add bind1st_op and bind2nd_op helpers to turn binary functors into unary ones, and implement scalar_multiple2 and scalar_quotient2 on top of them.
2016-06-13 16:18:59 +02:00
Gael Guennebaud
7a9ef7bbb4
Add default template parameters for the second scalar type of binary functors.
...
This enhences backward compatibility.
2016-06-13 16:17:23 +02:00
Gael Guennebaud
4c61f00838
Add missing explicit scalar conversion
2016-06-12 22:42:13 +02:00
Gael Guennebaud
83904a21c1
Make sure T(i+1,i)==0 when diagonalizing T(i:i+1,i:i+1)
2016-06-11 14:41:36 +02:00
Gael Guennebaud
fabae6c9a1
Cleanup
2016-06-10 15:58:33 +02:00
Gael Guennebaud
5fdd703629
Enable mixing types in numext::pow
2016-06-10 15:58:04 +02:00
Gael Guennebaud
2e238bafb6
Big 279: enable mixing types for comparisons, min, and max.
2016-06-10 15:05:43 +02:00
Gael Guennebaud
0028049380
bug #1240 : Remove any assumption on NEON vector types.
2016-06-09 23:08:11 +02:00
Gael Guennebaud
2c462f4201
Clean handling for void type in EIGEN_CHECK_BINARY_COMPATIBILIY
2016-06-06 23:11:38 +02:00
Gael Guennebaud
3d71d3918e
Disable shortcuts for res ?= prod when the scalar types do not match exactly.
2016-06-06 23:10:55 +02:00
Gael Guennebaud
66e99ab6a1
Relax mixing-type constraints for binary coefficient-wise operators:
...
- Replace internal::scalar_product_traits<A,B> by Eigen::ScalarBinaryOpTraits<A,B,OP>
- Remove the "functor_is_product_like" helper (was pretty ugly)
- Currently, OP is not used, but it is available to the user for fine grained tuning
- Currently, only the following operators have been generalized: *,/,+,-,=,*=,/=,+=,-=
- TODO: generalize all other binray operators (comparisons,pow,etc.)
- TODO: handle "scalar op array" operators (currently only * is handled)
- TODO: move the handling of the "void" scalar type to ScalarBinaryOpTraits
2016-06-06 15:11:41 +02:00
Benoit Steiner
1f1e0b9e30
Silenced compilation warning
2016-06-05 12:59:11 -07:00
Benoit Steiner
5b95b4daf9
Moved static assertions into the class constructor to make the code more portable
2016-06-05 12:57:48 -07:00
Sean Templeton
bd21243821
Fix compile errors initializing packets on ARM DS-5 5.20
...
The ARM DS-5 5.20 compiler fails compiling with the following errors:
"src/Core/arch/NEON/PacketMath.h", line 113: Error: #146 : too many initializer values
Packet4f countdown = EIGEN_INIT_NEON_PACKET4(0, 1, 2, 3);
^
"src/Core/arch/NEON/PacketMath.h", line 118: Error: #146 : too many initializer values
Packet4i countdown = EIGEN_INIT_NEON_PACKET4(0, 1, 2, 3);
^
"src/Core/arch/NEON/Complex.h", line 30: Error: #146 : too many initializer values
static uint32x4_t p4ui_CONJ_XOR = EIGEN_INIT_NEON_PACKET4(0x00000000, 0x80000000, 0x00000000, 0x80000000);
^
"src/Core/arch/NEON/Complex.h", line 31: Error: #146 : too many initializer values
static uint32x2_t p2ui_CONJ_XOR = EIGEN_INIT_NEON_PACKET2(0x00000000, 0x80000000);
^
The vectors are implemented as two doubles, hence the too many initializer values error.
Changed the code to use intrinsic load functions which all compilers
implementing NEON should have.
2016-06-03 10:51:35 -05:00
Gael Guennebaud
1fc2746417
Make Arrays's ctor/assignment noexcept
2016-06-09 22:52:37 +02:00
Gael Guennebaud
2bd59b0e0d
Take advantage that T is already diagonal in the extraction of generalized complex eigenvalues.
2016-06-09 17:12:03 +02:00
Gael Guennebaud
c1f9ca9254
Update RealQZ to reduce 2x2 diagonal block of T corresponding to non reduced diagonal block of S to positive diagonal form.
...
This step involve a real 2x2 SVD problem. The respective routine is thus in src/misc/ to be shared by both EVD and AVD modules.
2016-06-09 17:11:03 +02:00
Gael Guennebaud
a20d2ec1c0
Fix shadow variable, and indexing.
2016-06-09 16:16:22 +02:00
Abhijit Kundu
0beabb4776
Fixed type conversion from int
2016-06-08 16:12:04 -04:00
Gael Guennebaud
df095cab10
Fixes for PARDISO: warnings, and defaults to metis+ in-core mode.
2016-06-08 18:31:19 +02:00
Gael Guennebaud
9fc8379328
Fix extraction of complex eigenvalue pairs in real generalized eigenvalue problems.
2016-06-08 16:39:11 +02:00
Benoit Steiner
8fd57a97f2
Enable the vectorization of adds and mults of fp16
2016-06-07 18:22:18 -07:00
Benoit Steiner
ea75dba201
Added missing EIGEN_DEVICE_FUNC qualifiers to the unary array ops
2016-06-06 13:32:28 -07:00
Benoit Steiner
33f0340188
Implement result_of for the new ternary functors
2016-06-06 12:06:42 -07:00
Gael Guennebaud
df24f4a01d
bug #1201 : improve code generation of affine*vec with MSVC
2016-06-06 16:46:46 +02:00
Eugene Brevdo
39baff850c
Add TernaryFunctors and the betainc SpecialFunction.
...
TernaryFunctors and their executors allow operations on 3-tuples of inputs.
API fully implemented for Arrays and Tensors based on binary functors.
Ported the cephes betainc function (regularized incomplete beta
integral) to Eigen, with support for CPU and GPU, floats, doubles, and
half types.
Added unit tests in array.cpp and cxx11_tensor_cuda.cu
Collapsed revision
* Merged helper methods for betainc across floats and doubles.
* Added TensorGlobalFunctions with betainc(). Removed betainc() from TensorBase.
* Clean up CwiseTernaryOp checks, change igamma_helper to cephes_helper.
* betainc: merge incbcf and incbd into incbeta_cfe. and more cleanup.
* Update TernaryOp and SpecialFunctions (betainc) based on review comments.
2016-06-02 17:04:19 -07:00
Gael Guennebaud
8d97ba6b22
bug #725 : make move ctor/assignment noexcept.
2016-06-03 14:28:25 +02:00
Gael Guennebaud
fe62c06d9b
Fix compilation.
2016-06-03 07:47:38 +02:00
Gael Guennebaud
969b8959a0
Fix compilation: Matrix does not indirectly live in the internal namespace anymore!
2016-06-03 07:44:58 +02:00
Gael Guennebaud
f2c2465acc
Fix function dependencies
2016-06-03 07:44:18 +02:00
Gael Guennebaud
53feb73b45
Remove dead code.
2016-06-02 22:19:55 +02:00
Gael Guennebaud
2c00ac0b53
Implement generic scalar*expr and expr*scalar operator based on scalar_product_traits.
...
This is especially useful for custom scalar types, e.g., to enable float*expr<multi_prec> without conversion.
2016-06-02 22:16:37 +02:00
Gael Guennebaud
8b6f53222b
bug #1193 : fix lpNorm<Infinity> for empty input.
2016-06-02 15:29:59 +02:00
Gael Guennebaud
360e311b66
Doc: add some cross references (also fix empty macro argument warning)
2016-06-01 23:34:09 +02:00
Gael Guennebaud
3c69afca4c
Add missing ArrayBase::log1p
2016-06-01 17:08:47 +02:00
Gael Guennebaud
89099b0cf7
Expose log1p to Array.
2016-06-01 17:00:08 +02:00
Gael Guennebaud
afd33539dd
Doc: makes the global unary math functions visible to doxygen (and docuement them)
2016-06-01 15:27:13 +02:00
Gael Guennebaud
77e652d8ad
Doc: improve documentation of Map<SparseMatrix>
2016-06-01 10:03:32 +02:00
Gael Guennebaud
da4970ead2
Doc: disable inlining of inherited members, workaround Doxygen's limited C++ parsing abilities, and improve doc of MapBase.
2016-06-01 09:38:49 +02:00
Benoit Steiner
099b354ca7
Pulled latest updates from trunk
2016-05-31 10:34:16 -07:00
Benoit Steiner
b6e306f189
Improved support for CUDA 8.0
2016-05-31 09:47:59 -07:00
Gael Guennebaud
1d3b253329
bug #1181 : help MSVC inlining.
2016-05-31 17:23:42 +02:00
Gael Guennebaud
d79eee05ef
Fix compilation with old icc
2016-05-31 17:13:51 +02:00
Gael Guennebaud
2c1b56f4c1
bug #1238 : fix SparseMatrix::sum() overload for un-compressed mode.
2016-05-31 10:56:53 +02:00
Benoit Steiner
c4bd3b1f21
Silenced some compilation warnings triggered by nvcc 8.0
2016-05-27 14:40:49 -07:00
Benoit Steiner
3a5d6a3c38
Disable the use of MMX instructions since the code is broken on many platforms
2016-05-27 09:13:26 -07:00
Gael Guennebaud
e0cb73b46b
Fix compilation with old ICC version (use C99 types instead of C++11 ones)
2016-05-27 10:28:09 +02:00
Benoit Steiner
094f4a56c8
Deleted extra namespace
2016-05-26 14:49:51 -07:00
Gael Guennebaud
7ff5fadcc0
Disable usage of MMX with msvc.
2016-05-26 17:58:46 +02:00
Gael Guennebaud
e8cef383b7
bug #1236 : fix possible integer overflow in density estimation.
2016-05-26 17:51:04 +02:00
Gael Guennebaud
30d97c03ce
Defer the allocation of the working space:
...
- it is not always needed,
- and this fixes a long-to-float conversion warning
2016-05-26 17:39:42 +02:00
Gael Guennebaud
e08f54e9eb
Fix copy ctor prototype.
2016-05-26 17:37:25 +02:00
Gael Guennebaud
c7f54b11ec
linspaced's divisor for integer is better stored as the underlying scalar type.
2016-05-26 17:36:54 +02:00
Gael Guennebaud
bebc5a2147
Fix/handle some int-to-long conversions.
2016-05-26 17:35:53 +02:00
Gael Guennebaud
00c29c2cae
Store permutation's determinant as char.
...
This also fixes some long to float conversion warnings
2016-05-26 17:34:23 +02:00
Gael Guennebaud
2f56d91063
Fix a pointer to integer conversion warning
2016-05-26 17:31:45 +02:00
Gael Guennebaud
2a44a70142
Handle some Index to int conversions in BLAS/LAPACK support.
2016-05-26 17:29:04 +02:00
Gael Guennebaud
f253e19296
Disable some long to float conversion warnings
2016-05-26 17:27:14 +02:00
Gael Guennebaud
37197b602b
Remove debuging code.
2016-05-26 11:53:10 +02:00
Gael Guennebaud
27f0434233
Introduce internal's UIntPtr and IntPtr types for pointer to integer conversions.
...
This fixes "conversion from pointer to same-sized integral type" warnings by ICC.
Ideally, we would use the std::[u]intptr_t types all the time, but since they are C99/C++11 only,
let's be safe.
2016-05-26 10:52:12 +02:00
Gael Guennebaud
40e4637d79
Turn off ICC's conversion warning in is_convertible implementation
2016-05-26 10:48:43 +02:00
Gael Guennebaud
cc1ab64f29
Add missing inclusion of mmintrin.h
2016-05-26 09:51:50 +02:00
Benoit Steiner
3585ff585e
Silenced a compilation warning
2016-05-25 22:09:19 -07:00
Benoit Steiner
efeb89dcdb
Specify the rounding mode in the correct location
2016-05-25 17:53:24 -07:00
Benoit Steiner
0322c66a3f
Explicitly specify the rounding mode when converting floats to fp16
2016-05-25 15:56:15 -07:00
Benoit Steiner
ed783872ab
Disable the use of MMX instructions on x86_64 since too many compilers only support them in 32bit mode
2016-05-25 08:27:26 -07:00
Benoit Steiner
bcfff64f9e
Use numext:: instead of std:: functions.
2016-05-25 08:08:21 -07:00
Gael Guennebaud
bbf9109e25
Fix compilation with ICC.
2016-05-25 10:00:55 +02:00
Gael Guennebaud
2a1bff67fd
Fix static/inline order.
2016-05-25 10:00:11 +02:00
Benoit Steiner
d041a528da
Cleaned up the fp16 code a little more
2016-05-24 22:43:26 -07:00
Benoit Steiner
cb26784d07
Pulled latest updates from trunk
2016-05-24 18:51:39 -07:00
Benoit Steiner
ff4a289572
Cleaned up the fp16 code
2016-05-24 18:50:09 -07:00
Gael Guennebaud
e68e165a23
bug #256 : enable vectorization with unaligned loads/stores.
...
This concerns all architectures and all sizes.
This new behavior can be disabled by defining EIGEN_UNALIGNED_VECTORIZE=0
2016-05-24 21:54:03 +02:00
Gael Guennebaud
78390e4189
Block<> should not disable vectorization based on inner-size, this is the responsibilty of the assignment logic.
2016-05-24 17:14:01 +02:00
Gael Guennebaud
64bb7576eb
Clean propagation of Dest/Src alignments.
2016-05-24 17:12:12 +02:00
Benoit Jacob
40a16282c7
Remove now-unused protate PacketMath func
2016-05-24 11:01:18 -04:00
Benoit Jacob
6136f4fdd4
Remove the rotating kernel. It was only useful on some ARM CPUs (Qualcomm Krait) that are not as ubiquitous today as they were when I introduced it.
2016-05-24 10:00:32 -04:00
Benoit Steiner
e617711306
Don't attempt to use MMX instructions with visualstudio since they're only partially supported.
2016-05-24 06:43:58 -07:00
Benoit Steiner
334e76537f
Worked around missing clang intrinsic
2016-05-24 00:29:28 -07:00
Benoit Steiner
b517ab349b
Use the generic ploadquad intrinsics since it does the job
2016-05-24 00:11:17 -07:00
Benoit Steiner
646872cb3b
Worked around missing clang intrinsics
2016-05-24 00:07:08 -07:00
Benoit Steiner
3dfc391a61
Added missing EIGEN_DEVICE_FUNC qualifier
2016-05-23 20:56:59 -07:00
Benoit Steiner
3d0741f027
Include mmintrin.h to make it possible to use mmx instructions when needed. For example, this will enable the definition of a half packet for the Packet4f type.
2016-05-23 20:43:48 -07:00
Benoit Steiner
33a94f5dc7
Use the Index type instead of integers to specify the strides in pgather/pscatter
2016-05-23 20:37:30 -07:00
Benoit Steiner
6bc684ab6a
Added missing alignment in the fp16 packet traits
2016-05-23 20:32:30 -07:00
Benoit Steiner
283e33dea4
ptranspose is not a template.
2016-05-23 19:55:55 -07:00
Benoit Steiner
a5a3ba2b80
Avoid unnecessary float to double conversions
2016-05-23 17:16:09 -07:00
Benoit Steiner
5ba0ebe7c9
Avoid unnecessary float to double conversion.
2016-05-23 17:14:31 -07:00
Benoit Steiner
7d980d74e5
Started to vectorize the processing of 16bit floats on CPU.
2016-05-23 15:21:40 -07:00
Benoit Steiner
5d51a7f12c
Don't optimize the processing of the last rows of a matrix matrix product in cases that violate the assumptions made by the optimized code path.
2016-05-23 15:13:16 -07:00
Christoph Hertzberg
88654762da
Replace multiple constructors of half-type by a generic/templated constructor. This fixes an incompatibility with long double, exposed by the previous commit.
2016-05-23 10:03:03 +02:00
Christoph Hertzberg
718521d5cf
Silenced several double-promotion warnings
2016-05-22 18:17:04 +02:00
Gael Guennebaud
ccaace03c9
Make EIGEN_HAS_CONSTEXPR user configurable
2016-05-20 15:10:08 +02:00
Gael Guennebaud
c3410804cd
Make EIGEN_HAS_VARIADIC_TEMPLATES user configurable
2016-05-20 15:05:38 +02:00
Gael Guennebaud
abd1c1af7a
Make EIGEN_HAS_STD_RESULT_OF user configurable
2016-05-20 15:01:27 +02:00
Gael Guennebaud
1395056fc0
Make EIGEN_HAS_C99_MATH user configurable
2016-05-20 14:58:19 +02:00
Gael Guennebaud
48bf5ec216
Make EIGEN_HAS_RVALUE_REFERENCES user configurable
2016-05-20 14:54:20 +02:00
Gael Guennebaud
f43ae88892
Rename EIGEN_HAVE_RVALUE_REFERENCES to EIGEN_HAS_RVALUE_REFERENCES
2016-05-20 14:48:51 +02:00
Gael Guennebaud
998f2efc58
Add a EIGEN_MAX_CPP_VER option to limit the C++ version to be used.
2016-05-20 14:44:28 +02:00
Gael Guennebaud
c028d96089
Improve doc of special math functions
2016-05-20 14:18:48 +02:00
Gael Guennebaud
0ba32f99bd
Rename UniformRandom to UnitRandom.
2016-05-20 13:21:34 +02:00
Gael Guennebaud
7a9d9cde94
Fix coding practice in Quaternion::UniformRandom
2016-05-20 13:19:52 +02:00
Joseph Mirabel
eb0cc2573a
bug #823 : add static method to Quaternion for uniform random rotations.
2016-05-20 13:15:40 +02:00
Gael Guennebaud
6761c64d60
zeta and polygamma are not unary functions, but binary ones.
2016-05-19 18:34:16 +02:00
Gael Guennebaud
7a54032408
zeta and digamma do not require C++11/C99
2016-05-19 17:36:47 +02:00
Gael Guennebaud
ce12562710
Add some c++11 flags in documentation
2016-05-19 17:35:30 +02:00
Gael Guennebaud
b6ed8244b4
bug #1201 : optimize affine*vector products
2016-05-19 16:09:15 +02:00
Gael Guennebaud
73693b5de6
bug #1221 : disable gcc 6 warning: ignoring attributes on template argument
2016-05-19 15:21:53 +02:00
Gael Guennebaud
df9a5e13c6
Fix SelfAdjointEigenSolver for some input expression types, and add new regression unit tests for sparse and selfadjointview inputs.
2016-05-19 13:07:33 +02:00
Gael Guennebaud
6a2916df80
DiagonalWrapper is a vector, so it must expose the LinearAccessBit flag.
2016-05-19 13:06:21 +02:00
Gael Guennebaud
a226f6af6b
Add support for SelfAdjointView::diagonal()
2016-05-19 13:05:33 +02:00
Gael Guennebaud
ee7da3c7c5
Fix SelfAdjointView::triangularView for complexes.
2016-05-19 13:01:51 +02:00
Gael Guennebaud
b6b8578a67
bug #1230 : add support for SelfadjointView::triangularView.
2016-05-19 11:36:38 +02:00
Gael Guennebaud
84df9142e7
bug #1231 : fix compilation regression regarding complex_array/=real_array and add respective unit tests
2016-05-18 23:00:13 +02:00
Gael Guennebaud
21d692d054
Use coeff(i,j) instead of operator().
2016-05-18 17:09:20 +02:00
Gael Guennebaud
8456bbbadb
bug #1224 : fix regression in (dense*dense).sparseView() by specializing evaluator<SparseView<Product>> for sparse products only.
2016-05-18 16:53:28 +02:00
Gael Guennebaud
b507b82326
Use default sorting strategy for square products.
2016-05-18 16:51:54 +02:00
Gael Guennebaud
747e3290c0
bug #1213 : rename some enums type for consistency.
2016-05-18 13:26:56 +02:00
Rasmus Munk Larsen
0dbd68145f
Roll back changes to core. Move include of TensorFunctors.h up to satisfy dependence in TensorCostModel.h.
2016-05-17 10:25:19 -07:00
Rasmus Munk Larsen
e55deb21c5
Improvements to parallelFor.
...
Move some scalar functors from TensorFunctors. to Eigen core.
2016-05-12 14:07:22 -07:00
Benoit Steiner
fae0493f98
Fixed a couple of bugs related to the Pascalfamily of GPUs
...
H: Enter commit message. Lines beginning with 'HG:' are removed.
2016-05-11 23:02:26 -07:00
Benoit Steiner
b6a517c47d
Added the ability to load fp16 using the texture path.
...
Improved the performance of some reductions on fp16
2016-05-11 21:26:48 -07:00
Benoit Steiner
518149e868
Misc fixes for fp16
2016-05-11 20:11:14 -07:00
Benoit Steiner
56a1757d74
Made predux_min and predux_max on fp16 less noisy
2016-05-11 17:37:34 -07:00
Benoit Steiner
9091351dbe
__ldg is only available with cuda architectures >= 3.5
2016-05-11 15:22:13 -07:00
Benoit Steiner
02f76dae2d
Fixed a typo
2016-05-11 15:08:38 -07:00
Christoph Hertzberg
131e5a1a4a
Do not copy for trivial 1x1 case. This also avoids a "maybe-uninitialized" warning in some situations.
2016-05-11 23:50:13 +02:00
Benoit Steiner
70195a5ff7
Added missing EIGEN_DEVICE_FUNC
2016-05-11 14:10:09 -07:00
Benoit Steiner
09a19c33a8
Added missing EIGEN_DEVICE_FUNC qualifiers
2016-05-11 14:07:43 -07:00
Christoph Hertzberg
33ca7e3c8d
bug #1207 : Add and fix logical-op warnings
2016-05-11 19:36:34 +02:00
Benoit Steiner
217d984abc
Fixed a typo in my previous commit
2016-05-11 10:22:15 -07:00
Christoph Hertzberg
0f61343893
Workaround maybe-uninitialized warning
2016-05-11 09:00:18 +02:00
Christoph Hertzberg
3bfc9b47ca
Workaround "misleading-indentation" warnings
2016-05-11 08:41:36 +02:00
Benoit Steiner
0b9e3dcd06
Added packet primitives to compute exp, log, sqrt and rsqrt on fp16. This improves the performance by 10 to 30%.
2016-05-10 11:05:33 -07:00
Benoit Steiner
8adf5cc70f
Added support for packet processing of fp16 on kepler and maxwell gpus
2016-05-06 19:16:43 -07:00
Christoph Hertzberg
a11bd82dc3
bug #1213 : Give names to anonymous enums
2016-05-06 11:31:56 +02:00
Benoit Steiner
0451940fa4
Relaxed the dummy precision for fp16
2016-05-05 15:40:01 -07:00
Christoph Hertzberg
dacb469bc9
Enable and fix -Wdouble-conversion warnings
2016-05-05 13:35:45 +02:00
Ola Røer Thorsen
be78aea6b3
fix double-promotion/float-conversion in Core/SpecialFunctions.h
2016-05-04 10:52:08 +02:00
Gael Guennebaud
75a94b9662
Improve documentation of BDCSVD
2016-05-04 12:53:14 +02:00
Gael Guennebaud
e2ca478485
bug #1214 : consider denormals as zero in D&C SVD. This also workaround infinite binary search when compiling with ICC's unsafe optimizations.
2016-05-03 23:15:29 +02:00
Benoit Steiner
4c05fb03a3
Merged eigen/eigen into default
2016-05-03 13:15:00 -07:00
Benoit Steiner
6c3e5b85bc
Fixed compilation error with cuda >= 7.5
2016-05-03 09:38:42 -07:00
Benoit Steiner
da50419df8
Made a cast explicit
2016-05-02 19:50:22 -07:00
Gael Guennebaud
b1bd53aa6b
Fix performance regression: with AVX, unaligned stores were emitted instead of aligned ones for fixed size assignement.
2016-05-01 23:25:06 +02:00
Benoit Steiner
2b890ae618
Fixed compilation errors generated by clang
2016-04-29 18:30:40 -07:00
Benoit Steiner
46bcb70969
Don't turn on const expressions when compiling with gcc >= 4.8 unless the -std=c++11 option has been used
2016-04-29 15:20:59 -07:00
Benoit Steiner
07a247dcf4
Pulled latest updates from upstream
2016-04-29 13:41:26 -07:00
Benoit Steiner
fa5a8f055a
Implemented palign_impl for AVX512
2016-04-29 13:30:13 -07:00
Benoit Steiner
ef3ac9d05a
Fixed the AVX512 packet traits
2016-04-29 13:28:36 -07:00
Benoit Steiner
d7b75e8d86
Added pdiv packet primitives for avx512
2016-04-29 13:26:47 -07:00
Benoit Steiner
5e89ded685
Implemented preduxp for AVX512
2016-04-29 13:00:33 -07:00
Benoit Steiner
5f85662ad8
Implemented the pabs and preverse primitives for avx512.
2016-04-29 12:53:34 -07:00
Benoit Steiner
d37ee89ca8
Disabled some of the AVX512 primitives on compilers that don't support them
2016-04-29 12:50:29 -07:00
Gael Guennebaud
0f3c4c8ff4
Fix compilation of sparse.cast<>().transpose().
2016-04-29 18:26:08 +02:00
Benoit Steiner
dacb23277e
Fixed the igamma and igammac implementations to make them callable from a gpu kernel.
2016-04-28 18:54:54 -07:00
Benoit Steiner
a5d4545083
Deleted unused variable
2016-04-28 14:14:48 -07:00
Justin Lebar
40d1e2f8c7
Eliminate mutual recursion in igamma{,c}_impl::Run.
...
Presently, igammac_impl::Run calls igamma_impl::Run, which in turn calls
igammac_impl::Run.
This isn't actually mutual recursion; the calls are guarded such that we never
get into a loop. Nonetheless, it's a stretch for clang to prove this. As a
result, clang emits a recursive call in both igammac_impl::Run and
igamma_impl::Run.
That this is suboptimal code is bad enough, but it's particularly bad when
compiling for CUDA/nvptx. nvptx allows recursion, but only begrudgingly: If
you have recursive calls in a kernel, it's on you to manually specify the
kernel's stack size. Otherwise, ptxas will dump a warning, make a guess, and
who knows if it's right.
This change explicitly eliminates the mutual recursion in igammac_impl::Run and
igamma_impl::Run.
2016-04-28 13:57:08 -07:00
Konstantinos Margaritis
87294c84a6
define Packet2d constants with VSX only
2016-04-28 14:39:56 -03:00
Konstantinos Margaritis
6ed7a7281c
remove accidentally pasted code
2016-04-28 14:35:55 -03:00
Konstantinos Margaritis
62f9093b31
improve state of MathFunctions as well
2016-04-28 14:33:09 -03:00
Konstantinos Margaritis
8ed26120c8
bring Altivec/VSX to a better state, implement some of the missing functions
2016-04-28 14:32:42 -03:00
Konstantinos Margaritis
950158f6d1
add name to copyrights
2016-04-28 14:32:11 -03:00
Konstantinos Margaritis
ee0459300b
minor fix, add to copyright
2016-04-28 14:31:21 -03:00
Benoit Steiner
2b917291d9
Merged in rmlarsen/eigen2 (pull request PR-183)
...
Detect cxx_constexpr support when compiling with clang.
2016-04-27 15:19:54 -07:00
Rasmus Munk Larsen
09b9e951e3
Depend on the more extensive support for constexpr in clang:
...
http://clang.llvm.org/docs/LanguageExtensions.html#c-1y-relaxed-constexpr
2016-04-27 14:59:11 -07:00
Rasmus Munk Larsen
1a325ef71c
Detect cxx_constexpr support when compiling with clang.
2016-04-27 14:33:51 -07:00
Benoit Steiner
c61170e87d
fpclassify isn't portable enough. In particular, the return values of the function are not available on all the platforms Eigen supportes: remove it from Eigen.
2016-04-27 14:22:20 -07:00
Benoit Steiner
f629fe95c8
Made the index type a template parameter to evaluateProductBlockingSizes
...
Use numext::mini and numext::maxi instead of std::min/std::max to compute blocking sizes.
2016-04-27 13:11:19 -07:00
Benoit Steiner
25141b69d4
Improved support for min and max on 16 bit floats when running on recent cuda gpus
2016-04-27 12:57:21 -07:00
Benoit Steiner
6744d776ba
Added support for fpclassify in Eigen::Numext
2016-04-27 12:10:25 -07:00
Konstantinos Margaritis
3f80696ae1
Merged eigen/eigen into default
2016-04-22 15:05:21 +03:00
Benoit Steiner
5c372d19e3
Merged in rmlarsen/eigen (pull request PR-179)
...
Prevent crash in CompleteOrthogonalDecomposition if object was default constructed.
2016-04-21 18:06:36 -07:00
Rasmus Munk Larsen
a3256d78d8
Prevent crash in CompleteOrthogonalDecomposition if object was default constructed.
2016-04-21 16:49:28 -07:00
Konstantinos Margaritis
e5b2ef47d5
Merged eigen/eigen into default
2016-04-21 18:03:08 +03:00
Benoit Steiner
80200a1828
Don't attempt to leverage the _cvtss_sh and _cvtsh_ss instructions when compiling with clang since it's unclear which versions of clang actually support these instruction.
2016-04-20 12:10:27 -07:00
Benoit Steiner
1d0238375d
Made sure all the required header files are included when trying to use fp16
2016-04-19 17:44:12 -07:00
Gael Guennebaud
e4fe611e2c
Enable lazy-coeff-based-product for vector*(1x1) products
2016-04-16 15:17:39 +02:00
Benoit Steiner
1a16fb1532
Deleted extraneous comma.
2016-04-15 15:50:13 -07:00
Gael Guennebaud
2a7115daca
bug #1203 : by-pass large stack-allocation in stableNorm if EIGEN_STACK_ALLOCATION_LIMIT is too small
2016-04-15 22:34:11 +02:00
Benoit Steiner
1d23430628
Improved the matrix multiplication blocking in the case where mr is not a power of 2 (e.g on Haswell CPUs).
2016-04-15 10:53:31 -07:00
Gael Guennebaud
1e80bddde3
Fix trmv for mixing types.
2016-04-15 17:58:36 +02:00
Konstantinos Margaritis
0e8fc31087
remove pgather/pscatter for std::complex<double> for s390x
2016-04-15 07:08:57 -04:00
Benoit Steiner
a62e924656
Added ability to access the cache sizes from the tensor devices
2016-04-14 21:25:06 -07:00
Benoit Steiner
18e6f67426
Added support for exclusive or
2016-04-14 20:37:46 -07:00
Gael Guennebaud
20f387fafa
Improve numerical robustness of JacoviSVD:
...
- avoid noise amplification in complex to real conversion
- compare off-diagonal entries to the current biggest diagonal entry: no need to bother about a 2x2 block containing ridiculously small entries compared to the rest of the matrix.
2016-04-14 22:46:55 +02:00
Benoit Steiner
7718749fee
Force the inlining of the << operator on half floats
2016-04-14 11:51:54 -07:00
Benoit Steiner
5379d2b594
Inline the << operator on half floats
2016-04-14 11:40:48 -07:00
Benoit Steiner
5c13765ee3
Added ability to printf fp16
2016-04-14 10:24:52 -07:00
Gael Guennebaud
3551dea887
Cleaning pass on rcond estimator.
2016-04-14 16:45:41 +02:00
Gael Guennebaud
d402adc3d7
Better use .data() than &coeffRef(0)
2016-04-14 15:18:08 +02:00
Gael Guennebaud
ea7087ef31
Merged in rmlarsen/eigen (pull request PR-174)
...
Add matrix condition number estimation module.
2016-04-14 15:11:33 +02:00
Benoit Steiner
36f5a10198
Properly gate the definition of the error and gamma functions for fp16
2016-04-13 18:44:48 -07:00
Benoit Steiner
10b69810d1
Improved support for trigonometric functions on GPU
2016-04-13 16:00:51 -07:00
Benoit Steiner
d6105b53b8
Added basic implementation of the lgamma, digamma, igamma, igammac, polygamma, and zeta function for fp16
2016-04-13 15:26:02 -07:00
Gael Guennebaud
703251f10f
merge
2016-04-13 23:45:10 +02:00
Gael Guennebaud
39211ba46b
Fix JacobiSVD for complex when the complex-to-real update already gives a diagonal 2x2 block.
2016-04-13 23:43:26 +02:00
Benoit Steiner
2986253259
Cleaned up the implementation of digamma
2016-04-13 14:24:06 -07:00
Benoit Steiner
d5de1a8220
Pulled latest updates from trunk
2016-04-13 14:17:11 -07:00
Benoit Steiner
87ca15c4e8
Added support for sin, cos, tan, and tanh on fp16
2016-04-13 14:12:38 -07:00
Gael Guennebaud
feef39e2d1
Fix underflow in JacoviSVD's complex to real preconditioner
2016-04-13 22:49:51 +02:00
Benoit Steiner
bf3f6688f0
Added support for computing cos, sin, tan, and tanh on GPU.
2016-04-13 11:55:08 -07:00
Benoit Steiner
473c8380ea
Added constructors to convert unsigned integers into fp16
2016-04-13 11:03:37 -07:00
Gael Guennebaud
42a3352a3b
Workaround a division by zero when outerstride==0
2016-04-13 19:02:02 +02:00
Gael Guennebaud
6f960b83ff
Make use of is_same_dense helper instead of extract_data to detect input/outputs are the same.
2016-04-13 18:47:12 +02:00
Gael Guennebaud
b7716c0328
Fix incomplete previous patch on matrix comparision.
2016-04-13 18:32:56 +02:00
Gael Guennebaud
2630d97c62
Fix detection of same matrices when both matrices are not handled by extract_data.
2016-04-13 18:26:08 +02:00
Gael Guennebaud
06447e0a39
Improve half-packet vectorization logic to distinguish linear versus inner traversal modes.
2016-04-13 18:15:49 +02:00
Gael Guennebaud
bbb8854bf7
Enable half-packet in reduxions.
2016-04-13 13:02:34 +02:00
Benoit Steiner
aa1ba8bbd2
Don't put a command at the end of an enumerator list
2016-04-12 16:28:11 -07:00
Gael Guennebaud
b67c983291
Enable the use of half-packet in coeff-based product.
...
For instance, Matrix4f*Vector4f is now vectorized again when using AVX.
2016-04-12 23:03:03 +02:00
Benoit Steiner
8bfe739cd2
Updated the AVX512 PacketMath to properly leverage the AVX512DQ instructions
2016-04-11 18:40:16 -07:00
Rasmus Larsen
6498dadc2f
Merged eigen/eigen into default
2016-04-11 17:42:05 -07:00
Benoit Steiner
d6e596174d
Pull latest updates from upstream
2016-04-11 17:20:17 -07:00
Benoit Steiner
748c4c4599
More accurate cost estimates for exp, log, tanh, and sqrt.
2016-04-11 13:11:04 -07:00
Benoit Steiner
833efb39bf
Added epsilon, dummy_precision, infinity and quiet_NaN NumTraits for fp16
2016-04-11 11:03:56 -07:00
Benoit Steiner
e939b087fe
Pulled latest update from trunk
2016-04-11 11:03:02 -07:00
Gael Guennebaud
0483430283
Move LAPACK declarations from blas.h to lapack.h and fix compatibility with EIGEN_USE_MKL
2016-04-11 17:12:31 +02:00
Gael Guennebaud
097d1e8823
Cleanup obsolete assign_scalar_eig2mkl helper.
2016-04-11 16:09:29 +02:00
Gael Guennebaud
fec4c334ba
Remove all references to MKL in BLAS wrappers.
2016-04-11 16:04:09 +02:00
Gael Guennebaud
ddabc992fa
Fix long to int conversion in BLAS API.
2016-04-11 15:52:01 +02:00
Gael Guennebaud
8191f373be
Silent unused warning.
2016-04-11 15:37:16 +02:00
Gael Guennebaud
6a9ca88e7e
Relax dependency on MKL for EIGEN_USE_BLAS
2016-04-11 15:17:14 +02:00
Gael Guennebaud
4e8e5888d7
Improve constness of blas level-3 interface.
2016-04-11 15:12:44 +02:00
Gael Guennebaud
675e0a2224
Fix static/inline keywords order.
2016-04-11 15:06:20 +02:00
Till Hoffmann
643b697649
Proper handling of domain errors.
2016-04-10 00:37:53 +01:00
Rasmus Munk Larsen
1f70bd4134
Merge.
2016-04-09 15:31:53 -07:00
Rasmus Munk Larsen
096e355f8e
Add short-circuit to avoid calling matrix norm for empty matrix.
2016-04-09 15:29:56 -07:00
Rasmus Larsen
be80fb49fc
Merged default ( 4a92b590a0
...
) into default
2016-04-09 13:13:01 -07:00
Rasmus Larsen
7a8176587b
Merged eigen/eigen into default
2016-04-09 12:47:41 -07:00
Rasmus Munk Larsen
4a92b590a0
Merge.
2016-04-09 12:47:24 -07:00
Rasmus Munk Larsen
ee6c69733a
A few tiny adjustments to short-circuit logic.
2016-04-09 12:45:49 -07:00
Till Hoffmann
7f4826890c
Merge upstream
2016-04-09 20:08:07 +01:00
Till Hoffmann
de057ebe54
Added nans to zeta function.
2016-04-09 20:07:36 +01:00
Benoit Steiner
5da90fc8dd
Use numext::abs instead of std::abs in scalar_fuzzy_default_impl to make it usable inside GPU kernels.
2016-04-08 19:40:48 -07:00
Benoit Steiner
01bd577288
Fixed the implementation of Eigen::numext::isfinite, Eigen::numext::isnan, andEigen::numext::isinf on CUDA devices
2016-04-08 16:40:10 -07:00
Benoit Steiner
89a3dc35a3
Fixed isfinite_impl: NumTraits<T>::highest() and NumTraits<T>::lowest() are finite numbers.
2016-04-08 15:56:16 -07:00
Benoit Steiner
995f202cea
Disabled the use of half2 on cuda devices of compute capability < 5.3
2016-04-08 14:43:36 -07:00
Benoit Steiner
8d22967bd9
Initial support for taking the power of fp16
2016-04-08 14:22:39 -07:00
Benoit Steiner
3394379319
Fixed the packet_traits for half floats.
2016-04-08 13:33:59 -07:00
Rasmus Larsen
0b81a18d12
Merged eigen/eigen into default
2016-04-08 12:58:57 -07:00
Benoit Jacob
cd2b667ac8
Add references to filed LLVM bugs
2016-04-08 08:12:47 -04:00
Benoit Steiner
3bd16457e1
Properly handle complex numbers.
2016-04-07 23:28:04 -07:00
Rasmus Larsen
c34e55c62b
Merged eigen/eigen into default
2016-04-07 20:23:03 -07:00
Rasmus Munk Larsen
283c51cd5e
Widen short-circuiting ReciprocalConditionNumberEstimate so we don't call InverseMatrixL1NormEstimate for dec.rows() <= 1.
2016-04-07 16:45:40 -07:00
Rasmus Munk Larsen
d51803a728
Use Index instead of int for indexing and sizes.
2016-04-07 16:39:48 -07:00
Rasmus Munk Larsen
fd872aefb3
Remove transpose() method from LLT and LDLT classes as it would imply conjugation.
...
Explicitly cast constants to RealScalar in ConditionEstimator.h.
2016-04-07 16:28:44 -07:00
Rasmus Munk Larsen
0b5546d182
Use lpNorm<1>() to compute l1 norms in LLT and LDLT.
2016-04-07 15:49:30 -07:00
parthaEth
2d5bb375b7
Static casting scalar types so as to let chlesky module of eigen work with ceres
2016-04-08 00:14:44 +02:00
Benoit Steiner
74f64838c5
Updated the unary functors to use the numext implementation of typicall functions instead of the one provided in the standard library. The standard library functions aren't supported officially by cuda, so we're better off using the numext implementations.
2016-04-07 11:42:14 -07:00
Benoit Steiner
737644366f
Move the functions operating on fp16 out of the std namespace and into the Eigen::numext namespace
2016-04-07 11:40:15 -07:00
Benoit Steiner
b89d3f78b2
Updated the isnan, isinf and isfinite functions to make compatible with cuda devices.
2016-04-07 10:08:49 -07:00
Benoit Steiner
df838736e2
Fixed compilation warning triggered by msvc
2016-04-06 20:48:55 -07:00
Benoit Steiner
14ea7c7ec7
Fixed packet_traits<half>
2016-04-06 19:30:21 -07:00
Benoit Steiner
532fdf24cb
Added support for hardware conversion between fp16 and full floats whenever
...
possible.
2016-04-06 17:11:31 -07:00
Benoit Steiner
58c1dbff19
Made the fp16 code more portable.
2016-04-06 13:44:08 -07:00
Benoit Steiner
cf7e73addd
Added some missing conversions to the Half class, and fixed the implementation of the < operator on cuda devices.
2016-04-06 09:59:51 -07:00
Benoit Steiner
10bdd8e378
Merged in tillahoffmann/eigen (pull request PR-173)
...
Added zeta function of two arguments and polygamma function
2016-04-06 09:40:17 -07:00
Benoit Steiner
72abfa11dd
Added support for isfinite on fp16
2016-04-06 09:07:30 -07:00
Rasmus Munk Larsen
4d07064a3d
Fix bug in alternate lower bound calculation due to missing parentheses.
...
Make a few expressions more concise.
2016-04-05 16:40:48 -07:00
Konstantinos Margaritis
2bba4ee2cf
Merged kmargar/eigen/tip into default
2016-04-05 22:22:08 +03:00
Konstantinos Margaritis
317384b397
complete the port, remove float support
2016-04-05 14:56:45 -04:00
tillahoffmann
726bd5f077
Merged eigen/eigen into default
2016-04-05 18:21:05 +01:00
Till Hoffmann
a350c25a39
Added accuracy comments.
2016-04-05 18:20:40 +01:00
Konstantinos Margaritis
bc0ad363c6
add remaining includes
2016-04-05 06:01:17 -04:00
Konstantinos Margaritis
2d41dc9622
complete int/double specialized traits for ZVector
2016-04-05 06:00:51 -04:00
Konstantinos Margaritis
988344daf1
enable the other includes as well
2016-04-05 05:59:30 -04:00
Rasmus Larsen
d7eeee0c1d
Merged eigen/eigen into default
2016-04-04 15:58:27 -07:00
Rasmus Munk Larsen
513c372960
Fix docstrings to list all supported decompositions.
2016-04-04 14:34:59 -07:00
Rasmus Munk Larsen
86e0ed81f8
Addresses comments on Eigen pull request PR-174.
...
* Get rid of code-duplication for real vs. complex matrices.
* Fix flipped arguments to select.
* Make the condition estimation functions free functions.
* Use Vector::Unit() to generate canonical unit vectors.
* Misc. cleanup.
2016-04-04 14:20:01 -07:00
Benoit Jacob
158fea0f5e
bug #1190 - Don't trust __ARM_FEATURE_FMA on Clang/ARM
2016-04-04 16:42:40 -04:00
Benoit Jacob
03f2997a11
bug #1191 - Prevent Clang/ARM from rewriting VMLA into VMUL+VADD
2016-04-04 16:41:47 -04:00
Till Hoffmann
b97911dd18
Refactored code into type-specific helper functions.
2016-04-04 19:16:03 +01:00
Benoit Steiner
c4179dd470
Updated the scalar_abs_op struct to make it compatible with cuda devices.
2016-04-04 11:11:51 -07:00
Benoit Steiner
1108b4f218
Fixed the signature of numext::abs to make it compatible with complex numbers
2016-04-04 11:09:25 -07:00
Rasmus Larsen
30242b7565
Merged eigen/eigen into default
2016-04-01 17:19:36 -07:00
Rasmus Munk Larsen
9d51f7c457
Add rcond method to LDLT.
2016-04-01 16:48:38 -07:00
Rasmus Munk Larsen
f54137606e
Add condition estimation to Cholesky (LLT) factorization.
2016-04-01 16:19:45 -07:00
Rasmus Munk Larsen
fb8dccc23e
Replace "inline static" with "static inline" for consistency.
2016-04-01 12:48:18 -07:00
Rasmus Munk Larsen
91414e0042
Fix comments in ConditionEstimator and minor cleanup.
2016-04-01 11:58:17 -07:00
Rasmus Munk Larsen
1aa89fb855
Add matrix condition estimator module that implements the Higham/Hager algorithm from http://www.maths.manchester.ac.uk/~higham/narep/narep135.pdf used in LPACK. Add rcond() methods to FullPivLU and PartialPivLU.
2016-04-01 10:27:59 -07:00
Till Hoffmann
80eba21ad0
Merge upstream.
2016-04-01 18:18:49 +01:00
Till Hoffmann
3cb0a237c1
Fixed suggestions by Eugene Brevdo.
2016-04-01 17:51:39 +01:00
tillahoffmann
49960adbdd
Merged eigen/eigen into default
2016-04-01 14:36:15 +01:00
Till Hoffmann
57239f4a81
Added polygamma function.
2016-04-01 14:35:21 +01:00
Till Hoffmann
dd5d390daf
Added zeta function.
2016-04-01 13:32:29 +01:00
Benoit Steiner
0ea7ab4f62
Hashing was only officially introduced in c++11. Therefore only define an implementation of the hash function for float16 if c++11 is enabled.
2016-03-31 14:44:55 -07:00
Benoit Steiner
92b7f7b650
Improved code formating
2016-03-31 13:09:58 -07:00
Benoit Steiner
f197813f37
Added the ability to hash a fp16
2016-03-31 13:09:23 -07:00
Benoit Steiner
4c859181da
Made it possible to use the NumTraits for complex and Array in a cuda kernel.
2016-03-31 12:48:38 -07:00
Benoit Steiner
c36ab19902
Added __ldg primitive for fp16.
2016-03-31 10:55:03 -07:00
Benoit Steiner
b575fb1d02
Added NumTraits for half floats
2016-03-31 10:43:59 -07:00
Benoit Steiner
8c8a79cec1
Fixed a typo
2016-03-31 10:33:32 -07:00
Benoit Steiner
4f1a7e51c1
Pull math functions from the global namespace only when compiling cuda code with nvcc. When compiling with clang, we want to use the std namespace.
2016-03-30 17:59:49 -07:00
Benoit Steiner
bc68fc2fe7
Enable constant expressions when compiling cuda code with clang.
2016-03-30 17:58:32 -07:00
Benoit Jacob
01b5333e44
bug #1186 - vreinterpretq_u64_f64 fails to build on Android/Aarch64/Clang toolchain
2016-03-30 11:02:33 -04:00
Benoit Steiner
1841d6d4c3
Added missing cuda template specializations for numext::ceil
2016-03-29 13:29:34 -07:00
Benoit Steiner
e02b784ec3
Added support for standard mathematical functions and trancendentals(such as exp, log, abs, ...) on fp16
2016-03-29 09:20:36 -07:00
Benoit Steiner
c38295f0a0
Added support for fmod
2016-03-28 15:53:02 -07:00
Konstantinos Margaritis
01e7298fe6
actually include ZVector files, passes most basic tests (float still fails)
2016-03-28 10:58:02 -04:00
Konstantinos Margaritis
f48011119e
Merged eigen/eigen into default
2016-03-28 01:48:45 +03:00
Konstantinos Margaritis
ed6b9d08f1
some primitives ported, but missing intrinsics and crash with asm() are a problem
2016-03-27 18:47:49 -04:00
Benoit Steiner
65716e99a5
Improved the cost estimate of the quotient op
2016-03-25 11:13:53 -07:00
Benoit Steiner
d94f6ba965
Started to model the cost of divisions more accurately.
2016-03-25 11:02:56 -07:00
Benoit Steiner
2e4e4cb74d
Use numext::abs instead of abs to avoid incorrect conversion to integer of the argument
2016-03-23 16:57:12 -07:00
Benoit Steiner
81d340984a
Removed executable bit from header files
2016-03-23 16:15:02 -07:00
Benoit Steiner
bff8cbad06
Removed executable bit from header files
2016-03-23 16:14:23 -07:00
Benoit Steiner
7a570e50ef
Fixed contractions of fp16
2016-03-23 16:00:06 -07:00
Benoit Steiner
fc3660285f
Made type conversion explicit
2016-03-23 09:56:50 -07:00
Benoit Steiner
0e68882604
Added the ability to divide a half float by an index
2016-03-23 09:46:42 -07:00
Benoit Steiner
6971146ca9
Added more conversion operators for half floats
2016-03-23 09:44:52 -07:00
Benoit Steiner
f9ad25e4d8
Fixed contractions of 16 bit floats
2016-03-22 09:30:23 -07:00
Benoit Steiner
134d750eab
Completed the implementation of vectorized type casting of half floats.
2016-03-18 13:36:28 -07:00