Rasmus Munk Larsen
4d07064a3d
Fix bug in alternate lower bound calculation due to missing parentheses.
...
Make a few expressions more concise.
2016-04-05 16:40:48 -07:00
Rasmus Larsen
d7eeee0c1d
Merged eigen/eigen into default
2016-04-04 15:58:27 -07:00
Rasmus Munk Larsen
513c372960
Fix docstrings to list all supported decompositions.
2016-04-04 14:34:59 -07:00
Rasmus Munk Larsen
86e0ed81f8
Addresses comments on Eigen pull request PR-174.
...
* Get rid of code-duplication for real vs. complex matrices.
* Fix flipped arguments to select.
* Make the condition estimation functions free functions.
* Use Vector::Unit() to generate canonical unit vectors.
* Misc. cleanup.
2016-04-04 14:20:01 -07:00
Benoit Jacob
158fea0f5e
bug #1190 - Don't trust __ARM_FEATURE_FMA on Clang/ARM
2016-04-04 16:42:40 -04:00
Benoit Jacob
03f2997a11
bug #1191 - Prevent Clang/ARM from rewriting VMLA into VMUL+VADD
2016-04-04 16:41:47 -04:00
Benoit Steiner
c4179dd470
Updated the scalar_abs_op struct to make it compatible with cuda devices.
2016-04-04 11:11:51 -07:00
Benoit Steiner
1108b4f218
Fixed the signature of numext::abs to make it compatible with complex numbers
2016-04-04 11:09:25 -07:00
Rasmus Larsen
30242b7565
Merged eigen/eigen into default
2016-04-01 17:19:36 -07:00
Rasmus Munk Larsen
9d51f7c457
Add rcond method to LDLT.
2016-04-01 16:48:38 -07:00
Rasmus Munk Larsen
f54137606e
Add condition estimation to Cholesky (LLT) factorization.
2016-04-01 16:19:45 -07:00
Rasmus Munk Larsen
fb8dccc23e
Replace "inline static" with "static inline" for consistency.
2016-04-01 12:48:18 -07:00
Rasmus Munk Larsen
91414e0042
Fix comments in ConditionEstimator and minor cleanup.
2016-04-01 11:58:17 -07:00
Rasmus Munk Larsen
1aa89fb855
Add matrix condition estimator module that implements the Higham/Hager algorithm from http://www.maths.manchester.ac.uk/~higham/narep/narep135.pdf used in LPACK. Add rcond() methods to FullPivLU and PartialPivLU.
2016-04-01 10:27:59 -07:00
Benoit Steiner
0ea7ab4f62
Hashing was only officially introduced in c++11. Therefore only define an implementation of the hash function for float16 if c++11 is enabled.
2016-03-31 14:44:55 -07:00
Benoit Steiner
92b7f7b650
Improved code formating
2016-03-31 13:09:58 -07:00
Benoit Steiner
f197813f37
Added the ability to hash a fp16
2016-03-31 13:09:23 -07:00
Benoit Steiner
4c859181da
Made it possible to use the NumTraits for complex and Array in a cuda kernel.
2016-03-31 12:48:38 -07:00
Benoit Steiner
c36ab19902
Added __ldg primitive for fp16.
2016-03-31 10:55:03 -07:00
Benoit Steiner
b575fb1d02
Added NumTraits for half floats
2016-03-31 10:43:59 -07:00
Benoit Steiner
8c8a79cec1
Fixed a typo
2016-03-31 10:33:32 -07:00
Benoit Steiner
4f1a7e51c1
Pull math functions from the global namespace only when compiling cuda code with nvcc. When compiling with clang, we want to use the std namespace.
2016-03-30 17:59:49 -07:00
Benoit Steiner
bc68fc2fe7
Enable constant expressions when compiling cuda code with clang.
2016-03-30 17:58:32 -07:00
Benoit Jacob
01b5333e44
bug #1186 - vreinterpretq_u64_f64 fails to build on Android/Aarch64/Clang toolchain
2016-03-30 11:02:33 -04:00
Benoit Steiner
1841d6d4c3
Added missing cuda template specializations for numext::ceil
2016-03-29 13:29:34 -07:00
Benoit Steiner
e02b784ec3
Added support for standard mathematical functions and trancendentals(such as exp, log, abs, ...) on fp16
2016-03-29 09:20:36 -07:00
Benoit Steiner
c38295f0a0
Added support for fmod
2016-03-28 15:53:02 -07:00
Benoit Steiner
65716e99a5
Improved the cost estimate of the quotient op
2016-03-25 11:13:53 -07:00
Benoit Steiner
d94f6ba965
Started to model the cost of divisions more accurately.
2016-03-25 11:02:56 -07:00
Benoit Steiner
2e4e4cb74d
Use numext::abs instead of abs to avoid incorrect conversion to integer of the argument
2016-03-23 16:57:12 -07:00
Benoit Steiner
81d340984a
Removed executable bit from header files
2016-03-23 16:15:02 -07:00
Benoit Steiner
bff8cbad06
Removed executable bit from header files
2016-03-23 16:14:23 -07:00
Benoit Steiner
7a570e50ef
Fixed contractions of fp16
2016-03-23 16:00:06 -07:00
Benoit Steiner
fc3660285f
Made type conversion explicit
2016-03-23 09:56:50 -07:00
Benoit Steiner
0e68882604
Added the ability to divide a half float by an index
2016-03-23 09:46:42 -07:00
Benoit Steiner
6971146ca9
Added more conversion operators for half floats
2016-03-23 09:44:52 -07:00
Benoit Steiner
f9ad25e4d8
Fixed contractions of 16 bit floats
2016-03-22 09:30:23 -07:00
Benoit Steiner
134d750eab
Completed the implementation of vectorized type casting of half floats.
2016-03-18 13:36:28 -07:00
Benoit Steiner
7bd551b3a9
Make all the conversions explicit
2016-03-18 12:20:08 -07:00
Benoit Steiner
7b98de1f15
Implemented some of the missing type casting for half floats
2016-03-17 21:45:45 -07:00
Christoph Hertzberg
46aa9772fc
Merged in ebrevdo/eigen (pull request PR-169)
...
Bugfixes to cuda tests, igamma & igammac implemented, & tests for digamma, igamma, igammac on CPU & GPU.
2016-03-16 21:59:08 +01:00
Eugene Brevdo
1f69a1b65f
Change the header guard around certain numext functions to be CUDA specific.
2016-03-16 12:44:35 -07:00
Benoit Steiner
5a51366ea5
Fixed a typo.
2016-03-14 09:25:16 -07:00
Benoit Steiner
fcf59e1c37
Properly gate the use of cuda intrinsics in the code
2016-03-14 09:13:44 -07:00
Benoit Steiner
97a1f1c273
Make sure we only use the half float intrinsic when compiling with a version of CUDA that is recent enough to provide them
2016-03-14 08:37:58 -07:00
Benoit Steiner
e29c9676b1
Don't mark the cast operator as explicit, since this is a c++11 feature that's not supported by older compilers.
2016-03-12 00:15:58 -08:00
Benoit Steiner
eecd914864
Also replaced uint32_t with unsigned int to make the code more portable
2016-03-11 19:34:21 -08:00
Benoit Steiner
1ca8c1ec97
Replaced a couple more uint16_t with unsigned short
2016-03-11 19:28:28 -08:00
Benoit Steiner
0423b66187
Use unsigned short instead of uint16_t since they're more portable
2016-03-11 17:53:41 -08:00
Benoit Steiner
048c4d6efd
Made half floats usable on hardware that doesn't support them natively.
2016-03-11 17:21:42 -08:00