Benoit Steiner
|
a81b88bef7
|
Fixed the tensor concatenation code
|
2016-03-08 12:30:19 -08:00 |
|
Benoit Steiner
|
551ff11d0d
|
Fixed the tensor layout swapping code
|
2016-03-08 12:28:10 -08:00 |
|
Benoit Steiner
|
8768c063f5
|
Fixed the tensor chipping code.
|
2016-03-08 12:26:49 -08:00 |
|
Benoit Steiner
|
e09eb835db
|
Decoupled the packet type definition from the definition of the tensor ops. All the vectorization is now defined in the tensor evaluators. This will make it possible to relialably support devices with different packet types in the same compilation unit.
|
2016-03-08 12:07:33 -08:00 |
|
Benoit Steiner
|
3b614a2358
|
Use NumTraits::highest() and NumTraits::lowest() instead of the std::numeric_limits to make the tensor min and max functors more CUDA friendly.
|
2016-03-07 17:53:28 -08:00 |
|
Benoit Steiner
|
769685e74e
|
Added the ability to pad a tensor using a non-zero value
|
2016-03-07 14:45:37 -08:00 |
|
Benoit Steiner
|
7f87cc3a3b
|
Fix a couple of typos in the code.
|
2016-03-07 14:31:27 -08:00 |
|
Benoit Steiner
|
e5f25622e2
|
Added a test to validate the behavior of some of the tensor syntactic sugar.
|
2016-03-07 09:04:27 -08:00 |
|
Benoit Steiner
|
9f5740cbc1
|
Added missing include
|
2016-03-06 22:03:18 -08:00 |
|
Benoit Steiner
|
5238e03fe1
|
Don't try to compile the uint128 test with compilers that don't support uint127
|
2016-03-06 21:59:40 -08:00 |
|
Benoit Steiner
|
9a54c3e32b
|
Don't warn that msvc 2015 isn't c++11 compliant just because it doesn't claim to be.
|
2016-03-06 09:38:56 -08:00 |
|
Benoit Steiner
|
05bbca079a
|
Turn on some of the cxx11 features when compiling with visual studio 2015
|
2016-03-05 10:52:08 -08:00 |
|
Benoit Steiner
|
6093eb9ff5
|
Don't test our 128bit emulation code when compiling with msvc
|
2016-03-05 10:37:11 -08:00 |
|
Benoit Steiner
|
57b263c5b9
|
Avoid using initializer lists in test since not all version of msvc support them
|
2016-03-05 08:35:26 -08:00 |
|
Benoit Steiner
|
23aed8f2e4
|
Use EIGEN_PI instead of redefining our own constant PI
|
2016-03-05 08:04:45 -08:00 |
|
Benoit Steiner
|
c23e0be18f
|
Use the CMAKE_CXX_STANDARD variable to turn on cxx11
|
2016-03-04 20:18:01 -08:00 |
|
Benoit Steiner
|
ec35068edc
|
Don't rely on the M_PI constant since not all compilers provide it.
|
2016-03-04 16:42:38 -08:00 |
|
Benoit Steiner
|
60d9df11c1
|
Fixed the computation of leading zeros when compiling with msvc.
|
2016-03-04 16:27:02 -08:00 |
|
Benoit Steiner
|
4e49fd5eb9
|
MSVC uses __uint128 while other compilers use __uint128_t to encode 128bit unsigned integers. Make the cxx11_tensor_uint128.cpp test work in both cases.
|
2016-03-04 14:49:18 -08:00 |
|
Benoit Steiner
|
667fcc2b53
|
Fixed syntax error
|
2016-03-04 14:37:51 -08:00 |
|
Benoit Steiner
|
4416a5dcff
|
Added missing include
|
2016-03-04 14:35:43 -08:00 |
|
Benoit Steiner
|
c561eeb7bf
|
Don't use implicit type conversions in initializer lists since not all compilers support them.
|
2016-03-04 14:12:45 -08:00 |
|
Benoit Steiner
|
174edf976b
|
Made the contraction test more portable
|
2016-03-04 14:11:13 -08:00 |
|
Benoit Steiner
|
2c50fc878e
|
Fixed a typo
|
2016-03-04 14:09:38 -08:00 |
|
Benoit Steiner
|
deea866bbd
|
Added tests to cover the new rounding, flooring and ceiling tensor operations.
|
2016-03-03 12:38:02 -08:00 |
|
Benoit Steiner
|
5cf4558c0a
|
Added support for rounding, flooring, and ceiling to the tensor api
|
2016-03-03 12:36:55 -08:00 |
|
Benoit Steiner
|
dac58d7c35
|
Added a test to validate the conversion of half floats into floats on Kepler GPUs.
Restricted the testing of the random number generation code to GPU architecture greater than or equal to 3.5.
|
2016-03-03 10:37:25 -08:00 |
|
Benoit Steiner
|
1032441c6f
|
Enable partial support for half floats on Kepler GPUs.
|
2016-03-03 10:34:20 -08:00 |
|
Benoit Steiner
|
1da10a7358
|
Enable the conversion between floats and half floats on older GPUs that support it.
|
2016-03-03 10:33:20 -08:00 |
|
Benoit Steiner
|
2de8cc9122
|
Merged in ebrevdo/eigen (pull request PR-167)
Add infinity() support to numext::numeric_limits, use it in lgamma.
I tested the code on my gtx-titan-black gpu, and it appears to work as expected.
|
2016-03-03 09:42:12 -08:00 |
|
Eugene Brevdo
|
ab3dc0b0fe
|
Small bugfix to numeric_limits for CUDA.
|
2016-03-02 21:48:46 -08:00 |
|
Eugene Brevdo
|
6afea46838
|
Add infinity() support to numext::numeric_limits, use it in lgamma.
This makes the infinity access a __device__ function, removing
nvcc warnings.
|
2016-03-02 21:35:48 -08:00 |
|
Gael Guennebaud
|
3fccef6f50
|
bug #537: fix compilation with Apples's compiler
|
2016-03-02 13:22:46 +01:00 |
|
Benoit Steiner
|
fedaf19262
|
Pulled latest updates from trunk
|
2016-03-01 06:15:44 -08:00 |
|
Gael Guennebaud
|
dfa80b2060
|
Compilation fix
|
2016-03-01 12:48:56 +01:00 |
|
Gael Guennebaud
|
bee9efc203
|
Compilation fix
|
2016-03-01 12:47:27 +01:00 |
|
Benoit Steiner
|
68ac5c1738
|
Improved the performance of large outer reductions on cuda
|
2016-02-29 18:11:58 -08:00 |
|
Benoit Steiner
|
56a3ada670
|
Added benchmarks for full reduction
|
2016-02-29 14:57:52 -08:00 |
|
Benoit Steiner
|
b2075cb7a2
|
Made the signature of the inner and outer reducers consistent
|
2016-02-29 10:53:38 -08:00 |
|
Benoit Steiner
|
3284842045
|
Optimized the performance of narrow reductions on CUDA devices
|
2016-02-29 10:48:16 -08:00 |
|
Gael Guennebaud
|
e9bea614ec
|
Fix shortcoming in fixed-value deduction of startRow/startCol
|
2016-02-29 10:31:27 +01:00 |
|
Benoit Steiner
|
609b3337a7
|
Print some information to stderr when a CUDA kernel fails
|
2016-02-27 20:42:57 +00:00 |
|
Benoit Steiner
|
1031b31571
|
Improved the README
|
2016-02-27 20:22:04 +00:00 |
|
Gael Guennebaud
|
8e6faab51e
|
bug #1172: make valuePtr and innderIndexPtr properly return null for empty matrices.
|
2016-02-27 14:55:40 +01:00 |
|
Benoit Steiner
|
ac2e6e0d03
|
Properly vectorized the random number generators
|
2016-02-26 13:52:24 -08:00 |
|
Benoit Steiner
|
caa54d888f
|
Made the TensorIndexList usable on GPU without having to use the -relaxed-constexpr compilation flag
|
2016-02-26 12:38:18 -08:00 |
|
Benoit Steiner
|
93485d86bc
|
Added benchmarks for type casting of float16
|
2016-02-26 12:24:58 -08:00 |
|
Benoit Steiner
|
002824e32d
|
Added benchmarks for fp16
|
2016-02-26 12:21:25 -08:00 |
|
Benoit Steiner
|
2cd32cad27
|
Reverted previous commit since it caused more problems than it solved
|
2016-02-26 13:21:44 +00:00 |
|
Benoit Steiner
|
d9d05dd96e
|
Fixed handling of long doubles on aarch64
|
2016-02-26 04:13:58 -08:00 |
|