Benoit Steiner
95b8961a9b
Allocate the mersenne twister used by the random number generators on the heap instead of on the stack since they tend to keep a lot of state (i.e. about 5k) around.
2016-03-17 15:23:51 -07:00
Benoit Steiner
f7329619da
Fix bug in tensor contraction. The code assumes that contraction axis indices for the LHS (after possibly swapping to ColMajor!) is increasing. Explicitly sort the contraction axis pairs to make it so.
2016-03-17 15:08:02 -07:00
Christoph Hertzberg
46aa9772fc
Merged in ebrevdo/eigen (pull request PR-169)
...
Bugfixes to cuda tests, igamma & igammac implemented, & tests for digamma, igamma, igammac on CPU & GPU.
2016-03-16 21:59:08 +01:00
Benoit Steiner
b72ffcb05e
Made the comparison of Eigen::array GPU friendly
2016-03-11 16:37:59 -08:00
Benoit Steiner
25f69cb932
Added a comparison operator for Eigen::array
...
Alias Eigen::array to std::array when compiling with Visual Studio 2015
2016-03-11 15:20:37 -08:00
Benoit Steiner
86d45a3c83
Worked around visual studio compilation warnings.
2016-03-09 21:29:39 -08:00
Benoit Steiner
8fd4241377
Fixed a typo.
2016-03-10 02:28:46 +00:00
Benoit Steiner
a685a6beed
Made the list reductions less ambiguous.
2016-03-09 17:41:52 -08:00
Benoit Steiner
3149b5b148
Avoid implicit cast
2016-03-09 17:35:17 -08:00
Benoit Steiner
b2100b83ad
Made sure to include the <random> header file when compiling with visual studio
2016-03-09 16:03:16 -08:00
Benoit Steiner
f05fb449b8
Avoid unnecessary conversion from 32bit int to 64bit unsigned int
2016-03-09 15:27:45 -08:00
Benoit Steiner
1d566417d2
Enable the random number generators when compiling with visual studio
2016-03-09 10:55:11 -08:00
Benoit Steiner
b084133dbf
Fixed the integer division code on windows
2016-03-09 07:06:36 -08:00
Benoit Steiner
6d30683113
Fixed static assertion
2016-03-08 21:02:51 -08:00
Eugene Brevdo
5e7de771e3
Properly fix merge issues.
2016-03-08 17:35:05 -08:00
Benoit Steiner
46177c8d64
Replace std::vector with our own implementation, as using the stl when compiling with nvcc and avx enabled leads to many issues.
2016-03-08 16:37:27 -08:00
Benoit Steiner
6d6413f768
Simplified the full reduction code
2016-03-08 16:02:00 -08:00
Benoit Steiner
5a427a94a9
Fixed the tensor generator code
2016-03-08 13:28:06 -08:00
Benoit Steiner
a81b88bef7
Fixed the tensor concatenation code
2016-03-08 12:30:19 -08:00
Benoit Steiner
551ff11d0d
Fixed the tensor layout swapping code
2016-03-08 12:28:10 -08:00
Benoit Steiner
8768c063f5
Fixed the tensor chipping code.
2016-03-08 12:26:49 -08:00
Benoit Steiner
e09eb835db
Decoupled the packet type definition from the definition of the tensor ops. All the vectorization is now defined in the tensor evaluators. This will make it possible to relialably support devices with different packet types in the same compilation unit.
2016-03-08 12:07:33 -08:00
Benoit Steiner
3b614a2358
Use NumTraits::highest() and NumTraits::lowest() instead of the std::numeric_limits to make the tensor min and max functors more CUDA friendly.
2016-03-07 17:53:28 -08:00
Benoit Steiner
769685e74e
Added the ability to pad a tensor using a non-zero value
2016-03-07 14:45:37 -08:00
Benoit Steiner
7f87cc3a3b
Fix a couple of typos in the code.
2016-03-07 14:31:27 -08:00
Eugene Brevdo
5707004d6b
Fix Eigen's building of sharded tests that use CUDA & more igamma/igammac bugfixes.
...
0. Prior to this PR, not a single sharded CUDA test was actually being *run*.
Fixed that.
GPU tests are still failing for igamma/igammac.
1. Add calls for igamma/igammac to TensorBase
2. Fix up CUDA-specific calls of igamma/igammac
3. Add unit tests for digamma, igamma, igammac in CUDA.
2016-03-07 14:08:56 -08:00
Benoit Steiner
9a54c3e32b
Don't warn that msvc 2015 isn't c++11 compliant just because it doesn't claim to be.
2016-03-06 09:38:56 -08:00
Benoit Steiner
05bbca079a
Turn on some of the cxx11 features when compiling with visual studio 2015
2016-03-05 10:52:08 -08:00
Benoit Steiner
23aed8f2e4
Use EIGEN_PI instead of redefining our own constant PI
2016-03-05 08:04:45 -08:00
Benoit Steiner
ec35068edc
Don't rely on the M_PI constant since not all compilers provide it.
2016-03-04 16:42:38 -08:00
Benoit Steiner
60d9df11c1
Fixed the computation of leading zeros when compiling with msvc.
2016-03-04 16:27:02 -08:00
Benoit Steiner
c561eeb7bf
Don't use implicit type conversions in initializer lists since not all compilers support them.
2016-03-04 14:12:45 -08:00
Benoit Steiner
2c50fc878e
Fixed a typo
2016-03-04 14:09:38 -08:00
Benoit Steiner
5cf4558c0a
Added support for rounding, flooring, and ceiling to the tensor api
2016-03-03 12:36:55 -08:00
Benoit Steiner
68ac5c1738
Improved the performance of large outer reductions on cuda
2016-02-29 18:11:58 -08:00
Benoit Steiner
b2075cb7a2
Made the signature of the inner and outer reducers consistent
2016-02-29 10:53:38 -08:00
Benoit Steiner
3284842045
Optimized the performance of narrow reductions on CUDA devices
2016-02-29 10:48:16 -08:00
Benoit Steiner
609b3337a7
Print some information to stderr when a CUDA kernel fails
2016-02-27 20:42:57 +00:00
Benoit Steiner
ac2e6e0d03
Properly vectorized the random number generators
2016-02-26 13:52:24 -08:00
Benoit Steiner
caa54d888f
Made the TensorIndexList usable on GPU without having to use the -relaxed-constexpr compilation flag
2016-02-26 12:38:18 -08:00
Benoit Steiner
2cd32cad27
Reverted previous commit since it caused more problems than it solved
2016-02-26 13:21:44 +00:00
Benoit Steiner
d9d05dd96e
Fixed handling of long doubles on aarch64
2016-02-26 04:13:58 -08:00
Benoit Steiner
c36c09169e
Fixed a typo in the reduction code that could prevent large full reductionsx from running properly on old cuda devices.
2016-02-24 17:07:25 -08:00
Benoit Steiner
7a01cb8e4b
Marked the And and Or reducers as stateless.
2016-02-24 16:43:01 -08:00
Benoit Steiner
1d9256f7db
Updated the padding code to work with half floats
2016-02-23 05:51:22 +00:00
Benoit Steiner
72d2cf642e
Deleted the coordinate based evaluation of tensor expressions, since it's hardly ever used and started to cause some issues with some versions of xcode.
2016-02-22 15:29:41 -08:00
Benoit Steiner
5cd00068c0
include <iostream> in the tensor header since we now use it to better report cuda initialization errors
2016-02-22 13:59:03 -08:00
Benoit Steiner
257b640463
Fixed compilation warning generated by clang
2016-02-21 22:43:37 -08:00
Benoit Steiner
96a24b05cc
Optimized casting of tensors in the case where the casting happens to be a no-op
2016-02-21 11:16:15 -08:00
Benoit Steiner
203490017f
Prevent unecessary Index to int conversions
2016-02-21 08:49:36 -08:00