Gael Guennebaud
|
978c379ed7
|
Add missing ctor from uint
|
2015-12-30 12:52:38 +01:00 |
|
Benoit Steiner
|
bdcbc66a5c
|
Don't attempt to vectorize mean reductions of integers since we can't use
SSE or AVX instructions to divide 2 integers.
|
2015-12-22 17:51:55 -08:00 |
|
Benoit Steiner
|
a1e08fb2a5
|
Optimized the configuration of the outer reduction cuda kernel
|
2015-12-22 16:30:10 -08:00 |
|
Benoit Steiner
|
9c7d96697b
|
Added missing define
|
2015-12-22 16:11:07 -08:00 |
|
Benoit Steiner
|
e7e6d01810
|
Made sure the optimized gpu reduction code is actually compiled.
|
2015-12-22 15:07:33 -08:00 |
|
Benoit Steiner
|
b5d2078c4a
|
Optimized outer reduction on GPUs.
|
2015-12-22 15:06:17 -08:00 |
|
Benoit Steiner
|
1c3e78319d
|
Added missing const
|
2015-12-21 15:05:01 -08:00 |
|
Benoit Steiner
|
1b82969559
|
Add alignment requirement for local buffer used by the slicing op.
|
2015-12-18 14:36:35 -08:00 |
|
Benoit Steiner
|
75a7fa1919
|
Doubled the speed of full reductions on GPUs.
|
2015-12-18 14:07:31 -08:00 |
|
Benoit Steiner
|
8dd17cbe80
|
Fixed a clang compilation warning triggered by the use of arrays of size 0.
|
2015-12-17 14:00:33 -08:00 |
|
Benoit Steiner
|
4aac55f684
|
Silenced some compilation warnings triggered by nvcc
|
2015-12-17 13:39:01 -08:00 |
|
Benoit Steiner
|
40e6250fc3
|
Made it possible to run tensor chipping operations on CUDA devices
|
2015-12-17 13:29:08 -08:00 |
|
Benoit Steiner
|
2ca55a3ae4
|
Fixed some compilation error triggered by the tensor code with msvc 2008
|
2015-12-16 20:45:58 -08:00 |
|
Benoit Steiner
|
17352e2792
|
Made the entire TensorFixedSize api callable from a CUDA kernel.
|
2015-12-14 15:20:31 -08:00 |
|
Benoit Steiner
|
75e19fc7ca
|
Marked the tensor constructors as EIGEN_DEVICE_FUNC: This makes it possible to call them from a CUDA kernel.
|
2015-12-14 15:12:55 -08:00 |
|
Gael Guennebaud
|
ca39b1546e
|
Merged in ebrevdo/eigen (pull request PR-148)
Add special functions to eigen: lgamma, erf, erfc.
|
2015-12-11 11:52:09 +01:00 |
|
Benoit Steiner
|
6af52a1227
|
Fixed a typo in the constructor of tensors of rank 5.
|
2015-12-10 23:31:12 -08:00 |
|
Benoit Steiner
|
8e00ea9a92
|
Fixed the coefficient accessors use for the 2d and 3d case when compiling without cxx11 support.
|
2015-12-10 22:45:10 -08:00 |
|
Eugene Brevdo
|
fa4f933c0f
|
Add special functions to Eigen: lgamma, erf, erfc.
Includes CUDA support and unit tests.
|
2015-12-07 15:24:49 -08:00 |
|
Benoit Steiner
|
7dfe75f445
|
Fixed compilation warnings
|
2015-12-07 08:12:30 -08:00 |
|
Benoit Steiner
|
f4ca8ad917
|
Use signed integers instead of unsigned ones more consistently in the codebase.
|
2015-12-04 18:14:16 -08:00 |
|
Benoit Steiner
|
490d26e4c1
|
Use integers instead of std::size_t to encode the number of dimensions in the Tensor class since most of the code currently already use integers.
|
2015-12-04 10:15:11 -08:00 |
|
Benoit Steiner
|
d20efc974d
|
Made it possible to use the sigmoid functor within a CUDA kernel.
|
2015-12-04 09:38:15 -08:00 |
|
Benoit Steiner
|
029052d276
|
Deleted redundant code
|
2015-12-03 17:08:47 -08:00 |
|
Mark Borgerding
|
7ddcf97da7
|
added scalar_sign_op (both real,complex)
|
2015-11-24 17:15:07 -05:00 |
|
Benoit Steiner
|
44848ac39b
|
Fixed a bug in TensorArgMax.h
|
2015-11-23 15:58:47 -08:00 |
|
Benoit Steiner
|
547a8608e5
|
Fixed the implementation of Eigen::internal::count_leading_zeros for MSVC.
Also updated the code to silence bogux warnings generated by nvcc when compilining this function.
|
2015-11-23 12:17:45 -08:00 |
|
Benoit Steiner
|
562078780a
|
Don't create more cuda blocks than necessary
|
2015-11-23 11:00:10 -08:00 |
|
Benoit Steiner
|
df31ca3b9e
|
Made it possible to refer t oa GPUDevice from code compile with a regular C++ compiler
|
2015-11-23 10:03:53 -08:00 |
|
Benoit Steiner
|
1e04059012
|
Deleted unused variable.
|
2015-11-23 08:36:54 -08:00 |
|
Benoit Steiner
|
9fa65d3838
|
Split TensorDeviceType.h in 3 files to make it more manageable
|
2015-11-20 17:42:50 -08:00 |
|
Benoit Steiner
|
a367804856
|
Added option to force the usage of the Eigen array class instead of the std::array class.
|
2015-11-20 12:41:40 -08:00 |
|
Benoit Steiner
|
383d1cc2ed
|
Added proper support for fast 64bit integer division on CUDA
|
2015-11-20 11:09:46 -08:00 |
|
Benoit Steiner
|
f37a5f1c53
|
Fixed compilation error triggered by nvcc
|
2015-11-19 14:34:26 -08:00 |
|
Benoit Steiner
|
f8df393165
|
Added support for 128bit integers on CUDA devices.
|
2015-11-19 13:57:27 -08:00 |
|
Benoit Steiner
|
1dd444ea71
|
Avoid using the version of TensorIntDiv optimized for 32-bit integers when the divisor can be equal to one since it isn't supported.
|
2015-11-18 11:37:58 -08:00 |
|
Benoit Steiner
|
f1fbd74db9
|
Added sanity check
|
2015-11-13 09:07:27 -08:00 |
|
Benoit Steiner
|
7815b84be4
|
Fixed a compilation warning
|
2015-11-12 20:16:59 -08:00 |
|
Benoit Steiner
|
10a91930cc
|
Fixed a compilation warning triggered by nvcc
|
2015-11-12 20:10:52 -08:00 |
|
Benoit Steiner
|
ed4b37de02
|
Fixed a few compilation warnings
|
2015-11-12 20:08:01 -08:00 |
|
Benoit Steiner
|
b69248fa2a
|
Added a couple of missing EIGEN_DEVICE_FUNC
|
2015-11-12 20:01:50 -08:00 |
|
Benoit Steiner
|
0aaa5941df
|
Silenced some compilation warnings triggered by nvcc
|
2015-11-12 19:11:43 -08:00 |
|
Benoit Steiner
|
2c73633b28
|
Fixed a few more typos
|
2015-11-12 18:39:19 -08:00 |
|
Benoit Steiner
|
be08e82953
|
Fixed typos
|
2015-11-12 18:37:40 -08:00 |
|
Benoit Steiner
|
150c12e138
|
Completed the IndexList rewrite
|
2015-11-12 18:11:56 -08:00 |
|
Benoit Steiner
|
8037826367
|
Simplified more of the IndexList code.
|
2015-11-12 17:19:45 -08:00 |
|
Benoit Steiner
|
e9ecfad796
|
Started to make the IndexList code compile by more compilers
|
2015-11-12 16:41:14 -08:00 |
|
Benoit Steiner
|
7a1316fcc5
|
Fixed compilation error with xcode.
|
2015-11-12 11:05:54 -08:00 |
|
Benoit Steiner
|
737d237722
|
Made it possible to run some of the CXXMeta functions on a CUDA device.
|
2015-11-12 09:02:59 -08:00 |
|
Benoit Steiner
|
1e072424e8
|
Moved the array code into it's own file.
|
2015-11-12 08:57:04 -08:00 |
|