Mark Borgerding
|
7ddcf97da7
|
added scalar_sign_op (both real,complex)
|
2015-11-24 17:15:07 -05:00 |
|
Benoit Steiner
|
44848ac39b
|
Fixed a bug in TensorArgMax.h
|
2015-11-23 15:58:47 -08:00 |
|
Benoit Steiner
|
547a8608e5
|
Fixed the implementation of Eigen::internal::count_leading_zeros for MSVC.
Also updated the code to silence bogux warnings generated by nvcc when compilining this function.
|
2015-11-23 12:17:45 -08:00 |
|
Benoit Steiner
|
562078780a
|
Don't create more cuda blocks than necessary
|
2015-11-23 11:00:10 -08:00 |
|
Benoit Steiner
|
df31ca3b9e
|
Made it possible to refer t oa GPUDevice from code compile with a regular C++ compiler
|
2015-11-23 10:03:53 -08:00 |
|
Benoit Steiner
|
1e04059012
|
Deleted unused variable.
|
2015-11-23 08:36:54 -08:00 |
|
Benoit Steiner
|
9fa65d3838
|
Split TensorDeviceType.h in 3 files to make it more manageable
|
2015-11-20 17:42:50 -08:00 |
|
Benoit Steiner
|
a367804856
|
Added option to force the usage of the Eigen array class instead of the std::array class.
|
2015-11-20 12:41:40 -08:00 |
|
Benoit Steiner
|
383d1cc2ed
|
Added proper support for fast 64bit integer division on CUDA
|
2015-11-20 11:09:46 -08:00 |
|
Benoit Steiner
|
f37a5f1c53
|
Fixed compilation error triggered by nvcc
|
2015-11-19 14:34:26 -08:00 |
|
Benoit Steiner
|
f8df393165
|
Added support for 128bit integers on CUDA devices.
|
2015-11-19 13:57:27 -08:00 |
|
Benoit Steiner
|
1dd444ea71
|
Avoid using the version of TensorIntDiv optimized for 32-bit integers when the divisor can be equal to one since it isn't supported.
|
2015-11-18 11:37:58 -08:00 |
|
Benoit Steiner
|
f1fbd74db9
|
Added sanity check
|
2015-11-13 09:07:27 -08:00 |
|
Benoit Steiner
|
7815b84be4
|
Fixed a compilation warning
|
2015-11-12 20:16:59 -08:00 |
|
Benoit Steiner
|
10a91930cc
|
Fixed a compilation warning triggered by nvcc
|
2015-11-12 20:10:52 -08:00 |
|
Benoit Steiner
|
ed4b37de02
|
Fixed a few compilation warnings
|
2015-11-12 20:08:01 -08:00 |
|
Benoit Steiner
|
b69248fa2a
|
Added a couple of missing EIGEN_DEVICE_FUNC
|
2015-11-12 20:01:50 -08:00 |
|
Benoit Steiner
|
0aaa5941df
|
Silenced some compilation warnings triggered by nvcc
|
2015-11-12 19:11:43 -08:00 |
|
Benoit Steiner
|
2c73633b28
|
Fixed a few more typos
|
2015-11-12 18:39:19 -08:00 |
|
Benoit Steiner
|
be08e82953
|
Fixed typos
|
2015-11-12 18:37:40 -08:00 |
|
Benoit Steiner
|
150c12e138
|
Completed the IndexList rewrite
|
2015-11-12 18:11:56 -08:00 |
|
Benoit Steiner
|
8037826367
|
Simplified more of the IndexList code.
|
2015-11-12 17:19:45 -08:00 |
|
Benoit Steiner
|
e9ecfad796
|
Started to make the IndexList code compile by more compilers
|
2015-11-12 16:41:14 -08:00 |
|
Benoit Steiner
|
7a1316fcc5
|
Fixed compilation error with xcode.
|
2015-11-12 11:05:54 -08:00 |
|
Benoit Steiner
|
737d237722
|
Made it possible to run some of the CXXMeta functions on a CUDA device.
|
2015-11-12 09:02:59 -08:00 |
|
Benoit Steiner
|
1e072424e8
|
Moved the array code into it's own file.
|
2015-11-12 08:57:04 -08:00 |
|
Benoit Steiner
|
aa5f1ca714
|
gen_numeric_list takes a size_t, not a int
|
2015-11-12 08:30:10 -08:00 |
|
Benoit Steiner
|
9fa10fe52d
|
Don't use std::array when compiling with nvcc since nvidia doesn't support the use of STL containers on GPU.
|
2015-11-11 15:38:30 -08:00 |
|
Benoit Steiner
|
c587293e48
|
Fixed a compilation warning
|
2015-11-11 15:35:12 -08:00 |
|
Benoit Steiner
|
7f1c29fb0c
|
Make it possible for a vectorized tensor expression to be executed in a CUDA kernel.
|
2015-11-11 15:22:50 -08:00 |
|
Benoit Steiner
|
99f4778506
|
Disable SFINAE when compiling with nvcc
|
2015-11-11 15:04:58 -08:00 |
|
Benoit Steiner
|
5cb18e5b5e
|
Fixed CUDA compilation errors
|
2015-11-11 14:36:33 -08:00 |
|
Benoit Steiner
|
228edfe616
|
Use Eigen::NumTraits instead of std::numeric_limits
|
2015-11-11 09:26:23 -08:00 |
|
Benoit Steiner
|
d573efe303
|
Code cleanup
|
2015-11-06 14:54:28 -08:00 |
|
Benoit Steiner
|
9fa283339f
|
Silenced a compilation warning
|
2015-11-06 11:44:22 -08:00 |
|
Benoit Steiner
|
53432a17b2
|
Added static assertions to avoid misuses of padding, broadcasting and concatenation ops.
|
2015-11-06 10:26:19 -08:00 |
|
Benoit Steiner
|
6857a35a11
|
Fixed typos
|
2015-11-06 09:42:05 -08:00 |
|
Benoit Steiner
|
33cbdc2d15
|
Added more missing EIGEN_DEVICE_FUNC
|
2015-11-06 09:29:59 -08:00 |
|
Benoit Steiner
|
ed1962b464
|
Reimplement the tensor comparison operators by using the scalar_cmp_op functors. This makes them more cuda friendly.
|
2015-11-06 09:18:43 -08:00 |
|
Benoit Steiner
|
29038b982d
|
Added support for modulo operation
|
2015-11-05 19:39:48 -08:00 |
|
Benoit Steiner
|
c75a19f815
|
Misc fixes to full reductions
|
2015-11-05 14:21:20 -08:00 |
|
Benoit Steiner
|
ec5a81b45a
|
Fixed a bug in the extraction of sizes of fixed sized tensors of rank 0
|
2015-11-05 13:39:48 -08:00 |
|
Benoit Steiner
|
beedd9630d
|
Updated the reduction code so that full reductions now return a tensor of rank 0.
|
2015-11-04 13:57:36 -08:00 |
|
Benoit Steiner
|
6a02c2a85d
|
Fixed a compilation warning
|
2015-10-29 20:21:29 -07:00 |
|
Benoit Steiner
|
ca12d4c3b3
|
Pulled latest updates from trunk
|
2015-10-29 17:57:48 -07:00 |
|
Benoit Steiner
|
ce19e38c1f
|
Added support for tensor maps of rank 0.
|
2015-10-29 17:49:04 -07:00 |
|
Benoit Steiner
|
3785c69294
|
Added support for fixed sized tensors of rank 0
|
2015-10-29 17:31:03 -07:00 |
|
Benoit Steiner
|
0d7a23d34e
|
Extended the reduction code so that reducing an empty set returns the neural element for the operation
|
2015-10-29 17:29:49 -07:00 |
|
Benoit Steiner
|
1b0685d09a
|
Added support for rank-0 tensors
|
2015-10-29 17:27:38 -07:00 |
|
Benoit Steiner
|
c444a0a8c3
|
Consistently use the same index type in the fft codebase.
|
2015-10-29 16:39:47 -07:00 |
|