Benoit Steiner
67fcf47ecb
Merged from trunk
2014-10-30 21:59:22 -07:00
Benoit Steiner
fcecafde3a
Fixed a compilation error with clang
2014-10-30 21:58:14 -07:00
Benoit Steiner
d62bfe73a9
Use the proper index type in the padding code
2014-10-30 18:15:05 -07:00
Benoit Steiner
bc99c5f7db
fixed some potential alignment issues.
2014-10-30 18:09:53 -07:00
Benoit Steiner
1946cc4478
Added missing packet primitives for CUDA.
2014-10-30 17:52:32 -07:00
Benoit Steiner
5e62427e22
Use the proper index type
2014-10-30 17:49:39 -07:00
Benoit Steiner
debc97821c
Added support for tensor references
2014-10-28 23:10:13 -07:00
Benoit Steiner
f786897e4b
Added access to the unerlying raw data of a tnsor slice/chip whenever possible
2014-10-17 15:33:27 -07:00
Benoit Steiner
7acd38d19e
Created some benchmarks for the tensor code
2014-10-17 09:49:03 -07:00
Benoit Steiner
65af852b54
Silenced one last warning
2014-10-16 15:02:30 -07:00
Benoit Steiner
ae697b471c
Silenced a few compilation warnings
...
Generalized a TensorMap constructor
2014-10-16 14:52:50 -07:00
Benoit Steiner
94e47798f4
Fixed the return types of unary and binary expressions to properly handle the case where it is different from the input type (e.g. abs(complex<float>))
2014-10-16 10:41:07 -07:00
Benoit Steiner
d853adffdb
Avoid calling get_future() more than once on a given promise.
2014-10-16 10:10:04 -07:00
Benoit Steiner
bfdd9f3ac9
Made the blocking computation aware of the l3 cache
...
Also optimized the blocking parameters to take into account the number of threads used for a computation
2014-10-15 15:32:59 -07:00
Benoit Steiner
dba55041ab
Added support for promises
...
Started to improve multithreaded contractions
2014-10-15 11:20:36 -07:00
Benoit Steiner
99d75235a9
Misc improvements and cleanups
2014-10-13 17:02:09 -07:00
Benoit Steiner
4c70b0a762
Added support for patch extraction
2014-10-13 10:04:04 -07:00
Benoit Steiner
0219f8aed4
Added ability to print a tensor using an iostream.
2014-10-10 16:17:26 -07:00
Benoit Steiner
2ed1838aeb
Added support for tensor chips
2014-10-10 16:11:27 -07:00
Benoit Steiner
4b36c3591f
Fixed the tensor shuffling test
2014-10-10 15:43:21 -07:00
Benoit Steiner
a991f94c0e
Fixed the thread pool test
2014-10-10 15:20:37 -07:00
Benoit Steiner
498b7eed25
Rewrote the TensorBase::random method to support the generation of random number on gpu.
2014-10-09 15:39:13 -07:00
Benoit Steiner
767424af18
Improved the functors defined for standard reductions
...
Added a functor to encapsulate the generation of random numbers on cpu and gpu.
2014-10-09 15:36:23 -07:00
Benoit Steiner
44beee9d68
Removed dead code
2014-10-08 14:14:20 -07:00
Benoit Steiner
0a07ac574e
Added support for the *= and /* operators to TensorBase
2014-10-08 13:32:41 -07:00
Benoit Steiner
6c047d398d
Fixed a comment
2014-10-08 13:29:36 -07:00
Benoit Steiner
bbce6fa65d
define EIGEN_VECTORIZE_CUDA when compiling with nvcc
2014-10-03 19:55:35 -07:00
Benoit Steiner
95a430a2ca
Vector primitives for CUDA
2014-10-03 19:45:19 -07:00
Benoit Steiner
152f3218ac
Improved contraction test
2014-10-03 19:33:44 -07:00
Benoit Steiner
af2e5995e2
Improved support for CUDA devices.
...
Improved contractions on GPU
2014-10-03 19:18:07 -07:00
Benoit Steiner
1269392822
Created the IndexPair type to store pair of tensor indices. CUDA doesn't support std::pair so we can't use them when targeting GPUs.
...
Improved the performance on tensor contractions
2014-10-03 10:16:59 -07:00
Benoit Steiner
b7271dffb5
Generalized the gebp apis
2014-10-02 16:51:57 -07:00
Benoit Steiner
8b2afe33a1
Fixes for the forced evaluation of tensor expressions
...
More tests
2014-10-02 10:39:36 -07:00
Benoit Steiner
5cc23199be
More tests to validate the const-correctness of the tensor code.
2014-10-02 10:30:44 -07:00
Benoit Steiner
7caaf6453b
Added support for tensor reductions and concatenations
2014-10-01 20:38:22 -07:00
Benoit Steiner
1c236f4c9a
Added tests for tensors of const values and tensors of stringswwq::
2014-10-01 20:21:42 -07:00
Benoit Steiner
10a79ca3a3
Merged latest updates from the Eigen trunk.
2014-09-15 09:18:16 -07:00
Jitse Niesen
9452eb38f8
Make UpperBidiagonalization accept row-major matrices (bug #769 )
...
* Give temporary workspace the same storage order as original matrix
* Take storage order into account when determining inner stride
of rows and columns
* Change one test to use a row-major matrix.
2014-09-12 14:52:35 +01:00
Benoit Steiner
efdff15749
Fixed a typo in the contraction code
2014-09-06 13:28:24 -07:00
Benoit Steiner
74db22455a
Misc fixes.
2014-09-05 07:47:43 -07:00
Benoit Steiner
1abe4ed14c
Created more regression tests
2014-09-04 20:27:28 -07:00
Benoit Steiner
d43f737b4a
Added support for evaluation of tensor shuffling operations as lvalues
2014-09-04 20:02:28 -07:00
Benoit Steiner
f50548e86a
Added missing tensor copy constructors. As a result it is now possible to declare and initialize a tensor on the same line, as in:
...
Tensor<bla> T = A + B; or
Tensor<bla> T(A.reshape(new_shape));
2014-09-04 19:50:27 -07:00
Benoit Steiner
b24fe22b1a
Improved the performance of the tensor convolution code by a factor of about 4.
2014-09-03 11:38:13 -07:00
Benoit Steiner
2959045f2f
Optimized the tensor padding code.
2014-08-26 09:47:18 -07:00
Benoit Steiner
36fffe48f7
Misc api improvements and cleanups
2014-08-23 14:35:41 -07:00
Benoit Steiner
fb5c1e9097
Optimized and cleaned up the tensor morphing code
2014-08-23 13:18:30 -07:00
Benoit Steiner
3d298da269
Added support for broadcasting
2014-08-20 17:00:50 -07:00
Benoit Steiner
9ac3c821ea
Improved the speed of convolutions when running on cuda devices
2014-08-19 16:57:10 -07:00
Benoit Steiner
33c702c79f
Added support for fast integer divisions by a constant
...
Sped up tensor slicing by a factor of 3 by using these fast integer divisions.
2014-08-14 22:13:21 -07:00