Rasmus Munk Larsen
|
463738ccbe
|
Use computeProductBlockingSizes to compute blocking for both ShardByCol and ShardByRow cases.
|
2016-04-27 12:26:18 -07:00 |
|
Gael Guennebaud
|
3dddd34133
|
Refactor the unsupported CXX11/Core module to internal headers only.
|
2016-04-26 11:20:25 +02:00 |
|
Benoit Steiner
|
4a164d2c46
|
Fixed the partial evaluation of non vectorizable tensor subexpressions
|
2016-04-25 10:43:03 -07:00 |
|
Benoit Steiner
|
fd9401f260
|
Refined the cost of the striding operation.
|
2016-04-25 09:16:08 -07:00 |
|
Benoit Steiner
|
4bbc97be5e
|
Provide access to the base threadpool classes
|
2016-04-21 17:59:33 -07:00 |
|
Benoit Steiner
|
33adce5c3a
|
Added the ability to switch to the new thread pool with a #define
|
2016-04-21 11:59:58 -07:00 |
|
Benoit Steiner
|
f670613e4b
|
Fixed several compilation warnings
|
2016-04-21 11:03:02 -07:00 |
|
Benoit Steiner
|
32ffce04fc
|
Use EIGEN_THREAD_YIELD instead of std::this_thread::yield to make the code more portable.
|
2016-04-21 08:47:28 -07:00 |
|
Benoit Steiner
|
2dde1b1028
|
Don't crash when attempting to reduce empty tensors.
|
2016-04-20 18:08:20 -07:00 |
|
Benoit Steiner
|
a792cd357d
|
Added more tests
|
2016-04-20 17:33:58 -07:00 |
|
Benoit Steiner
|
c7c2054bb5
|
Started to implement a portable way to yield.
|
2016-04-19 17:59:58 -07:00 |
|
Benoit Steiner
|
2b72163028
|
Implemented a more portable version of thread local variables
|
2016-04-19 15:56:02 -07:00 |
|
Benoit Steiner
|
04f954956d
|
Fixed a few typos
|
2016-04-19 15:27:09 -07:00 |
|
Benoit Steiner
|
5b1106c56b
|
Fixed a compilation error with nvcc 7.
|
2016-04-19 14:57:57 -07:00 |
|
Benoit Steiner
|
7129d998db
|
Simplified the code that launches cuda kernels.
|
2016-04-19 14:55:21 -07:00 |
|
Benoit Steiner
|
b9ea40c30d
|
Don't take the address of a kernel on CUDA devices that don't support this feature.
|
2016-04-19 14:35:11 -07:00 |
|
Benoit Steiner
|
884c075058
|
Use numext::ceil instead of std::ceil
|
2016-04-19 14:33:30 -07:00 |
|
Benoit Steiner
|
a278414d1b
|
Avoid an unnecessary copy of the evaluator.
|
2016-04-19 13:54:28 -07:00 |
|
Benoit Steiner
|
f953c60705
|
Fixed 2 recent regression tests
|
2016-04-19 12:57:39 -07:00 |
|
Benoit Steiner
|
50968a0a3e
|
Use DenseIndex in the MeanReducer to avoid overflows when processing very large tensors.
|
2016-04-19 11:53:58 -07:00 |
|
Benoit Steiner
|
84543c8be2
|
Worked around the lack of a rand_r function on windows systems
|
2016-04-17 19:29:27 -07:00 |
|
Benoit Steiner
|
5fbcfe5eb4
|
Worked around the lack of a rand_r function on windows systems
|
2016-04-17 18:42:31 -07:00 |
|
Benoit Steiner
|
c8e8f93d6c
|
Move the evalGemm method into the TensorContractionEvaluatorBase class to make it accessible from both the single and multithreaded contraction evaluators.
|
2016-04-15 16:48:10 -07:00 |
|
Benoit Steiner
|
7cff898e0a
|
Deleted unnecessary variable
|
2016-04-15 15:46:14 -07:00 |
|
Benoit Steiner
|
6c43c49e4a
|
Fixed a few compilation warnings
|
2016-04-15 15:34:34 -07:00 |
|
Benoit Steiner
|
eb669f989f
|
Merged in rmlarsen/eigen (pull request PR-178)
Eigen Tensor cost model part 2: Thread scheduling for standard evaluators and reductions.
|
2016-04-15 14:53:15 -07:00 |
|
Rasmus Munk Larsen
|
3718bf654b
|
Get rid of void* casting when calling EvalRange::run.
|
2016-04-15 12:51:33 -07:00 |
|
Benoit Steiner
|
40c9923a8a
|
Fixed compilation errors with msvc
|
2016-04-15 11:27:52 -07:00 |
|
Benoit Steiner
|
a62e924656
|
Added ability to access the cache sizes from the tensor devices
|
2016-04-14 21:25:06 -07:00 |
|
Benoit Steiner
|
18e6f67426
|
Added support for exclusive or
|
2016-04-14 20:37:46 -07:00 |
|
Rasmus Munk Larsen
|
07ac4f7e02
|
Eigen Tensor cost model part 2: Thread scheduling for standard evaluators and reductions. The cost model is turned off by default.
|
2016-04-14 18:28:23 -07:00 |
|
Benoit Steiner
|
9624a1ea3d
|
Added missing definition of PacketSize in the gpu evaluator of convolution
|
2016-04-14 17:16:58 -07:00 |
|
Benoit Steiner
|
6fbedf5a4e
|
Merged in rmlarsen/eigen (pull request PR-177)
Eigen Tensor cost model part 1.
|
2016-04-14 17:13:19 -07:00 |
|
Benoit Steiner
|
bebb89acfa
|
Enabled the new threadpool tests
|
2016-04-14 16:44:10 -07:00 |
|
Benoit Steiner
|
9c064b5a97
|
Cleanup
|
2016-04-14 16:41:31 -07:00 |
|
Benoit Steiner
|
1372156c41
|
Prepared the migration to the new non blocking thread pool
|
2016-04-14 16:16:42 -07:00 |
|
Rasmus Munk Larsen
|
aeb5494a0b
|
Improvements to cost model.
|
2016-04-14 15:52:58 -07:00 |
|
Benoit Steiner
|
a8e8837ba7
|
Added tests for the non blocking thread pool
|
2016-04-14 15:23:49 -07:00 |
|
Benoit Steiner
|
78a51abc12
|
Added a more scalable non blocking thread pool
|
2016-04-14 15:23:10 -07:00 |
|
Rasmus Munk Larsen
|
d2e95492e7
|
Merge upstream updates.
|
2016-04-14 13:59:50 -07:00 |
|
Rasmus Munk Larsen
|
235e83aba6
|
Eigen cost model part 1. This implements a basic recursive framework to estimate the cost of evaluating tensor expressions.
|
2016-04-14 13:57:35 -07:00 |
|
Benoit Steiner
|
5912ad877c
|
Silenced a compilation warning
|
2016-04-14 11:40:14 -07:00 |
|
Benoit Steiner
|
2b6e3de02f
|
Added tests to validate flooring and ceiling of fp16
|
2016-04-14 11:39:18 -07:00 |
|
Benoit Steiner
|
6f23e945f6
|
Added simple test for numext::sqrt and numext::pow on fp16
|
2016-04-14 10:32:52 -07:00 |
|
Benoit Steiner
|
72510c80e1
|
Added basic test for trigonometric functions on fp16
|
2016-04-14 10:27:24 -07:00 |
|
Benoit Steiner
|
c7167fee0e
|
Added support for fp16 to the sigmoid function
|
2016-04-14 10:08:33 -07:00 |
|
Benoit Steiner
|
f6003f0873
|
Made the test msvc friendly
|
2016-04-14 09:47:26 -07:00 |
|
Gael Guennebaud
|
7d1391d049
|
Turn a converge check to a warning
|
2016-04-13 22:50:54 +02:00 |
|
Benoit Steiner
|
e9b12cc1f7
|
Fixed compilation warnings generated by clang
|
2016-04-12 20:53:18 -07:00 |
|
Benoit Steiner
|
e3a184785c
|
Fixed the zeta test
|
2016-04-12 11:12:36 -07:00 |
|