Benoit Steiner
|
5c372d19e3
|
Merged in rmlarsen/eigen (pull request PR-179)
Prevent crash in CompleteOrthogonalDecomposition if object was default constructed.
|
2016-04-21 18:06:36 -07:00 |
|
Benoit Steiner
|
4bbc97be5e
|
Provide access to the base threadpool classes
|
2016-04-21 17:59:33 -07:00 |
|
Rasmus Munk Larsen
|
a3256d78d8
|
Prevent crash in CompleteOrthogonalDecomposition if object was default constructed.
|
2016-04-21 16:49:28 -07:00 |
|
Benoit Steiner
|
33adce5c3a
|
Added the ability to switch to the new thread pool with a #define
|
2016-04-21 11:59:58 -07:00 |
|
Benoit Steiner
|
79b900375f
|
Use index list for the striding benchmarks
|
2016-04-21 11:58:27 -07:00 |
|
Benoit Steiner
|
f670613e4b
|
Fixed several compilation warnings
|
2016-04-21 11:03:02 -07:00 |
|
Benoit Steiner
|
6015422ee6
|
Added an option to enable the use of the F16C instruction set
|
2016-04-21 10:30:29 -07:00 |
|
Benoit Steiner
|
32ffce04fc
|
Use EIGEN_THREAD_YIELD instead of std::this_thread::yield to make the code more portable.
|
2016-04-21 08:47:28 -07:00 |
|
Benoit Steiner
|
2dde1b1028
|
Don't crash when attempting to reduce empty tensors.
|
2016-04-20 18:08:20 -07:00 |
|
Benoit Steiner
|
a792cd357d
|
Added more tests
|
2016-04-20 17:33:58 -07:00 |
|
Benoit Steiner
|
80200a1828
|
Don't attempt to leverage the _cvtss_sh and _cvtsh_ss instructions when compiling with clang since it's unclear which versions of clang actually support these instruction.
|
2016-04-20 12:10:27 -07:00 |
|
Benoit Steiner
|
c7c2054bb5
|
Started to implement a portable way to yield.
|
2016-04-19 17:59:58 -07:00 |
|
Benoit Steiner
|
1d0238375d
|
Made sure all the required header files are included when trying to use fp16
|
2016-04-19 17:44:12 -07:00 |
|
Benoit Steiner
|
2b72163028
|
Implemented a more portable version of thread local variables
|
2016-04-19 15:56:02 -07:00 |
|
Benoit Steiner
|
04f954956d
|
Fixed a few typos
|
2016-04-19 15:27:09 -07:00 |
|
Benoit Steiner
|
5b1106c56b
|
Fixed a compilation error with nvcc 7.
|
2016-04-19 14:57:57 -07:00 |
|
Benoit Steiner
|
7129d998db
|
Simplified the code that launches cuda kernels.
|
2016-04-19 14:55:21 -07:00 |
|
Benoit Steiner
|
b9ea40c30d
|
Don't take the address of a kernel on CUDA devices that don't support this feature.
|
2016-04-19 14:35:11 -07:00 |
|
Benoit Steiner
|
884c075058
|
Use numext::ceil instead of std::ceil
|
2016-04-19 14:33:30 -07:00 |
|
Benoit Steiner
|
a278414d1b
|
Avoid an unnecessary copy of the evaluator.
|
2016-04-19 13:54:28 -07:00 |
|
Benoit Steiner
|
f953c60705
|
Fixed 2 recent regression tests
|
2016-04-19 12:57:39 -07:00 |
|
Benoit Steiner
|
50968a0a3e
|
Use DenseIndex in the MeanReducer to avoid overflows when processing very large tensors.
|
2016-04-19 11:53:58 -07:00 |
|
Benoit Steiner
|
84543c8be2
|
Worked around the lack of a rand_r function on windows systems
|
2016-04-17 19:29:27 -07:00 |
|
Benoit Steiner
|
5fbcfe5eb4
|
Worked around the lack of a rand_r function on windows systems
|
2016-04-17 18:42:31 -07:00 |
|
Gael Guennebaud
|
e4fe611e2c
|
Enable lazy-coeff-based-product for vector*(1x1) products
|
2016-04-16 15:17:39 +02:00 |
|
Benoit Steiner
|
c8e8f93d6c
|
Move the evalGemm method into the TensorContractionEvaluatorBase class to make it accessible from both the single and multithreaded contraction evaluators.
|
2016-04-15 16:48:10 -07:00 |
|
Benoit Steiner
|
1a16fb1532
|
Deleted extraneous comma.
|
2016-04-15 15:50:13 -07:00 |
|
Benoit Steiner
|
7cff898e0a
|
Deleted unnecessary variable
|
2016-04-15 15:46:14 -07:00 |
|
Benoit Steiner
|
6c43c49e4a
|
Fixed a few compilation warnings
|
2016-04-15 15:34:34 -07:00 |
|
Benoit Steiner
|
eb669f989f
|
Merged in rmlarsen/eigen (pull request PR-178)
Eigen Tensor cost model part 2: Thread scheduling for standard evaluators and reductions.
|
2016-04-15 14:53:15 -07:00 |
|
Gael Guennebaud
|
2a7115daca
|
bug #1203: by-pass large stack-allocation in stableNorm if EIGEN_STACK_ALLOCATION_LIMIT is too small
|
2016-04-15 22:34:11 +02:00 |
|
Rasmus Munk Larsen
|
3718bf654b
|
Get rid of void* casting when calling EvalRange::run.
|
2016-04-15 12:51:33 -07:00 |
|
Benoit Steiner
|
40c9923a8a
|
Fixed compilation errors with msvc
|
2016-04-15 11:27:52 -07:00 |
|
Benoit Steiner
|
1d23430628
|
Improved the matrix multiplication blocking in the case where mr is not a power of 2 (e.g on Haswell CPUs).
|
2016-04-15 10:53:31 -07:00 |
|
Gael Guennebaud
|
1e80bddde3
|
Fix trmv for mixing types.
|
2016-04-15 17:58:36 +02:00 |
|
Benoit Steiner
|
a62e924656
|
Added ability to access the cache sizes from the tensor devices
|
2016-04-14 21:25:06 -07:00 |
|
Benoit Steiner
|
18e6f67426
|
Added support for exclusive or
|
2016-04-14 20:37:46 -07:00 |
|
Rasmus Munk Larsen
|
07ac4f7e02
|
Eigen Tensor cost model part 2: Thread scheduling for standard evaluators and reductions. The cost model is turned off by default.
|
2016-04-14 18:28:23 -07:00 |
|
Benoit Steiner
|
9624a1ea3d
|
Added missing definition of PacketSize in the gpu evaluator of convolution
|
2016-04-14 17:16:58 -07:00 |
|
Benoit Steiner
|
6fbedf5a4e
|
Merged in rmlarsen/eigen (pull request PR-177)
Eigen Tensor cost model part 1.
|
2016-04-14 17:13:19 -07:00 |
|
Benoit Steiner
|
bebb89acfa
|
Enabled the new threadpool tests
|
2016-04-14 16:44:10 -07:00 |
|
Benoit Steiner
|
9c064b5a97
|
Cleanup
|
2016-04-14 16:41:31 -07:00 |
|
Benoit Steiner
|
1372156c41
|
Prepared the migration to the new non blocking thread pool
|
2016-04-14 16:16:42 -07:00 |
|
Rasmus Munk Larsen
|
aeb5494a0b
|
Improvements to cost model.
|
2016-04-14 15:52:58 -07:00 |
|
Benoit Steiner
|
00dfe18487
|
Merged latest updates from trunk
|
2016-04-14 15:25:20 -07:00 |
|
Benoit Steiner
|
a8e8837ba7
|
Added tests for the non blocking thread pool
|
2016-04-14 15:23:49 -07:00 |
|
Benoit Steiner
|
78a51abc12
|
Added a more scalable non blocking thread pool
|
2016-04-14 15:23:10 -07:00 |
|
Rasmus Munk Larsen
|
d2e95492e7
|
Merge upstream updates.
|
2016-04-14 13:59:50 -07:00 |
|
Rasmus Munk Larsen
|
235e83aba6
|
Eigen cost model part 1. This implements a basic recursive framework to estimate the cost of evaluating tensor expressions.
|
2016-04-14 13:57:35 -07:00 |
|
Gael Guennebaud
|
68897c52f3
|
Add extreme values to the imaginary part for SVD unit tests.
|
2016-04-14 22:47:30 +02:00 |
|