eigen/unsupported/Eigen/CXX11
Artem Belevich 25230d1862 Improve performance of contraction kernels
* Force-inline implementations. They pass around pointers to shared memory
  blocks. Without inlining compiler must operate via generic pointers.
  Inlining allows compiler to detect that we're operating on shared memory
  which allows generation of substantially faster code.

* Fixed a long-standing typo which resulted in launching 8x more kernels
  than we needed (.z dimension of the block is unused by the kernel).
2019-12-05 12:48:34 -08:00
..
src Improve performance of contraction kernels 2019-12-05 12:48:34 -08:00
CMakeLists.txt
Tensor [SYCL] Rebasing the SYCL support branch on top of the Einge upstream master branch. 2019-11-28 10:08:54 +00:00
TensorSymmetry
ThreadPool ThreadLocal container that does not rely on thread local storage 2019-09-09 15:18:14 -07:00