Commit Graph

9249 Commits

Author SHA1 Message Date
Benoit Steiner
f0f3591118 Made the reduction code compile with cuda-clang 2017-03-14 14:16:53 -07:00
Rasmus Munk Larsen
bfd7bf9c5b Get rid of Init(). 2017-03-10 08:48:20 -08:00
Rasmus Munk Larsen
d56ab01094 Use C++11 ctor forwarding to simplify code a bit. 2017-03-10 08:30:22 -08:00
Rasmus Munk Larsen
344c2694a6 Make the non-blocking threadpool more flexible and less wasteful of CPU cycles for high-latency use-cases.
* Adds a hint to ThreadPool allowing us to turn off spin waiting. Currently each reader and record yielder op in a graph creates a threadpool with a thread that spins for 1000 iterations through the work stealing loop before yielding. This is wasteful for such ops that process I/O.

* This also changes the number of iterations through the steal loop to be inversely proportional to the number of threads. Since the time of each iteration is proportional to the number of threads, this yields roughly a constant spin time.

* Implement a separate worker loop for the num_threads == 1 case since there is no point in going through the expensive steal loop. Moreover, since Steal() calls PopBack() on the victim queues it might reverse the order in which ops are executed, compared to the order in which they are scheduled, which is usually counter-productive for the types of I/O workloads the single thread pools tend to be used for.

* Store num_threads in a member variable for simplicity and to avoid a data race between the thread creation loop and worker threads calling threads_.size().
2017-03-09 15:41:03 -08:00
Gael Guennebaud
970ff78294 bug #1401: fix compilation of "cond ? x : -x" with x an AutoDiffScalar 2017-03-08 16:16:53 +01:00
Gael Guennebaud
e5156e4d25 fix typo 2017-03-07 11:25:58 +01:00
Gael Guennebaud
5694315fbb remove UTF8 symbol 2017-03-07 10:53:47 +01:00
Gael Guennebaud
e958c2baac remove UTF8 symbols 2017-03-07 10:47:40 +01:00
Gael Guennebaud
d967718525 do not include std header within extern C 2017-03-07 10:16:39 +01:00
Gael Guennebaud
659087b622 bug #1400: fix stableNorm with EIGEN_DONT_ALIGN_STATICALLY 2017-03-07 10:02:34 +01:00
Benoit Steiner
a71943b9a4 Made the Tensor code compile with clang 3.9 2017-03-02 10:47:29 -08:00
Benoit Steiner
09ae0e6586 Adjusted the EIGEN_DEVICE_FUNC qualifiers to make sure that:
* they're used consistently between the declaration and the definition of a function
  * we avoid calling host only methods from host device methods.
2017-03-01 11:47:47 -08:00
Benoit Steiner
1e2d046651 Silenced a couple of compilation warnings 2017-03-01 10:13:42 -08:00
Benoit Steiner
c1d87ec110 Added missing EIGEN_DEVICE_FUNC qualifiers 2017-03-01 10:08:50 -08:00
Benoit Steiner
3a3f040baa Added missing EIGEN_DEVICE_FUNC qualifiers 2017-02-28 17:06:15 -08:00
Benoit Steiner
7b61944669 Made most of the packet math primitives usable within CUDA kernel when compiling with clang 2017-02-28 17:05:28 -08:00
Benoit Steiner
c92406d613 Silenced clang compilation warning. 2017-02-28 17:03:11 -08:00
Benoit Steiner
857adbbd52 Added missing EIGEN_DEVICE_FUNC qualifiers 2017-02-28 16:42:00 -08:00
Benoit Steiner
c36bc2d445 Added missing EIGEN_DEVICE_FUNC qualifiers 2017-02-28 14:58:45 -08:00
Benoit Steiner
4a7df114c8 Added missing EIGEN_DEVICE_FUNC 2017-02-28 14:00:15 -08:00
Benoit Steiner
de7b0fdea9 Made the TensorStorage class compile with clang 3.9 2017-02-28 13:52:22 -08:00
Benoit Steiner
765f4cc4b4 Deleted extra: EIGEN_DEVICE_FUNC: the QR and Cholesky code isn't ready to run on GPU yet. 2017-02-28 11:57:00 -08:00
Benoit Steiner
e993c94f07 Added missing EIGEN_DEVICE_FUNC qualifiers 2017-02-28 09:56:45 -08:00
Benoit Steiner
33443ec2b0 Added missing EIGEN_DEVICE_FUNC qualifiers 2017-02-28 09:50:10 -08:00
Benoit Steiner
f3e9c42876 Added missing EIGEN_DEVICE_FUNC qualifiers 2017-02-28 09:46:30 -08:00
Gael Guennebaud
4e98a7b2f0 bug #1396: add some missing EIGEN_DEVICE_FUNC 2017-02-28 09:47:38 +01:00
Gael Guennebaud
478a9f53be Fix typo. 2017-02-28 09:32:45 +01:00
Benoit Steiner
889c606f8f Added missing EIGEN_DEVICE_FUNC to the SelfCwise binary ops 2017-02-27 17:17:47 -08:00
Benoit Steiner
193939d6aa Added missing EIGEN_DEVICE_FUNC qualifiers to several nullary op methods. 2017-02-27 17:11:47 -08:00
Benoit Steiner
ed4dc9d01a Declared the plset, ploadt_ro, and ploaddup packet primitives as usable within a gpu kernel 2017-02-27 16:57:01 -08:00
Benoit Steiner
b1fc7c9a09 Added missing EIGEN_DEVICE_FUNC qualifiers. 2017-02-27 16:48:30 -08:00
Benoit Steiner
554116bec1 Added EIGEN_DEVICE_FUNC to make the prototype of the EigenBase override match that of DenseBase 2017-02-27 16:45:31 -08:00
Benoit Steiner
34d9fce93b Avoid unecessary float to double conversions. 2017-02-27 16:33:33 -08:00
Gael Guennebaud
76687f385c bug #1394: fix compilation of SelfAdjointEigenSolver<Matrix>(sparse*sparse); 2017-02-20 14:27:26 +01:00
Gael Guennebaud
d8b1f6cebd bug #1380: for Map<> as input of matrix exponential 2017-02-20 14:06:06 +01:00
Gael Guennebaud
6572825703 bug #1395: fix the use of compile-time vectors as inputs of JacobiSVD. 2017-02-20 13:44:37 +01:00
Gael Guennebaud
a811a04696 Silent warning. 2017-02-20 10:14:21 +01:00
Gael Guennebaud
63798df038 Fix usage of CUDACC_VER 2017-02-20 08:16:36 +01:00
Gael Guennebaud
deefa54a54 Fix tracking of temporaries in unit tests 2017-02-19 10:32:54 +01:00
Gael Guennebaud
f8a55cc062 Fix compilation. 2017-02-18 10:08:13 +01:00
Gael Guennebaud
cbbf88c4d7 Use int32_t instead of int in NEON code. Some platforms with 16 bytes int supports ARM NEON. 2017-02-17 14:39:02 +01:00
Gael Guennebaud
582b5e39bf bug #1393: enable Matrix/Array explicit ctor from types with conversion operators (was ok with 3.2) 2017-02-17 14:10:57 +01:00
Benoit Steiner
cfa0568ef7 Size indices are signed. 2017-02-16 10:13:34 -08:00
Benoit Steiner
31a25ab226 Merged eigen/eigen into default 2017-02-14 15:36:21 -08:00
Mehdi Goli
0d153ded29 Adding TensorChippingOP for sycl backend; fixing the index value in the verification operation for cxx11_tensorChipping.cpp test 2017-02-13 17:25:12 +00:00
Gael Guennebaud
5937c4ae32 Fall back is_integral to std::is_integral in c++11 2017-02-13 17:14:26 +01:00
Gael Guennebaud
7073430946 Fix overflow and make use of long long in c++11 only. 2017-02-13 17:14:04 +01:00
Jonathan Hseu
3453b00a1e Fix vector indexing with uint64_t 2017-02-11 21:45:32 -08:00
Gael Guennebaud
e7ebe52bfb bug #1391: include IO.h before DenseBase to enable its usage in DenseBase plugins. 2017-02-13 09:46:20 +01:00
Gael Guennebaud
b3750990d5 Workaround some gcc 4.7 warnings 2017-02-11 23:24:06 +01:00