Commit Graph

1648 Commits

Author SHA1 Message Date
Luke Iwanski
c5130dedbe Specialised basic math functions for SYCL device. 2016-11-17 11:47:13 +00:00
Benoit Steiner
b5c75351e3 Merged eigen/eigen into default 2016-11-14 15:54:44 -08:00
Rasmus Munk Larsen
32df1b1046 Reduce dispatch overhead in parallelFor by only calling thread_pool.Schedule() for one of the two recursive calls in handleRange. This avoids going through the scedule path to push both recursive calls onto another thread-queue in the binary tree, but instead executes one of them on the main thread. At the leaf level this will still activate a full complement of threads, but will save up to 50% of the overhead in Schedule (random number generation, insertion in queue which includes signaling via atomics). 2016-11-14 14:18:16 -08:00
Mehdi Goli
05e8c2a1d9 Adding extra test for non-fixed size to broadcast; Replacing stcl with sycl. 2016-11-14 18:13:53 +00:00
Mehdi Goli
f8ca893976 Adding TensorFixsize; adding sycl device memcpy; adding insial stage of slicing. 2016-11-14 17:51:57 +00:00
Mehdi Goli
a5c3f15682 Adding comment to TensorDeviceSycl.h and cleaning the code. 2016-11-11 19:06:34 +00:00
Mehdi Goli
3be3963021 Adding EIGEN_STRONG_INLINE back; using size() instead of dimensions.TotalSize() on Tensor. 2016-11-10 19:16:31 +00:00
Mehdi Goli
12387abad5 adding the missing in eigen_assert! 2016-11-10 18:58:08 +00:00
Mehdi Goli
2e704d4257 Adding Memset; optimising MecopyDeviceToHost by removing double copying; 2016-11-10 18:45:12 +00:00
Benoit Steiner
dcc14bee64 Fixed the formatting of the code 2016-11-08 14:24:46 -08:00
Luke Iwanski
912cb3d660 #if EIGEN_EXCEPTION -> #ifdef EIGEN_EXCEPTIONS. 2016-11-08 22:01:14 +00:00
Luke Iwanski
1b345b0895 Fix for SYCL queue initialisation. 2016-11-08 21:56:31 +00:00
Luke Iwanski
1b95717358 Use try/catch only when exceptions are enabled. 2016-11-08 21:08:53 +00:00
Mehdi Goli
d57430dd73 Converting all sycl buffers to uninitialised device only buffers; adding memcpyHostToDevice and memcpyDeviceToHost on syclDevice; modifying all examples to obey the new rules; moving sycl queue creating to the device based on Benoit suggestion; removing the sycl specefic condition for returning m_result in TensorReduction.h according to Benoit suggestion. 2016-11-08 17:08:02 +00:00
Benoit Steiner
dad177be01 Added missing includes 2016-11-05 10:04:42 -07:00
Benoit Steiner
d46a36cc84 Merged eigen/eigen into default 2016-11-04 18:22:55 -07:00
Mehdi Goli
0ebe3808ca Removed the sycl include from Eigen/Core and moved it to Unsupported/Eigen/CXX11/Tensor; added TensorReduction for sycl (full reduction and partial reduction); added TensorReduction test case for sycl (full reduction and partial reduction); fixed the tile size on TensorSyclRun.h based on the device max work group size; 2016-11-04 18:18:19 +00:00
Benoit Steiner
3e37166d0b Merged in benoitsteiner/opencl (pull request PR-244)
Disable vectorization on device only when compiling for sycl
2016-11-02 22:01:03 +00:00
Benoit Steiner
0585b2965d Disable vectorization on device only when compiling for sycl 2016-11-02 11:44:27 -07:00
Benoit Steiner
e6e77ed08b Don't call lgamma_r when compiling for an Apple device, since the function isn't available on MacOS 2016-11-02 09:55:39 -07:00
Benoit Steiner
b238f387b4 Pulled latest updates from trunk 2016-11-02 08:53:13 -07:00
Benoit Steiner
c8db17301e Special functions require math.h: make sure it is included. 2016-11-02 08:51:52 -07:00
Benoit Steiner
e44519744e Merged in benoitsteiner/opencl (pull request PR-243)
Fixed the ambiguity in callig make_tuple for sycl backend.
2016-11-02 02:56:58 +00:00
Rasmus Munk Larsen
0a6ae41555 Merged eigen/eigen into default 2016-11-01 15:37:00 -07:00
Rasmus Munk Larsen
b730952414 Don't attempts to use lgamma_r for CUDA devices.
Fix type in lgamma_impl<double>.
2016-11-01 15:34:19 -07:00
Mehdi Goli
51af6ae971 Fixed the ambiguity in callig make_tuple for sycl backend. 2016-10-31 16:35:51 +00:00
Benoit Steiner
0a9ad6fc72 Worked around Visual Studio compilation errors 2016-10-28 07:54:27 -07:00
Benoit Steiner
b0c5bfdf78 Added missing template parameters 2016-10-28 03:43:41 +00:00
Rasmus Munk Larsen
2ebb314fa7 Use threadsafe versions of lgamma and lgammaf if possible. 2016-10-27 16:17:12 -07:00
Gael Guennebaud
530f20c21a Workaround MSVC issue. 2016-10-27 21:51:37 +02:00
Benoit Steiner
0a4c4d40b4 Removed a template parameter for fixed sized tensors 2016-10-26 18:47:37 -07:00
Benoit Steiner
5f2dd503ff Replaced tabs with spaces 2016-10-25 20:40:58 -07:00
Benoit Steiner
1644bafe29 Code cleanup 2016-10-25 20:36:14 -07:00
Benoit Steiner
cf20b30d65 Merge latest updates from trunk 2016-10-20 09:42:05 -07:00
Luke Iwanski
03b63e182c Added SYCL include in Tensor. 2016-10-20 15:32:44 +01:00
Benoit Steiner
d3943cd50c Fixed a few typos in the ternary tensor expressions types 2016-10-19 12:56:12 -07:00
Mehdi Goli
e36cb91c99 Fixing the code indentation in the TensorReduction.h file. 2016-10-14 18:03:00 +01:00
Luke Iwanski
e742da8b28 Merged ComputeCpp into default. 2016-10-14 13:36:51 +01:00
Mehdi Goli
524fa4c46f Reducing the code by generalising sycl backend functions/structs. 2016-10-14 12:09:55 +01:00
Benoit Steiner
7e4a6754b2 Merged eigen/eigen into default 2016-10-12 22:42:33 -07:00
Gael Guennebaud
091d373ee9 Fix outer-stride. 2016-10-12 21:47:52 +02:00
Benoit Steiner
7f0599b6eb Manually define int16_t and uint16_t when compiling with Visual Studio 2016-10-08 22:56:32 -07:00
Benoit Steiner
5c68051cd7 Merge the content of the ComputeCpp branch into the default branch 2016-10-07 11:04:16 -07:00
RJ Ryan
e2e9cdd169 Fully support complex types in SumReducer and MeanReducer when building for CUDA by using scalar_sum_op and scalar_product_op instead of operator+ and operator*. 2016-10-06 10:49:48 -07:00
Benoit Steiner
ae1385c7e4 Pull the latest updates from trunk 2016-10-05 14:54:36 -07:00
Benoit Steiner
c84084c0c0 Fixed compilation warning 2016-10-05 14:15:41 -07:00
Benoit Steiner
8b69d5d730 ::rand() returns a signed integer on win32 2016-10-05 08:55:02 -07:00
Benoit Steiner
ed7a220b04 Fixed a typo that impacts windows builds 2016-10-05 08:51:31 -07:00
Benoit Steiner
ceee1c008b Silenced compilation warning 2016-10-04 18:47:53 -07:00
Benoit Steiner
6af5ac7e27 Cleanup the cuda executor code. 2016-10-04 08:52:13 -07:00