eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-01-06 14:14:46 +08:00

Author	SHA1	Message	Date
Mehdi Goli	622805a0c5	Modifying TensorDeviceSycl.h to always create buffer of type uint8_t and convert them to the actual type at the execution on the device; adding the queue interface class to separate the lifespan of sycl queue and buffers,created for that queue, from Eigen::SyclDevice; modifying sycl tests to support the evaluation of the results for both row major and column major data layout on all different devices that are supported by Sycl{CPU; GPU; and Host}.	2016-11-18 16:20:42 +00:00
Benoit Steiner	7c30078b9f	Merged eigen/eigen into default	2016-11-17 22:53:37 -08:00
Benoit Steiner	553f50b246	Added a way to detect errors generated by the opencl device from the host	2016-11-17 21:51:48 -08:00
Benoit Steiner	72a45d32e9	Cleanup	2016-11-17 21:29:15 -08:00
Benoit Steiner	4349fc640e	Created a test to check that the sycl runtime can successfully report errors (like ivision by 0). Small cleanup	2016-11-17 20:27:54 -08:00
Benoit Steiner	a6a3fd0703	Made TensorDeviceCuda.h compile on windows	2016-11-17 16:15:27 -08:00
Luke Iwanski	c5130dedbe	Specialised basic math functions for SYCL device.	2016-11-17 11:47:13 +00:00
Benoit Steiner	b5c75351e3	Merged eigen/eigen into default	2016-11-14 15:54:44 -08:00
Rasmus Munk Larsen	32df1b1046	Reduce dispatch overhead in parallelFor by only calling thread_pool.Schedule() for one of the two recursive calls in handleRange. This avoids going through the scedule path to push both recursive calls onto another thread-queue in the binary tree, but instead executes one of them on the main thread. At the leaf level this will still activate a full complement of threads, but will save up to 50% of the overhead in Schedule (random number generation, insertion in queue which includes signaling via atomics).	2016-11-14 14:18:16 -08:00
Mehdi Goli	05e8c2a1d9	Adding extra test for non-fixed size to broadcast; Replacing stcl with sycl.	2016-11-14 18:13:53 +00:00
Mehdi Goli	f8ca893976	Adding TensorFixsize; adding sycl device memcpy; adding insial stage of slicing.	2016-11-14 17:51:57 +00:00
Mehdi Goli	a5c3f15682	Adding comment to TensorDeviceSycl.h and cleaning the code.	2016-11-11 19:06:34 +00:00
Mehdi Goli	3be3963021	Adding EIGEN_STRONG_INLINE back; using size() instead of dimensions.TotalSize() on Tensor.	2016-11-10 19:16:31 +00:00
Mehdi Goli	12387abad5	adding the missing in eigen_assert!	2016-11-10 18:58:08 +00:00
Mehdi Goli	2e704d4257	Adding Memset; optimising MecopyDeviceToHost by removing double copying;	2016-11-10 18:45:12 +00:00
Benoit Steiner	dcc14bee64	Fixed the formatting of the code	2016-11-08 14:24:46 -08:00
Luke Iwanski	912cb3d660	#if EIGEN_EXCEPTION -> #ifdef EIGEN_EXCEPTIONS.	2016-11-08 22:01:14 +00:00
Luke Iwanski	1b345b0895	Fix for SYCL queue initialisation.	2016-11-08 21:56:31 +00:00
Luke Iwanski	1b95717358	Use try/catch only when exceptions are enabled.	2016-11-08 21:08:53 +00:00
Mehdi Goli	d57430dd73	Converting all sycl buffers to uninitialised device only buffers; adding memcpyHostToDevice and memcpyDeviceToHost on syclDevice; modifying all examples to obey the new rules; moving sycl queue creating to the device based on Benoit suggestion; removing the sycl specefic condition for returning m_result in TensorReduction.h according to Benoit suggestion.	2016-11-08 17:08:02 +00:00
Benoit Steiner	dad177be01	Added missing includes	2016-11-05 10:04:42 -07:00
Mehdi Goli	0ebe3808ca	Removed the sycl include from Eigen/Core and moved it to Unsupported/Eigen/CXX11/Tensor; added TensorReduction for sycl (full reduction and partial reduction); added TensorReduction test case for sycl (full reduction and partial reduction); fixed the tile size on TensorSyclRun.h based on the device max work group size;	2016-11-04 18:18:19 +00:00
Benoit Steiner	0585b2965d	Disable vectorization on device only when compiling for sycl	2016-11-02 11:44:27 -07:00
Mehdi Goli	51af6ae971	Fixed the ambiguity in callig make_tuple for sycl backend.	2016-10-31 16:35:51 +00:00
Benoit Steiner	0a9ad6fc72	Worked around Visual Studio compilation errors	2016-10-28 07:54:27 -07:00
Benoit Steiner	b0c5bfdf78	Added missing template parameters	2016-10-28 03:43:41 +00:00
Gael Guennebaud	530f20c21a	Workaround MSVC issue.	2016-10-27 21:51:37 +02:00
Benoit Steiner	0a4c4d40b4	Removed a template parameter for fixed sized tensors	2016-10-26 18:47:37 -07:00
Benoit Steiner	5f2dd503ff	Replaced tabs with spaces	2016-10-25 20:40:58 -07:00
Benoit Steiner	1644bafe29	Code cleanup	2016-10-25 20:36:14 -07:00
Benoit Steiner	cf20b30d65	Merge latest updates from trunk	2016-10-20 09:42:05 -07:00
Luke Iwanski	03b63e182c	Added SYCL include in Tensor.	2016-10-20 15:32:44 +01:00
Benoit Steiner	d3943cd50c	Fixed a few typos in the ternary tensor expressions types	2016-10-19 12:56:12 -07:00
Mehdi Goli	e36cb91c99	Fixing the code indentation in the TensorReduction.h file.	2016-10-14 18:03:00 +01:00
Luke Iwanski	e742da8b28	Merged ComputeCpp into default.	2016-10-14 13:36:51 +01:00
Mehdi Goli	524fa4c46f	Reducing the code by generalising sycl backend functions/structs.	2016-10-14 12:09:55 +01:00
Benoit Steiner	7e4a6754b2	Merged eigen/eigen into default	2016-10-12 22:42:33 -07:00
Benoit Steiner	7f0599b6eb	Manually define int16_t and uint16_t when compiling with Visual Studio	2016-10-08 22:56:32 -07:00
Benoit Steiner	5c68051cd7	Merge the content of the ComputeCpp branch into the default branch	2016-10-07 11:04:16 -07:00
RJ Ryan	e2e9cdd169	Fully support complex types in SumReducer and MeanReducer when building for CUDA by using scalar_sum_op and scalar_product_op instead of operator+ and operator*.	2016-10-06 10:49:48 -07:00
Benoit Steiner	ae1385c7e4	Pull the latest updates from trunk	2016-10-05 14:54:36 -07:00
Benoit Steiner	c84084c0c0	Fixed compilation warning	2016-10-05 14:15:41 -07:00
Benoit Steiner	8b69d5d730	::rand() returns a signed integer on win32	2016-10-05 08:55:02 -07:00
Benoit Steiner	ed7a220b04	Fixed a typo that impacts windows builds	2016-10-05 08:51:31 -07:00
Benoit Steiner	ceee1c008b	Silenced compilation warning	2016-10-04 18:47:53 -07:00
Benoit Steiner	6af5ac7e27	Cleanup the cuda executor code.	2016-10-04 08:52:13 -07:00
Benoit Steiner	2f6d1607c8	Cleaned up the random number generation code.	2016-10-04 08:38:23 -07:00
Benoit Steiner	2bda1b0d93	Updated the tensor sum and mean reducer to enable them to process complex numbers on cuda gpus.	2016-09-28 17:08:41 -07:00
Mehdi Goli	dd602e62c8	Converting alias template to nested struct in order to be compatible with CXX-03	2016-09-27 16:21:19 +01:00
Benoit Steiner	6565f8d60f	Made the initialization of a CUDA device thread safe.	2016-09-26 11:00:32 -07:00

1 2 3 4 5 ...

808 Commits