eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-21 07:19:46 +08:00

Author	SHA1	Message	Date
Benoit Steiner	7f31bb6822	Merged in ilya-biryukov/eigen/fix_clang_cuda_compilation (pull request PR-304) Fixed compilation with cuda-clang	2017-03-15 16:48:52 +00:00
Gael Guennebaud	89fd0c3881	better check array index before using it	2017-03-15 15:18:03 +01:00
Benoit Jacob	61160a21d2	ARM prefetch fixes: Implement prefetch on ARM64. Do not clobber cc on ARM32.	2017-03-15 06:57:25 -04:00
Benoit Steiner	f0f3591118	Made the reduction code compile with cuda-clang	2017-03-14 14:16:53 -07:00
Mehdi Goli	f499fe9496	Adding synchronisation to convolution kernel for sycl backend.	2017-03-13 09:18:37 +00:00
Rasmus Munk Larsen	bfd7bf9c5b	Get rid of Init().	2017-03-10 08:48:20 -08:00
Rasmus Munk Larsen	d56ab01094	Use C++11 ctor forwarding to simplify code a bit.	2017-03-10 08:30:22 -08:00
Rasmus Munk Larsen	344c2694a6	Make the non-blocking threadpool more flexible and less wasteful of CPU cycles for high-latency use-cases. * Adds a hint to ThreadPool allowing us to turn off spin waiting. Currently each reader and record yielder op in a graph creates a threadpool with a thread that spins for 1000 iterations through the work stealing loop before yielding. This is wasteful for such ops that process I/O. * This also changes the number of iterations through the steal loop to be inversely proportional to the number of threads. Since the time of each iteration is proportional to the number of threads, this yields roughly a constant spin time. * Implement a separate worker loop for the num_threads == 1 case since there is no point in going through the expensive steal loop. Moreover, since Steal() calls PopBack() on the victim queues it might reverse the order in which ops are executed, compared to the order in which they are scheduled, which is usually counter-productive for the types of I/O workloads the single thread pools tend to be used for. * Store num_threads in a member variable for simplicity and to avoid a data race between the thread creation loop and worker threads calling threads_.size().	2017-03-09 15:41:03 -08:00
Luke Iwanski	1b32a10053	Use name to distinguish name instead of the vendor	2017-03-08 18:26:34 +00:00
Mehdi Goli	aadb7405a7	Fixing typo in sycl Benchmark.	2017-03-08 18:20:06 +00:00
Gael Guennebaud	970ff78294	bug #1401 : fix compilation of "cond ? x : -x" with x an AutoDiffScalar	2017-03-08 16:16:53 +01:00
Mehdi Goli	5e9a1e7a7a	Adding sycl Benchmarks.	2017-03-08 14:17:48 +00:00
Mehdi Goli	e2e3f78533	Fixing potential race condition on sycl device.	2017-03-07 17:48:15 +00:00
Mehdi Goli	f84963ed95	Adding TensorIndexTuple and TensorTupleReduceOP backend (ArgMax/Min) for sycl; fixing the address space issue for const TensorMap; converting all discard_write to write due to data missmatch.	2017-03-07 14:27:10 +00:00
Gael Guennebaud	e5156e4d25	fix typo	2017-03-07 11:25:58 +01:00
Gael Guennebaud	5694315fbb	remove UTF8 symbol	2017-03-07 10:53:47 +01:00
Gael Guennebaud	e958c2baac	remove UTF8 symbols	2017-03-07 10:47:40 +01:00
Gael Guennebaud	d967718525	do not include std header within extern C	2017-03-07 10:16:39 +01:00
Gael Guennebaud	659087b622	bug #1400 : fix stableNorm with EIGEN_DONT_ALIGN_STATICALLY	2017-03-07 10:02:34 +01:00
Ilya Biryukov	1c03d43a5c	Fixed compilation with cuda-clang	2017-03-06 12:01:12 +01:00
Julian Kent	bbe717fa2f	Make scaling work with non-square matrices	2017-03-03 12:58:51 +01:00
Benoit Steiner	a71943b9a4	Made the Tensor code compile with clang 3.9	2017-03-02 10:47:29 -08:00
Benoit Steiner	09ae0e6586	Adjusted the EIGEN_DEVICE_FUNC qualifiers to make sure that: * they're used consistently between the declaration and the definition of a function * we avoid calling host only methods from host device methods.	2017-03-01 11:47:47 -08:00
Benoit Steiner	1e2d046651	Silenced a couple of compilation warnings	2017-03-01 10:13:42 -08:00
Benoit Steiner	c1d87ec110	Added missing EIGEN_DEVICE_FUNC qualifiers	2017-03-01 10:08:50 -08:00
Benoit Steiner	3a3f040baa	Added missing EIGEN_DEVICE_FUNC qualifiers	2017-02-28 17:06:15 -08:00
Benoit Steiner	7b61944669	Made most of the packet math primitives usable within CUDA kernel when compiling with clang	2017-02-28 17:05:28 -08:00
Benoit Steiner	c92406d613	Silenced clang compilation warning.	2017-02-28 17:03:11 -08:00
Benoit Steiner	857adbbd52	Added missing EIGEN_DEVICE_FUNC qualifiers	2017-02-28 16:42:00 -08:00
Benoit Steiner	c36bc2d445	Added missing EIGEN_DEVICE_FUNC qualifiers	2017-02-28 14:58:45 -08:00
Benoit Steiner	4a7df114c8	Added missing EIGEN_DEVICE_FUNC	2017-02-28 14:00:15 -08:00
Benoit Steiner	de7b0fdea9	Made the TensorStorage class compile with clang 3.9	2017-02-28 13:52:22 -08:00
Benoit Steiner	765f4cc4b4	Deleted extra: EIGEN_DEVICE_FUNC: the QR and Cholesky code isn't ready to run on GPU yet.	2017-02-28 11:57:00 -08:00
Benoit Steiner	e993c94f07	Added missing EIGEN_DEVICE_FUNC qualifiers	2017-02-28 09:56:45 -08:00
Benoit Steiner	33443ec2b0	Added missing EIGEN_DEVICE_FUNC qualifiers	2017-02-28 09:50:10 -08:00
Benoit Steiner	f3e9c42876	Added missing EIGEN_DEVICE_FUNC qualifiers	2017-02-28 09:46:30 -08:00
Mehdi Goli	8296b87d7b	Adding sycl backend for TensorCustomOp; fixing the partial lhs modification issue on sycl when the rhs is TensorContraction, reduction or convolution; Fixing the partial modification for memset when sycl backend is used.	2017-02-28 17:16:14 +00:00
Gael Guennebaud	4e98a7b2f0	bug #1396 : add some missing EIGEN_DEVICE_FUNC	2017-02-28 09:47:38 +01:00
Gael Guennebaud	478a9f53be	Fix typo.	2017-02-28 09:32:45 +01:00
Benoit Steiner	889c606f8f	Added missing EIGEN_DEVICE_FUNC to the SelfCwise binary ops	2017-02-27 17:17:47 -08:00
Benoit Steiner	193939d6aa	Added missing EIGEN_DEVICE_FUNC qualifiers to several nullary op methods.	2017-02-27 17:11:47 -08:00
Benoit Steiner	ed4dc9d01a	Declared the plset, ploadt_ro, and ploaddup packet primitives as usable within a gpu kernel	2017-02-27 16:57:01 -08:00
Benoit Steiner	b1fc7c9a09	Added missing EIGEN_DEVICE_FUNC qualifiers.	2017-02-27 16:48:30 -08:00
Benoit Steiner	554116bec1	Added EIGEN_DEVICE_FUNC to make the prototype of the EigenBase override match that of DenseBase	2017-02-27 16:45:31 -08:00
Benoit Steiner	34d9fce93b	Avoid unecessary float to double conversions.	2017-02-27 16:33:33 -08:00
Benoit Steiner	e0bd6f5738	Merged eigen/eigen into default	2017-02-26 10:02:14 -08:00
Mehdi Goli	2fa2b617a9	Adding TensorVolumePatchOP.h for sycl	2017-02-24 19:16:24 +00:00
Mehdi Goli	0b7875f137	Converting fixed float type into template type for TensorContraction.	2017-02-24 18:13:30 +00:00
Mehdi Goli	89dfd51fae	Adding Sycl Backend for TensorGenerator.h.	2017-02-22 16:36:24 +00:00
Mehdi Goli	4f07ac16b0	Reducing the number of warnings.	2017-02-21 10:09:47 +00:00

... 5 6 7 8 9 ...

9591 Commits