eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-27 07:29:52 +08:00

Author	SHA1	Message	Date
Antonio Sanchez	1e6c6c1576	Replace memset with fill to work for non-trivial scalars. For custom scalars, zero is not necessarily represented by a zeroed-out memory block (e.g. gnu MPFR). We therefore cannot rely on `memset` if we want to fill a matrix or tensor with zeroes. Instead, we should rely on `fill`, which for trivial types does end up getting converted to a `memset` under-the-hood (at least with gcc/clang). Requires adding a `fill(begin, end, v)` to `TensorDevice`. Replaced all potentially bad instances of memset with fill. Fixes #2245.	2021-07-08 18:34:41 +00:00
mehdi-goli	d3e81db6c5	Eigen moved the `scanLauncehr` function inside the internal namespace. This commit applies the following changes: - Moving the `scamLauncher` specialization inside internal namespace to fix compiler crash on TensorScan for SYCL backend. - Replacing `SYCL/sycl.hpp` to `CL/sycl.hpp` in order to follow SYCL 1.2.1 standard. - minor fixes: commenting out an unused variable to avoid compiler warnings.	2020-05-11 16:10:33 +01:00
Mehdi Goli	00f32752f7	[SYCL] Rebasing the SYCL support branch on top of the Einge upstream master branch. * Unifying all loadLocalTile from lhs and rhs to an extract_block function. * Adding get_tensor operation which was missing in TensorContractionMapper. * Adding the -D method missing from cmake for Disable_Skinny Contraction operation. * Wrapping all the indices in TensorScanSycl into Scan parameter struct. * Fixing typo in Device SYCL * Unifying load to private register for tall/skinny no shared * Unifying load to vector tile for tensor-vector/vector-tensor operation * Removing all the LHS/RHS class for extracting data from global * Removing Outputfunction from TensorContractionSkinnyNoshared. * Combining the local memory version of tall/skinny and normal tensor contraction into one kernel. * Combining the no-local memory version of tall/skinny and normal tensor contraction into one kernel. * Combining General Tensor-Vector and VectorTensor contraction into one kernel. * Making double buffering optional for Tensor contraction when local memory is version is used. * Modifying benchmark to accept custom Reduction Sizes * Disabling AVX optimization for SYCL backend on the host to allow SSE optimization to the host * Adding Test for SYCL * Modifying SYCL CMake	2019-11-28 10:08:54 +00:00
Mehdi Goli	f499fe9496	Adding synchronisation to convolution kernel for sycl backend.	2017-03-13 09:18:37 +00:00
Mehdi Goli	aadb7405a7	Fixing typo in sycl Benchmark.	2017-03-08 18:20:06 +00:00
Mehdi Goli	5e9a1e7a7a	Adding sycl Benchmarks.	2017-03-08 14:17:48 +00:00
Benoit Steiner	3eda02d78d	Fixed the sycl benchmarking code	2016-12-22 10:37:05 -08:00
Luke Iwanski	cb81975714	Partial OpenCL support via SYCL compatible with ComputeCpp CE.	2016-09-19 12:44:13 +01:00
Benoit Steiner	457204cb83	Updated the README file for the tensor benchmarks	2016-05-25 16:13:41 -07:00
Benoit Steiner	034aa3b2c0	Improved the performance of tensor padding	2016-05-25 11:43:08 -07:00
Benoit Steiner	069a0b04d7	Added benchmarks for contraction on CPU.	2016-05-13 14:32:17 -07:00
Benoit Steiner	f81e413180	Added a benchmark to measure the performance of full reductions of 16 bit floats	2016-05-05 14:15:11 -07:00
Benoit Steiner	79b900375f	Use index list for the striding benchmarks	2016-04-21 11:58:27 -07:00
Benoit Steiner	eaeb6ca93a	Enable the benchmarks for algebraic and transcendental fnctions on fp16.	2016-04-12 16:29:00 -07:00
Benoit Steiner	53121c0119	Turned on the contraction benchmarks for fp16	2016-04-12 14:11:52 -07:00
Benoit Steiner	63102ee43d	Turn on the coeffWise benchmarks on fp16	2016-04-07 23:05:20 -07:00
Benoit Steiner	7c47d3e663	Fixed the type casting benchmarks for fp16	2016-04-07 22:50:25 -07:00
Benoit Steiner	a6d08be9b2	Fixed the benchmarking of fp16 coefficient wise operations	2016-04-07 17:13:44 -07:00
Benoit Steiner	0968e925a0	Updated the benchmarking code to use Eigen::half instead of half	2016-03-24 18:00:33 -07:00
Benoit Steiner	7168afde5e	Made the tensor benchmarks compile on MacOS	2016-03-23 14:21:04 -07:00
Benoit Steiner	56a3ada670	Added benchmarks for full reduction	2016-02-29 14:57:52 -08:00
Benoit Steiner	1031b31571	Improved the README	2016-02-27 20:22:04 +00:00
Benoit Steiner	93485d86bc	Added benchmarks for type casting of float16	2016-02-26 12:24:58 -08:00
Benoit Steiner	002824e32d	Added benchmarks for fp16	2016-02-26 12:21:25 -08:00
Benoit Steiner	8cb9bfab87	Extended the tensor benchmark suite to support types other than floats	2016-02-23 05:28:02 +00:00
Benoit Steiner	f442a5a5b3	Updated the tensor benchmarking code to work with compilers that don't support cxx11.	2016-02-23 04:15:48 +00:00
Benoit Steiner	4281eb1e2c	Added 2 benchmarks to the suite of tensor benchmarks running on GPU	2016-01-30 10:20:43 -08:00
Benoit Steiner	e4f83bae5d	Fixed the tensor benchmarks on apple devices	2016-01-28 21:08:07 -08:00
Benoit Steiner	10bea90c4a	Fixed clang related compilation error	2016-01-28 20:52:08 -08:00
Benoit Steiner	211d350fc3	Fixed a typo	2016-01-28 17:13:04 -08:00
Benoit Steiner	bd2e5a788a	Made sure the number of floating point operations done by a benchmark is computed using 64 bit integers to avoid overflows.	2016-01-28 17:10:40 -08:00
Benoit Steiner	120e13b1b6	Added a readme to explain how to compile the tensor benchmarks.	2016-01-28 17:06:00 -08:00
Benoit Steiner	a68864b6bc	Updated the benchmarking code to print the number of flops processed instead of the number of bytes.	2016-01-28 16:51:40 -08:00
Benoit Steiner	c8d5f21941	Added extra tensor benchmarks	2016-01-28 16:20:36 -08:00
Yangqing Jia	270c4e1ecd	bugfix	2016-01-28 11:11:45 -08:00
Yangqing Jia	c4e47630b1	benchmark modifications to make it compilable in a standalone fashion.	2016-01-28 10:35:14 -08:00
Benoit Steiner	46fc881e4a	Added a few benchmarks for the tensor code	2015-01-26 17:46:40 -08:00

37 Commits