eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-21 07:19:46 +08:00

Author	SHA1	Message	Date
Benoit Steiner	b5c75351e3	Merged eigen/eigen into default	2016-11-14 15:54:44 -08:00
Rasmus Munk Larsen	32df1b1046	Reduce dispatch overhead in parallelFor by only calling thread_pool.Schedule() for one of the two recursive calls in handleRange. This avoids going through the scedule path to push both recursive calls onto another thread-queue in the binary tree, but instead executes one of them on the main thread. At the leaf level this will still activate a full complement of threads, but will save up to 50% of the overhead in Schedule (random number generation, insertion in queue which includes signaling via atomics).	2016-11-14 14:18:16 -08:00
Mehdi Goli	05e8c2a1d9	Adding extra test for non-fixed size to broadcast; Replacing stcl with sycl.	2016-11-14 18:13:53 +00:00
Mehdi Goli	f8ca893976	Adding TensorFixsize; adding sycl device memcpy; adding insial stage of slicing.	2016-11-14 17:51:57 +00:00
Mehdi Goli	a5c3f15682	Adding comment to TensorDeviceSycl.h and cleaning the code.	2016-11-11 19:06:34 +00:00
Mehdi Goli	3be3963021	Adding EIGEN_STRONG_INLINE back; using size() instead of dimensions.TotalSize() on Tensor.	2016-11-10 19:16:31 +00:00
Mehdi Goli	12387abad5	adding the missing in eigen_assert!	2016-11-10 18:58:08 +00:00
Mehdi Goli	2e704d4257	Adding Memset; optimising MecopyDeviceToHost by removing double copying;	2016-11-10 18:45:12 +00:00
Benoit Steiner	75c080b176	Added a test to validate memory transfers between host and sycl device	2016-11-09 06:23:42 -08:00
Benoit Steiner	db3903498d	Merged in benoitsteiner/opencl (pull request PR-246) Improved support for OpenCL	2016-11-08 22:28:44 +00:00
Benoit Steiner	dcc14bee64	Fixed the formatting of the code	2016-11-08 14:24:46 -08:00
Luke Iwanski	912cb3d660	#if EIGEN_EXCEPTION -> #ifdef EIGEN_EXCEPTIONS.	2016-11-08 22:01:14 +00:00
Luke Iwanski	1b345b0895	Fix for SYCL queue initialisation.	2016-11-08 21:56:31 +00:00
Luke Iwanski	1b95717358	Use try/catch only when exceptions are enabled.	2016-11-08 21:08:53 +00:00
Mehdi Goli	d57430dd73	Converting all sycl buffers to uninitialised device only buffers; adding memcpyHostToDevice and memcpyDeviceToHost on syclDevice; modifying all examples to obey the new rules; moving sycl queue creating to the device based on Benoit suggestion; removing the sycl specefic condition for returning m_result in TensorReduction.h according to Benoit suggestion.	2016-11-08 17:08:02 +00:00
Benoit Steiner	ad086b03e4	Removed unnecessary statement	2016-11-05 12:43:27 -07:00
Benoit Steiner	dad177be01	Added missing includes	2016-11-05 10:04:42 -07:00
Gael Guennebaud	55b4fd1d40	Extend mpreal unit test to check LLT with complexes.	2016-11-05 11:28:53 +01:00
Benoit Steiner	d46a36cc84	Merged eigen/eigen into default	2016-11-04 18:22:55 -07:00
Mehdi Goli	0ebe3808ca	Removed the sycl include from Eigen/Core and moved it to Unsupported/Eigen/CXX11/Tensor; added TensorReduction for sycl (full reduction and partial reduction); added TensorReduction test case for sycl (full reduction and partial reduction); fixed the tile size on TensorSyclRun.h based on the device max work group size;	2016-11-04 18:18:19 +00:00
Benoit Steiner	3e37166d0b	Merged in benoitsteiner/opencl (pull request PR-244) Disable vectorization on device only when compiling for sycl	2016-11-02 22:01:03 +00:00
Benoit Steiner	0585b2965d	Disable vectorization on device only when compiling for sycl	2016-11-02 11:44:27 -07:00
Benoit Steiner	e6e77ed08b	Don't call lgamma_r when compiling for an Apple device, since the function isn't available on MacOS	2016-11-02 09:55:39 -07:00
Benoit Steiner	b238f387b4	Pulled latest updates from trunk	2016-11-02 08:53:13 -07:00
Benoit Steiner	c8db17301e	Special functions require math.h: make sure it is included.	2016-11-02 08:51:52 -07:00
Benoit Steiner	e44519744e	Merged in benoitsteiner/opencl (pull request PR-243) Fixed the ambiguity in callig make_tuple for sycl backend.	2016-11-02 02:56:58 +00:00
Rasmus Munk Larsen	0a6ae41555	Merged eigen/eigen into default	2016-11-01 15:37:00 -07:00
Rasmus Munk Larsen	b730952414	Don't attempts to use lgamma_r for CUDA devices. Fix type in lgamma_impl<double>.	2016-11-01 15:34:19 -07:00
Mehdi Goli	51af6ae971	Fixed the ambiguity in callig make_tuple for sycl backend.	2016-10-31 16:35:51 +00:00
Benoit Steiner	0a9ad6fc72	Worked around Visual Studio compilation errors	2016-10-28 07:54:27 -07:00
Benoit Steiner	d5f88e2357	Sharded the tensor_image_patch test to help it run on low power devices	2016-10-27 21:48:21 -07:00
Benoit Steiner	0b4b0f11e8	Fixed a few more compilation warnings	2016-10-28 04:01:01 +00:00
Benoit Steiner	306daa24a3	Fixed a compilation warning	2016-10-28 03:50:31 +00:00
Benoit Steiner	8471cf1996	Fixed compilation warning	2016-10-28 03:46:08 +00:00
Benoit Steiner	b0c5bfdf78	Added missing template parameters	2016-10-28 03:43:41 +00:00
Rasmus Munk Larsen	2ebb314fa7	Use threadsafe versions of lgamma and lgammaf if possible.	2016-10-27 16:17:12 -07:00
Gael Guennebaud	530f20c21a	Workaround MSVC issue.	2016-10-27 21:51:37 +02:00
Benoit Steiner	0a4c4d40b4	Removed a template parameter for fixed sized tensors	2016-10-26 18:47:37 -07:00
Benoit Steiner	5f2dd503ff	Replaced tabs with spaces	2016-10-25 20:40:58 -07:00
Benoit Steiner	1644bafe29	Code cleanup	2016-10-25 20:36:14 -07:00
Benoit Steiner	cf20b30d65	Merge latest updates from trunk	2016-10-20 09:42:05 -07:00
Luke Iwanski	03b63e182c	Added SYCL include in Tensor.	2016-10-20 15:32:44 +01:00
Benoit Steiner	d3943cd50c	Fixed a few typos in the ternary tensor expressions types	2016-10-19 12:56:12 -07:00
Tal Hadad	15eca2432a	Euler tests: Tighter precision when no roll exists and clean code.	2016-10-18 23:24:57 +03:00
Tal Hadad	6f4f12d1ed	Add isApprox() and cast() functions. test cases included	2016-10-17 22:23:47 +03:00
Tal Hadad	7402cfd4cc	Add safty for near pole cases and test them better.	2016-10-17 20:42:08 +03:00
Tal Hadad	58f5d7d058	Fix calc bug, docs and better testing. Test code changes: * better coded * rand and manual numbers * singularity checking	2016-10-16 14:39:26 +03:00
Mehdi Goli	e36cb91c99	Fixing the code indentation in the TensorReduction.h file.	2016-10-14 18:03:00 +01:00
Tal Hadad	078a202621	Merge Hongkai Dai correct range calculation, and remove ranges from API. Docs updated.	2016-10-14 16:03:28 +03:00
Luke Iwanski	e742da8b28	Merged ComputeCpp into default.	2016-10-14 13:36:51 +01:00
Mehdi Goli	524fa4c46f	Reducing the code by generalising sycl backend functions/structs.	2016-10-14 12:09:55 +01:00
Hongkai Dai	014d9f1d9b	implement euler angles with the right ranges	2016-10-13 14:45:51 -07:00
Benoit Steiner	d0ee2267d6	Relaxed the resizing checks so that they don't fail with gcc >= 5.3	2016-10-13 10:59:46 -07:00
Benoit Steiner	7e4a6754b2	Merged eigen/eigen into default	2016-10-12 22:42:33 -07:00
Gael Guennebaud	091d373ee9	Fix outer-stride.	2016-10-12 21:47:52 +02:00
Benoit Steiner	7f0599b6eb	Manually define int16_t and uint16_t when compiling with Visual Studio	2016-10-08 22:56:32 -07:00
Benoit Steiner	5266ff8966	Cleaned up a regression test	2016-10-08 19:12:44 +00:00
Benoit Steiner	5c68051cd7	Merge the content of the ComputeCpp branch into the default branch	2016-10-07 11:04:16 -07:00
RJ Ryan	bfc264abe8	Add a test that GPU complex product reductions match CPU reductions.	2016-10-06 11:10:14 -07:00
RJ Ryan	e2e9cdd169	Fully support complex types in SumReducer and MeanReducer when building for CUDA by using scalar_sum_op and scalar_product_op instead of operator+ and operator*.	2016-10-06 10:49:48 -07:00
Benoit Steiner	d7f9679a34	Fixed a couple of compilation warnings	2016-10-05 15:00:32 -07:00
Benoit Steiner	ae1385c7e4	Pull the latest updates from trunk	2016-10-05 14:54:36 -07:00
Benoit Steiner	73b0012945	Fixed compilation warnings	2016-10-05 14:24:24 -07:00
Benoit Steiner	c84084c0c0	Fixed compilation warning	2016-10-05 14:15:41 -07:00
Benoit Steiner	4387433acf	Increased the robustness of the reduction tests on fp16	2016-10-05 10:42:41 -07:00
Benoit Steiner	aad20d700d	Increase the tolerance to numerical noise.	2016-10-05 10:39:24 -07:00
Benoit Steiner	8b69d5d730	::rand() returns a signed integer on win32	2016-10-05 08:55:02 -07:00
Benoit Steiner	ed7a220b04	Fixed a typo that impacts windows builds	2016-10-05 08:51:31 -07:00
Benoit Steiner	ceee1c008b	Silenced compilation warning	2016-10-04 18:47:53 -07:00
Benoit Steiner	6af5ac7e27	Cleanup the cuda executor code.	2016-10-04 08:52:13 -07:00
Benoit Steiner	2f6d1607c8	Cleaned up the random number generation code.	2016-10-04 08:38:23 -07:00
Benoit Steiner	616a7a1912	Improved support for compiling CUDA code with clang as the host compiler	2016-10-03 17:09:33 -07:00
Benoit Steiner	422530946f	Renamed the SYCL tests to follow the standard naming convention.	2016-09-30 08:22:10 -07:00
Benoit Steiner	2bda1b0d93	Updated the tensor sum and mean reducer to enable them to process complex numbers on cuda gpus.	2016-09-28 17:08:41 -07:00
Mehdi Goli	dd602e62c8	Converting alias template to nested struct in order to be compatible with CXX-03	2016-09-27 16:21:19 +01:00
Benoit Steiner	6565f8d60f	Made the initialization of a CUDA device thread safe.	2016-09-26 11:00:32 -07:00
Benoit Steiner	f6ac51a054	Made TensorEvalTo compatible with c++0x again.	2016-09-23 16:45:17 -07:00
Benoit Steiner	00d4e65f00	Deleted unused TensorMap data member	2016-09-23 16:44:45 -07:00
Benoit Steiner	1301d744f8	Made the gaussian generator usable on GPU	2016-09-22 19:04:44 -07:00
RJ Ryan	608b1acd6d	Don't use c++11 features and fix include.	2016-09-20 07:49:05 -07:00
RJ Ryan	b2c6dc48d9	Add CUDA-specific std::complex<T> specializations for scalar_sum_op, scalar_difference_op, scalar_product_op, and scalar_quotient_op.	2016-09-20 07:18:20 -07:00
Gael Guennebaud	3ada6e4bed	Merged hongkai-dai/eigen/tip into default (bug #1298 )	2016-09-19 22:08:06 +02:00
Benoit Steiner	c3ca9b1e76	Deleted some unecessary and confusing EIGEN_DEVICE_FUNC	2016-09-19 11:33:39 -07:00
Hongkai Dai	5dcc6d301a	remove ternary operator in euler angles	2016-09-19 10:30:30 -07:00
Luke Iwanski	c771df6bc3	Updated the owners of the file.	2016-09-19 14:09:25 +01:00
Luke Iwanski	b91e021172	Merged with default.	2016-09-19 14:03:54 +01:00
Luke Iwanski	cb81975714	Partial OpenCL support via SYCL compatible with ComputeCpp CE.	2016-09-19 12:44:13 +01:00
Emil Fresk	6edd2e2851	Made AutoDiffJacobian more intuitive to use and updated for C++11 Changes: * Removed unnecessary types from the Functor by inferring from its types * Removed inputs() function reference, replaced with .rows() * Updated the forward constructor to use variadic templates * Added optional parameters to the Fuctor for passing parameters, control signals, etc * Has been tested with fixed size and dynamic matricies Ammendment by chtz: overload operator() for compatibility with not fully conforming compilers	2016-09-16 14:03:55 +02:00
Gael Guennebaud	18f6e47815	Fix order of "static inline".	2016-09-16 11:32:54 +02:00
Benoit Steiner	488ad7dd1b	Added missing EIGEN_DEVICE_FUNC qualifiers	2016-09-14 13:35:00 -07:00
Benoit Steiner	e4d4d15588	Register the cxx11_tensor_device only for recent cuda architectures (i.e. >= 3.0) since the test instantiate contractions that require a modern gpu.	2016-09-12 19:01:52 -07:00
Benoit Steiner	4dfd888c92	CUDA contractions require arch >= 3.0: don't compile the cuda contraction tests on older architectures.	2016-09-12 18:49:01 -07:00
Benoit Steiner	028e299577	Fixed a bug impacting some outer reductions on GPU	2016-09-12 18:36:52 -07:00
Benoit Steiner	5f50f12d2c	Added the ability to compute the absolute value of a complex number on GPU, as well as a test to catch the problem.	2016-09-12 13:46:13 -07:00
Benoit Steiner	8321dcce76	Merged latest updates from trunk	2016-09-12 10:33:05 -07:00
Benoit Steiner	eb6ba00cc8	Properly size the list of waiters	2016-09-12 10:31:55 -07:00
Benoit Steiner	a618094b62	Added a resize method to MaxSizeVector	2016-09-12 10:30:53 -07:00
Gael Guennebaud	471eac5399	bug #1195 : move NumTraits::Div<>::Cost to internal::scalar_div_cost (with some specializations in arch/SSE and arch/AVX)	2016-09-08 08:36:27 +02:00
Gael Guennebaud	e1642f485c	bug #1288 : fix memory leak in arpack wrapper.	2016-09-05 18:01:30 +02:00
Gael Guennebaud	dabc81751f	Fix compilation when cuda_fp16.h does not exist.	2016-09-05 17:14:20 +02:00

1 2 3 4 5 ...

2151 Commits