Mehdi Goli
77cc4d06c7
Removing unused variables
2017-01-19 17:06:21 +00:00
Mehdi Goli
837fdbdcb2
Merging with Benoit's upstream.
2017-01-19 11:34:34 +00:00
Mehdi Goli
6bdd15f572
Adding non-deferrenciable pointer track for ComputeCpp backend; Adding TensorConvolutionOp for ComputeCpp; fixing typos. modifying TensorDeviceSycl to use the LegacyPointer class.
2017-01-19 11:30:59 +00:00
Mehdi Goli
c6f7b33834
Applying Benoit's comment. Embedding synchronisation inside device memcpy so there is no need to externally call synchronise() for device memcopy.
2017-01-18 10:45:28 +00:00
Mehdi Goli
e46e722381
Adding Tensor ReverseOp; TensorStriding; TensorConversionOp; Modifying Tensor Contractsycl to be located in any place in the expression tree.
2017-01-16 13:58:49 +00:00
Gael Guennebaud
bbd97b4095
Add a EIGEN_NO_CUDA option, and introduce EIGEN_CUDACC and EIGEN_CUDA_ARCH aliases
2017-07-17 01:02:51 +02:00
Luke Iwanski
90c5bc8d64
Fixes auto appearance in functor template argument for reduction.
2017-01-04 22:18:44 +00:00
Mehdi Goli
8b1c2108ba
Reverting asynchronous exec to Synchronous exec regarding random race condition.
2016-12-22 16:45:38 +00:00
Benoit Steiner
660da83e18
Pulled latest update from trunk
2016-12-21 16:43:27 -08:00
Benoit Steiner
4236aebe10
Simplified the contraction code`
2016-12-21 16:42:56 -08:00
Benoit Steiner
3cfa16f41d
Merged in benoitsteiner/opencl (pull request PR-279)
...
Fix for auto appearing in functor template argument.
2016-12-21 15:08:54 -08:00
Benoit Steiner
519d63d350
Added support for libxsmm kernel in multithreaded contractions
2016-12-21 15:06:06 -08:00
Benoit Steiner
f9eff17e91
Leverage libxsmm kernels within signle threaded contractions
2016-12-21 12:32:06 -08:00
Luke Iwanski
c55ecfd820
Fix for auto appearing in functor template argument.
2016-12-21 15:42:51 +00:00
Benoit Steiner
0f577d4744
Merged eigen/eigen into default
2016-12-20 17:02:06 -08:00
Luke Iwanski
29186f766f
Fixed order of initialisation in ExecExprFunctorKernel functor.
2016-12-20 21:32:42 +00:00
Gael Guennebaud
e8d6862f14
Properly adjust precision when saving to Market format.
2016-12-20 22:10:33 +01:00
Gael Guennebaud
e2f4ee1c2b
Speed up parsing of sparse Market file.
2016-12-20 21:56:21 +01:00
Luke Iwanski
8245851d1b
Matching parameters order between lambda and the functor.
2016-12-20 16:18:15 +00:00
Benoit Steiner
70d0172f0c
Merged eigen/eigen into default
2016-12-16 17:37:04 -08:00
Benoit Steiner
8910442e19
Fixed memcpy, memcpyHostToDevice and memcpyDeviceToHost for Sycl.
2016-12-16 15:45:04 -08:00
Luke Iwanski
54db66c5df
struct -> class in order to silence compilation warning.
2016-12-16 20:25:20 +00:00
Mehdi Goli
35bae513a0
Converting all parallel for lambda to functor in order to prevent kernel duplication name error; adding tensorConcatinationOp backend for sycl.
2016-12-16 19:46:45 +00:00
Mehdi Goli
c5e8546306
Adding asynchandler to sycl queue as lack of it can cause undefined behaviour.
2016-12-15 16:59:57 +00:00
Benoit Steiner
2c2e218471
Avoid using #define since they can conflict with user code
2016-12-14 19:49:15 -08:00
Benoit Steiner
3beb180ee5
Don't call EnvThread::OnCancel by default since it doesn't do anything.
2016-12-14 18:33:39 -08:00
Benoit Steiner
9ff5d0f821
Merged eigen/eigen into default
2016-12-14 17:32:16 -08:00
Mehdi Goli
730eb9fe1c
Adding asynchronous execution as it improves the performance.
2016-12-14 17:38:53 +00:00
Mehdi Goli
2d4a091beb
Adding tensor contraction operation backend for Sycl; adding test for contractionOp sycl backend; adding temporary solution to prevent memory leak in buffer; cleaning up cxx11_tensor_buildins_sycl.h
2016-12-14 15:30:37 +00:00
Benoit Steiner
a432fc102d
Moved the choice of ThreadPool to unsupported/Eigen/CXX11/ThreadPool
2016-12-12 15:24:16 -08:00
Benoit Steiner
8ae68924ed
Made ThreadPoolInterface::Cancel() an optional functionality
2016-12-12 11:58:38 -08:00
Benoit Steiner
76fca22134
Use a more accurate timer to sleep on Linux systems.
2016-12-09 15:12:24 -08:00
Benoit Steiner
4deafd35b7
Introduce a portable EIGEN_SLEEP macro.
2016-12-09 14:52:15 -08:00
Benoit Steiner
aafa97f4d2
Fixed build error with MSVC
2016-12-09 14:42:32 -08:00
Benoit Steiner
2f5b7a199b
Reworked the threadpool cancellation mechanism to not depend on pthread_cancel since it turns out that pthread_cancel doesn't work properly on numerous platforms.
2016-12-09 13:05:14 -08:00
Benoit Steiner
28ee8f42b2
Added a Flush method to the RunQueue
2016-12-08 14:07:56 -08:00
Benoit Steiner
69ef267a77
Added the new threadpool cancel method to the threadpool interface based class.
2016-12-08 14:03:25 -08:00
Benoit Steiner
7bfff85355
Added support for thread cancellation on Linux
2016-12-08 08:12:49 -08:00
Benoit Steiner
462c28e77a
Merged in srvasude/eigen (pull request PR-265)
...
Add Expm1 support to Eigen.
2016-12-05 02:31:11 +00:00
Gael Guennebaud
4465d20403
Add missing generic load methods.
2016-12-03 21:25:04 +01:00
Srinivas Vasudevan
218764ee1f
Added support for expm1 in Eigen.
2016-12-02 14:13:01 -08:00
Mehdi Goli
592acc5bfa
Makingt default numeric_list works with sycl.
2016-12-02 17:58:30 +00:00
Mehdi Goli
79aa2b784e
Adding sycl backend for TensorPadding.h; disbaling __unit128 for sycl in TensorIntDiv.h; disabling cashsize for sycl in tensorDeviceDefault.h; adding sycl backend for StrideSliceOP ; removing sycl compiler warning for creating an array of size 0 in CXX11Meta.h; cleaning up the sycl backend code.
2016-12-01 13:02:27 +00:00
Benoit Steiner
a70393fd02
Cleaned up forward declarations
2016-11-30 21:59:07 -08:00
Benoit Steiner
e073de96dc
Moved the MemCopyFunctor back to TensorSyclDevice since it's the only caller and it makes TensorFlow compile again
2016-11-30 21:36:52 -08:00
Benoit Steiner
fca27350eb
Added the deallocate_all() method back
2016-11-30 20:45:20 -08:00
Benoit Steiner
e633a8371f
Simplified includes
2016-11-30 20:21:18 -08:00
Benoit Steiner
7cd33df4ce
Improved formatting
2016-11-30 20:20:44 -08:00
Benoit Steiner
f5107010ee
Udated the Sizes class to work on AMD gpus without requiring a separate implementation
2016-11-30 19:57:28 -08:00
Benoit Steiner
e37c2c52d3
Added an implementation of numeric_list that works with sycl
2016-11-30 19:55:15 -08:00
Luke Iwanski
26fff1c5b1
Added EIGEN_STRONG_INLINE to get_sycl_supported_device().
2016-11-30 16:55:22 +00:00
Mehdi Goli
577ce78085
Adding TensorShuffling backend for sycl; adding TensorReshaping backend for sycl; cleaning up the sycl backend.
2016-11-29 15:30:42 +00:00
Benoit Steiner
02080e2b67
Merged eigen/eigen into default
2016-11-27 07:27:30 -08:00
Benoit Steiner
9fd081cddc
Fixed compilation warnings
2016-11-26 20:22:25 -08:00
Benoit Steiner
9f8fbd9434
Merged eigen/eigen into default
2016-11-26 11:28:25 -08:00
Benoit Steiner
67b2c41f30
Avoided unnecessary type conversion
2016-11-26 11:27:29 -08:00
Benoit Steiner
7fe704596a
Added missing array_get method for numeric_list
2016-11-26 11:26:07 -08:00
Mehdi Goli
7318daf887
Fixing LLVM error on TensorMorphingSycl.h on GPU; fixing int64_t crash for tensor_broadcast_sycl on GPU; adding get_sycl_supported_devices() on syclDevice.h.
2016-11-25 16:19:07 +00:00
Benoit Steiner
7ad37606dd
Fixed the documentation of Scalar Tensors
2016-11-24 12:31:43 -08:00
Gael Guennebaud
308961c05e
Fix compilation.
2016-11-23 22:17:52 +01:00
Mehdi Goli
b8cc5635d5
Removing unsupported device from test case; cleaning the tensor device sycl.
2016-11-23 16:30:41 +00:00
Gael Guennebaud
7f6333c32b
Merged in tal500/eigen-eulerangles (pull request PR-237)
...
Euler angles
2016-11-23 15:17:38 +00:00
Gael Guennebaud
56e5ec07c6
Automatically switch between EigenSolver and ComplexEigenSolver, and fix a few Real versus Scalar issues.
2016-11-23 16:05:10 +01:00
Gael Guennebaud
9246587122
Patch from Oleg Shirokobrod to extend polynomial solver to complexes
2016-11-23 15:42:26 +01:00
Benoit Steiner
f11da1d83b
Made the QueueInterface thread safe
2016-11-20 13:17:08 -08:00
Benoit Steiner
6d781e3e52
Merged eigen/eigen into default
2016-11-20 10:12:54 -08:00
Benoit Steiner
79a07b891b
Fixed a typo
2016-11-20 07:07:41 -08:00
Benoit Steiner
81151bd474
Fixed merge conflicts
2016-11-19 19:12:59 -08:00
Benoit Steiner
9265ca707e
Made it possible to check the state of a sycl device without synchronization
2016-11-19 10:56:24 -08:00
Benoit Steiner
2d1aec15a7
Added missing include
2016-11-19 08:09:54 -08:00
Benoit Steiner
1bdf1b9ce0
Merged in benoitsteiner/opencl (pull request PR-253)
...
OpenCL improvements
2016-11-19 04:44:43 +00:00
Benoit Steiner
dc601d79d1
Added the ability to run test exclusively OpenCL devices that are listed by sycl::device::get_devices().
2016-11-18 16:26:50 -08:00
Benoit Steiner
110b7f8d9f
Deleted unnecessary semicolons
2016-11-18 14:06:17 -08:00
Benoit Steiner
37c2c516a6
Cleaned up the sycl device code
2016-11-18 12:38:06 -08:00
Mehdi Goli
15e226d7d3
adding Benoit changes on the TensorDeviceSycl.h
2016-11-18 16:34:54 +00:00
Mehdi Goli
622805a0c5
Modifying TensorDeviceSycl.h to always create buffer of type uint8_t and convert them to the actual type at the execution on the device; adding the queue interface class to separate the lifespan of sycl queue and buffers,created for that queue, from Eigen::SyclDevice; modifying sycl tests to support the evaluation of the results for both row major and column major data layout on all different devices that are supported by Sycl{CPU; GPU; and Host}.
2016-11-18 16:20:42 +00:00
Tal Hadad
76b2a3e6e7
Allow to construct EulerAngles from 3D vector directly.
...
Using assignment template struct to distinguish between 3D vector and 3D rotation matrix.
2016-11-18 15:01:06 +02:00
Benoit Steiner
7c30078b9f
Merged eigen/eigen into default
2016-11-17 22:53:37 -08:00
Benoit Steiner
553f50b246
Added a way to detect errors generated by the opencl device from the host
2016-11-17 21:51:48 -08:00
Benoit Steiner
72a45d32e9
Cleanup
2016-11-17 21:29:15 -08:00
Benoit Steiner
4349fc640e
Created a test to check that the sycl runtime can successfully report errors (like ivision by 0).
...
Small cleanup
2016-11-17 20:27:54 -08:00
Benoit Steiner
a6a3fd0703
Made TensorDeviceCuda.h compile on windows
2016-11-17 16:15:27 -08:00
Luke Iwanski
c5130dedbe
Specialised basic math functions for SYCL device.
2016-11-17 11:47:13 +00:00
Benoit Steiner
b5c75351e3
Merged eigen/eigen into default
2016-11-14 15:54:44 -08:00
Rasmus Munk Larsen
32df1b1046
Reduce dispatch overhead in parallelFor by only calling thread_pool.Schedule() for one of the two recursive calls in handleRange. This avoids going through the scedule path to push both recursive calls onto another thread-queue in the binary tree, but instead executes one of them on the main thread. At the leaf level this will still activate a full complement of threads, but will save up to 50% of the overhead in Schedule (random number generation, insertion in queue which includes signaling via atomics).
2016-11-14 14:18:16 -08:00
Mehdi Goli
05e8c2a1d9
Adding extra test for non-fixed size to broadcast; Replacing stcl with sycl.
2016-11-14 18:13:53 +00:00
Mehdi Goli
f8ca893976
Adding TensorFixsize; adding sycl device memcpy; adding insial stage of slicing.
2016-11-14 17:51:57 +00:00
Mehdi Goli
a5c3f15682
Adding comment to TensorDeviceSycl.h and cleaning the code.
2016-11-11 19:06:34 +00:00
Mehdi Goli
3be3963021
Adding EIGEN_STRONG_INLINE back; using size() instead of dimensions.TotalSize() on Tensor.
2016-11-10 19:16:31 +00:00
Mehdi Goli
12387abad5
adding the missing in eigen_assert!
2016-11-10 18:58:08 +00:00
Mehdi Goli
2e704d4257
Adding Memset; optimising MecopyDeviceToHost by removing double copying;
2016-11-10 18:45:12 +00:00
Benoit Steiner
dcc14bee64
Fixed the formatting of the code
2016-11-08 14:24:46 -08:00
Luke Iwanski
912cb3d660
#if EIGEN_EXCEPTION -> #ifdef EIGEN_EXCEPTIONS.
2016-11-08 22:01:14 +00:00
Luke Iwanski
1b345b0895
Fix for SYCL queue initialisation.
2016-11-08 21:56:31 +00:00
Luke Iwanski
1b95717358
Use try/catch only when exceptions are enabled.
2016-11-08 21:08:53 +00:00
Mehdi Goli
d57430dd73
Converting all sycl buffers to uninitialised device only buffers; adding memcpyHostToDevice and memcpyDeviceToHost on syclDevice; modifying all examples to obey the new rules; moving sycl queue creating to the device based on Benoit suggestion; removing the sycl specefic condition for returning m_result in TensorReduction.h according to Benoit suggestion.
2016-11-08 17:08:02 +00:00
Benoit Steiner
dad177be01
Added missing includes
2016-11-05 10:04:42 -07:00
Benoit Steiner
d46a36cc84
Merged eigen/eigen into default
2016-11-04 18:22:55 -07:00
Mehdi Goli
0ebe3808ca
Removed the sycl include from Eigen/Core and moved it to Unsupported/Eigen/CXX11/Tensor; added TensorReduction for sycl (full reduction and partial reduction); added TensorReduction test case for sycl (full reduction and partial reduction); fixed the tile size on TensorSyclRun.h based on the device max work group size;
2016-11-04 18:18:19 +00:00
Benoit Steiner
3e37166d0b
Merged in benoitsteiner/opencl (pull request PR-244)
...
Disable vectorization on device only when compiling for sycl
2016-11-02 22:01:03 +00:00
Benoit Steiner
0585b2965d
Disable vectorization on device only when compiling for sycl
2016-11-02 11:44:27 -07:00
Benoit Steiner
e6e77ed08b
Don't call lgamma_r when compiling for an Apple device, since the function isn't available on MacOS
2016-11-02 09:55:39 -07:00
Benoit Steiner
b238f387b4
Pulled latest updates from trunk
2016-11-02 08:53:13 -07:00
Benoit Steiner
c8db17301e
Special functions require math.h: make sure it is included.
2016-11-02 08:51:52 -07:00
Benoit Steiner
e44519744e
Merged in benoitsteiner/opencl (pull request PR-243)
...
Fixed the ambiguity in callig make_tuple for sycl backend.
2016-11-02 02:56:58 +00:00
Rasmus Munk Larsen
0a6ae41555
Merged eigen/eigen into default
2016-11-01 15:37:00 -07:00
Rasmus Munk Larsen
b730952414
Don't attempts to use lgamma_r for CUDA devices.
...
Fix type in lgamma_impl<double>.
2016-11-01 15:34:19 -07:00
Mehdi Goli
51af6ae971
Fixed the ambiguity in callig make_tuple for sycl backend.
2016-10-31 16:35:51 +00:00
Benoit Steiner
0a9ad6fc72
Worked around Visual Studio compilation errors
2016-10-28 07:54:27 -07:00
Benoit Steiner
b0c5bfdf78
Added missing template parameters
2016-10-28 03:43:41 +00:00
Rasmus Munk Larsen
2ebb314fa7
Use threadsafe versions of lgamma and lgammaf if possible.
2016-10-27 16:17:12 -07:00
Gael Guennebaud
530f20c21a
Workaround MSVC issue.
2016-10-27 21:51:37 +02:00
Benoit Steiner
0a4c4d40b4
Removed a template parameter for fixed sized tensors
2016-10-26 18:47:37 -07:00
Benoit Steiner
5f2dd503ff
Replaced tabs with spaces
2016-10-25 20:40:58 -07:00
Benoit Steiner
1644bafe29
Code cleanup
2016-10-25 20:36:14 -07:00
Benoit Steiner
cf20b30d65
Merge latest updates from trunk
2016-10-20 09:42:05 -07:00
Luke Iwanski
03b63e182c
Added SYCL include in Tensor.
2016-10-20 15:32:44 +01:00
Benoit Steiner
d3943cd50c
Fixed a few typos in the ternary tensor expressions types
2016-10-19 12:56:12 -07:00
Tal Hadad
6f4f12d1ed
Add isApprox() and cast() functions.
...
test cases included
2016-10-17 22:23:47 +03:00
Tal Hadad
58f5d7d058
Fix calc bug, docs and better testing.
...
Test code changes:
* better coded
* rand and manual numbers
* singularity checking
2016-10-16 14:39:26 +03:00
Mehdi Goli
e36cb91c99
Fixing the code indentation in the TensorReduction.h file.
2016-10-14 18:03:00 +01:00
Tal Hadad
078a202621
Merge Hongkai Dai correct range calculation, and remove ranges from API.
...
Docs updated.
2016-10-14 16:03:28 +03:00
Luke Iwanski
e742da8b28
Merged ComputeCpp into default.
2016-10-14 13:36:51 +01:00
Mehdi Goli
524fa4c46f
Reducing the code by generalising sycl backend functions/structs.
2016-10-14 12:09:55 +01:00
Hongkai Dai
014d9f1d9b
implement euler angles with the right ranges
2016-10-13 14:45:51 -07:00
Benoit Steiner
7e4a6754b2
Merged eigen/eigen into default
2016-10-12 22:42:33 -07:00
Gael Guennebaud
091d373ee9
Fix outer-stride.
2016-10-12 21:47:52 +02:00
Benoit Steiner
7f0599b6eb
Manually define int16_t and uint16_t when compiling with Visual Studio
2016-10-08 22:56:32 -07:00
Benoit Steiner
5c68051cd7
Merge the content of the ComputeCpp branch into the default branch
2016-10-07 11:04:16 -07:00
RJ Ryan
e2e9cdd169
Fully support complex types in SumReducer and MeanReducer when building for CUDA by using scalar_sum_op and scalar_product_op instead of operator+ and operator*.
2016-10-06 10:49:48 -07:00
Benoit Steiner
ae1385c7e4
Pull the latest updates from trunk
2016-10-05 14:54:36 -07:00
Benoit Steiner
c84084c0c0
Fixed compilation warning
2016-10-05 14:15:41 -07:00
Benoit Steiner
8b69d5d730
::rand() returns a signed integer on win32
2016-10-05 08:55:02 -07:00
Benoit Steiner
ed7a220b04
Fixed a typo that impacts windows builds
2016-10-05 08:51:31 -07:00
Benoit Steiner
ceee1c008b
Silenced compilation warning
2016-10-04 18:47:53 -07:00
Benoit Steiner
6af5ac7e27
Cleanup the cuda executor code.
2016-10-04 08:52:13 -07:00
Benoit Steiner
2f6d1607c8
Cleaned up the random number generation code.
2016-10-04 08:38:23 -07:00
Benoit Steiner
2bda1b0d93
Updated the tensor sum and mean reducer to enable them to process complex numbers on cuda gpus.
2016-09-28 17:08:41 -07:00
Mehdi Goli
dd602e62c8
Converting alias template to nested struct in order to be compatible with CXX-03
2016-09-27 16:21:19 +01:00
Benoit Steiner
6565f8d60f
Made the initialization of a CUDA device thread safe.
2016-09-26 11:00:32 -07:00
Benoit Steiner
f6ac51a054
Made TensorEvalTo compatible with c++0x again.
2016-09-23 16:45:17 -07:00
Benoit Steiner
00d4e65f00
Deleted unused TensorMap data member
2016-09-23 16:44:45 -07:00
Benoit Steiner
1301d744f8
Made the gaussian generator usable on GPU
2016-09-22 19:04:44 -07:00
Gael Guennebaud
3ada6e4bed
Merged hongkai-dai/eigen/tip into default (bug #1298 )
2016-09-19 22:08:06 +02:00
Benoit Steiner
c3ca9b1e76
Deleted some unecessary and confusing EIGEN_DEVICE_FUNC
2016-09-19 11:33:39 -07:00
Hongkai Dai
5dcc6d301a
remove ternary operator in euler angles
2016-09-19 10:30:30 -07:00
Luke Iwanski
b91e021172
Merged with default.
2016-09-19 14:03:54 +01:00
Luke Iwanski
cb81975714
Partial OpenCL support via SYCL compatible with ComputeCpp CE.
2016-09-19 12:44:13 +01:00
Emil Fresk
6edd2e2851
Made AutoDiffJacobian more intuitive to use and updated for C++11
...
Changes:
* Removed unnecessary types from the Functor by inferring from its types
* Removed inputs() function reference, replaced with .rows()
* Updated the forward constructor to use variadic templates
* Added optional parameters to the Fuctor for passing parameters,
control signals, etc
* Has been tested with fixed size and dynamic matricies
Ammendment by chtz: overload operator() for compatibility with not fully conforming compilers
2016-09-16 14:03:55 +02:00
Gael Guennebaud
18f6e47815
Fix order of "static inline".
2016-09-16 11:32:54 +02:00
Benoit Steiner
488ad7dd1b
Added missing EIGEN_DEVICE_FUNC qualifiers
2016-09-14 13:35:00 -07:00
Benoit Steiner
028e299577
Fixed a bug impacting some outer reductions on GPU
2016-09-12 18:36:52 -07:00
Benoit Steiner
8321dcce76
Merged latest updates from trunk
2016-09-12 10:33:05 -07:00
Benoit Steiner
eb6ba00cc8
Properly size the list of waiters
2016-09-12 10:31:55 -07:00
Benoit Steiner
a618094b62
Added a resize method to MaxSizeVector
2016-09-12 10:30:53 -07:00
Gael Guennebaud
471eac5399
bug #1195 : move NumTraits::Div<>::Cost to internal::scalar_div_cost (with some specializations in arch/SSE and arch/AVX)
2016-09-08 08:36:27 +02:00
Gael Guennebaud
e1642f485c
bug #1288 : fix memory leak in arpack wrapper.
2016-09-05 18:01:30 +02:00
Benoit Steiner
13df3441ae
Use MaxSizeVector instead of std::vector: xcode sometimes assumes that std::vector allocates aligned memory and therefore issues aligned instruction to initialize it. This can result in random crashes when compiling with AVX instructions enabled.
2016-09-02 19:25:47 -07:00
Benoit Steiner
cadd124d73
Pulled latest update from trunk
2016-09-02 15:30:02 -07:00
Benoit Steiner
05b0518077
Made the index type an explicit template parameter to help some compilers compile the code.
2016-09-02 15:29:34 -07:00
Benoit Steiner
adf864fec0
Merged in rmlarsen/eigen (pull request PR-222)
...
Fix CUDA build broken by changes to min and max reduction.
2016-09-02 14:11:20 -07:00
Rasmus Munk Larsen
13e93ca8b7
Fix CUDA build broken by changes to min and max reduction.
2016-09-02 13:41:36 -07:00
Benoit Steiner
c53f783705
Updated the contraction code to support constant inputs.
2016-09-01 11:41:27 -07:00
Gael Guennebaud
46475eff9a
Adjust Tensor module wrt recent change in nullary functor
2016-09-01 13:40:45 +02:00
Rasmus Munk Larsen
a1e092d1e8
Fix bugs to make min- and max reducers with correctly with IEEE infinities.
2016-08-31 15:04:16 -07:00
Gael Guennebaud
1f84f0d33a
merge EulerAngles module
2016-08-30 10:01:53 +02:00
Gael Guennebaud
e074f720c7
Include missing forward declaration of SparseMatrix
2016-08-29 18:56:46 +02:00
Gael Guennebaud
35a8e94577
bug #1167 : simplify installation of header files using cmake's install(DIRECTORY ...) command.
2016-08-29 10:59:37 +02:00
Gael Guennebaud
965e595f02
Add missing log1p method
2016-08-26 14:55:00 +02:00
Benoit Steiner
34ae80179a
Use array_prod instead of calling TotalSize since TotalSize is only available on DSize.
2016-08-15 10:29:14 -07:00
Benoit Steiner
fe73648c98
Fixed a bug in the documentation.
2016-08-12 10:00:43 -07:00
Benoit Steiner
e3a8dfb02f
std::erfcf doesn't exist: use numext::erfc instead
2016-08-11 15:24:06 -07:00
Benoit Steiner
64e68cbe87
Don't attempt to optimize partial reductions when the optimized implementation doesn't buy anything.
2016-08-08 19:29:59 -07:00
Benoit Steiner
ca2cee2739
Merged in ibab/eigen (pull request PR-206)
...
Expose real and imag methods on Tensors
2016-08-03 11:53:04 -07:00
Benoit Steiner
a20b58845f
CUDA_ARCH isn't always defined, so avoid relying on it too much when figuring out which implementation to use for reductions. Instead rely on the device to tell us on which hardware version we're running.
2016-08-03 10:00:43 -07:00
Benoit Steiner
fd220dd8b0
Use numext::conj instead of std::conj
2016-08-01 18:16:16 -07:00
Benoit Steiner
e256acec7c
Avoid unecessary object copies
2016-08-01 17:03:39 -07:00
Benoit Steiner
2693fd54bf
bug #1266 : half implementation has been moved to half_impl namespace
2016-07-29 13:45:56 -07:00
Gael Guennebaud
cc2f6d68b1
bug #1264 : fix compilation
2016-07-27 23:30:47 +02:00
Gael Guennebaud
8972323c08
Big 1261: add missing max(ADS,ADS) overload (same for min)
2016-07-27 14:52:48 +02:00
Gael Guennebaud
0d7039319c
bug #1260 : remove doubtful specializations of ScalarBinaryOpTraits
2016-07-27 14:35:52 +02:00
Benoit Steiner
3d3d34e442
Deleted dead code.
2016-07-25 08:53:37 -07:00
Gael Guennebaud
6d5daf32f5
bug #1255 : comment out broken and unsused line.
2016-07-25 14:48:30 +02:00
Gael Guennebaud
f9598d73b5
bug #1250 : fix pow() for AutoDiffScalar with custom nested scalar type.
2016-07-25 14:42:19 +02:00
Gael Guennebaud
fd1117f2be
Implement digits10 for mpreal
2016-07-25 14:38:55 +02:00
Gael Guennebaud
9908020d36
Add minimal support for Array<string>, and fix Tensor<string>
2016-07-25 14:25:56 +02:00
Benoit Steiner
c6b0de2c21
Improved partial reductions in more cases
2016-07-22 17:18:20 -07:00
Gael Guennebaud
0f350a8b7e
Fix CUDA compilation
2016-07-21 18:47:07 +02:00
Yi Lin
7b4abc2b1d
Fixed a code comment error
2016-07-20 22:28:54 +08:00
Benoit Steiner
20f7ef2f89
An evalTo expression is only aligned iff both the lhs and the rhs are aligned.
2016-07-12 10:56:42 -07:00
Benoit Steiner
3a2dd352ae
Improved the contraction mapper to properly support tensor products
2016-07-11 13:43:41 -07:00
Benoit Steiner
0bc020be9d
Improved the detection of packet size in the tensor scan evaluator.
2016-07-11 12:14:56 -07:00
Gael Guennebaud
a96a7ce3f7
Move CUDA's special functions to SpecialFunctions module.
2016-07-11 18:39:11 +02:00
Gael Guennebaud
fd60966310
merge
2016-07-11 18:11:47 +02:00
Gael Guennebaud
194daa3048
Fix assertion (it did not make sense for static_val types)
2016-07-11 11:39:27 +02:00
Gael Guennebaud
18c35747ce
Emulate _BitScanReverse64 for 32 bits builds
2016-07-11 11:38:04 +02:00
Gael Guennebaud
599f8ba617
Change runtime to compile-time conditional.
2016-07-08 11:39:43 +02:00
Gael Guennebaud
544935101a
Fix warnings
2016-07-08 11:38:52 +02:00
Gael Guennebaud
2f7e2614e7
bug #1232 : refactor special functions as a new SpecialFunctions module, currently in unsupported/.
2016-07-08 11:13:55 +02:00
Gael Guennebaud
179ebb88f9
Fix warning
2016-07-07 09:16:40 +02:00
Gael Guennebaud
ce9fc0ce14
fix clang compilation
2016-07-04 12:59:02 +02:00
Gael Guennebaud
440020474c
Workaround compilation issue with msvc
2016-07-04 12:49:19 +02:00
Igor Babuschkin
78f37ca03c
Expose real and imag methods on Tensors
2016-07-01 17:34:31 +01:00
Benoit Steiner
cb2d8b8fa6
Made it possible to compile reductions for an old cuda architecture and run them on a recent gpu.
2016-06-29 15:42:01 -07:00
Benoit Steiner
b2a47641ce
Made the code compile when using CUDA architecture < 300
2016-06-29 15:32:47 -07:00
Igor Babuschkin
85699850d9
Add missing CUDA kernel to tensor scan op
...
The TensorScanOp implementation was missing a CUDA kernel launch.
This adds a simple placeholder implementation.
2016-06-29 11:54:35 +01:00
Benoit Steiner
75c333f94c
Don't store the scan axis in the evaluator of the tensor scan operation since it's only used in the constructor.
...
Also avoid taking references to values that may becomes stale after a copy construction.
2016-06-27 10:32:38 -07:00
Benoit Steiner
7944d4431f
Made the cost model cwiseMax and cwiseMin methods consts to help the PowerPC cuda compiler compile this code.
2016-08-18 13:46:36 -07:00
Benoit Steiner
647a51b426
Force the inlining of a simple accessor.
2016-08-18 12:31:02 -07:00
Benoit Steiner
a452dedb4f
Merged in ibab/eigen/double-tensor-reduction (pull request PR-216)
...
Enable efficient Tensor reduction for doubles on the GPU (continued)
2016-08-18 12:29:54 -07:00
Igor Babuschkin
18c67df31c
Fix remaining CUDA >= 300 checks
2016-08-18 17:18:30 +01:00
Igor Babuschkin
1569a7d7ab
Add the necessary CUDA >= 300 checks back
2016-08-18 17:15:12 +01:00
Benoit Steiner
2b17f34574
Properly detect the type of the result of a contraction.
2016-08-16 16:00:30 -07:00
Igor Babuschkin
841e075154
Remove CUDA >= 300 checks and enable outer reductin for doubles
2016-08-06 18:07:50 +01:00
Igor Babuschkin
0425118e2a
Merge upstream changes
2016-08-05 14:34:57 +01:00
Igor Babuschkin
9537e8b118
Make use of atomicExch for atomicExchCustom
2016-08-05 14:29:58 +01:00
Igor Babuschkin
eeb0d880ee
Enable efficient Tensor reduction for doubles
2016-07-01 19:08:26 +01:00
Gael Guennebaud
cfff370549
Fix hyperbolic functions for autodiff.
2016-06-24 23:21:35 +02:00
Gael Guennebaud
3852351793
merge pull request 198
2016-06-24 11:48:17 +02:00
Gael Guennebaud
6dd9077070
Fix some unused typedef warnings.
2016-06-24 11:34:21 +02:00
Gael Guennebaud
ce90647fa5
Fix NumTraits<AutoDiff>
2016-06-24 11:34:02 +02:00
Gael Guennebaud
fa39f81b48
Fix instantiation of ScalarBinaryOpTraits for AutoDiff.
2016-06-24 11:33:30 +02:00
Rasmus Munk Larsen
a9c1e4d7b7
Return -1 from CurrentThreadId when called by thread outside the pool.
2016-06-23 16:40:07 -07:00
Rasmus Munk Larsen
d39df320d2
Resolve merge.
2016-06-23 15:08:03 -07:00
Gael Guennebaud
360a743a10
bug #1241 : does not emmit anything for empty tensors
2016-06-23 18:47:31 +02:00
Gael Guennebaud
7c6561485a
merge PR 194
2016-06-23 15:29:57 +02:00
Benoit Steiner
a29a2cb4ff
Silenced a couple of compilation warnings generated by xcode
2016-06-22 16:43:02 -07:00
Benoit Steiner
f8fcd6b32d
Turned the constructor of the PerThread struct into what is effectively a constant expression to make the code compatible with a wider range of compilers
2016-06-22 16:03:11 -07:00
Benoit Steiner
c58df31747
Handle empty tensors in the print functions
2016-06-21 09:22:43 -07:00
Benoit Steiner
de32f8d656
Fixed the printing of rank-0 tensors
2016-06-20 10:46:45 -07:00
Tal Hadad
8e198d6835
Complete docs and add ostream operator for EulerAngles.
2016-06-19 20:42:45 +03:00
Geoffrey Lalonde
72c95383e0
Add autodiff coverage for standard library hyperbolic functions, and tests.
...
* * *
Corrected tanh derivatived, moved test definitions.
* * *
Added more test cases, removed lingering lines
2016-06-15 23:33:19 -07:00
Benoit Steiner
7d495d890a
Merged in ibab/eigen (pull request PR-197)
...
Implement exclusive scan option for Tensor library
2016-06-14 17:54:59 -07:00
Benoit Steiner
aedc5be1d6
Avoid generating pseudo random numbers that are multiple of 5: this helps
...
spread the load over multiple cpus without havind to rely on work stealing.
2016-06-14 17:51:47 -07:00
Igor Babuschkin
c4d10e921f
Implement exclusive scan option
2016-06-14 19:44:07 +01:00
Gael Guennebaud
76236cdea4
merge
2016-06-14 15:33:47 +02:00
Gael Guennebaud
62134082aa
Update AutoDiffScalar wrt to scalar-multiple.
2016-06-14 15:06:35 +02:00
Gael Guennebaud
5d38203735
Update Tensor module to use bind1st_op and bind2nd_op
2016-06-14 15:06:03 +02:00
Tal Hadad
6edfe8771b
Little bit docs
2016-06-13 22:03:19 +03:00
Tal Hadad
6e1c086593
Add static assertion
2016-06-13 21:55:17 +03:00
Gael Guennebaud
3c12e24164
Add bind1st_op and bind2nd_op helpers to turn binary functors into unary ones, and implement scalar_multiple2 and scalar_quotient2 on top of them.
2016-06-13 16:18:59 +02:00
Tal Hadad
06206482d9
More docs, and minor code fixes
2016-06-12 23:40:17 +03:00
Benoit Steiner
65d33e5898
Merged in ibab/eigen (pull request PR-195)
...
Add small fixes to TensorScanOp
2016-06-10 19:31:17 -07:00
Benoit Steiner
a05607875a
Don't refer to the half2 type unless it's been defined
2016-06-10 11:53:56 -07:00
Igor Babuschkin
86aedc9282
Add small fixes to TensorScanOp
2016-06-07 20:06:38 +01:00
Benoit Steiner
84b2060a9e
Fixed compilation error with gcc 4.4
2016-06-06 17:16:19 -07:00
Benoit Steiner
7ef9f47b58
Misc small improvements to the reduction code.
2016-06-06 14:09:46 -07:00
Tal Hadad
e30133e439
Doc EulerAngles class, and minor fixes.
2016-06-06 22:01:40 +03:00
Benoit Steiner
9137f560f0
Moved assertions to the constructor to make the code more portable
2016-06-06 07:26:48 -07:00
Gael Guennebaud
66e99ab6a1
Relax mixing-type constraints for binary coefficient-wise operators:
...
- Replace internal::scalar_product_traits<A,B> by Eigen::ScalarBinaryOpTraits<A,B,OP>
- Remove the "functor_is_product_like" helper (was pretty ugly)
- Currently, OP is not used, but it is available to the user for fine grained tuning
- Currently, only the following operators have been generalized: *,/,+,-,=,*=,/=,+=,-=
- TODO: generalize all other binray operators (comparisons,pow,etc.)
- TODO: handle "scalar op array" operators (currently only * is handled)
- TODO: move the handling of the "void" scalar type to ScalarBinaryOpTraits
2016-06-06 15:11:41 +02:00