Commit Graph

1600 Commits

Author SHA1 Message Date
Gael Guennebaud
091d373ee9 Fix outer-stride. 2016-10-12 21:47:52 +02:00
Benoit Steiner
7f0599b6eb Manually define int16_t and uint16_t when compiling with Visual Studio 2016-10-08 22:56:32 -07:00
RJ Ryan
e2e9cdd169 Fully support complex types in SumReducer and MeanReducer when building for CUDA by using scalar_sum_op and scalar_product_op instead of operator+ and operator*. 2016-10-06 10:49:48 -07:00
Benoit Steiner
8b69d5d730 ::rand() returns a signed integer on win32 2016-10-05 08:55:02 -07:00
Benoit Steiner
ed7a220b04 Fixed a typo that impacts windows builds 2016-10-05 08:51:31 -07:00
Benoit Steiner
ceee1c008b Silenced compilation warning 2016-10-04 18:47:53 -07:00
Benoit Steiner
6af5ac7e27 Cleanup the cuda executor code. 2016-10-04 08:52:13 -07:00
Benoit Steiner
2f6d1607c8 Cleaned up the random number generation code. 2016-10-04 08:38:23 -07:00
Benoit Steiner
2bda1b0d93 Updated the tensor sum and mean reducer to enable them to process complex numbers on cuda gpus. 2016-09-28 17:08:41 -07:00
Benoit Steiner
6565f8d60f Made the initialization of a CUDA device thread safe. 2016-09-26 11:00:32 -07:00
Benoit Steiner
1301d744f8 Made the gaussian generator usable on GPU 2016-09-22 19:04:44 -07:00
Gael Guennebaud
3ada6e4bed Merged hongkai-dai/eigen/tip into default (bug #1298) 2016-09-19 22:08:06 +02:00
Benoit Steiner
c3ca9b1e76 Deleted some unecessary and confusing EIGEN_DEVICE_FUNC 2016-09-19 11:33:39 -07:00
Hongkai Dai
5dcc6d301a remove ternary operator in euler angles 2016-09-19 10:30:30 -07:00
Emil Fresk
6edd2e2851 Made AutoDiffJacobian more intuitive to use and updated for C++11
Changes:
* Removed unnecessary types from the Functor by inferring from its types
* Removed inputs() function reference, replaced with .rows()
* Updated the forward constructor to use variadic templates
* Added optional parameters to the Fuctor for passing parameters,
  control signals, etc
* Has been tested with fixed size and dynamic matricies

Ammendment by chtz: overload operator() for compatibility with not fully conforming compilers
2016-09-16 14:03:55 +02:00
Gael Guennebaud
18f6e47815 Fix order of "static inline". 2016-09-16 11:32:54 +02:00
Benoit Steiner
488ad7dd1b Added missing EIGEN_DEVICE_FUNC qualifiers 2016-09-14 13:35:00 -07:00
Benoit Steiner
028e299577 Fixed a bug impacting some outer reductions on GPU 2016-09-12 18:36:52 -07:00
Benoit Steiner
8321dcce76 Merged latest updates from trunk 2016-09-12 10:33:05 -07:00
Benoit Steiner
eb6ba00cc8 Properly size the list of waiters 2016-09-12 10:31:55 -07:00
Benoit Steiner
a618094b62 Added a resize method to MaxSizeVector 2016-09-12 10:30:53 -07:00
Gael Guennebaud
471eac5399 bug #1195: move NumTraits::Div<>::Cost to internal::scalar_div_cost (with some specializations in arch/SSE and arch/AVX) 2016-09-08 08:36:27 +02:00
Gael Guennebaud
e1642f485c bug #1288: fix memory leak in arpack wrapper. 2016-09-05 18:01:30 +02:00
Benoit Steiner
13df3441ae Use MaxSizeVector instead of std::vector: xcode sometimes assumes that std::vector allocates aligned memory and therefore issues aligned instruction to initialize it. This can result in random crashes when compiling with AVX instructions enabled. 2016-09-02 19:25:47 -07:00
Benoit Steiner
cadd124d73 Pulled latest update from trunk 2016-09-02 15:30:02 -07:00
Benoit Steiner
05b0518077 Made the index type an explicit template parameter to help some compilers compile the code. 2016-09-02 15:29:34 -07:00
Benoit Steiner
adf864fec0 Merged in rmlarsen/eigen (pull request PR-222)
Fix CUDA build broken by changes to min and max reduction.
2016-09-02 14:11:20 -07:00
Rasmus Munk Larsen
13e93ca8b7 Fix CUDA build broken by changes to min and max reduction. 2016-09-02 13:41:36 -07:00
Benoit Steiner
c53f783705 Updated the contraction code to support constant inputs. 2016-09-01 11:41:27 -07:00
Gael Guennebaud
46475eff9a Adjust Tensor module wrt recent change in nullary functor 2016-09-01 13:40:45 +02:00
Rasmus Munk Larsen
a1e092d1e8 Fix bugs to make min- and max reducers with correctly with IEEE infinities. 2016-08-31 15:04:16 -07:00
Gael Guennebaud
1f84f0d33a merge EulerAngles module 2016-08-30 10:01:53 +02:00
Gael Guennebaud
e074f720c7 Include missing forward declaration of SparseMatrix 2016-08-29 18:56:46 +02:00
Gael Guennebaud
35a8e94577 bug #1167: simplify installation of header files using cmake's install(DIRECTORY ...) command. 2016-08-29 10:59:37 +02:00
Gael Guennebaud
965e595f02 Add missing log1p method 2016-08-26 14:55:00 +02:00
Benoit Steiner
7944d4431f Made the cost model cwiseMax and cwiseMin methods consts to help the PowerPC cuda compiler compile this code. 2016-08-18 13:46:36 -07:00
Benoit Steiner
647a51b426 Force the inlining of a simple accessor. 2016-08-18 12:31:02 -07:00
Benoit Steiner
a452dedb4f Merged in ibab/eigen/double-tensor-reduction (pull request PR-216)
Enable efficient Tensor reduction for doubles on the GPU (continued)
2016-08-18 12:29:54 -07:00
Igor Babuschkin
18c67df31c Fix remaining CUDA >= 300 checks 2016-08-18 17:18:30 +01:00
Igor Babuschkin
1569a7d7ab Add the necessary CUDA >= 300 checks back 2016-08-18 17:15:12 +01:00
Benoit Steiner
2b17f34574 Properly detect the type of the result of a contraction. 2016-08-16 16:00:30 -07:00
Benoit Steiner
34ae80179a Use array_prod instead of calling TotalSize since TotalSize is only available on DSize. 2016-08-15 10:29:14 -07:00
Benoit Steiner
fe73648c98 Fixed a bug in the documentation. 2016-08-12 10:00:43 -07:00
Benoit Steiner
e3a8dfb02f std::erfcf doesn't exist: use numext::erfc instead 2016-08-11 15:24:06 -07:00
Benoit Steiner
64e68cbe87 Don't attempt to optimize partial reductions when the optimized implementation doesn't buy anything. 2016-08-08 19:29:59 -07:00
Igor Babuschkin
841e075154 Remove CUDA >= 300 checks and enable outer reductin for doubles 2016-08-06 18:07:50 +01:00
Igor Babuschkin
0425118e2a Merge upstream changes 2016-08-05 14:34:57 +01:00
Igor Babuschkin
9537e8b118 Make use of atomicExch for atomicExchCustom 2016-08-05 14:29:58 +01:00
Benoit Steiner
ca2cee2739 Merged in ibab/eigen (pull request PR-206)
Expose real and imag methods on Tensors
2016-08-03 11:53:04 -07:00
Benoit Steiner
a20b58845f CUDA_ARCH isn't always defined, so avoid relying on it too much when figuring out which implementation to use for reductions. Instead rely on the device to tell us on which hardware version we're running. 2016-08-03 10:00:43 -07:00