eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-01-06 14:14:46 +08:00

Author	SHA1	Message	Date
Gael Guennebaud	471eac5399	bug #1195 : move NumTraits::Div<>::Cost to internal::scalar_div_cost (with some specializations in arch/SSE and arch/AVX)	2016-09-08 08:36:27 +02:00
Gael Guennebaud	e1642f485c	bug #1288 : fix memory leak in arpack wrapper.	2016-09-05 18:01:30 +02:00
Gael Guennebaud	dabc81751f	Fix compilation when cuda_fp16.h does not exist.	2016-09-05 17:14:20 +02:00
Benoit Steiner	87a8a1975e	Fixed a regression test	2016-09-02 19:29:33 -07:00
Benoit Steiner	13df3441ae	Use MaxSizeVector instead of std::vector: xcode sometimes assumes that std::vector allocates aligned memory and therefore issues aligned instruction to initialize it. This can result in random crashes when compiling with AVX instructions enabled.	2016-09-02 19:25:47 -07:00
Benoit Steiner	cadd124d73	Pulled latest update from trunk	2016-09-02 15:30:02 -07:00
Benoit Steiner	05b0518077	Made the index type an explicit template parameter to help some compilers compile the code.	2016-09-02 15:29:34 -07:00
Benoit Steiner	adf864fec0	Merged in rmlarsen/eigen (pull request PR-222) Fix CUDA build broken by changes to min and max reduction.	2016-09-02 14:11:20 -07:00
Rasmus Munk Larsen	13e93ca8b7	Fix CUDA build broken by changes to min and max reduction.	2016-09-02 13:41:36 -07:00
Benoit Steiner	6c05c3dd49	Fix the cxx11_tensor_cuda.cu test on 32bit platforms.	2016-09-02 11:12:16 -07:00
Benoit Steiner	039e225f7f	Added a test for nullary expressions on CUDA Also check that we can mix 64 and 32 bit indices in the same compilation unit	2016-09-01 13:28:12 -07:00
Benoit Steiner	c53f783705	Updated the contraction code to support constant inputs.	2016-09-01 11:41:27 -07:00
Gael Guennebaud	46475eff9a	Adjust Tensor module wrt recent change in nullary functor	2016-09-01 13:40:45 +02:00
Gael Guennebaud	72a4d49315	Fix compilation with CUDA 8	2016-09-01 13:39:33 +02:00
Rasmus Munk Larsen	a1e092d1e8	Fix bugs to make min- and max reducers with correctly with IEEE infinities.	2016-08-31 15:04:16 -07:00
Gael Guennebaud	1f84f0d33a	merge EulerAngles module	2016-08-30 10:01:53 +02:00
Gael Guennebaud	e074f720c7	Include missing forward declaration of SparseMatrix	2016-08-29 18:56:46 +02:00
Gael Guennebaud	6cd7b9ea6b	Fix compilation with cuda 8	2016-08-29 11:06:08 +02:00
Gael Guennebaud	35a8e94577	bug #1167 : simplify installation of header files using cmake's install(DIRECTORY ...) command.	2016-08-29 10:59:37 +02:00
Gael Guennebaud	0f56b5a6de	enable vectorization path when testing half on cuda, and add test for log1p	2016-08-26 14:55:51 +02:00
Gael Guennebaud	965e595f02	Add missing log1p method	2016-08-26 14:55:00 +02:00
Benoit Steiner	34ae80179a	Use array_prod instead of calling TotalSize since TotalSize is only available on DSize.	2016-08-15 10:29:14 -07:00
Benoit Steiner	fe73648c98	Fixed a bug in the documentation.	2016-08-12 10:00:43 -07:00
Benoit Steiner	e3a8dfb02f	std::erfcf doesn't exist: use numext::erfc instead	2016-08-11 15:24:06 -07:00
Benoit Steiner	64e68cbe87	Don't attempt to optimize partial reductions when the optimized implementation doesn't buy anything.	2016-08-08 19:29:59 -07:00
Benoit Steiner	5eea1c7f97	Fixed cut and paste bug in debud message	2016-08-04 17:34:13 -07:00
Benoit Steiner	b50d8f8c4a	Extended a regression test to validate that we basic fp16 support works with cuda 7.0	2016-08-03 16:50:13 -07:00
Benoit Steiner	fad9828769	Deleted redundant regression test.	2016-08-03 16:08:37 -07:00
Benoit Steiner	ca2cee2739	Merged in ibab/eigen (pull request PR-206) Expose real and imag methods on Tensors	2016-08-03 11:53:04 -07:00
Benoit Steiner	d92df04ce8	Cleaned up the new float16 test a bit	2016-08-03 11:50:07 -07:00
Benoit Steiner	81099ef482	Added a test for fp16	2016-08-03 11:41:17 -07:00
Benoit Steiner	a20b58845f	CUDA_ARCH isn't always defined, so avoid relying on it too much when figuring out which implementation to use for reductions. Instead rely on the device to tell us on which hardware version we're running.	2016-08-03 10:00:43 -07:00
Benoit Steiner	fd220dd8b0	Use numext::conj instead of std::conj	2016-08-01 18:16:16 -07:00
Benoit Steiner	e256acec7c	Avoid unecessary object copies	2016-08-01 17:03:39 -07:00
Benoit Steiner	2693fd54bf	bug #1266 : half implementation has been moved to half_impl namespace	2016-07-29 13:45:56 -07:00
Gael Guennebaud	cc2f6d68b1	bug #1264 : fix compilation	2016-07-27 23:30:47 +02:00
Gael Guennebaud	8972323c08	Big 1261: add missing max(ADS,ADS) overload (same for min)	2016-07-27 14:52:48 +02:00
Gael Guennebaud	5d94dc85e5	bug #1260 : add regression test	2016-07-27 14:38:30 +02:00
Gael Guennebaud	0d7039319c	bug #1260 : remove doubtful specializations of ScalarBinaryOpTraits	2016-07-27 14:35:52 +02:00
Benoit Steiner	3d3d34e442	Deleted dead code.	2016-07-25 08:53:37 -07:00
Gael Guennebaud	6d5daf32f5	bug #1255 : comment out broken and unsused line.	2016-07-25 14:48:30 +02:00
Gael Guennebaud	f9598d73b5	bug #1250 : fix pow() for AutoDiffScalar with custom nested scalar type.	2016-07-25 14:42:19 +02:00
Gael Guennebaud	fd1117f2be	Implement digits10 for mpreal	2016-07-25 14:38:55 +02:00
Gael Guennebaud	9908020d36	Add minimal support for Array<string>, and fix Tensor<string>	2016-07-25 14:25:56 +02:00
Benoit Steiner	c6b0de2c21	Improved partial reductions in more cases	2016-07-22 17:18:20 -07:00
Gael Guennebaud	32d95e86c9	merge	2016-07-22 16:43:12 +02:00
Gael Guennebaud	d7a0e52478	Fix testing of log nearby 1	2016-07-22 15:44:26 +02:00
Gael Guennebaud	7acf23c14c	Truely split unit test.	2016-07-22 15:41:23 +02:00
Gael Guennebaud	d075d122ea	Move half unit test from unsupported to main tests	2016-07-22 14:34:19 +02:00
Gael Guennebaud	0f350a8b7e	Fix CUDA compilation	2016-07-21 18:47:07 +02:00
Gael Guennebaud	82798162c0	Extend unit testing of half with ADL and arrays.	2016-07-21 15:47:21 +02:00
Yi Lin	7b4abc2b1d	Fixed a code comment error	2016-07-20 22:28:54 +08:00
Benoit Steiner	20f7ef2f89	An evalTo expression is only aligned iff both the lhs and the rhs are aligned.	2016-07-12 10:56:42 -07:00
Gael Guennebaud	c98bac2966	Manually add -stdd=c++11 to nvcc for old cmake versions	2016-07-12 09:29:18 +02:00
Benoit Steiner	40eb97516c	reverted unintended change.	2016-07-11 14:28:03 -07:00
Benoit Steiner	03b71c273e	Made the packetmath test compile again. A better fix would be to move the special function tests to the unsupported directory where the code now resides.	2016-07-11 13:50:24 -07:00
Benoit Steiner	3a2dd352ae	Improved the contraction mapper to properly support tensor products	2016-07-11 13:43:41 -07:00
Benoit Steiner	0bc020be9d	Improved the detection of packet size in the tensor scan evaluator.	2016-07-11 12:14:56 -07:00
Gael Guennebaud	a96a7ce3f7	Move CUDA's special functions to SpecialFunctions module.	2016-07-11 18:39:11 +02:00
Gael Guennebaud	fd60966310	merge	2016-07-11 18:11:47 +02:00
Gael Guennebaud	7d636349dc	Fix configuration of CUDA: - preserve user defined CUDA_NVCC_FLAGS - remove the -ansi flag that conflicts with -std=c++11 - do not add -std=c++11 if already there	2016-07-11 18:09:04 +02:00
Gael Guennebaud	131ee4bb8e	Split test_slice_in_expr which seems to be huge for visual	2016-07-11 11:46:55 +02:00
Gael Guennebaud	194daa3048	Fix assertion (it did not make sense for static_val types)	2016-07-11 11:39:27 +02:00
Gael Guennebaud	18c35747ce	Emulate _BitScanReverse64 for 32 bits builds	2016-07-11 11:38:04 +02:00
Gael Guennebaud	599f8ba617	Change runtime to compile-time conditional.	2016-07-08 11:39:43 +02:00
Gael Guennebaud	544935101a	Fix warnings	2016-07-08 11:38:52 +02:00
Gael Guennebaud	59bf2774a3	Fix warnings	2016-07-08 11:38:11 +02:00
Gael Guennebaud	2f7e2614e7	bug #1232 : refactor special functions as a new SpecialFunctions module, currently in unsupported/.	2016-07-08 11:13:55 +02:00
Gael Guennebaud	8b7431d8fd	fix compilation with c++11	2016-07-07 15:18:23 +02:00
Gael Guennebaud	69378eed0b	Split huge unit test	2016-07-07 15:18:04 +02:00
Gael Guennebaud	179ebb88f9	Fix warning	2016-07-07 09:16:40 +02:00
Gael Guennebaud	5d2dada197	Fix warnings	2016-07-07 09:05:15 +02:00
Gael Guennebaud	f5e780fb05	split huge unit test	2016-07-07 08:59:59 +02:00
Gael Guennebaud	ce9fc0ce14	fix clang compilation	2016-07-04 12:59:02 +02:00
Gael Guennebaud	440020474c	Workaround compilation issue with msvc	2016-07-04 12:49:19 +02:00
Igor Babuschkin	78f37ca03c	Expose real and imag methods on Tensors	2016-07-01 17:34:31 +01:00
Benoit Steiner	cb2d8b8fa6	Made it possible to compile reductions for an old cuda architecture and run them on a recent gpu.	2016-06-29 15:42:01 -07:00
Benoit Steiner	b2a47641ce	Made the code compile when using CUDA architecture < 300	2016-06-29 15:32:47 -07:00
Igor Babuschkin	85699850d9	Add missing CUDA kernel to tensor scan op The TensorScanOp implementation was missing a CUDA kernel launch. This adds a simple placeholder implementation.	2016-06-29 11:54:35 +01:00
Benoit Steiner	1a9f92e781	Added a test to validate the tensor scan evaluation on GPU. The test is currently disabled since the code segfaults.	2016-06-27 16:02:52 -07:00
Benoit Steiner	75c333f94c	Don't store the scan axis in the evaluator of the tensor scan operation since it's only used in the constructor. Also avoid taking references to values that may becomes stale after a copy construction.	2016-06-27 10:32:38 -07:00
Benoit Steiner	7944d4431f	Made the cost model cwiseMax and cwiseMin methods consts to help the PowerPC cuda compiler compile this code.	2016-08-18 13:46:36 -07:00
Benoit Steiner	647a51b426	Force the inlining of a simple accessor.	2016-08-18 12:31:02 -07:00
Benoit Steiner	a452dedb4f	Merged in ibab/eigen/double-tensor-reduction (pull request PR-216) Enable efficient Tensor reduction for doubles on the GPU (continued)	2016-08-18 12:29:54 -07:00
Igor Babuschkin	18c67df31c	Fix remaining CUDA >= 300 checks	2016-08-18 17:18:30 +01:00
Igor Babuschkin	1569a7d7ab	Add the necessary CUDA >= 300 checks back	2016-08-18 17:15:12 +01:00
Benoit Steiner	2b17f34574	Properly detect the type of the result of a contraction.	2016-08-16 16:00:30 -07:00
Igor Babuschkin	841e075154	Remove CUDA >= 300 checks and enable outer reductin for doubles	2016-08-06 18:07:50 +01:00
Igor Babuschkin	0425118e2a	Merge upstream changes	2016-08-05 14:34:57 +01:00
Igor Babuschkin	9537e8b118	Make use of atomicExch for atomicExchCustom	2016-08-05 14:29:58 +01:00
Igor Babuschkin	eeb0d880ee	Enable efficient Tensor reduction for doubles	2016-07-01 19:08:26 +01:00
Gael Guennebaud	cfff370549	Fix hyperbolic functions for autodiff.	2016-06-24 23:21:35 +02:00
Gael Guennebaud	3852351793	merge pull request 198	2016-06-24 11:48:17 +02:00
Gael Guennebaud	6dd9077070	Fix some unused typedef warnings.	2016-06-24 11:34:21 +02:00
Gael Guennebaud	ce90647fa5	Fix NumTraits<AutoDiff>	2016-06-24 11:34:02 +02:00
Gael Guennebaud	fa39f81b48	Fix instantiation of ScalarBinaryOpTraits for AutoDiff.	2016-06-24 11:33:30 +02:00
Rasmus Munk Larsen	a9c1e4d7b7	Return -1 from CurrentThreadId when called by thread outside the pool.	2016-06-23 16:40:07 -07:00
Rasmus Munk Larsen	d39df320d2	Resolve merge.	2016-06-23 15:08:03 -07:00
Gael Guennebaud	361dbd246d	Add unit test for printing empty tensors	2016-06-23 18:54:30 +02:00
Gael Guennebaud	360a743a10	bug #1241 : does not emmit anything for empty tensors	2016-06-23 18:47:31 +02:00
Gael Guennebaud	7c6561485a	merge PR 194	2016-06-23 15:29:57 +02:00
Benoit Steiner	a29a2cb4ff	Silenced a couple of compilation warnings generated by xcode	2016-06-22 16:43:02 -07:00
Benoit Steiner	f8fcd6b32d	Turned the constructor of the PerThread struct into what is effectively a constant expression to make the code compatible with a wider range of compilers	2016-06-22 16:03:11 -07:00
Benoit Steiner	c58df31747	Handle empty tensors in the print functions	2016-06-21 09:22:43 -07:00
Benoit Steiner	de32f8d656	Fixed the printing of rank-0 tensors	2016-06-20 10:46:45 -07:00
Tal Hadad	8e198d6835	Complete docs and add ostream operator for EulerAngles.	2016-06-19 20:42:45 +03:00
Geoffrey Lalonde	72c95383e0	Add autodiff coverage for standard library hyperbolic functions, and tests. * * * Corrected tanh derivatived, moved test definitions. * * * Added more test cases, removed lingering lines	2016-06-15 23:33:19 -07:00
Benoit Steiner	7d495d890a	Merged in ibab/eigen (pull request PR-197) Implement exclusive scan option for Tensor library	2016-06-14 17:54:59 -07:00
Benoit Steiner	aedc5be1d6	Avoid generating pseudo random numbers that are multiple of 5: this helps spread the load over multiple cpus without havind to rely on work stealing.	2016-06-14 17:51:47 -07:00
Igor Babuschkin	c4d10e921f	Implement exclusive scan option	2016-06-14 19:44:07 +01:00
Gael Guennebaud	76236cdea4	merge	2016-06-14 15:33:47 +02:00
Gael Guennebaud	62134082aa	Update AutoDiffScalar wrt to scalar-multiple.	2016-06-14 15:06:35 +02:00
Gael Guennebaud	5d38203735	Update Tensor module to use bind1st_op and bind2nd_op	2016-06-14 15:06:03 +02:00
Gael Guennebaud	f925dba3d9	Fix compilation of BVH example	2016-06-14 11:32:09 +02:00
Tal Hadad	6edfe8771b	Little bit docs	2016-06-13 22:03:19 +03:00
Tal Hadad	6e1c086593	Add static assertion	2016-06-13 21:55:17 +03:00
Gael Guennebaud	3c12e24164	Add bind1st_op and bind2nd_op helpers to turn binary functors into unary ones, and implement scalar_multiple2 and scalar_quotient2 on top of them.	2016-06-13 16:18:59 +02:00
Tal Hadad	06206482d9	More docs, and minor code fixes	2016-06-12 23:40:17 +03:00
Benoit Steiner	65d33e5898	Merged in ibab/eigen (pull request PR-195) Add small fixes to TensorScanOp	2016-06-10 19:31:17 -07:00
Benoit Steiner	a05607875a	Don't refer to the half2 type unless it's been defined	2016-06-10 11:53:56 -07:00
Igor Babuschkin	86aedc9282	Add small fixes to TensorScanOp	2016-06-07 20:06:38 +01:00
Christoph Hertzberg	db0118342c	Fixed compilation of BVH_Example (required for make doc)	2016-06-07 19:17:18 +02:00
Benoit Steiner	84b2060a9e	Fixed compilation error with gcc 4.4	2016-06-06 17:16:19 -07:00
Benoit Steiner	7ef9f47b58	Misc small improvements to the reduction code.	2016-06-06 14:09:46 -07:00
Tal Hadad	e30133e439	Doc EulerAngles class, and minor fixes.	2016-06-06 22:01:40 +03:00
Benoit Steiner	9137f560f0	Moved assertions to the constructor to make the code more portable	2016-06-06 07:26:48 -07:00
Gael Guennebaud	66e99ab6a1	Relax mixing-type constraints for binary coefficient-wise operators: - Replace internal::scalar_product_traits<A,B> by Eigen::ScalarBinaryOpTraits<A,B,OP> - Remove the "functor_is_product_like" helper (was pretty ugly) - Currently, OP is not used, but it is available to the user for fine grained tuning - Currently, only the following operators have been generalized: ,/,+,-,=,=,/=,+=,-= - TODO: generalize all other binray operators (comparisons,pow,etc.) - TODO: handle "scalar op array" operators (currently only * is handled) - TODO: move the handling of the "void" scalar type to ScalarBinaryOpTraits	2016-06-06 15:11:41 +02:00
Rasmus Munk Larsen	f1f2ff8208	size_t -> int	2016-06-03 18:06:37 -07:00
Rasmus Munk Larsen	76308e7fd2	Add CurrentThreadId and NumThreads methods to Eigen threadpools and TensorDeviceThreadPool.	2016-06-03 16:28:58 -07:00
Benoit Steiner	37638dafd7	Simplified the code that dispatches vectorized reductions on GPU	2016-06-09 10:29:52 -07:00
Benoit Steiner	66796e843d	Fixed definition of some of the reducer_traits	2016-06-09 08:50:01 -07:00
Benoit Steiner	14a112ee15	Use signed integers more consistently to encode the number of threads to use to evaluate a tensor expression.	2016-06-09 08:25:22 -07:00
Benoit Steiner	8f92c26319	Improved code formatting	2016-06-09 08:23:42 -07:00
Benoit Steiner	aa33446dac	Improved support for vectorization of 16-bit floats	2016-06-09 08:22:27 -07:00
Benoit Steiner	d6d39c7ddb	Added missing EIGEN_DEVICE_FUNC	2016-06-07 14:35:08 -07:00
Gael Guennebaud	e8b922ca63	Fix MatrixFunctions module.	2016-06-03 09:21:35 +02:00
Benoit Steiner	c3c8ad8046	Align the first element of the Waiter struct instead of padding it. This reduces its memory footprint a bit while achieving the goal of preventing false sharing	2016-06-02 21:17:41 -07:00
Eugene Brevdo	39baff850c	Add TernaryFunctors and the betainc SpecialFunction. TernaryFunctors and their executors allow operations on 3-tuples of inputs. API fully implemented for Arrays and Tensors based on binary functors. Ported the cephes betainc function (regularized incomplete beta integral) to Eigen, with support for CPU and GPU, floats, doubles, and half types. Added unit tests in array.cpp and cxx11_tensor_cuda.cu Collapsed revision * Merged helper methods for betainc across floats and doubles. * Added TensorGlobalFunctions with betainc(). Removed betainc() from TensorBase. * Clean up CwiseTernaryOp checks, change igamma_helper to cephes_helper. * betainc: merge incbcf and incbd into incbeta_cfe. and more cleanup. * Update TernaryOp and SpecialFunctions (betainc) based on review comments.	2016-06-02 17:04:19 -07:00
Benoit Steiner	02db4e1a82	Disable the tensor tests when using msvc since older versions of the compiler fail to handle this code	2016-06-04 08:21:17 -07:00
Benoit Steiner	c21eaedce6	Use array_prod to compute the number of elements contained in the input tensor expression	2016-06-04 07:47:04 -07:00
Benoit Steiner	36a4500822	Merged in ibab/eigen (pull request PR-192) Add generic scan method	2016-06-03 17:28:33 -07:00
Benoit Steiner	c2a102345f	Improved the performance of full reductions. AFTER: BM_fullReduction/10 4541 4543 154017 21.0M items/s BM_fullReduction/64 5191 5193 100000 752.5M items/s BM_fullReduction/512 9588 9588 71361 25.5G items/s BM_fullReduction/4k 244314 244281 2863 64.0G items/s BM_fullReduction/5k 359382 359363 1946 64.8G items/s BEFORE: BM_fullReduction/10 9085 9087 74395 10.5M items/s BM_fullReduction/64 9478 9478 72014 412.1M items/s BM_fullReduction/512 14643 14646 46902 16.7G items/s BM_fullReduction/4k 260338 260384 2678 60.0G items/s BM_fullReduction/5k 385076 385178 1818 60.5G items/s	2016-06-03 17:27:08 -07:00
Igor Babuschkin	dc03b8f3a1	Add generic scan method	2016-06-03 17:37:04 +01:00
Rasmus Munk Larsen	811aadbe00	Add syntactic sugar to Eigen tensors to allow more natural syntax. Specifically, this enables expressions involving: scalar + tensor scalar * tensor scalar / tensor scalar - tensor	2016-06-02 12:41:28 -07:00
Tal Hadad	52e4cbf539	Merged eigen/eigen into default	2016-06-02 22:15:20 +03:00
Tal Hadad	2aaaf22623	Fix Gael reports (except documention) - "Scalar angle(int) const" should be "const Vector& angles() const" - then method "coeffs" could be removed. - avoid one letter names like h, p, r -> use alpha(), beta(), gamma() ;) - about the "fromRotation" methods: - replace the ones which are not static by operator= (as in Quaternion) - the others are actually static methods: use a capital F: FromRotation - method "invert" should be removed. - use a macro to define both float and double EulerAnglesXYZ* typedefs - AddConstIf -> not used - no needs for NegateIfXor, compilers are extremely good at optimizing away branches based on compile time constants: if(IsHeadingOpposite-=IsEven) res.alpha() = -res.alpha();	2016-06-02 22:12:57 +03:00
Igor Babuschkin	fbd7ed6ff7	Add tensor scan op This is the initial implementation a generic scan operation. Based on this, cumsum and cumprod method have been added to TensorBase.	2016-06-02 13:35:47 +01:00
Benoit Steiner	0ed08fd281	Use a single PacketSize variable	2016-06-01 21:19:05 -07:00
Benoit Steiner	8f6fedc55f	Fixed compilation warning	2016-06-01 21:14:46 -07:00
Benoit Steiner	c3cada38e2	Speedup a test	2016-06-01 21:13:00 -07:00
Benoit Steiner	873e6ac54b	Silenced compilation warning generated by nvcc.	2016-06-01 14:20:50 -07:00
Benoit Steiner	d27b0ad4c8	Added support for mean reductions on fp16	2016-06-01 11:12:07 -07:00
Benoit Steiner	5aeb3687c4	Only enable optimized reductions of fp16 if the reduction functor supports them	2016-05-31 10:33:40 -07:00
Benoit Steiner	e2946d962d	Reimplement clamp as a static function.	2016-05-27 12:58:43 -07:00
Benoit Steiner	e96d36d4cd	Use NULL instead of nullptr to preserve the compatibility with cxx03	2016-05-27 12:54:06 -07:00
Benoit Steiner	abc815798b	Added a new operation to enable more powerful tensorindexing.	2016-05-27 12:22:25 -07:00
Benoit Steiner	5707537592	Fixed option '--relaxed-constexpr' has been deprecated and replaced by option '--expt-relaxed-constexpr' warning generated by nvcc 7.5	2016-05-27 10:47:53 -07:00
Gael Guennebaud	22a035db95	Fix compilation when defaulting to row-major	2016-05-27 10:31:11 +02:00
Benoit Steiner	1ae2567861	Fixed some compilation warnings	2016-05-26 15:57:19 -07:00
Benoit Steiner	1a47844529	Preserve the ability to vectorize the evaluation of an expression even when it involves a cast that isn't vectorized (e.g fp16 to float)	2016-05-26 14:37:09 -07:00
Benoit Steiner	36369ab63c	Resolved merge conflicts	2016-05-26 13:39:39 -07:00
Benoit Steiner	28fcb5ca2a	Merged latest reduction improvements	2016-05-26 12:19:33 -07:00
Benoit Steiner	c1c7f06c35	Improved the performance of inner reductions.	2016-05-26 11:53:59 -07:00
Benoit Steiner	22d02c9855	Improved the coverage of the fp16 reduction tests	2016-05-26 11:12:16 -07:00
Benoit Steiner	8288b0aec2	Code cleanup.	2016-05-26 09:00:04 -07:00
Benoit Steiner	2d7ed54ba2	Made the static storage class qualifier come first.	2016-05-25 22:16:15 -07:00
Benoit Steiner	e1fca8866e	Deleted unnecessary explicit qualifiers.	2016-05-25 22:15:26 -07:00
Benoit Steiner	9b0aaf5113	Don't mark inline functions as static since it confuses the ICC compiler	2016-05-25 22:10:11 -07:00
Benoit Steiner	037a463fd5	Marked unused variables as such	2016-05-25 22:07:48 -07:00
Benoit Steiner	3ac4045272	Made the IndexPair code compile in non cxx11 mode	2016-05-25 15:15:12 -07:00
Benoit Steiner	66556d0e05	Made the index pair list code more portable accross various compilers	2016-05-25 14:34:27 -07:00
Benoit Steiner	034aa3b2c0	Improved the performance of tensor padding	2016-05-25 11:43:08 -07:00
Benoit Steiner	58026905ae	Added support for statically known lists of pairs of indices	2016-05-25 11:04:14 -07:00
Benoit Steiner	0835667329	There is no need to make the fp16 full reduction kernel a static function.	2016-05-24 23:11:56 -07:00
Benoit Steiner	b5d6b52a4d	Fixed compilation warning	2016-05-24 23:10:57 -07:00
Benoit Steiner	a09cbf9905	Merged in rmlarsen/eigen (pull request PR-188) Minor cleanups: 1. Get rid of a few unused variables. 2. Get rid of last uses of EIGEN_USE_COST_MODEL.	2016-05-23 12:55:12 -07:00
Christoph Hertzberg	718521d5cf	Silenced several double-promotion warnings	2016-05-22 18:17:04 +02:00
Christoph Hertzberg	b5a7603822	fixed macro name	2016-05-22 16:49:29 +02:00
Christoph Hertzberg	25a03c02d6	Fix some sign-compare warnings	2016-05-22 16:42:27 +02:00
Gael Guennebaud	ccaace03c9	Make EIGEN_HAS_CONSTEXPR user configurable	2016-05-20 15:10:08 +02:00
Gael Guennebaud	c3410804cd	Make EIGEN_HAS_VARIADIC_TEMPLATES user configurable	2016-05-20 15:05:38 +02:00
Gael Guennebaud	48bf5ec216	Make EIGEN_HAS_RVALUE_REFERENCES user configurable	2016-05-20 14:54:20 +02:00
Gael Guennebaud	f43ae88892	Rename EIGEN_HAVE_RVALUE_REFERENCES to EIGEN_HAS_RVALUE_REFERENCES	2016-05-20 14:48:51 +02:00
Gael Guennebaud	2f656ce447	Remove std:: to enable custom scalar types.	2016-05-19 23:13:47 +02:00
Rasmus Larsen	b1e080c752	Merged eigen/eigen into default	2016-05-18 15:21:50 -07:00
Rasmus Munk Larsen	5624219b6b	Merge.	2016-05-18 15:16:06 -07:00
Rasmus Munk Larsen	7df811cfe5	Minor cleanups: 1. Get rid of unused variables. 2. Get rid of last uses of EIGEN_USE_COST_MODEL.	2016-05-18 15:09:48 -07:00
Benoit Steiner	bb3ff8e9d9	Advertize the packet api of the tensor reducers iff the corresponding packet primitives are available.	2016-05-18 14:52:49 -07:00
Gael Guennebaud	548a487800	bug #1229 : bypass usage of Derived::Options which is available for plain matrix types only. Better use column-major storage anyway.	2016-05-18 16:44:05 +02:00
Gael Guennebaud	43790e009b	Pass argument by const ref instead of by value in pow(AutoDiffScalar...)	2016-05-18 16:28:02 +02:00
Gael Guennebaud	1fbfab27a9	bug #1223 : fix compilation of AutoDiffScalar's min/max operators, and add regression unit test.	2016-05-18 16:26:26 +02:00
Gael Guennebaud	448d9d943c	bug #1222 : fix compilation in AutoDiffScalar and add respective unit test	2016-05-18 16:00:11 +02:00
Rasmus Munk Larsen	f519fca72b	Reduce overhead for small tensors and cheap ops by short-circuiting the const computation and block size calculation in parallelFor.	2016-05-17 16:06:00 -07:00
Benoit Steiner	86ae94462e	#if defined(EIGEN_USE_NONBLOCKING_THREAD_POOL) is now #if !defined(EIGEN_USE_SIMPLE_THREAD_POOL): the non blocking thread pool is the default since it's more scalable, and one needs to request the old thread pool explicitly.	2016-05-17 14:06:15 -07:00
Benoit Steiner	997c335970	Fixed compilation error	2016-05-17 12:54:18 -07:00
Benoit Steiner	ebf6ada5ee	Fixed compilation error in the tensor thread pool	2016-05-17 12:33:46 -07:00
Rasmus Munk Larsen	0bb61b04ca	Merge upstream.	2016-05-17 10:26:10 -07:00
Rasmus Munk Larsen	0dbd68145f	Roll back changes to core. Move include of TensorFunctors.h up to satisfy dependence in TensorCostModel.h.	2016-05-17 10:25:19 -07:00
Rasmus Larsen	00228f2506	Merged eigen/eigen into default	2016-05-17 09:49:31 -07:00
Benoit Steiner	e7e64c3277	Enable the use of the packet api to evaluate tensor broadcasts. This speed things up quite a bit: Before" M_broadcasting/10 500000 3690 27.10 MFlops/s BM_broadcasting/80 500000 4014 1594.24 MFlops/s BM_broadcasting/640 100000 14770 27731.35 MFlops/s BM_broadcasting/4K 5000 632711 39512.48 MFlops/s After: BM_broadcasting/10 500000 4287 23.33 MFlops/s BM_broadcasting/80 500000 4455 1436.41 MFlops/s BM_broadcasting/640 200000 10195 40173.01 MFlops/s BM_broadcasting/4K 5000 423746 58997.57 MFlops/s	2016-05-17 09:24:35 -07:00

... 2 3 4 5 6 ...

2154 Commits