We were getting a lot of warnings due to nested `find_package` calls
within `Find***.cmake` files. The recommended approach is to use
[`find_dependency`](https://cmake.org/cmake/help/latest/module/CMakeFindDependencyMacro.html)
in package configuration files. I made this change for all instances.
Case mismatches between `Find<Package>.cmake` and calling
`find_package(<PACKAGE>`) also lead to warnings. Fixed for
`FindPASTIX.cmake` and `FindSCOTCH.cmake`.
`FindBLASEXT.cmake` was broken due to calling `find_package_handle_standard_args(BLAS ...)`.
The package name must match, otherwise the `find_package(BLASEXT)` falsely thinks
the package wasn't found. I changed to `BLASEXT`, but then also copied that value
to `BLAS_FOUND` for compatibility.
`FindPastix.cmake` had a typo that incorrectly added `PTSCOTCH` when looking for
the `SCOTCH` component.
`FindPTSCOTCH` incorrectly added `***-NOTFOUND` to include/library lists,
corrupting them. This led to cmake errors down-the-line.
Fixes#2288.
- Use standard types in SYCL/PacketMath.h to avoid compilation problems on Windows
- Add EIGEN_HAS_CONSTEXPR to cxx11_tensor_argmax_sycl.cpp to fix build problems on Windows
* Unifying all loadLocalTile from lhs and rhs to an extract_block function.
* Adding get_tensor operation which was missing in TensorContractionMapper.
* Adding the -D method missing from cmake for Disable_Skinny Contraction operation.
* Wrapping all the indices in TensorScanSycl into Scan parameter struct.
* Fixing typo in Device SYCL
* Unifying load to private register for tall/skinny no shared
* Unifying load to vector tile for tensor-vector/vector-tensor operation
* Removing all the LHS/RHS class for extracting data from global
* Removing Outputfunction from TensorContractionSkinnyNoshared.
* Combining the local memory version of tall/skinny and normal tensor contraction into one kernel.
* Combining the no-local memory version of tall/skinny and normal tensor contraction into one kernel.
* Combining General Tensor-Vector and VectorTensor contraction into one kernel.
* Making double buffering optional for Tensor contraction when local memory is version is used.
* Modifying benchmark to accept custom Reduction Sizes
* Disabling AVX optimization for SYCL backend on the host to allow SSE optimization to the host
* Adding Test for SYCL
* Modifying SYCL CMake
Ancient versions of CMake required else(), endif(), and similar block
termination commands to have arguments matching the command starting the block.
This is no longer the preferred style.