* CPU: Refactored code to be generic across vector CPU platforms.
Ewald and non-Ewald interactions now share a common code base, templated on
their interaction type.
The vec4 and vec8 implementations have been replaced by a single generic implementation
class which is templated on SIMD type. Currently works for SIMD4 and SIMD8 types, but
can be extended in future to support other types (e.g., AVX-512).
Modified runtime CPU support to lay groundwork for future SIMD types.
Pulled out some vector utility functions (gather pair, reduce),
and refactored the AVX CPU code to make use of them.
* CPU: Fixed coding standards and incorrect header include.
* CPU: Fixed code review comments from PR #2661
* CPU: Fixed CI build issues.
* CPU: Further CI fixes.
* CPU: Fix for unit test failure on MacOS.
Reverted optimised code to go back to a version which is thought to work
on MacOS. The optimisation will be reapplied in future after the failure is
understood better.