* Support kernel files with extensions of any length (like .hip)
* Do not allow to replace symbols in single-line comments
* Add OPENMM_BUILD_COMMON CMake option
It allows to build and install common platform files even if
CUDA or OpenCL platforms are not built.
This is required for HIP platform (openmm-hip) if ROCm OpenCL
packages are not installed.
* Add an option for Python wrapper to install into user packages
OPENMM_PYTHON_USER_INSTALL is OFF be default.
* Support FFT backends in Amoeba plugin
The HIP platform supports FFT backends, this commit moves
findLegalFFTDimension to ComputeContext, so platforms can have their own
implementations.
* Compatibility for common platform w/ new HIP platform
* Do not use volatile with private and local AtomData parameters on HIP
The generated code is not optimal, for example, the compiler generates
flat_load instructions instead of ds_read.
* Tune launch bounds for PME grid-related kernels and add WA for RDNA
Force the compiler to use all registers for gridSpreadCharge and
gridInterpolateForce by limiting max waves per EU to 1 on CDNA GPUs,
RDNA GPUs work better without it.
* Optimize atom data structs in GBSA and Amoeba on HIP
Manually rearrange fields, add paddings and force alignments to
have faster accesses to shared memory: ds_read and ds_write may
work slower if addresses are not aligned by 16 bytes.
Co-authored-by: Anton Gorenko <anton@streamhpc.com>
Co-authored-by: Nick Curtis <nicholas.curtis@amd.com>
* Began creating common compute framework to unify code between CUDA and OpenCL
* Began OpenCL implementation of common compute framework
* Common implementation of CMMotionRemover
* CUDA implementation of common compute interface
* Converted HarmonicBondForce to common compute API
* Converted standard bonded forces to common compute API
* Converted ExpressionUtilities to common compute API
* Created ComputeParameterSet
* Converted custom bonded forces to common compute API
* Converted CustomCentroidBondForce to common compute API
* Converted CustomManyParticleForce to common compute API
* Moved lots of duplicate code from CudaContext and OpenCLContext to ComputeContext
* Converted GayBerneForce to common compute API
* Removed obsolete kernels
* Converted verlet integrators to common compute API
* Converted Langevin and Brownian integrators to common compute API
* Converted CustomIntegrator to common compute API
* Converted CustomNonbondedForce to common compute API
* Removed uses of a deprecated API
* Fixed failing test cases
* Converted GBSAOBCForce to common compute API
* Began converting CustomGBForce to common compute API
* Finished converting CustomGBForce to common compute API
* Merged duplicated code in CudaIntegrationUtilities and OpenCLIntegrationUtilities
* Converted RMSDForce and AndersenThermostat to common compute API
* Converted CustomHbondForce to common compute API
* Merged scripts for encoding kernel sources
* Converted Drude plugin to common compute API
* Fixed errors in CMake scripts
* Attempt at fixing errors on Windows
* Added discussion of common compute API to developer guide
* Added Windows export macro for common classes
* Fixed error in CMMotionRemover
* Ubdated travis to newer Ubuntu version
* Fixed errors on CPU OpenCL
* Fixed Windows linking errors
* Added missing pragma for 32 bit atomics
* Replaced long long with mm_long
* More fixes to Windows linking
* Bug fix