Force-Field Development

The accuracy of molecular dynamics (MD) simulations is determined by the underlying empirical force field [1]. A classical force field involves hundreds of parameters which need to be chosen such as to approximate the true Hamiltonian of the system. Parametrization is typically based on properties calculated using higher-level theoretical approaches and/or fitting to experimentally accessible properties. For the majority of biomolecules such as proteins, DNA, lipids, and sugars only a relatively small number of building blocks is required which makes parametrization and validation straightforward. However, the situation is different for small organic molecules. Due to the sheer size of chemical space coupled with a lack of experimental data for individual compounds, the fast and accurate parametrization of interaction functions for small organic molecules constitutes a long-standing problem in MD, in particular with respect to partial charges. Improving the accuracy and efficiency of the parametrization of organic molecules is therefore crucial to take advantage of the strengths of MD simulations for computer-aided drug design, cheminformatics as well as environmental chemistry. Our research aims at fundamental improvements in speed and accuracy of the parametrization process.

[1] external pageRiniker, J. Chem. Inf. Model. (2018), 58, 565.

Partial Charges and Higher Multipoles

In a first step, we have developed machine-learning (ML) models (random forest regression) based on a large database of high-quality electron densities for approx. 130'000 diverse lead-like compounds to predict atomic partial charges of organic molecules [2]. To describe the atomic environment, we used atom-centered atom-pairs fingerprints, i.e. a 2D description of the environment. Thus, by construction, the resulting partial charges represent average charges for the molecule, independent of the conformation, which is important for the use in a fixed-charge force field. If an implicit solvent model with a dielectric permittivity of 4 was used during the calculation of the electron densities, the partial charges extracted with the DDEC method were most compatible with existing force fields such as GAFF and OPLS.

[2] external pageBleiziffer et al., J. Chem. Inf. Model. (2018), 58, 579.

One possibility to improve the accuracy of classical force fields is to include higher multipoles, however, these are expensive to calculate. We developed an equivariant graph neural network to predict multipoles up to quadrupoles based on the 3D conformation of the molecule [3]. The results for predicting electrostatic potentials showed that the error introduced by the ML model is much smaller than the error coming from neglecting higher multipoles.

[3] external pageThürlemann et al., J. Chem. Theory Comput. (2022), 18, 1701.

Regularization by Physics

Some of the limitations of existing force fields simply come from the chosen functional form (e.g. fixed-charge versus polarizable force fields), but many inaccuracies are also introduced by the parametrization procedure. Due to the time and resources required for traditional parametrization, parametrization may be incomplete and/or the choice of atom types and combination rules introduces noise. Recent years have therefore seen increasing efforts to automatize force-field parametrization or to replace force fields with ML potentials. We propose an alternative strategy to parametrize force fields [4], which makes use of ML and gradient-descent based optimization while retaining a physics-based functional form. By using a predefined functional form, interpretability is retained, robustness is increased, and efficient simulations of large systems over long time scales are possible. The approach has been demonstrated for a fixed-charge and a polarizable functional form, where the ML models were trained on ab initio potential-energy surfaces.

[4] external pageThürlemann et al., J. Chem. Theory Comput. (2023), 19, 562.

 

JavaScript has been disabled in your browser