Introduction to molecular dynamics
In order to propagate a molecular system using the above equations there are three typical stages
For more information on these the DISCOVER manual is also recommended reading. The section on minimization is particularly helpful with a concrete example that illustrates the methods.
Minimization
Using the forcefield that has been assigned to the atoms in the system it is essential to find a stable point or a minimum on the potential energy surface in order to begin dynamics. At a minimum on the potential energy surface the net force on each atom vanishes. There will be more than one minimum for a polymer, biopolymer, or a liquid under periodic boundary conditions. In principle there may be a global minimum, but this will not likely be found without a conformational search.
Minimization provides information that is complementary to molecular dynamics. Ensembles of structures are useful for calculating thermodynamic averages and estimating entropy, but the large number of structures makes detailed microscopic analysis difficult. Minimized structures represent the underlying configurations about which fluctuations occur during dynamics. The use of a forcefield to define structure is often called molecular mechanics.
Constraints may be imposed during minimization, as well as during dynamics. These constraints may be based on data such as NOEs from an NMR experiment or they may be imposed by a template such that we force a ligand to find the minimum closest in structure to a target molecule. The area of template forcing is also important for homology modeling. Since it is not possible at present to fold a protein by energy minimization, one can approach the question of determining the fold of a protein by comparing with a structure that has significant amino acid sequence homology. Given the amount of data generated by the Human Genome Project as well as the extensive databases that contain protein sequences (e.g. SwissProt) this is a growing area for research.
To minimize we need a function (provided by the forcefield) and a starting guess or set of coordinates. The magnitude of the first derivative can be used to determine the direction and magnitude of a step (i.e. change in the coordinates) required to approach a minimum configuration. The magnitude of the first derivative is also a rigorous way to characterize convergence. A minimum has converged when the derivatives are close to zero. The typical tolerance in the MSI program suite is 0.001 kcal/mole. To reach the minimum the structure must be successively updated by changing the coordinates (taking a step) and checking for convergence. Each complete cycle of differentiation and stepping is known as a minimization iteration. Typically thousands of minimiation iterations are required for large macromolecules to reach convergence.
There are three major protocols for minimization:
The efficiency of a minimize can be judged by both the number of iteractions required to converge and the number function evaluations needed per iteration.
Steepest descent
The steepest descent method uses the first derivative to determine the direction towards the minimum. It is not particularly efficient because it must be combined with a line search to determine the step size. The line search uses the direction vector obtained from the first derivative of the potential function to find the optimium step size along this vector direction. Once this local minimum along the direction of the derivative is found the step can be taken. The next derivative will be orthogonal to the first. A line search is a requires several function evaluations, however, in order to determine the optimum step size. This technique is robust and is used to minimize initially when the structure is far from the minimum configuration.
Conjugate gradient and Newton-Raphson
More efficient minimization can be obtained using conjugate gradients or the Newton-Raphson algorithms. The conjugate gradient technique uses information from previous first derivatives to determine the optimum direction for a line search. The Newton-Raphson method uses the second derivatives as well as the first derivatives. In addition to using the gradient information, Newton-Raphson uses the curvature to predict where along the gradient the function will change direction. Storing and manipulating the second derivative matrix is prohibitive for large systems (DISCOVER has a maximum of 200 atoms allowed for Newton-Raphson).
Equilibration
Molecular dynamics solves the equations of motion for a system of atoms. The solution for the equations of motion of a molecule represents the time evolution of the molecular motions, the trajectory. Depending on the temperature at which a simulation is run MD allows barrier crossing and exploration of multiple configurations. In order to initiate MD we need to assign velocities initially. This is done using a random number generator using the constraint of the Maxwell-Boltzmann distribution. The temperature is defined by the average kinetic energy of the system according to the kinetic theory of gases. The internal energy of the system is U = 3/2 NkT. The kinetic energy is U = 1/2 Nmv2. By averaging over the velocities of all of the atoms in the system the temperature can be estimated. It is assumed that once an initial set of velocities has been generated the Maxwell-Boltzmann distribution will be maintained throughout the simulation.
Following minimization we can consider the temperature as being essential zero Kelvin. To initialize dynamics the system must be brought up to the temperature of interest. This is done by assigning velocities at some low temperature and then running dynamics according to the equations of motion (see below and finite methods). After a number of iterations of dynamics the temperature is scaled upwards. The most common means of temperature scaling is velocity scaling. Since the velocity for each atom is distributed about an average of v = (3kT/m)1/2 we can multiply all of the velocities by a common factor to obtain a new temperature. This is done systematically during the equilibration (initialization) stage. Given a typical time step of 1 fs equilibration is run for at least 5 ps (5000 time steps) and often for 10 or 20 ps.
Dynamics
The dynamics stage is the stage of interest for determining thermodynamic averages or sampling new configurations. The stage used for these applications is sometimes known as production dynamics. Molecular dynamics solves Newton's equations of motion Fi = miai, where Fi is the force, mi is the mass, and ai is the acceleration of atom i. The force on atom I can be computed directly from the derivative of the potential energy V with respect to the coordinates ri, Fi = -
dV/dri.mi
d2ri/dt2 = -dV/driAnalytical solutions of the equations of motion are possible only for two particles. Large systems require numerical methods (see Finite difference methods).