​Abstract:

Recent new interest in modelling the protein-solvent interactions by continuum methods lead to several improvements for calculating the solvent accessible surface area and its derivative [1]. One of the most efficient analytical method is a parametric approach based on the Gauss-Bonnet theorem. This approach has been implemented as molecular surface routine PARAREA [2] in the energy minimization and Monte Carlo simulation package FANTOM [3].

In this study we demonstrate that this approach can be further enhanced by avoiding the calculation of a relatively large number of potential intersection points in the Gauss-Bonnet path. PARAREA finds them by a straightforward test of all atom pairs in the neighbor list of each atom. Numerical tests showed that this part of the routine consumes a significant portion of the execution time. The new surface routine GETAREA efficiently finds the solvent exposed vertices of the Gauss-Bonnet path by calculating the intersection of half-spaces (IHS). These half-spaces are defined by the planes of two-sphere intersection. Geometric inversion conveniently transforms intersection planes into a set points in dual space. Convex hull of these points corresponds to the desired IHS [4]. Finally, the vertices of Gauss-Bonnet path are found by intersecting edges of IHS with the central atom sphere. Additionally, the list of neighbor atoms is obtained with the aid of a cubic lattice algorithm, as opposed to direct search in PARAREA.

Numerical tests with several conformations of the peptide Met-enkephalin and the medium-size protein tendamistat show the correctness of the area and gradient calculation. The CPU time spent in GETAREA has been significantly reduced by factors of 0.4 (Met-enkephalin) and 0.6 (tendamistat) as compared to PARAREA. With the improved performance of the method we are currently performing search for low energy conformations of peptides in solution and apply the improved version of FANTOM in studies of protein folding.



​References

1
W. Braun, ''Incorporation of Solvation Energy Contributions for Energy Refinement and Folding of Proteins'' in Computer Simulation of Biomolecular Systems, Vol.3 (van Gunsteren, W.; Weiner, P.; Wilkinson, T. Eds.), ESCOM, Leiden, 1996 (in press).
2
Ch. Mumenthaler and W. Braun, J. Mol. Mod., 1, 1 (1995).
3
B. von Freyberg and W. Braun, J. Comp. Chem., 14, 510 (1993).
4
F.P. Preparata and M.I. Shamos, Computational Geometry: An Introduction, Springer-Verlag, New York, 1985.




​Protein-Solvent Interaction Energy:

i
: atomic solvation parameter;
Ai
: solvent accessible surface area (ASA);

ASA and its derivatives can be calculated analytically from Gauss-Bonnet theorem:

p
: number of intersecting arcs defining ASA;
ri
: radius of the i-th sphere;

i n,n+1 : angle between tangential vectors;

ikn
: angle defining arc length;
in
: polar angle of the ik-th intersection circle;
aik
: radius of the ik-th intersection circle.



​New Vector Parametrization of Accessible Arcs

​Definitions:

xi : Cartesian coordinates of atom i;

xik = xk-xi ; dik = |xik| : interatomic distance vector and its length;

µik = xik/dik : unit vector along xik ;

gik = [(dik)2 + ri2 + rk2]/[2dik]: distance to the center of the ik-th intersection circle;

nikj = [µik×Pikj]/aik ; mikl = [µik×Qikl]/aik : tangential unit vectors;

Sikjl = sign[µik· (nikj×mikl)] : orientation of the tangential vectors.







































​Flowchart illustrating algorithmic calculation of accessible surface area and its partial derivatives

( denotes partial derivative with respect to the relative Cartesian coordinates)








Problem : find all accessible arcs = find all relevant intersection points P and Q

Solution implemented in routine PARAREA: find all vectors P, Q and eliminate those that are buried by other neighbor atoms.

Trouble : this algorithm is highly CPU-time consuming:




















New solution implemented in routine GETAREA: relevant P, Q can be determined from 3D half-space intersection. Simple 2D example illustrates this concept:

Two intersecting spheres, S and Ki, define a half-space Hi. Surface of S, S, intersected with Hi gives accessible surface, Ai (marked in blue), i.e., part of S that is not buried in Ki:

If sphere S has N neighbors, then its accessible surface is given by S crossed with intersection of all N half-spaces:

Half-space intersection is a polyhedron (called here the "IHS polyhedron") whose edges cross S at exactly the desired points P and Q.











In 3D the vertices of Gauss-Bonnet path are points where edges of the IHS polyhedron cross the sphere surface:









Finding the IHS polyhedron of N half-spaces from a convex hull using the power of computational geometry.

Some useful definitions:

Geometric inversion
in 3D Euclidean space is a point-to-point transformation of E3 which maps a point with spherical polar coordinates (R, , ) to the point with coordinates (1/R, , ).
Convex hull.
Given a finite subset L of points of E3, the convex hull of L, is the smallest convex polyhedron containing L.

Step 1: find distance vectors (red) to every halfspace boundary and transform them by geometric inversion (blue).

Step 2: find convex hull (blue) of transformed points. Find distance vectors (green) to the faces of convex hull.

Step 3: geometric inversion of green vectors yields vertices (black) of the IHS polyhedron (red).





​Performance of Surface Routines

Systems studied:
Met-enkephalin; 5 amino acid residues.
Tendamistat; 74 amino acid residues.
Method:
elapsed CPU time (as reported by SGI FORTRAN function DTIME) spent in conjugate gradient energy minimization.
Hardware:
Silicon Graphics Indigo2 with 195 MHz MIPS R10000 CPU.

Molecule

Met-enkephalin

Tendamistat

Routine

PARAREA

GETAREA

PARAREA

GETAREA

Total CPU time [s]

33.4

17.8

823

437

CPU time spent in surface routine [s]

26.0

11.1

573

182

Number of routine invocations

1302

1215

1349

1382

CPU time per invocation [s]

0.020

0.009

0.425

0.132

Ratio (P/G)

2.22

3.22






Application: folding studies of the pheromone Er-10 from Euplotes raikovi (38 amino acid residues).

  • Ten tertiary structures of Er-10 were solved by NMR spectroscopy.

  • Three-helix bundle (residues 2-8, 12-18 and 24-32) stabilized by three disulfide bridges.

  • Energy minimization calculations were performed with and without protein-solvent interactions. Apolar solvation parameters were used ( i = 1 kcal/mol/Å2 for C and S; = 0 kcal/mol/Å2 for other atoms).

  • Helical residues constrained to -72° < < -42° and -62° < < -32°. S-S bridges not included.

  • All 10 unrefined NMR structures and 25 random structures used as starting geometries.

  • RMSD values calculated for backbone atoms. Unrefined NMR structure 1 used as reference.






​Conclusions:

  • Analytical calculation of the solvent accessible surface areas and their derivatives (ASAD) has been simplified due to the new vector parametrization.

  • CPU time spent on ASAD calculations can be greatly reduced by employing new algorithm of finding the vertices of Gauss-Bonnet path by calculating the intersection of half-spaces.

  • Protein-solvent interaction is an extraordinarily important factor in computational studies of protein folding. Energy minimization without solvent is successful in mathematical sense, but it does not allow one to find the native structure. On the other hand, inclusion of solvent smoothes low energy local minima for unfolded structures and therefore drives structure minimization towards a folded state.