Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2880819

Formats

Article sections

- Abstract
- 1. Introduction
- 2. Theoretical Background
- 3. Code Structure and Implementation
- 4. Test Run Description
- 5. Conclusion
- References

Authors

Related links

Comput Phys Commun. Author manuscript; available in PMC 2010 December 1.

Published in final edited form as:

Comput Phys Commun. 2010 June 1; 181(6): 1150–1160.

doi: 10.1016/j.cpc.2010.02.015PMCID: PMC2880819

NIHMSID: NIHMS200589

See other articles in PMC that cite the published article.

A Fortran program package is introduced for rapid evaluation of the electrostatic potentials and forces in biomolecular systems modeled by the linearized Poisson-Boltzmann equation. The numerical solver utilizes a well-conditioned boundary integral equation (BIE) formulation, a node-patch discretization scheme, a Krylov subspace iterative solver package with reverse communication protocols, and an adaptive new version of fast multipole method in which the exponential expansions are used to diagonalize the multipole to local translations. The program and its full description, as well as several closely related libraries and utility tools are available at http://lsec.cc.ac.cn/lubz/afmpb.html and a mirror site at http://mccammon.ucsd.edu/. This paper is a brief summary of the program: the algorithms, the implementation and the usage.

In the past thirty years, the Poisson-Boltzmann (PB) continuum electrostatic model has been widely accepted as a tool in theoretical studies of interactions of biomolecules such as proteins and DNAs in aqueous solutions. Recent work includes the introduction of more physical boundary conditions so that PB and its linearized version become valid for a wider class of molecular systems [38].

In this paper, we describe an adaptive fast multipole Poisson-Boltzmann (AFMPB) solver for solving the linearized PB equation, and thus for elucidating the electrostatic role in many biological processes, such as enzymatic catalysis, molecular recognition and bio-regulation. Indeed, quite a few PB solvers are available in biochemistry and biophysical communities, including DelPhi, GRASP, MEAD, UHBD and PBEQ based on the finite difference formulation [10], and APBS (Adaptive Poisson-Boltzmann Solver) based on the finite volume/multigrid framework [11, 22]. In the linearized PB regime, algorithms using the boundary integral equation (BIE) approach have shown great promise for their efficiency on scaling and memory requirements [13, 25, 28]: when Green’s theorem and potential theory are applied, the linearized PB equation can be recasted into a set of boundary integral equations where the unknowns are only de-fined on the surface of the molecule. Therefore, the number of unknowns is reduced when compared with the volumetric discretization in finite difference and finite element methods.

This AFMPB solver reflects our effort on developing more efficient codes using the BIE approach for the linearized PB equation, currently on single processor computing architectures. Several techniques are used in AFMPB to improve its efficiency over existing BIE-based solvers. First, a well-conditioned BIE formulation is used so that the number of iterations in the Krylov subspace methods is bounded, independent of the number of unknowns in the system. Second, a node-patch scheme is applied to discretize the resulting BIE, and the node based scheme reduces the number of unknowns defined on the molecular surface compared with commonly used “constant element” discretizations. Further, the iterative Krylov subspace methods from the SPARSKIT package [7] are applied to the resulting linear systems with simplified calling interface with the use of the reverse communication protocols. Fourth, the adaptive new versions of the fast multipole methods (FMMs) from FMMSuite [4] are applied to the convolution type discretized integrals. Finally, interface computer programs are provided to couple AFMPB with many existing mesh generating (e.g. *MSMS* [6] or other programs that can generate *OFF* format type of mesh) and visualization tools (e.g. VMD [5]). Preliminary numerical experiments show that the new AFMPB solver achieves significant speedup and memory savings for calculating the electrostatics of large-scale biomolecular systems on single-processor personal computers. We are currently parallelizing the codes on multi-core/multi-processor computer architectures toward simulating the dynamics of complex biomolecular systems under the influence of electrostatic forces derived from the PB calculation.

This paper is organized as follows. In Section 2, we describe various numerical techniques used in the AFMPB solver, including the boundary integral equation formulation for the linearized PB equation, the node-patch discretization, the Krylov subspace methods, and the adaptive new version of FMM. In Section 3, the overall structure of the codes is discussed, in particular, how it can be coupled with existing tools for pre- and post-processing. In Section 4, test runs are described to illustrate the performance of the solver. Finally, in the Appendix, we briefly describe the units and several important parameters used in our codes, and provide a sample shell script file for running the package.

The electrostatic force is considered an important factor in understanding the interactions and dynamics of molecular systems in solution. One commonly used continuum model for describing the electrostatic effects of the solvent outside the molecules is the Poisson-Boltzmann (PB) equation

$$-\nabla \xb7(\epsilon \nabla \phi )+{k}^{2}sinh(\phi )=\sum _{i=1}^{M}\delta (r-{r}_{i}),$$

(1)

and the Poisson equation is used for the inside of the molecules. When electrostatic potentials are small, the linearized PB (LPB) equation

$$-\nabla \xb7(\epsilon \nabla \phi )+{k}^{2}\phi =\sum _{i=1}^{M}\delta (r-{r}_{i})$$

(2)

is valid. The LPB equation can be generalized to a wider class of problems when proper interface boundary conditions are given [38].

In numerical simulations of large molecular systems, efficient solution of the Poisson-Boltzmann equation is required, for direct visualization of electrostatic potentials, for calculating the energetics of molecular interactions in solution, and more importantly, for understanding the dynamics of molecular systems in which the PBE is coupled with time evolution equations including the continuum diffusion equations and the Newton’s law of motion.

There are several ways to reformulate the LPB equation as “boundary integral equations”. When Green’s second identity is applied directly, the traditional boundary integral equations for the linearized PB equation for a single domain (molecule) takes the form

$$\begin{array}{l}\frac{1}{2}{\phi}_{p}^{\mathit{int}}={\oint}_{S}[{G}_{pt}{\scriptstyle \frac{\partial {\phi}_{t}^{\mathit{int}}}{\partial n}}-{\scriptstyle \frac{\partial {G}_{pt}}{\partial n}}{\phi}_{t}^{\mathit{int}}]d{S}_{t}+{\scriptstyle \frac{1}{{D}_{\mathit{int}}}}{\sum}_{k}{q}_{k}{G}_{pk},\phantom{\rule{0.38889em}{0ex}}p\in S,\\ \frac{1}{2}{\phi}_{p}^{\mathit{ext}}={\oint}_{S}[-{u}_{pt}{\scriptstyle \frac{\partial {\phi}_{t}^{\mathit{ext}}}{\partial n}}-{\scriptstyle \frac{\partial {u}_{pt}}{\partial n}}{\phi}_{t}^{\mathit{ext}}]d{S}_{t},\phantom{\rule{0.38889em}{0ex}}p\in S,\end{array}$$

(3)

with the interface conditions ^{int} = ^{ext} and
${D}_{int}{\scriptstyle \frac{\partial {\phi}^{int}}{\partial n}}={D}_{\text{ext}}{\scriptstyle \frac{\partial {\phi}^{\text{ext}}}{\partial n}}$. In the formulas,
${\phi}_{p}^{\mathit{int}}$ is the interior potential at surface position *p* of the molecular domain Ω, *S* = Ω is its boundary, i.e., solvent-accessible surface,
${\phi}_{p}^{\mathit{ext}}$ is the exterior potential at position *p*, *D _{int}* is the interior dielectric constant,

In our AFMPB solver, instead of simply accepting this natural BIE formulation, we use the technique in [33] and combine the single and double layer potentials to derive a second kind Fredholm integral equation as in (with notations *f* = * ^{ext}*,
$h={\scriptstyle \frac{\partial {\phi}^{\mathit{ext}}}{\partial n}}$)

$$\begin{array}{l}({\scriptstyle \frac{1}{2\epsilon}}+{\scriptstyle \frac{1}{2}}){f}_{p}={\oint}_{S}[({G}_{pt}-{u}_{pt}){h}_{t}-({\scriptstyle \frac{1}{\epsilon}}{\scriptstyle \frac{\partial {G}_{pt}}{\partial n}}-{\scriptstyle \frac{\partial {u}_{pt}}{\partial n}}){f}_{t}]d{S}_{t}+{\scriptstyle \frac{1}{{D}_{\mathit{ext}}}}{\sum}_{k}{q}_{k}{G}_{pk},\\ ({\scriptstyle \frac{1}{2}}+{\scriptstyle \frac{1}{2\epsilon}}){h}_{p}={\oint}_{S}[({\scriptstyle \frac{\partial {G}_{pt}}{\partial {n}_{0}}}-{\scriptstyle \frac{1}{\epsilon}}{\scriptstyle \frac{\partial {u}_{pt}}{\partial {n}_{0}}}){h}_{t}-{\scriptstyle \frac{1}{\epsilon}}({\scriptstyle \frac{{\partial}^{2}{G}_{pt}}{\partial {n}_{0}\partial n}}-{\scriptstyle \frac{{\partial}^{2}{u}_{pt}}{\partial {n}_{0}\partial n}}){f}_{t}]d{S}_{t}+{\scriptstyle \frac{1}{{D}_{\mathit{ext}}}}{\sum}_{k}{q}_{k}{\scriptstyle \frac{\partial {G}_{pk}}{\partial {n}_{0}}},\end{array}$$

(4)

where *n*_{0} is the unit normal vector at point *p* *S*. This is a well-conditioned (Fredholm second kind) system of equations. When Krylov subspace methods are applied to such systems, the number of iterations remains bounded even for a large number of unknowns. Similar formulations are also used in [24] where a complete boundary integral equation formulation is derived for the linearized PB equation using a limiting process to avoid the singularity problem, and in later boundary element method (BEM) based PB work [13, 27, 36].

To discretize Eqs. (4), one commonly used technique in engineering community is the “constant element” approach that triangularizes the molecular surface and treats the unknowns as constants on each triangular element. In this approach, the number of unknowns equals to the number of elements. Alternatively, the “linear element” method linearly interpolates the solution using the three constituent nodes, therefore the number of unknowns equals to the number of nodes. As there are less number of nodes than surface elements, the linear element approach leads to a reduction of the total number of unknowns by approximately a factor of 2. However, the node-based linear element approach introduces additional complexity in its implementation. In the AFMPB solver, we develop a “node-patch” approach to take advantage of the easy implementation of the constant element method and the reduced number of unknowns in the linear element method.

The idea of the node-patch approach is to construct a “working” patch around each node instead of directly using the facet patch (element), and assume that the unknowns are constants on each new “node-patch”. These new patches are illustrated in Figure 1, in which a “node-patch” is constructed around the *i _{th}* node that has five neighboring elements. The new patch is defined by the area encircled by a set of points {

“Node” patch constructed on a triangular mesh. *O* and *n* are the centroid and normal vector of an element respectively, and *C* is the middle point of an edge.

There are several advantages of this “node-patch” approach. First, because of the reduction of the total number of unknowns when compared to the constant element method, the computational cost of solving the resulting linear system is accordingly reduced. The “node-patch” method does involve an additional preprocessing step for deriving the average charge on each patch. This, however, only constitutes a negligible portion of the total computation time, and the processed geometric coefficients can also be saved for repeated use in iterative solving procedures. Second, relative to the linear element method, the “node-patch” method is significantly more efficient in searching and indexing the local list when used with practical matrix storage formats such as the Harwell-Boeing sparse matrix format or modified sparse column (row) format. Finally, in the “node-patch” method, as in the constant element method, the source and target are the same - the nodes, making the application of FMM from the FMMSuite package straightforward.

Given an initial iterate *x*_{0}, the Krylov subspace method solves the linear system *Ax* = *b* by iteratively minimizing some measure of error over the space *x*_{0} + *K _{k}*, where

An interesting feature of the iterative solvers in SPARSKIT is the so-called “reverse communication protocol” (check the ITSOL directory in SPARSKIT [7]), which avoids calling the matrix vector product subroutine from inside the Krylov solver. Instead, the Krylov solver provides a vector and asks for the matrix vector multiplication result as a further input. Therefore, it is unnecessary to pass the parameters in the FMM subroutines to the Krylov solvers, leading to easier interface between FMM and SPARSKIT, and easier memory management.

As the total number of operations required to solve Eq. (4) is a constant (representing the number of iterations) times the amount of work required for a matrix vector multiplication, when the new versions of fast multipole methods (FMMs) are applied, the matrix vector product only requires O(N) operations, therefore the linear equation system can be solved in asymptotically optimal O(N) time.

The fundamental observation in the multipole expansion based methods is that the numerical rank of the far field interactions is relatively low and hence can be approximated by *P* terms (depending on the prescribed accuracy) of the so-called “multipole expansion”. For the Coulomb interactions (Poisson Equation),

$$\mathrm{\Phi}(R,\theta ,\phi )=\sum _{i=1}^{N}{q}_{i}\xb7\frac{1}{\mid \overrightarrow{R}-{\overrightarrow{\rho}}_{i}\mid}\approx \sum _{n=0}^{P}\sum _{m=-n}^{m=n}{M}_{n}^{m}\frac{{Y}_{n}^{m}(\theta ,\phi )}{{R}^{n+1}}$$

(5)

where the multipole coefficients are computed by

$${M}_{n}^{m}=8\sum _{i=1}^{N}{q}_{i}\xb7{Y}_{n}^{-m}({\alpha}_{i},{\beta}_{i}).$$

(6)

The spherical harmonic function of order *n* and degree *m* is defined according to the formula[8]

$${Y}_{n}^{m}(\theta ,\phi )=\sqrt{\frac{(2n+1)(n-\mid m\mid )!}{4\pi (n+\mid m\mid )!}}\xb7{P}_{n}^{\mid m\mid}(\mathit{cos}\theta ){e}^{im\phi},$$

(7)

and ${P}_{n}^{m}$ is the associated Legendre polynomial. For the Debye-Hückel (screened Coulombic) interaction, a similar expansion can be written as

$$\mathrm{\Phi}(R,\theta ,\phi )=\sum _{i=1}^{N}{q}_{i}\xb7\frac{{e}^{-\kappa \mid \overrightarrow{R}-{\overrightarrow{\rho}}_{i}\mid}}{\mid \overrightarrow{R}-{\overrightarrow{\rho}}_{i}\mid}\approx \sum _{n=0}^{P}\sum _{m=-n}^{m=n}{M}_{n}^{m}\xb7{k}_{n}(\kappa R)\xb7{Y}_{n}^{m}(\theta ,\phi ).$$

(8)

The multipole coefficients are given by

$${M}_{n}^{m}=8\kappa \sum _{i=1}^{N}{q}_{i}\xb7{i}_{n}(\kappa {\rho}_{i})\xb7{Y}_{n}^{-m}({\alpha}_{i},{\beta}_{i}),$$

(9)

where *i _{n}*(

For arbitrary distributions of particles, a hierarchical oct-tree (in 3D) is generated so each particle is associated with boxes at different levels, and a divide-and-conquer strategy is applied to account for the far field interactions at each level in the tree structure. In the “tree code” developed by Appel[9], and Barnes and Hut [12], as each particle interacts with 189 boxes in its “interaction list” through *P* terms of multipole expansions at each level and there are *O*(log*N*) levels, the total amount of operations is approximately 189*P*^{2}*N* log*N*. The tree code was later improved by Greengard and Rokhlin in 1987 [19]. In their original FMM, local expansions (under a different coordinate system) were introduced to accumulate information from the multipole expansions in the interaction list,

$$\mathrm{\Phi}(R,\theta ,\phi )=\sum _{i=1}^{N}{q}_{i}\xb7\frac{1}{\mid \overrightarrow{R}-{\overrightarrow{\rho}}_{i}\mid}\approx \sum _{n=0}^{P}\sum _{m=-n}^{m=n}{L}_{n}^{m}\xb7{R}^{n}{Y}_{n}^{m}(\theta ,\phi )$$

(10)

where ${L}_{n}^{m}$ are the local expansion coefficients, and for the screened Coulombic interaction, a similar expansion can be written as

$$\mathrm{\Phi}(R,\theta ,\phi )=\sum _{i=1}^{N}{q}_{i}\xb7\frac{{e}^{-\kappa \mid \overrightarrow{R}-{\overrightarrow{\rho}}_{i}\mid}}{\mid \overrightarrow{R}-{\overrightarrow{\rho}}_{i}\mid}\approx \sum _{n=0}^{P}\sum _{m=-n}^{m=n}{L}_{n}^{m}\xb7{i}_{n}(\kappa R)\xb7{Y}_{n}^{m}(\theta ,\phi ).$$

(11)

As the particles only interact with boxes and other particles at the finest level, and information at higher levels is transferred using a combination of multipole and local expansions, the original FMM is asymptotically optimal *O*(*N*). However, because the multipole to local translation requires prohibitive 189*P*^{4} operations for each box, the huge prefactor makes the original FMM less competitive with the tree code and other FFT based methods. In 1997, a new version of FMM was introduced by Greengard and Rokhlin [20] for the Laplace equation, in which exponential expansions are introduced to diagonalize the multipole-to-local translations, and a merge-and-shift technique is used to further reduce the number of such translations. Compared with the original FMM, the original 189*P*^{4} operations were reduced to 40*P*^{2} + 6*P*^{3} in the new version. Numerical experiments show that the new version of FMM breaks even with direct calculation when the number of particles *n* = 500 for three digit accuracy, and *n* = 1000 for six digits for the screened Coulomb interactions. In our AFMPB solver, the new version of FMM was implemented for the Laplace and linearized PB equations for the efficient calculation of electrostatic interactions. As far as we know, this is the only LPB solver using the new versions of FMMs.

Instead of discussing technical details of the adaptive new version of FMM, we refer the readers to existing literature. The new version of FMM was introduced in [20] for the Laplace equation; the corresponding adaptive version was discussed in [14]; the new version of FMM for the linearized PB equation (also called Yukawa or modified Helmholtz equation) was discussed in [18, 23].

For an interacting molecular system, AFMPB can also calculate the total force and torque acting on each molecule. The force is obtained through computing the stress tensor in ionic solution[17]

$${T}_{ij}={D}_{\text{ext}}{E}_{i}{E}_{j}-\frac{1}{2}{D}_{\text{ext}}{E}^{2}{\delta}_{ij}-\frac{1}{2}{D}_{\text{ext}}{\kappa}^{2}{\phi}^{2}{\delta}_{ij},$$

(12)

where *E* is the electrostatic field, δ* _{i j}* is the Kronecker delta function.

To obtain the boundary stress tensor, the gradient of the potential (the negative of *E* on the boundary) should be computed. In our current implementation, we first use an interpolation method to construct a continuous potential function *f* (*x*,*y*,*z*) in the vicinity of the molecular surface based on the PB solutions, then compute the gradient of *f* for any positions on the molecular surface. For more details on the interpolation method and the force calculations, we refer interested readers to [30].

Finally, the PB force *F* and torque *M* acting on any constituent molecule are calculated by integrations

$$F={\int}_{S}T(x)\xb7dS(x),$$

(13)

$$M={\int}_{S}{r}_{c}(x)\times [T(x)\xb7dS(x)],$$

(14)

where *r _{c}*(

In this section, we describe the package portability and installation, flow chart, job running, file format and I/O layers, pre- and post-processing, and several related tools.

The software package was implemented mostly in Fortran77. Memory allocation features from Fortran 90 and above are used in a few subroutines for memory management.

After the user downloads the package and extracts it to a local computer, the following directories can be found:

- Doc: contains the references and this manual.
- job-examples: contains a README file, samples of the c-shell job files
*job.csh*to run the package, and the generated input/output files. A subdirectory ./Two-proteins contains similar input/output files but is for the test case of a two-molecule system. - FBEM: the driver and the boundary integral equation codes of the AFMPB package. It also contains the subroutine iters.f, the Krylov subspace iterative solvers from SPARSKIT [7]. The default Krylov solver is GMRES, though the user can also use the restarted GMRES, TFQMR, or BiCGStab.
- fmm3d_node: the fast multipole method library for the Laplace and Yukawa potentials.
- tools: pre- and post- processing utility programs. Additional tools will be added to this directory for calculation of different parameters.

For compiling, check the file **Makefile** for further information. The package has been successfully compiled using the Intel©compiler for Linux, and the GNU©F95 compiler.

The program diagram is shown in Fig. 2.

AFMPB can be considered as a main driver for calculating the electrostatics in biomolecular systems. It requires molecular structure, charge and mesh information from the pre-processing tools as inputs, and outputs potential, various energies and force results for visualization and/or further processing. To generate the “boundary elements”, AFMPB currently accepts mesh data from the package *MSMS* [6] and other programs that can generate molecular surface mesh in the *OFF* format. For postprocessing, we use VMD for visualization. The solver can also be coupled with multi-scale time stepping schemes to simulate the dynamics of molecular systems under the influence of electrostatic interactions.

A command line style is adopted to run AFMPB. A simple way to run AFMPB is ”*./afmpb*”, with all the default input/output files in the working directory. A typical job with specified options can be like

afmpb -i inp.dat -o out.log -s surfpot.dat.

*-i* means to be followed by the input file, *-o* an output log file, and *-s* is followed by an output file that contains surface nodal potential. More options see Appendix C.

To run AFMPB, users are required to prepare input files including (a) a control file (*inp.dat* by default), (b) *PQR* file(s) containing atomic charges and their locations, (c) molecular surface mesh file(s), and an optional (d) off-surface points location file where the potentials need to be evaluated. For the output, a log file (*out.log* by default) will be generated automatically after every AFMPB run, and users may also request a surface potential file (default *surfpot.dat*) by setting a write-control parameter in *inp.dat*. All the input files and output log file (*out.log*) are in free format. The job-control input file can also be generated by the job shell script as described in Appendix C, in which a brief explanation for each variable is provided.

The PQR format is a modification of the PDB format which adds charge and radius parameters to existing PDB file. The surface mesh file contains node coordinates and how they connect to form triangles. The current version of AFMPB allows a few slightly different mesh formats, including the standard *OFF* format and the so-called *MSMS* format. A comment line in *inp.dat*

“# meshfmt: 0 icosohedron, 1 raw, 2 msms, 3 matlab sphere, 4 mc, 5 off”

indicates to choose one of the optional mesh formats to be read. The *OFF* format is used by many mesh generating tools, and can be viewed using visualization software such as “GEOMVIEW”, while the *MSMS* format can be conveniently generated by using a script tool provided in our package (see the section “Utility Tools”). The script will call the software MSMS[6] that is widely used in biomolecular studies. A sample *MSMS* format file is as follows:

5358 10712 −18.162 1.138 29.523 −0.309 −0.779 0.546 −17.185 −0.033 30.761 −0.960 0.001 −0.280 … 16.334 2.571 23.760 −0.000 1.000 0.000 1 5 3645 3645 5 3646 … 3614 3620 3613

In this file, the first line describes the total numbers of nodes and triangular elements. Starting from the second line, the *xyz* coordinates and optional *xyz* components of the normal direction for each node are given, followed by a list indicating the indices of three nodes for each triangular element.

The AFMPB solver requires that the normal vectors point outward (as defined in the current *OFF* and *MSMS* formats) and the nodes are ordered counter-clockwisely. In case the normal vectors are not available in the input surface mesh files, which happens when other file formats are used as input files, functions are provided in the solver to calculate these quantities.

The output log file *out.log* records useful information during code execution, including the number of iterations of the iterative Krylov solver, the CPU time and memory usage information, the total/solvation/interaction/Coulombic energies, and so on. The surface potentials and forces are recorded in a formatted file (*surfpot.dat*), which can be extracted for further analysis, and can be visualized with VMD using a provided TCL script file (see the subsection “Utility Tools”).

The *PQR* file can be generated from a standard *PDB* file using the software *pdb2pqr* [3]. *pdb2pqr* also adds missing hydrogen atoms in the *PDB* file, optimizes the hydrogen bond network, and assigns atomic charges and radii based on various force field parameter sets.

The molecular surface mesh file is traditionally generated from the *PQR* file. There exist at least three types of “molecular surface” to define the boundary between the low (interior) and high dielectric (exterior) regions: the *van der Waals* surface, the *solvent-accessible surface*[32], and the *solvent-excluded surface*. The *van der Waals* surface is the union of the surface area formed by placing van der Waals spheres at the center of each atom in a molecule. When rolling a solvent (or a probe) sphere over the van der Waals surface, the trajectory of the solvent sphere center defines the *solvent-accessible surface*, while the trajectory of the boundary of the solvent sphere in closest contact with the van der Waals surface defines the *solvent-excluded surface*. The solvent-excluded surface is also referred to as the molecular surface in bimolecular studies. In the AFMPB solver, a triangular mesh of the molecular surface can be generated using the software *MSMS* as described in [35]. Two important parameters in *MSMS* are the node density and probe radius that control the resolution of the output mesh, and the typical values are set to 1.0/*Å*^{2} and 1.5 Å, respectively. The mesh from *MSMS* can be further improved by removing the elements of extremely small area using a mesh checking procedure provided in the AFMPB package, though this step is usually not necessary. We want to mention that mesh generation is in general a fast step and takes only a few seconds for medium-sized molecules. A typical mesh of a molecule with 8362 atoms is shown in Fig 3, which contains 32975 vertices and 65982 triangles and is generated within 3 seconds of CPU time. For larger molecules or molecular assemblies, however, our numerical experiments show that almost all existing mesh generating tools fail to generate reasonable quality molecular surface meshes required for numerical stability. The mesh generation therefore remains a bottleneck for large-scale PB calculation and full scale dynamics simulations with PB forces.

The calculation results for energies and forces (between molecules) can be extracted from *out.log*. The surface and off-surface potential files can be used for further analysis. The solvation energies for individual molecules are also stored in *out.log* if the corresponding output options are switched on in the input file *inp.dat*. The solvation energy is computed by solving the PBE only once, which is different from most finite difference based methods. For multi-molecule systems, the binding energy, the total forces and torques acting on individual molecules due to the other molecules (excluding itself) can also be calculated and outputed to *out.log* by setting the corresponding control variables in *inp.dat*.

The directory *./tools* contains tools for file format conversion, mesh generation and refinement, and data analysis and visualization. In the current release of the solver, two scripts are provided: *pqrmsms.csh* and *showSmoothMesh.tcl*. The c-shell script *pqrmsms.csh* generates a *MSMS* molecular surface mesh from a *PQR* file. Before using the script file, the program *MSMS* should be installed and the path should be correctly set. Given a *PQR* file (e.g. *fas2.pqr* in the directory), the surface mesh can be generated by running the following command:

./pqrmsms.csh fas2 1.2 1.5.

The last two variables specify the node density (in unit of 1 per *Å*^{2}) and probe radius in unit of *Å*, respectively.

The second tool, a TCL script *showSmoothMesh.tcl*, is used together with the visualization program *VMD* to display the surface potential data file. *VMD* should be installed before running the script. A sample execution to visualize the *fas*2 structure and the resulting surface potentials follows:

vmd fas2.pqr -e showSmoothMesh.tcl -args ../job-examples/surfpot.dat.

To assess the performance of the AFMPB solver, we first calculate the Born solvation energy of a point charge +50e located at the center of a spherical cavity with a radius of 50 Å. The exact Born solvation energy *E _{solvation}* of the cavity is −4046.0 (energy is in kcal/mol). The surface is discretized at various resolution levels by recursively subdividing an icosahedron. Table 1 summarizes the timing results (on a Dell dual 2.0 GHz P4 desktop with 2 GB memory) along with some critical control parameters using the AFMPB solver and a direct BEM without invoking any fast algorithms (denoted by direct BEM). We noticed that regardless of the surface resolution, all the GMRES iteration steps are below 5, which numerically confirms that our formulation is well-conditioned. The CPU time for the new version of FMM increases linearly with the number of elements, while increases quadratically for the direct BEM method.

The computational accuracy of the solver improves linearly upon the refinement of the mesh scale. As shown in Table 1, when the mesh is refined to a higher level, the number of elements is quadrupled, and the size of each element (for instance, measured by the length of edges of the triangular elements) is reduced by half, the relative errors of the calculated Born solvation energy are also reduced by approximately half when compared with the analytical result.

To further illustrate the AFMPB’s performance on the electrostatic calculation for proteins, we computed the electrostatic solvation energies of fasciculinII, a 68 residue protein at various mesh resolutions. The results are shown in Table 2. It shows again that the mesh size has only a small impact on the number of iteration steps, and the memory usage and CPU time increase almost linearly with the increase of surface meshes.

AFMPB is also tested for a two-molecule system, the acetylcholinesterase and its inhibitor fasciculin. They are separated by 42.2 *Å* away by shifting the coordinates along the center-to-center direction in the original structure. The molecular surface mesh for fasciculin contains 5198 nodes and 10400 triangles, and the mesh for acetyl-cholinesterase contains 34111 nodes and 68266 triangles. In order to extract various quantities characterizing the interaction between the two molecules, AFMPB actually solves the PBE three times: two for each isolated molecule, and one for the combined two-molecule system. The total solvation energy of the system is −3663.8 kcal/mol. The electrostatic interaction energy between fasciculin and acetylcholinesterase is 0.30 kcal/mol, and the total force (in kcal/mol.*Å*) and torque (in kcal/mol) acting on fasciculin are (0.08, 0.14, −0.32) and (12.53, 9.95, −7.42), respectively. The total CPU time is 73 seconds on a DELL PRECISION T7400 desktop computer.

In this paper, we describe an adaptive fast multipole Poisson-Boltzmann solver for computing the electrostatics in biomolecules. The solver uses the new version of fast multipole methods and Krylov subspace methods to solve a well-conditioned boundary integral equation formulation of the linearized PB equation. Numerical experiments show that the solver is very efficient for relatively large molecules on current desktop computers. We anticipate that the AFMPB solver, when parallelized on multi-core/multi-processor large-scale computers, can be applied to the efficient simulation of the dynamics of large molecular systems.

Finally in this paper, we want to thank many of our colleagues and collaborators for their contributions and suggestions. In particular, our code uses many existing open source codes, including the SPARSKIT by Saad and collaborators [7], *MSMS* [6] for mesh generation, VMD [5] for visualization, and several important subroutines in the new version of FMM from Profs. Greengard and Rokhlin’s groups. This work was supported by HHMI, NIH, NBCR, NSF (JH: NSF0811130 and NSF0411920, JAM: MCB0506593), the NSF Center of Theoretical Biological Physics (CTBP), and the Institute for Mathematics and Its Applications (IMA). In addition, BL is partially funded by the Academy of Mathematics and Systems Science of Chinese Academy of Sciences, the State Key Laboratory of Scientific/Engineering Computing, and NSFC (NSFC10971218), and XC is partially supported by the U.S. Department of Energy Field Work Proposal ERKJE84. Their support is thankfully acknowledged.

The following table (Table 3) lists the units used in AFMPB. The conversion factors to transform to SI units are also given in the last column for reference.

We recommend advanced users to check the following header files for adjustable parameters used in the program.

- defgeom.h: defines the variables describing the geometry of the molecules.
- files.h: defines the units for various input and output files.
- fmmtree.h: in the fast multipole method, the size of many variables and vectors can be determined before one knows the adaptive tree structure for an input geometry. In this file, all the “fixed” length variables are defined, including many variables for storing the precomputed data. One common block is defined in this file for the adaptive tree structure.
- membem.h: defines the geometry and input/output variables in the BEM subroutines. The common block geomol defines the numbers of molecules, node points, elements, and singular charges. The common block membem defines the pointer to a huge working array where different geometry and input/output variables are stored.
- memfmm.h: defines the pointers for the variables in the FMM subroutines. All adjustable variables are allocated once the adaptive tree structure is determined, and these pointers are generated for integer, real, and complex variables, respectively.
- parm.h: defines important physical and code parameters. In particular, the following parameters are used for code control.
- lw: as the tree structure is adaptive, one can not determine the size of the required memory where the oct-tree structure will be stored. Hence initially a large amount of memory space is allocated to generate the adaptive tree. If this number is too small, error message will be provided asking the user to increase this number.
- nbox: in our adaptive strategy, we ask that the maximum number of particles in each childless box is less than nbox. Reducing this parameter will generate more levels (which means more boxes but less direct interactions).
- nterms, nlambs: the number of multipole and exponential expansions in the FMM code. Currently only 3 and 6 digits accuracy are allowed, corresponding to nterms=nlambs=9 and nterms=nlambs=18, respectively.

In the following, we provide a simple shell file for executing the AFMPB package.

#!/bin/csh # xxx, 20xx # Set up directories. set DIR=../ set OUT=. # Set up molecule data and mesh. set pqr1=fas2.pqr # mol1 pqr set mesh1=fas2-mod.dat-d1.2-r1.5 # mol1 msms mesh # Prepare input data file. cat END > inp.dat # nmol, number of molecules 1 # di, de, ion concentration(mM), temperature 2.0 80. 150 300.0 # meshfmt: 0 icosohedron, 1 raw, 2 msms, 3 matlab sphere, 4 mc, 5 off 2 # output key: ipotw(surf pot),iforce,iinterE,iselfE (solvation),ipotdx(vol pot) 1 0 0 1 0 # pqr and mesh files (repeat nmol lines for multi-molecules) $pqr1 $mesh1 _END # Start the code. $DIR/afmpb -i inp.dat -o $OUT/out.log -s $OUT/surfpot.dat

AFMPB also supports potential calculations at exterior points given in *space mesh.m* file. A sample command line for this option follows, where the computed potentials are stored in *evo.dat*.

afmpb -i inp.dat -vm ./space mesh.m -o ./out.log -s ./pot.dat -vp ./evo.dat.

Finally, a simple command line “${PROG PATH}/afmpb” can be used with all the default options, and the output files will be generated in the working directory.

PACS: 02.30.Rz; 02.70.Ns; 24.10.Cn.

**Publisher's Disclaimer: **This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

8. Abramowitz M, Stegun IA. Handbook of Mathematical Functions. Dover; 1970.

9. Appel AW, Siam J Sci Stat Comput. 1985;6:85–103.

10. Baker NA. Methods in Enzymology. 2004;383:94–118. [PubMed]

11. Baker NA, Sept D, Joseph S, Holst MJ, McCammon JA. Proc Natl Acad Sci USA. 2001;98:10037. [PubMed]

12. Barnes J, Hut P. Nature. 1986;324:446–449.

13. Boschitsch AH, Fenley MO, Zhou HX. J Phys Chem B. 2002;106:2741–2754.

14. Cheng H, Greengard L, Rokhlin V. J of Comp Phys. 1999;155:468–498.

15. Chung FRK. Spectral Graph Theory: Published for the Conference Board of the mathematical sciences by the American Mathematical Society; Providence, R.I.. 1997.

16. Darden T, York D, Pedersen L. J Chem Phys. 1993;98:10089–10092.

17. Gilson MK, Davis ME, Luty BA, McCammon JA. J Phys Chem. 1993;97:3591–3600.

18. Greengard L, Huang JF. J of Comp Phys. 2002;180:642–658.

19. Greengard L, Rokhlin V. J of Comp Phys. 1987;73:325–348.

20. Greengard L, Rokhlin V. Acta numerica. 1997;6:229–269.

21. Greengard L, Wandzura S. IEEE Computational Science & Engineering. Vol. 5. 1998. pp. 16–18.

22. Holst M, Baker NA, Wang F. J Comput Chem. 2000;21:1319.

23. Huang J, Zhang B, Jia J. Computer Physics Communications. 2009. in press.

24. Juffer AH, Botta EFF, Vankeulen BAM, Vanderploeg A, Berendsen HJC. J Comput Phys. 1991;97(1):144–171.

25. Kuo S, Altman M, Bardhan J, Tidor B, White J. Proc. IEEE/ACM Int. Conf. Computer Aided-Design; 2002. pp. 466–473.

26. Lee H, Darden T, Pedersen L. Chemical Physics Letters. 1995;243:229–235.

27. Liang J, Subramaniam S. Biophys J. 1997;73(4):1830–1841. [PubMed]

28. Lu B, Cheng X, Huang J, McCammon JA. Proc Natl Acad Sci USA. 2006;103:19314–19319. [PubMed]

29. Lu B, Cheng X, McCammon JA. J Comput Phys. 2007;226:1348–1366. [PMC free article] [PubMed]

30. Lu B, McCammon JA. J Chem Theory Comput. 2007;3:1134. [PubMed]

31. Phillips JR, White J. Int. Conf. On Computer-Aided Design; Santa Clara, CA. 1994.

32. Richards FM. Ann Rev in Biophysics and Bioengineering. 1977;6:151–176. [PubMed]

33. Rokhlin V. Wave Motion. 1983;5(3):257–272.

34. Saad Y, Schultz M. SIAM J Sci Statist Comput. 1986;7:856–869.

35. Sanner MF, Olson AJ, Spehner JC. Biopolymers. 1996;38:305–320. [PubMed]

36. Tanaka M, Sladek V, Sladek J. AMSE Appl Mech Rev. 1994;47:457–499.

37. White J, Phillips JR, Korsmeyer T. Proceedings of the Colorado Conference on Iterative Methods; Breckenridge, Colorado. 1994.

38. Lee CC, Lee HJ, Hyon YK, Lin TC, Liu C. personal communication.