It is not always computationally feasible to undertake protein structure studies using full atom representations. The challenge is to reduce complexity while maintaining detail [
1–
3]. Lattice protein models are often used to achieve this but in general only the protein backbone or the amino acid centre of mass is represented [
4–
12]. A huge variety of lattices and energy functions have previously been developed and applied [
4,
13,
14].
In order to evaluate the applicability of different lattices and to enable the transformation of real protein structures into lattice models, a representative lattice protein structure has to be calculated. Ma

uch and Gaur have shown the NP completeness of this problem for backbone-only models in the 3D-cubic lattice and named it the
protein chain lattice fitting (PCLF) problem [
15].
The PCLF problem has been widely studied for backbone-only models [
13,
16–
24]. The most important aspects in producing lattice protein models with a low root mean squared deviation (RMSD) are the lattice coordination number and the neighbourhood vector angles [
18,
23]. Lattices with intermediate coordination numbers, such as the face-centred cubic (FCC) lattice, can produce high resolution backbone models [
18] and have been used in many protein structure studies (e.g., [
3,
25,
26]). However, the use of backbone models is limited since they do not account for the space required for side chain packing.
To overcome this restriction lattice protein models that include side chains have been introduced [
27–
33]. Reva et al. [
32] have, to our knowledge, developed the only previous approach to solve the PCLF problem including side chains. They apply dynamic programming to find an optimal solution according to their error function. Unfortunately, the approach is shown to often yield no solution in the 3D cubic lattice. The CABS tools by Kolinski and coworkers utilize a hybrid on-lattice (backbone) and off-lattice (side chain) protein representation to study folding dynamics but do not attempt to answer the PCLF problem [
31,
34].
In this paper we use the side chain model definition of Bromberg and Dill [
28], where each amino acid is represented by two on-lattice monomers: one represents the side chain and one the
Cα atom. This explicit representation of side chains prevents unnatural collapse during structural studies [
35] and enables the reconstruction of full atom protein data [
36]. Full on-lattice protein models are constrained in their possible side chain placement but enable exhaustive studies of folding kinetics and structure space [
11,
37,
38] not applicable within off-lattice side chain models like the CABS approach.
To the best of our knowledge, there is only one other publicly available implemented approach, namely,
LocalMove, to derive lattice protein models from real proteins despite a large number of published methods.
LocalMove is a web interface introduced by Ponty et al. [
22] for backbone-only models in 3D-cubic and FCC lattice and applies a Monte-Carlo search in order to find lattice protein models.
We present our tool
LatFit to tackle this lack of available implementations. The program is freely available for academic download and as a webserver:
http://cpsp.informatik.uni-freiburg.de/LatFit/.
LatFit solves the PCLF problem, that is, transforms a protein from full atom coordinate data to a lattice model, and is available as both a stand-alone tool for high-throughput pipelines and a web interface for
ad hoc usage. A new fitting procedure that optimises distance RMSD enables rotation-independent lattice model creation of protein structures. The method is applicable to arbitrary lattices and handles both backbone and side chain representations with equivalent accuracy. A depiction of the workflow is given in .
Utilising
LatFit we present the first comprehensive study of lattice quality for protein models including side chains. In our test,
LatFit fitted the majority of models on an FCC lattice within 1.5

Å RMSD.