The method was implemented as a standalone tool performing automated processing of a protein–DNA complex, invoking third party programs and returning the computed PWM. Before calculating the energies, 3DTF analyzes the protein–DNA interface and can automatically identify the binding site on the DNA in the complex. Automatic detection works when the binding interface can be determined unambiguously and appears to have less than 30
nt positions. The scripts for the server are heavily optimized, allowing calculation of a typical matrix of 10–12 positions in less than a minute. Convergence of the full model (which includes an additional weight for each position) requires longer calculations, which can be invoked by a special option in the submission page.
3DTF has an easy to use interface that enables the user to perform three major task modes related to PWM calculation. For all three task modes, the required input is a structure model in PDB format, containing one protein chain and two complementary DNA chains. Additional parameters can be set in the Task mode 3, described further below. An example file with the proper format is available on the website.
In Task mode 1, the user can check whether the provided structure model is suitable for PWM calculation. Here, 3DTF parses the necessary information, such as chains and types of chains, from the PDB file. It also automatically determines the segment of DNA in close contact with the protein component. The Task mode 1 output is a plain text page that shows the detected protein and DNA chains as well as the parsed-out sequence of the binding site. This output part is followed by a detailed description of the bases contacted by protein with reference to corresponding chains, chain IDs and individual base identifiers. Failing conditions in Task mode 1 include absence or corruption of chains, e.g. if the file does not contain a DNA chain, unpaired bases in the binding site or unconventional base numbering in either strand.
Aside from testing of the PDB file, Task mode 1 is also of interest for users who wish to anticipate how the provided information is going to be processed. It should be noted that the information compiled in the Task mode 1 output is also incorporated in the output of Task mode 2, but the latter requires more complex calculations.
Task mode 2 as well as Task mode 3 calculate a PWM for a given protein–DNA complex. In Task mode 2, 3DTF computes the binding profile on the basis of the automatically defined DNA segment (see Task mode 1). For Task mode 3, the user can specify chains, start base numbers and desired length of the binding site in order to enforce a particular binding site to be modeled. This provides greater flexibility to obtain custom PWMs based on prior knowledge.
Task mode 3 is important when the binding interface cannot be defined unambiguously from the structure (for example, when there is more than one binding site), thus disqualifying the structure from being used via Task 2. Another possible application of this user-defined mode is to calculate matrices for long binding sites. A long site can be divided into shorter segments to be handled by 3DTF. PWMs from each segment can then be concatenated into a larger model. This yields the same result as calculating the whole matrix, since within the applied model of protein/DNA interactions energy contributions of positions are independent from one another. An example output of 3DTF is shown in .
Figure 1. Outputs of Task modes 2 and 3 are plain text pages that feature the sequences, for which binding energies have been computed, a tabular description of the derived PWM as well as a PWM logo. The PWM can further be downloaded in the TRANSFAC-like format. (more ...)