When I began my PhD, GenBank came in the post on several CDs (and a colleague first received GenBank on a DAT tape, which he printed out to look at all the sequences). Now, with GenBank holding 85 billion bases of sequence from nearly one-third of a million species, the amount of DNA sequence data is virtually unlimited (or soon will be). Analysis methods, and computational capacity, have had to grow rapidly to accommodate the vast amount of data. But as the analytical techniques get more sophisticated, they also tend to incorporate more assumptions about the evolutionary processes that produced the data. So progress in bioinformatics relies not only on advances in laboratory and computational techniques, but also on increased understanding of the patterns and processes of molecular evolution.
It is critical that the advances in computation do not come at the expense of biological veracity. For example, the increasing size of sequence datasets has brought a growing reliance on automatic alignment programs that, although constantly improving in sophistication, sometimes result in biologically unrealistic arrangements of some sequences, due to the complexity of patterns of sequence change. Some researchers claim that it is simply too time consuming to inspect alignments to detect these errors, yet if these poorly aligned regions are included in an analysis, any inference drawn from them is spurious. Just as we would be reluctant to accept sloppy laboratory techniques for the sake of expedience, we should be equally unhappy about cutting corners on the analysis. If our methods do not reflect real biological processes, we risk leading ourselves up a garden path of our own making.
Understanding patterns of genome change requires an evolutionary perspective that regards the genome as part of a whole organism. This issue includes papers that take an evolutionary approach to measuring or estimating the mutation rate, detecting and explaining differences in the rate of molecular evolution across the genome and between species, exploring how the interplay between positive selection, negative selection and drift creates complex patterns of molecular evolution, and developing realistic evolutionary models for analysing DNA to uncover evolutionary history and contemporary patterns of biodiversity. These papers share a common theme: that we need to combine molecular evolutionary theory with empirical measurement of the patterns and rates of molecular evolution to fully appreciate the complexity of the information contained in the genome.