We describe an allele frequency database that records gene frequency data on a diverse set of population samples and DNA polymorphisms. In constructing the database, the following were emphasized:
• Data quality. To make our data useful for meaningful scientific study, we try to ensure that each population sample analyzed is well documented. We also make an effort to describe, in a structured fashion, the details of a polymorphic site, including its typing method and the protocol used.
• User-friendly interface. We adopted the Web technology to allow our data to be broadly disseminated to the scientific community. Hypertext links are used to connect our data to other Web sites such as PubMed. As demonstrated in the search examples included in the Supplementary material accompanying this paper, our Web interface allows flexible retrieval of frequency data on any combination of populations and loci without too much learning effort. We also provide useful summary reports including one that indicates what populations are typed on what loci as well as the number of individuals typed.
By addressing these issues, the resulting Web site becomes informative and easy to use. We expect that the main users of our Web site will include medical genetic researchers and anthropologists who need to seek high-quality population-specific genetic variation data. We also envision, however, that our Web site can be accessed by other types of users including educators and students who represent the future of genetics-based research. We are currently in the process of incorporating more illustrative materials such as gene frequency histograms (in color) into the Kidd Lab Web site to help achieve this broader goal.
At the moment ALFRED serves primarily to make our data readily available to others and as a testbed for database design. If it proves generally useful, ALFRED could be expanded to contain a much broader scope of data. Any large expansion would require developing additional procedures to ensure integrity for information extracted from the literature or submitted by other researchers. At a minimum, we expect that what we and the community learn from our experience with ALFRED will help in the development of better databases to meet the future needs in this domain, whether designed by us or others.