Our study aimed to identify, via a systematic search, the readily identifiable databases that have been set up to disseminate genetic epidemiology information over and above that available via MEDLINE to the scientific community. While many databases have been set up to house information on prevalence of genetic variation, with some notable exceptions little progress has been made in the field of gene-disease association data. In the 13 databases we identified on gene-disease association, all but one provided at least some extra information unavailable via a MEDLINE search alone. However, the seven databases among these that gave access to previously unavailable data
(i.e. a utility grade of
3) clearly include only a small minority of the genetic association studies that exist (for example, Lin et al [13
] found over 15,000 articles) The most useful of the databases, i.e. those providing the most, previously unavailable, information were considered excellent examples of resources potentially useful in systematic reviews and meta-analyses, but were targeted to particular fields, such as Type 1 diabetes, Hereditary Fever, Alzheimer's disease or pharmacogenetics. The utility of one such database for meta-analyses is demonstrated by a recent paper on Alzheimer's [14
Many of the genetic epidemiology databases cited in the 2000 paper [8
] are no longer updated or no longer exist, due a lack of financial support. Efforts and funding are needed to facilitate the further development of online repositories that enable the dissemination of all findings into the public domain. Any new repositories will need to provide some assurance of suitable quality control. The Human Genome Epidemiology Network (HuGENet) maintains the Published Literature Database [13
], which is currently based on MEDLINE records alone. We would be keen to see this developed into a more comprehensive resource in the way that the Cochrane Central Register of Controlled Trials attempts to includes all clinical trials [15
]. Neither database is currently structured to link together reports from the same study.
In the wake of the Human Genome Project, with the advent of high throughput genotyping technology, the HapMap project, and now in the era of whole genome association studies, many thousands of genotypes and other data will be generated from epidemiological studies. Only a small minority of these will be reported in traditional journals, and the published literature will continue to provide a potentially biased resource of only the most exciting findings [16
]. The Human Genome Epidemiology Network (HuGENet) is committed to encouraging the dissemination of negative findings into the public domain via collaborating with existing journals and setting up on-line journals that will make this process easier. The 'Journal of Negative Results in Biomedicine' published online by BioMed Central [17
] has already published several sets of null results of genetic associations and other journals have dedicated subsections for the reporting of null results [18
We would strongly encourage individual study investigators, and especially consortia of investigators such as those in the HuGENet network of networks [6
], to assemble and maintain lists of studies and data repositories. To enable the latter, an approach similar to that of the microarray research community could be adopted for gene-disease association studies: the MIAME (Minimum Information About a Microarray Experiment) guidelines encourage provision of sufficient detail about a microarray experiment for it to be replicated, and offer a format for data to be held in public repositories. Until such developments, it will continue to be difficult to interpret findings from genetic epidemiological studies easily and to fully include them in rigorous and regularly updated meta-analyses.
Since the completion of this study, the National Center for Biotechnology Information (NCBI) have announced a new database called dbGaP specifically to host genotype-phenotype studies [19
]. This database appears to be an ideal example of the sort of database for which we were searching and will hopefully, in time if adequately utilised, form an essential resource for those preparing systematic reviews and meta-analyses of gene-disease association studies.