|Home | About | Journals | Submit | Contact Us | Français|
One of the recommendations of the 2010 Leon Thal Symposium, organized to develop strategies to prevent Alzheimer’s disease, was to build a global database of longitudinal aging studies. While several databases of longitudinal aging studies exist, none of these are comprehensive or complete. In this paper we review selected databases of longitudinal aging studies. We make recommendations on future steps to create a comprehensive database. Additionally, we discuss issues related to data harmonization.
One of the four major recommendations from the 2010 Leon Thal Symposium, which brought together experts in brain aging and dementia in order to examine national strategies to prevent Alzheimer disease (AD), was: “… initiation of a Global Database that extends the concept of the National Database for Longitudinal Studies for longitudinal studies beyond the United States;” 1.
As implied in the above recommendations, creation of a global database of longitudinal aging studies is crucial to facilitating large collaborative studies and providing the research community with the greatest opportunity to conduct efficient, optimally informed research. Multiple longitudinal aging studies exist around the world that could contribute valuable data to such a database 2, 3. Currently, a few databases have systematically collected and stored data/information from two or more longitudinal aging studies. These databases were created for different purposes and the amount and type of data they provide varies. Additionally how useful these databases are to the research community in general is unclear. This review aims to answer several questions related to these databases: Which are the main databases that include data on or information about longitudinal aging studies? Are these databases utilized in their current format? What future steps should be taken to improve the utilization of such databases? To try to answer these questions this paper reviews selected databases and makes recommendations for future attempts in creating a database of longitudinal aging studies.
We aimed to identify all web-based databases that provide either individual level data from longitudinal aging studies or information on which longitudinal aging study includes what types of variables. To find web-based databases, the majority of which do not have a related publication, we used internet searches and relied on discussions with experts in the field to obtain a comprehensive list. Existing databases were identified through web searches, using the internet search engine “Google” and “Google scholar” conducted on December 15, 2010 and June 27, 2011. Search terms “aging database”, or “Alzheimer disease”, or “Alzheimer’s disease”, or “longitudinal aging study” or “aging study network” or, “longitudinal study on aging” were used. Sites not related to human aging research were excluded. We also repeated our search using Medline for years 1946 through the last 3 weeks of June 2011 using the same search phrases. Our search did not identify any additional web sites. In fact, most of the databases or other aging relevant web sites were not captured using this method. The list was further compiled from discussions with experts in the field.
From this extensive list we only selected for review databases which: 1) focused on brain aging, 2) included studies with biomarker measures and, 3) are publicly available as a web-based database.
Our search yielded over one hundred thousand sites. From this list, we first compiled a table including all web sites/networks that were human aging research related, available in English and had a searchable web-based database (Table 1). The sites/networks/databases without a web-based searchable database but relevant to human aging research are presented separately (Supplementary Table 1). The focus and collection methods of these databases/networks vary widely. While some gather aging-related survey data with a focus on qualitative measures and policy issues, some focus on publications and resources relevant to aging. Others include information on cognitive or biomarker variables from longitudinal aging studies. Several of the databases focus on genetics of aging. Some of these databases provide information about what measures are available from different longitudinal aging studies while others provide data at the individual subject level. A few of these databases are private, where data is not available to non-affiliated investigators (Supplementary Table 1). In the end, from the list of sites in table 1, the following databases were selected for further review based on their aging brain focus, biomarker annotation and public availability as a searchable web-site of the database: the National Institute on Aging (NIA) Database of Longitudinal Studies (DLS)4 Cognitive and Emotional Health Project (CEHP) 5, the Integrative Analysis of Longitudinal Studies on Aging (IALSA) 6,7, the National Alzheimer Disease Coordinating Center Database (NACC)8 and the Alzheimer Disease Neuroimaging Initiative Database (ADNI)9. NACC and ADNI differ from the other databases because they provide data-sets at the individual subject level and focus on AD. Additionally ADNI is not a “database” in the true sense. However, because both include cognitive and biomarker data obtained from multiple longitudinal aging studies we felt they were relevant to this review.
The NIA-DLS resulted from the recommendations of the 2003 NIA Longitudinal Data on Aging working group 4. This group was assembled to facilitate research initiatives to identify risks and protective factors for diseases associated with brain aging. As a first step, the working group recommended establishing a database of existing sources of longitudinal aging related data. The primary purpose of the database was to establish a resource for investigators applying for NIA grants. The NIA-DLS includes a total of 55 longitudinal studies. Data from the Canadian Institutes of Health Research (CIHR) review of Longitudinal Studies on Aging was used in the development of the NIA database 10. The CIHR review resulted from a review of longitudinal studies on aging undertaken by the Division of Aging and Seniors, Health Canada. The CIHR review does not have a web-based database, but resulted in a document which includes information on the design and current status of the studies and study variable domains. The NIA database on the other hand is a web-based searchable database and provides information on which studies have collected which variables. How studies were selected to be included in the NIA-DLS is not described. While most studies included focus on brain aging, some of the studies enrolled younger subjects and brain aging was not the main focus (such as the Canadian Multicentre Osteoporosis Study, or the Bogalusa Heart Study (Supplementary Table 2)). Two reviews of longitudinal studies on aging have concluded that some studies with valuable findings were not included in the NIA-DLS 3, 11. One such study, for example, is the well-known Framingham study.
The CEHP, initiated in 2001, is another web-based searchable database supported by the NIA, the National Institute of Mental Health, and the National Institute of Neurological Disorders and Stroke. Its aim is to “…assess the state of longitudinal and epidemiological research on demographic, social and biologic determinants of cognitive and emotional health in aging adults” 5. Unlike other databases, CEHP has well defined selection criteria: studies were included if they had a sample size over 500 subjects, and studied a broad range of demographic, biological and psychosocial risk factors. However, the CEHP database is not limited to studies focusing on middle age or elderly individuals. Longitudinal studies with a focus on brain health enrolling young adults, for example, have also been included. Examples of such studies are the San Antonio Lupus Study of Neuropsychiatric Disease, the Work and Iron Status Evaluation, and the Neurobiological studies of Huntington’s disease (Supplementary Table 2). A web-based searchable questionnaire database was created based on the responses to a questionnaire sent to 80 studies, not all of which had a focus on age-related cognitive changes. Additionally, not all of the studies had a sample size of over 500 subjects as initially mentioned in the inclusion criteria. The CEHP database also provides information on constituent variables of the participating studies.
The IALSA network, whose meta-data tool development started in 2005, aims to create “…a collaborative research infrastructure for coordinated interdisciplinary, cross-national research aimed at the integrative understanding of within-person aging-related changes in health and cognition” 6, 7. It is an open and growing network, and over twenty-five longitudinal aging studies from around the world have joined 12. The IALSA database provides summary information regarding constituent variables of the participating studies. The IALSA network also does not have strictly defined criteria on study inclusion in the network as it began as a grass-roots effort among an initially smaller group of collaborating programs with strong interests in starting the network. The NIA-DLS, CEHP and IALSA databases all together cover 132 unique studies with some overlap.
One large-scale database with a specific focus on AD whose data collection spans from the year 1984 to present is the NACC database 8. NACC states one of its aims as: “… to promote collaborative research among the Alzheimer’s Disease Centers through research funding and technical support”. It provides a web-based query site for data from 29 NIA funded Alzheimer Disease Centers around the U.S.. Data include standardized cognitive, clinical and biomarker measures collected since 1999. Both summary data and data at the individual subject level are available from the NACC site 13.
Another network with a focus on AD is ADNI 9. Started in 2005, it is a multi-site longitudinal cohort study of biomarkers in brain aging and cognitive impairment. While ADNI is not an aggregating database in the true sense, it was designed from inception to readily provide relevant clinical, and biomarker (including imaging) data to the research community worldwide. It currently includes data for over 600 participants who come from 59 clinical centers across the U.S. and Canada. Data is available to nonparticipating investigators through the ADNI website. Similar initiatives are underway in Japan14, Europe15, and Australia16.
In contrast to the NIA, CEHP and IALSA study information databases, NACC and ADNI provide data at the individual subject level. Both of these databases were initiated as part of a large collaborative effort by multiple institutions with standardized data collection methods. These types of databases, because of the availability of standardized variables across large numbers of subjects, are able to facilitate a large number of research projects and publications.
The NACC alone has provided data to 180 funded projects resulting in over 125 publications 13. Similarly, use of the ADNI database has generated 213 publications since 2005 17. Identifying studies and publications resulting from the other databases is more difficult, since there is no formal mechanism which tracks use of their web-sites. The IALSA network, funded by NIH, with further funding from CIHR, has generated a number of ongoing projects involving cross-cohort and cross-national comparisons (S. Hofer, personal communication, June 27 2011). We were not able to determine the number of studies or publications resulting from the NIA, or CEHP databases.
Relevant clinical and biomarker data from aging studies are also available through other databases that do not specifically focus on brain aging. While these are out of the scope of this review, examples worth mentioning are the Public Population Project in Genomics 18 and the Database of Genotypes and Phenotypes 19. We anticipate that there likely are many more aging related databases/resources inadvertently left out of this review. We hope this review will serve as a starting point to build an inclusive inventory of aging relevant databases/resources.
Our review of databases providing longitudinal data relevant to brain aging found several existing useful and evolving resources. For a number of reasons these databases overlap. They are variable in the data aggregated, as well as the level of detail contained within the database itself. Determining how widely used some of these databases are is challenging. However, we believe that many researchers around the world do access these sites for a number of reasons, not the least of which to simply obtain a sense of where the field is and whom they might consult or collaborate with on planned or subsequent projects. Although databases of longitudinal aging studies can be important resources in the development of large-scale collaborative studies or the identification of confirmation samples, we suspect that in their current form there exist several limitations to their use and further development. One such limitation is related to the completeness of these databases. We found over a dozen longitudinal aging studies, some NIA-funded, which were not included in some of the databases. Examples of studies that have not been included in the NIA database are the Framingham study and the Nurse’s Health Study. The CEHP database on the other hand, includes some studies that did not necessarily have an aging brain focus. Documenting all the studies worldwide not belonging to any known database is an arduous endeavor due to the lack of a priori knowledge of study name, location, or some study-specific search terms and may be a near impossible goal to achieve. However, moving forward, a well-described process on study selection methods and how many studies did not agree to participate should be provided as part of the database description. Equally important is the regular update of such a database. The ability to achieve the above will depend on the individual study’s participation, which in most cases has been voluntary. It is possible in the future that similar to clinical trials registries, consideration might be made to requiring longitudinal studies to register as part of their funding process. Even if this were the case, there would need to be a site maintained that aggregates and curates the studies for public access.
A second limitation to the use of such a database is the fact that cognitive and biomarker variables have been measured using different methodologies across different studies. Data harmonization and the implementation of methods to control for these variations enable comparability and are crucial for data use across sites and subjects 5. Data harmonization may be retrospective or prospective. There are many challenges associated with retrospective data harmonization and for a global comprehensive database with hundreds of variables this may simply not be feasible. We suggest that retrospective data harmonization should be considered for hypothesis-driven efforts on selected variables. On the other hand, funding agencies and investigators of aging cohort studies should increasingly consider using agreed upon guidelines for collection and measurement of at least a core set of important biomarker variables to ensure prospective data harmonization. An example of such an effort in AD-related research is to standardize measurement of CSF markers and neuroimaging measurements across studies as part of the U.S. ADNI and the European ADNI 15.
We identified one hundred and thirty two longitudinal studies, mostly aging related, belonging to the NIA-DLS, IALSA, or CEHP databases (Supplementary Table 2). The challenging task of creating an inclusive database will require collaboration of experts from multiple disciplines within the aging field, bioinformatics, and funding agencies. Data harmonization both retrospective, so as to optimize the continued use of what is likely to be terabytes of existing data, and prospective, with a keen eye toward what outcome measures serve the greatest use to the research community, should be considered as key to these valued efforts.
This publication was made possible with support from the Oregon Clinical and Translational Research Institute (OCTRI), grant number UL1 RR024140 from the National Center for Research Resources (NCRR), a component of the National Institutes of Health (NIH), and NIH Roadmap for Medical Research, by the Department of Veterans Affairs and the National Institutes of Health grant number AG08017.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.