Search tips
Search criteria 


Logo of bioinformLink to Publisher's site
Bioinformation. 2006; 1(6): 214–219.
Published online 2006 October 7.
PMCID: PMC1891685

GCSDB: an integrated database system for the Georgia Centenarian Study


GCSDB is a web-oriented integrated database system for the Georgia Centenarian Study, a phase III, population-based, multidisciplinary study of centenarians. The Study recruited 244 centenarians and near-centenarians (age 98 and older), 80 octogenarians and 400 young controls in Northern Georgia. GCSDB incorporates more than 40 relational tables containing data about the participants including demographics, family longevity, physical health, cognition, neuropsychology, mental health, neuropathology, functional capacity, and genetics. The GCSDB web site includes detailed information about these tables and functions for genetic and other kinds of data analysis. More data and functions will be added as the study progresses. GCSDB provides a resource that could be used to identify what biological, psychological, and social factors as well as their epistatic interactions help these centenarians achieve long life.

Availability (login information can be obtained from authors)

Keywords: centenarian, cognition, database, longevity, mental health, neuropathology, neuropsychology, functional capacity, Single Nucleotide Polymorphism (SNPs)


Over the past two decades, interest in study of centenarians has increased steadily [1,2,3], and several countries have initiated their own centenarian studies, such as the United States [4,5,6], Japan [7], Italy [8], Hungary [9], France [10 ], Sweden [11], Finland, and Denmark. [ 12] The fundamental question is how centenarians live longer and what specific biological, psychological, and sociological factors help centenarians survive to become the oldest of the old. [1,2 ,13] Bringing together these data from multiple disciplines in ways that are simple and useful is a common problem that must be solved for research to progress.

Phase 3 of the Georgia Centenarian Study (2001-2007) is a multi-disciplinary program project designed to identify further the roles of biological, psychological, and social factors as well as their epistatic interactions contributing to the successful late-life aging. [2] Four specific projects within the study are to examine: 1) genetic structure of the Georgia population with a focus on the oldest-old; 2) relations between dementia and neuropathology among centenarians with additional neuropsychological data collected for persons who give permission for autopsy; 3) determinants of functional capacity from neuropsychological, cognitive, health, and demographic variables; and 4) predictors differentiating centenarians who are independent, healthy, and experience a sense of well-being from those who are dependent, frail, and do not experience a sense of wellbeing.

A population-based sample of individuals was recruited from 44 counties in Northeast Georgia, consisting of 244 centenarians and near-centenarians (age 98 and older), 80 octogenarians primarily for the cognitive and adaptation studies, and 400 controls in equal numbers from the 2nd to 5th decades of life for the genetic studies. Data were collected using standardized data forms and protocols by interviewers from the Data Acquisition Core, then scanned, checked, corrected, verified, and saved as separate electronic files according to the source of the data using the Teleform software package. [14]

For ease of data management and data sharing in a multidisciplinary and multi-institutional setting, a weboriented, integrated database-GCSDB-was established to serve as a resource for effective collaborative research. This changes the paradigm for most social science studies. In this paper, we report on the status of GCSDB and highlight the database table features and web interface functions that were incorporated or developed for genetic data analysis as well as for other data analyses. The database and its interface are designed to promote cross-project questions and analyses of the data.


Database Implementation

GCSDB was developed on a Sun microsystems SunBlade with the Solaris 5.8 operating system; database management was Oracle version 10g. GCSDB was constructed following the relational database schema from the data collection instruments and booklets. All tables in GCSDB share a common field-the unique participant ID number as the primary key (GCSID) for linking tables generated by multiple projects and/or cores. Web interface to the database is produced from Java Sever Page (JSP) technology and the Struts framework under the SQL query tool. Web pages are served using the Apache-Tomcat web server version 5.0.30. In addition to the relational database, underlying statistical analyses were implemented via Java or FORTRAN.

Database Content

GCSDB contains more than 40 relational tables for 4 interrelated projects. Number of columns in these tables ranges from 23 to 519. Column attributes are scale (double), ordinal (integer), or nominal (String). Each data entry in every table corresponds to one study participant (centenarian, octogenarian, or young control). Depending upon a participant's level of involvement in the project, these tables include the participants' information in the following tests or domain areas:

(1) Demographics and Family Longevity. Data include age, date of birth, gender, racial and ethnic backgrounds, handedness, years of education, county and place of residence, and occupational history. Family longevity includes ages and causes of death of mother, father, grandparents, siblings, number of siblings, lifetime residential locations, and military services.

(2) Physical Health. The protocol is composed of 3 domains: (a) Physical Examination and Health History, including vital signs, presence/absence of certain diseases, anthropometrics, neurological/musculoskeletal measures and current medications; (b) Blood Chemistry Profile, including glucose, BUN, creatinine, albumin, complete blood count, , C-reactive protein, hemoglobin A1c, ferritin, thyroxine, thyroid stimulating hormone, vitamin B12, and vitamin D; and (c) Physical Function Assessment includes measures of bed mobility, bed to chair transfer skills, standing balance, walking, step-up, and chair standing abilities;

(3) Cognition, Neuropsychology, Mental Health, and Neuropathology. Tests and assessments include: the Mini-Mental State Examination (MMSE) [15], Global Deterioration Scale [16], and Severe Impairment Battery [17], Fuld Object Memory Evaluation (FOME) [18], Wechsler Adult Intelligence Scale-III, Similarities sub-test, Letter Number Sequencing sub-test, and Matrix Reasoning subtest [19], Behavioral Dyscontrol Scale (DBS) [20], ILS Health & Safety Scale [21], COWAT [22], Clock Drawing Test [23,24], Geriatric Depression Scale [25], CERAD battery [26], and Clinical Dementia Rating Scale. [27] Magnetic Resonance Imaging data of the voluntarily donated brain tissue were also available for some centenarians who passed-away before the completion of the project.

(4) Functional Capacity and Independence. The basic functional capacity was assessed using both a self-report as well as performance-based measures of basic/physical and instrumental activities of daily living (BADL and IADL). Physical functional capacity was measured with NIA Short Performance Battery (SPPB) [28] and Physical Performance and Mobility Examination (PPME). [29] The performance-based measures include selected subtests of the Direct Assessment of Functional Status-Revised (DAFS-R). [30] The measures and Performance Rating Scale was taken from OARS. [ 31] In addition, there are other tables summarizing personality, life events, social support, and economic resources.

(5) Genetics. The Single Nucleotide Polymorphism (SNPs) table includes 15 SNPs from three candidate longevity genes, i.e., APOE [32,33], HRAS1 [34], and LASS1/LAG1 [35,36] for all of the participants. The SNPs are located in exon or promoter regions in these genes.

Database Interface

The GCSDB homepage contains three sections (Figure 1): an introduction to the Georgia Centenarian Study; links to web pages for the four individual projects (Figure 2); ways to cross link among the tables (users may retrieve data either by selecting table names and column names or directly by SQL; a plot function is available to obtain a preliminary plot for any two column variables of interest from any table in the study to help investigators choose variables for further statistical analysis; Regression, Principal Component Analysis, and canonical correlation analysis tools are available to identify relationships among set of variables of interest.

Figure 1
Composite of screen displays demonstrating the homepage of GCSDB. (1) A brief introduction of The Georgia Centenarian Study Project with a link to a detailed introduction. (2) Four links to the web pages of four individual projects (see ...
Figure 2
Screen display showing the web page for project 3. The page consists of the proposed project 3 research model which connects the potentially interrelated domain areas together. Clicking the domain area picture in the model ...

For each table, the web page provides the following information and functions:

  1. a dictionary listing the data-type, data length, column name, full name, and each category value for ordinal columns;
  2. a PDF file showing frequencies and descriptive statistics for most columns;
  3. a raw data table and function giving values for the user selected column for a given participant;
  4. missing value counts for every column and a function shows what columns are missing for a given participant;
  5. frequency counts for a user selected ordinal column and frequency counts for ordinal columns;
  6. maximum, minimum, and mean values and number of participants with values less or larger than mean values for all scale columns;
  7. number of participants with values larger or less than a given cut-off value for the user selected scale column;

For the SNPs table (Figure 3), the following important information and genetic analyses are available;

  1. SNPs descriptor includes the name, identity, cytogenetic map, chromosomal and physical location for every SNP;
  2. the 2 x 2 x 2 contingency table (age, race, SNP allele) and log odds ratios for each SNP [37] ;
  3. exact tests of association for each SNP [37] ;
  4. exact test of Hardy-Weinberg Law for each SNP;
  5. Linkage Disequibrium Test for pair-wise SNPs;
  6. haplotype analysis results for SNPs from APOE, HRAS1 and LASS1, respectively and for all the SNPs. [38]
Figure 3
A portion of the web page for the SNPs and Demographics and Family Longevity data. (A) Information and functions provided for SNPs data analyses. (B) Information and functions provided for preliminary analyses of Demographics ...

Data dictionaries, forms, code books, questionnaires, and other relevant metadata are on the Web site or are being added as requested. The database is passworded to protect the anonymity of centenarians, and centenarian names have been replaced with a GCSID.

Future Development

Data acquisition for functional capacity, and adaptation and resource projects has been completed; genetic data analysis is still under way from the collected blood samples. Neuropathology collection is expected to continue until 2007. GCSDB is updated as data become available. Functions and tools are added as data reduction and analysis begin. The database/Web interface constitutes a virtual resource for interested researchers. At present, GCSDB is only available to our project personnel, but will be released to the research community one year following completion of our project according to the data sharing plan on the Web site. It is our obligation and desire to share data, cell lines, and brain tissues with qualified researchers at that time. This data archive is part of making results available online. We expect that these rare resources can be used to test new hypotheses to gain new knowledge on contributors for the extreme longevity of these centenarians and that our system can serve as a model to other cross-disciplinary projects.


The Georgia Centenarian Study is funded by 1P01-AG17553 from the National Institute on Aging. We also thank the UGA College of Agricultural and Environmental Sciences for their support. 4The Georgia Centenarian Study (Leonard W. Poon, PI) is funded by 1P01-AG17553 from the National Institute on Aging, a collaboration among The University of Georgia, Louisiana State University Health Sciences Center in New Orleans, Boston University, University of Kentucky, Emory University, Duke University, Rosalind Franklin University of Medicine and Science, Iowa State University, and University of Michigan. The authors acknowledge the contributions of the Study's project and core leaders to this paper: L.W. Poon, S. M. Jazwinski, R. C. Green, M. Gearing, W. R. Markesbery, J. L. Woodard, M. A. Johnson, J. S. Tenover, I. C. Siegler, P. Martin, M. MacDonald, C. Rott, W. L. Rodgers, D. Hausman, J. Arnold, and A. Davey. We also acknowledge M. A. Batzer, E. Cress, and L. S. Miller for their contributions. Authors acknowledge the valuable recruitment and data acquisition effort from M. Burgess, K. Grier, E. Jackson, E. McCarthy, K. Shaw, L. Strong, and S. Reynolds, data acquisition team manager; S. Anderson, E. Cassidy, M. Janke, and T. Savla, data management; M. Durden for project fiscal management.


Citation:Taylor et al., Bioinformation 1(6): 208-213 (2006)


1. Lehr U. Zeitschrift Fur Gerontologie. 1991;24:227. [PubMed]
3. Vaupel JW, et al. Science. 1998;280:855. [PubMed]
4. Perls TT. Medical Hypothesis. 1997;49:405. [PubMed]
5. Poon LW, et al. International Journal of Aging & Human Development. 1992;34:1. [PubMed]
6. Poon LW, et al. International Journal of Aging and Human Development. 1992;34:31. [PubMed]
7. Chan YC, et al. Journal of Nutritional Science and Vitaminology. 1997;43:73. [PubMed]
8. Capurso A, et al. Archives of Gerontology and Geriatrics. 1997;25:149.
9. Regius O, et al. Zeitschrift für Gerontologie. 1994;27:456. [PubMed]
10. Allard M, et al. Les 120 ans de Jeanne Calment Doyenne de l'humanite. Paris: Le Cherche Midi Editeur; 1994.
11. Samuelsson SM, et al. International Journal of Aging & Human Development. 1997;45:223. [PubMed]
12. Jeune B. Population studies of aging Number 15. Odense, Denmark: Odense University; 1994.
13. Vaillant GE, Mukamal K. American Journal of Psychiatry. 2001;158:839. [PubMed]
14. Inc Cardiff Teleform. Cardiff Software. Vista, CA: 1993. Computer Software.
15. Folstein MF, et al. Journal of Psychiatric Research. 1975;12:189. [PubMed]
16. Reisberg B, et al. American Journal of Psychiatry. 1982;139:1136. [PubMed]
17. Saxton J, et al. Psychological Assessment: A Journal of Consulting and Clinical Psychology. 1990;2:298.
18. Fuld PA. The Fuld Object-Memory Evaluation. Chicago: Stoelting Instrument Company; 1981.
19. Wechsler D. Wechsler Adult Intelligence Scale (WAIS-III) () Third ed. San Antonio, TX: The Psychological Corporation; 1992.
20. Grigsby J, et al. Perceptual and Motor Skills. 1992;74:883. [PubMed]
21. Loeb PA. Independent Living Scales manual. San Antonio, TX: The Psychological Corporation; 1992.
22. Benton A, Hamsher K. Multilingual Aphasia Examination. Iowa City: University of Iowa; 1997.
23. Freedman M, et al. Clock drawing: A neuropsychological analysis. New York: Oxford; 1994.
24. Rouleau A, et al. Brain and Cognition. 1992;18:70. [PubMed]
25. Yesavage JA. Journal of Psychiatric Research. 1983;17:37. [PubMed]
26. Morris JC, et al. Neurology. 1989;39:1159. [PubMed]
27. Morris JC. Neurology. 1993;43:2412. [PubMed]
28. Guralnik JM. Journal of Gerontology: Series A. 1994;49:85. [PubMed]
29. Winograd CH, et al. Journal of the American Geriatric Society. 1994;42:743. [PubMed]
30. Loewenstein DA. Journal of Gerontology. 1989;4:114. [PubMed]
31. Fillenbaum G. Multidimensional functional assessment of older adults. Hillsdale, New Jersey: Erbaum; 1988.
32. Kervinen K, et al. Atherosclerosis. 1994;105:89. [PubMed]
33. Schachter F, et al. Nature Genetics. 1994;6:29. [PubMed]
34. Bonafè M, et al. Gene. 2002;286:121. [PubMed]
35. D'mello NP, et al. Journal of Biological Chemistry. 1994;269:15451. [PubMed]
36. Egilmez NK. Journal of Biological Chemistry. 1989;264:14312. [PubMed]
37. Dai J, et al. Exact sample sizes needed to detect dependence in 2 x 2 x 2 tables. Biometrics. submitted. [PMC free article] [PubMed]
38. Fallin D, Schork NJ. Am J Hum Genet. 2000;67:947. [PubMed]

Articles from Bioinformation are provided here courtesy of Biomedical Informatics Publishing Group