Search tips
Search criteria 


Logo of geronaLink to Publisher's site
J Gerontol A Biol Sci Med Sci. 2012 April; 67A(4): 433–439.
Published online 2011 September 20. doi:  10.1093/gerona/glr172
PMCID: PMC3309871

Reliability and Validity of the Pittsburgh Sleep Quality Index and the Epworth Sleepiness Scale in Older Men



The Pittsburgh Sleep Quality Index (PSQI) and the Epworth Sleepiness Scale (ESS) are commonly used to quantify sleep and excessive daytime sleepiness in older adults. These measures, however, have not been comprehensively evaluated for their psychometrics in older men. We determined the internal consistency reliability and construct validity of the PSQI and ESS in a sample of older men.


Participants were 3,059 men (mean age = 76.4 years) in the Osteoporotic Fractures in Men Study (MrOS) who completed the two questionnaires, wrist actigraphy, and a range of additional psychosocial and health measures.


Internal consistency was adequate for the PSQI (Cronbach’s α =.69) and the ESS (α = .70) total scores. PSQI daytime dysfunction and sleep medications components were weakly associated with the total score, but their removal did not notably improve internal consistency. PSQI and ESS totals were associated with each other and with theoretically related variables (ie, actigraphic variables, depressive symptoms, mobility/instrumental activities of daily living, health-related quality of life) in expected directions. The PSQI differentiated participants reporting no sleep disorder from those reporting particular disorders more reliably than the ESS.


In general, we found evidence of the internal consistency reliability and construct validity of the PSQI and ESS in older men. Despite low correlation with the PSQI global score, the PSQI daytime dysfunction and sleep medications components do not appreciably reduce the PSQI total score’s reliability or validity in older men.

Keywords: Validity, Reliability, Sleep, Men, Psychometrics

Sleep complaints and sleep-disordered breathing (SDB) are common in older adults. Over half of elders report some difficulty falling or staying asleep, early waking, requiring a nap, or nonrestorative sleep (1). Estimates of SDB prevalence range as high as 62% in community-dwelling elders (2). The elevated prevalence of sleep disturbance in older adults is particularly troubling in light of studies linking poor sleep to adverse outcomes in this population, including cognitive (3,4) and functional impairment (1).

Given the prevalence of sleep disturbance among elders and evidence for adverse consequences of poor sleep, reliable and valid measures are needed to maximize the rigor of late-life sleep assessment. Questionnaires commonly used to quantify sleep disturbance and its consequences include the Pittsburgh Sleep Quality Index (PSQI (5)) for sleep quality and the Epworth Sleepiness Scale (ESS (6)) for daytime sleepiness. The original PSQI validation was performed with mixed-age healthy controls, individuals with major depression, and sleep clinic patients (5). Additional research supports the reliability and validity of the PSQI in various populations, including patients with cancer and other medical conditions (7). The ESS was first shown to be reliable in medical students and patients with a range of sleep disorders (8); its validity was initially demonstrated in patients with sleep disorders and controls (6).

We know little, however, about the reliability and validity of the PSQI and ESS in the general population of older adults. We recently studied the internal consistency reliability and construct validity of the PSQI and ESS in older women in the Study of Osteoporotic Fractures (SOF (9)). Total scores for both measures had good internal consistency, but multiple PSQI items and two subscales had low correlations with the total score (9). Advancement of knowledge regarding late-life sleep quality and daytime sleepiness also requires validation of these measures in older men.

We evaluated the internal consistency reliability and construct validity of the PSQI and the ESS in a cohort of older men. We hypothesized that greater disturbance on the questionnaires would be associated with (a) poorer objective sleep as measured by greater objectively measured sleep fragmentation and daytime napping; (b) more depressive symptoms, greater mobility/instrumental activity of daily living (IADL) difficulty, and lower health-related quality of life; and (c) the presence of self-reported sleep disorders.



Participants were older men from the Osteoporotic Fractures in Men Study (MrOS), a cohort study of aging. The 5,994 men who participated in the initial MrOS study visit enrolled between 2000 and 2002 and were residents of Birmingham, AL; Minneapolis, MN; Palo Alto, CA; Pittsburgh/Monongahela Valley, PA; Portland, OR; and San Diego, CA. Additional detail about study design is described elsewhere (10,11). Participants had to be at least 65 years old, live near a study site, be able to provide informed consent, ambulate unassisted, and complete questionnaires. Men with bilateral hip replacement or an imminently terminal medical condition were excluded.

Data were collected during the MrOS Sleep Visit (2003–2005), from which men were excluded if they had an open tracheotomy or if over the prior 3 months, they used continuous positive airway pressure or bi-level positive airway pressure masks more than twice weekly, an oral appliance for snoring/SDB, or nocturnal oxygen therapy. Men could participate, however, if they discontinued use of devices during the Sleep Visit. Of the 5,994 men in the original cohort, 344 died, 36 terminated their participation, 150 were ineligible, and 1,997 were unwilling to participate. Of the remaining 3,467 men, 332 were not enrolled because recruitment goals had been met, leaving 3,135 men who participated in the Sleep Visit. Of these, 76 had someone else complete their questionnaires and were excluded. Our analysis sample contained 3,059 men, including 46 who reported using one of the above devices.


Pittsburgh Sleep Quality Index.—

The PSQI, a self-report measure of sleep quality (5), queries about multiple sleep-related variables over the preceding month, using Likert and open-ended response formats. Respondents complete 19 items about themselves, of which 18 are used to calculate scores. Five additional items are completed by a bed partner or roommate but are not used to calculate scores. The PSQI yields seven component (ie, subscale) scores: subjective sleep quality, sleep latency, sleep duration, habitual sleep efficiency, sleep disturbances, sleep medication, and daytime dysfunction. Component scores range from 0 to 3 and are summed to obtain a global score, which ranges from 0 to 21. Higher scores suggest greater sleep disturbance; a global score more than 5 suggests a significant disturbance (5).

Epworth Sleepiness Scale.—

The ESS is an eight-item measure of daytime sleepiness. Respondents report their likelihood of falling asleep in particular situations using a 4-point Likert scale (6). Responses are summed, and higher scores indicate greater sleepiness; scores more than 10 suggest excessive daytime sleepiness (6).


Sleep Visit participants completed three or more 24-hour periods of wrist actigraphy (Octagonal Sleep Watch; Ambulatory Monitoring, Inc., Ardsley, NY). Data were edited and scored using Action W software (Ambulatory Monitoring, Inc.). To facilitate this, participants indicated in a sleep diary when they got into and out of bed or removed the actigraph. We studied one nighttime variable (wake after sleep onset or WASO; total time spent awake after initial sleep onset) and one daytime variable (napping; mean time spent inactive while out of bed). Averages were calculated across 24-hour periods.

Other measures.—

Participants provided demographic information upon enrollment. At the Sleep Visit, participants indicated whether a health care provider had told them that they have sleep apnea, insomnia, restless legs, periodic leg movements, or narcolepsy. They also reported on history of medical conditions and difficulty with five mobility/IADL items (ie, walking, climbing 10 steps, preparing meals, doing heavy housework, shopping (12,13)) and reported alcohol and caffeine use. Participants also completed the SF-12, a measure of physical– and mental health–related quality of life (14) and the 15-item Geriatric Depression Scale (GDS (15)). Medication use was recorded in an electronic inventory based on the Iowa Drug Information Service Drug Vocabulary (College of Pharmacy, University of Iowa, Iowa City, IA (16)).

Statistical Analyses

We compared characteristics of our sample and men excluded from this study using t tests for continuous variables, Wilcoxon tests for skewed variables, and chi-square tests for categorical variables. We then compared PSQI and ESS scores by age, educational level, and race/ethnic group (white vs minority). To evaluate the questionnaires’ internal consistency, we calculated raw Cronbach’s αs for total and subscale scores and raw corrected item-total, item-component, and component-total Spearman’s rho (rs) correlations. We selected Cronbach’s α ≥.70 and corrected correlations (rs) ≥.30 to indicate adequate internal consistency. Scores of participants reporting particular sleep disorders were compared with those of men reporting no such history using t tests or Wilcoxon tests for skewed variables. Analyses were conducted using SAS (SAS Institute, Cary, NC). We selected α <.05 for statistical significance.



We excluded 2,555 participants, who were alive and had not terminated participation at the time of the Sleep Visit, for reasons explained above. At baseline, these participants were older (mean age 73.9 vs 73.0 years), had less education (73.2% vs 79.5% had at least some college education), and were less likely to have had 12 alcoholic drinks in the past 12 months (61.6% vs 68.2%) than the 3,059 men in our analysis sample (all p < .001). Those excluded were more likely to take antidepressants (6.9% vs 5.4%) and have diabetes (11.2% vs 9.4%), hypertension (45.1% vs 40.9%), congestive heart failure (5.5% vs 3.7%), and chronic obstructive pulmonary disease (COPD; 11.3% vs 9.3%) and have lower SF-12 physical component summaries (mean 48.4 vs 49.9; all p < .05). There was a trend toward a higher prevalence of coronary artery disease (ie, myocardial infarction or angina) in excluded participants compared with our analysis sample (21.8% vs 19.7%; p = .05).

At the Sleep Visit, participants’ mean age ± standard deviation was 76.4 ± 5.5 years; 90.1% were white (Table 1). Other race/ethnic groups in our sample were black (3.6%), Asian (3.2%), Hispanic (1.9%), and “other” (1.2%). Most participants (79.5%) had at least some college education. Approximately 50% had hypertension, 24% coronary artery disease, 13% diabetes, 6% congestive heart failure, and 5% COPD. Participants had 0.4 ± 0.8 mobility/IADL impairments. Approximately 6% had elevated depressive symptoms (GDS ≥ 6). Mean SF-12 physical and mental component summaries were 48.7 ± 10.1 and 55.0 ± 7.5, respectively, indicating good physical– and mental health–related quality of life (14).

Table 1.
Participant Characteristics (M ± SD or n (%))

Descriptive Statistics for Questionnaires

Participants’ mean PSQI global score was 5.6 ± 3.2 (Table 2), indicating that, on average, participants met a minimal threshold definition for poor sleep quality (ie, PSQI > 5); 44.2% had a global score more than 5. PSQI global scores differed by age group (p = .04) but the proportion of men with PSQI more than 5 did not. Higher educational attainment was associated with lower global PSQI scores and decreased likelihood of PSQI more than 5 (ps < .001). Men from race/ethnic minorities had higher global scores and a greater likelihood of PSQI more than 5 than white men (ps = .02).

Table 2.
Descriptive Statistics (M ± SD or n (%)) for PSQI Scores Across Demographic Variables

On the ESS, participants’ mean score was 6.1 ± 3.6; 12.7% had an ESS score more than 10 (Table 3). ESS scores did not differ by age or education. Men from minority groups had higher scores than white men (p = .02) but were not more likely to have ESS more than 10 (p = .24).

Table 3.
Descriptive Statistics (M ± SD or n (%)) for ESS Scores Across Demographic Variables

Internal Consistency

Internal consistency was adequate for both questionnaires’ total scores. The PSQI global score had an α = .69 (Table 4). Corrected component-total correlations ranged from .25 for the daytime dysfunction component and .28 for the sleep medications components to .57 for the sleep quality component. Removal of the daytime dysfunction and sleep medications components increased the PSQI global score’s α to .72.

Table 4.
PSQI Internal Consistency Data

The ESS had an α = .70. Item-total correlations ranged from .30 for the item assessing dozing while stopped in traffic to .51 for the item regarding dozing while sitting after lunch (Table 5).

Table 5.
ESS Internal Consistency Data

Construct Validity

Worse sleep on the PSQI was associated with greater sleepiness on the ESS (r = .13; Table 6; all p values < .001). On both questionnaires, greater disturbance was modestly associated with longer actigraphic measurements of napping and greater WASO. The largest correlation was between WASO and the PSQI global score (r = .18).

Table 6.
Pearson Correlations Between Sleep Measures and Relevant Variables

Greater disturbance on the PSQI was moderately associated with a greater number of depressive symptoms on the GDS (r = .34), and lower mental health–related quality of life on the SF-12 (r = −.30). Greater sleepiness on the ESS was modestly associated with a greater number of depressive symptoms (r = .17) and lower SF-12 mental component scores (r = −.12).

Worse PSQI scores were modestly associated with greater impairment on the mobility/IADL questionnaire (r = .23) and moderately associated with lower SF-12 physical component summaries (r = −.31). Worse ESS scores were more weakly yet significantly associated with greater mobility/IADL impairment (r = .11) and lower SF-12 physical component scores (r = −.17). Removal of the daytime dysfunction and sleep medication components from the PSQI did not improve the correlations between the global score and theoretically relevant variables (data not shown).

Men reporting an insomnia history or periodic leg movements had worse PSQI scores than those reporting no sleep disorder history (Table 7; p < .001); they did not differ significantly on the ESS. Men reporting a history of restless legs, sleep apnea, or narcolepsy had more sleep problems on both questionnaires than those reporting no sleep disorder history (p < .05 for all). With the daytime dysfunction and sleep medication components removed, all but one association—with narcolepsy history—remained significantly correlated with the PSQI (data not shown).

Table 7.
Scores on Questionnaires by Diagnostic Category


We studied the internal consistency and validity of the PSQI and ESS in 3,059 older men in the Osteoporotic Fractures in Men Study (MrOS). Results generally suggest that these measures are reliable and valid for use in this population.

The PSQI global score’s internal consistency was on the border of the adequate range (α = .69). The sleep medication and daytime dysfunction components had low correlations with the global score; their removal increased internal consistency to α = .72. Although they differed from the high internal consistency (α = .83) in the original PSQI publication (5), our findings in older men are consistent with those in older women in SOF, which included an overall α = .72 and low correlations of the same two components with the global score (9). These two studies suggest that the PSQI is internally consistent in older adults, but slightly less so in older men than in older women, and that the medication and daytime dysfunction components might not co-vary with other PSQI components in older adults.

The ESS had adequate psychometrics in older men, with an α = .70, and corrected item-total correlations ≥.30. A lower corrected item-total correlation for the ESS “traffic” item was consistent with findings in SOF (9). Results from these two studies suggest that the ESS is internally consistent when used in older adults.

Modest to moderate correlations with theoretically relevant variables suggest that the PSQI and ESS are valid for use in older men; these results are similar to those from older women in SOF (9). Specifically, we found that correlations between questionnaires and actigraphic WASO and napping were modest but statistically significant in older men. Another study in the MrOS cohort reported that greater WASO, measured by polysomnography (PSG), was independently associated with greater disturbance on the PSQI and the ESS, after adjustment for potential confounders (17). We found that removing the daytime dysfunction and sleep medication components did not improve the association between the PSQI global score and theoretically relevant variables.

Although we observed a moderate correlation between depressive symptoms and the PSQI, depressive symptoms were more modestly associated with the ESS in older men in the present study and in older women in SOF (9). Correlations between depressive symptoms and the PSQI and ESS are consistent with results from another MrOS study, which used a categorical GDS variable to investigate multivariable-adjusted associations with the PSQI and ESS (18), rather than the continuous GDS score and bivariate correlations typically used for validation, reported here.

We found that the PSQI and, to a lesser degree, the ESS were more strongly associated with subjective psychosocial and functional measures (eg, GDS, SF-12 indices, mobility/IADLs) than with actigraphic variables. Similarly, Buysse and colleagues (19) categorized a mixed-age community sample on the basis of their PSQI and ESS scores and found that these groups differed on self-report psychosocial measures and sleep diaries but not on objective sleep measures. They suggested that the PSQI and ESS measure aspects of subjective sleep and wake that differ from the signals recorded by objective sleep measures (19). Although we detected correlations between these questionnaires and actigraphic measures, their modest magnitude supports this conclusion, particularly with regard to the PSQI.

Similar to findings in women (9), we found that men reporting a range of sleep disorders had worse scores on the PSQI than men reporting no sleep diagnoses. Removing the daytime dysfunction and sleep medication components did not improve the measure’s ability to differentiate between men reporting different sleep disorders. ESS scores also differentiated men reporting several different sleep diagnoses, except insomnia or periodic leg movements, from those reporting no sleep disorders. Among the self-reported diagnoses assessed in SOF, however, ESS scores differed between older women with and without insomnia (9).

Our study has limitations. Our sample only included men, although comparisons with women suggest comparable psychometrics (9), and only 10% of participants were from race/ethnic minorities. Also, we omitted objective measures of SDB and nocturnal sleep duration because a prior MrOS paper reported unadjusted associations of PSG-measured SDB and actigraphic sleep duration with the questionnaires (17). Furthermore, in the present study, napping was measured by actigraphy. Although actigraphy might have utility in assessing daytime sleepiness when other methods are not feasible (20), actigraphy might incorrectly identify reduced activity as naps (21). This study also excluded men routinely using continuous positive airway pressure who would not discontinue use; it is unclear whether results will generalize to men with treated SDB. Finally, participants tended to be healthier than those who were excluded. Selection bias might have affected results.


Findings generally support the internal consistency reliability and construct validity of the PSQI and ESS in older men. Although two PSQI components had low correlations with the global score, their removal only slightly improved overall internal consistency and did not affect correlations with theoretically relevant variables, suggesting that these components can be retained when using the PSQI with older men.


The Osteoporotic Fractures in Men (MrOS) Study is supported by National Institutes of Health funding. The following Institutes provide support: the National Institute of Arthritis and Musculoskeletal and Skin Diseases, the National Institute on Aging (NIA), the National Center for Research Resources (NCRR), and National Institutes of Health Roadmap for Medical Research under the following grant numbers: U01 AR45580, U01 AR45614, U01 AR45632, U01 AR45647, U01 AR45654, U01 AR45583, U01 AG18197, U01-AG027810, and UL1 RR024140. The National Heart, Lung, and Blood Institute provides funding for the MrOS Sleep ancillary study “Outcomes of Sleep Disorders in Older Men” under the following grant numbers: R01 HL071194, R01 HL070848, R01 HL070847, R01 HL070842, R01 HL070841, R01 HL070837, R01 HL070838, and R01 HL070839. A.P.S. is supported by K01AG033195 and S.A-I. by R01AG008415 from the NIA. E.J.K. is supported by a career development award from the NCRR of the National Institutes of Health and a Triological Society Research Career Development Award of the American Laryngological, Rhinological, and Otological Society (KL2RR024130).


A.P.S.: received honoraria as a clinical editor for the International Journal of Sleep and Wakefulness—Primary Care, which receives pharmaceutical company support. S.A-I.: consultant/scientific advisory board for Ferring Pharmaceuticals Inc., GlaxoSmithKline, Merck, NeuroVigil, Inc., Neurocrine Biosciences, Pfizer, Philips Respironics, sanofi-aventis, Sepracor, Inc. E.J.K.: Apnex Medical (medical advisory board, consultant), ArthroCare (consultant), Medtronic (consultant), Pavad Medical (consultant), ReVENT Medical (medical advisory board).


We thank Katherine Lou and Kaycee Rashid for their assistance with editing the manuscript. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.


1. Foley DJ, Monjan AA, Brown SL, Simonsick EM, Wallace RB, Blazer DG. Sleep complaints among elderly persons: an epidemiologic study of three communities. Sleep. 1995;18:425–432. [PubMed]
2. Ancoli-Israel S, Kripke DF, Klauber MR, Mason WJ, Fell R, Kaplan O. Sleep-disordered breathing in community-dwelling elderly. Sleep. 1991;14:486–495. [PMC free article] [PubMed]
3. Cohen-Zion M, Stepnowsky C, Marler M, Shochat T, Kripke D, Ancoli-Israel S. Changes in cognitive function associated with sleep disordered breathing in older people. J Am Geriatr Soc. 2001;49:1622–1627. [PubMed]
4. Faubel R, Lopez-Garcia E, Guallar-Castillon P, Graciani A, Banegas JR, Rodriguez-Artalejo F. Usual sleep duration and cognitive function in older adults in Spain. J Sleep Res. 2009;18:427–435. [PubMed]
5. Buysse DJ, Reynolds CF, Monk TH, Berman SR, et al. The Pittsburgh Sleep Quality Index: a new instrument for psychiatric practice and research. Psychiatry Res. 1989;28:193–213. [PubMed]
6. Johns M. A new method for measuring daytime sleepiness: the Epworth Sleepiness Scale. Sleep. 1991;14:540–545. [PubMed]
7. Carpenter JS, Andrykowski MA. Psychometric evaluation of the Pittsburgh Sleep Quality Index. J Psychosom Res. 1998;45:5–13. [PubMed]
8. Johns MW. Reliability and factor analysis of the Epworth Sleepiness Scale. Sleep. 1992;15:376–381. [PubMed]
9. Beaudreau SA, Spira AP, Stewart A, et al. Validation of the Pittsburgh Sleep Quality Index and the Epworth Sleepiness Scale in older black and white women. Sleep Med. In press. [PMC free article] [PubMed]
10. Orwoll E, Blank JB, Barrett-Connor E, et al. Design and baseline characteristics of the osteoporotic fractures in men (MrOS) study—a large observational study of the determinants of fracture in older men. Contemp Clin Trials. 2005;26:569–585. [PubMed]
11. Blank JB, Cawthon PM, Carrion-Petersen ML, et al. Overview of recruitment for the osteoporotic fractures in men study (MrOS) Contemp Clin Trials. 2005;26:557–568. [PubMed]
12. Fitti J, Kovar M. The supplement on aging to the 1984 National Health Interview Survey. Vital Health Stat 1. 1987;21:1–115. [PubMed]
13. Pincus T, Summey JA, Soraci SA, Jr, Wallston KA, Hummon NP. Assessment of patient satisfaction in activities of daily living using a modified Stanford Health Assessment Questionnaire. Arthritis Rheum. 1983;26:1346–1353. [PubMed]
14. Ware J, Jr, Kosinski M, Keller SD. A 12-Item Short-Form Health Survey: construction of scales and preliminary tests of reliability and validity. Med Care. 1996;34:220–233. [PubMed]
15. Sheikh JI, Yesavage JA. Geriatric Depression Scale: recent evidence and development of a shorter version. Clin Gerontol. 1986;5:165–173.
16. Pahor M, Chrischilles EA, Guralnik JM, Brown SL, Wallace RB, Carbonin P. Drug data coding and analysis in epidemiologic studies. Eur J Epidemiol. 1994;10:405–411. [PubMed]
17. Kezirian EJ, Harrison SL, Ancoli-Israel S, et al. Behavioral correlates of sleep-disordered breathing in older men. Sleep. 2009;32:253–261. [PubMed]
18. Paudel ML, Taylor BC, Diem SJ, et al. Association between depressive symptoms and sleep disturbances in community-dwelling older men. J Am Geriatr Soc. 2008;56:1228–1235. [PMC free article] [PubMed]
19. Buysse DJ, Hall ML, Strollo PJ, et al. Relationships between the Pittsburgh Sleep Quality Index (PSQI), Epworth Sleepiness Scale (ESS), and clinical/polysomnographic measures in a community sample. J Clin Sleep Med. 2008;4:563–571. [PubMed]
20. Littner M, Kushida CA, Anderson WM, et al. Practice parameters for the role of actigraphy in the study of sleep and circadian rhythms: an update for 2002. Sleep. 2003;26:337–341. [PubMed]
21. Ancoli-Israel S, Cole R, Alessi C, Chambers M, Moorcroft W, Pollak CP. The role of actigraphy in the study of sleep and circadian rhythms. Sleep. 2003;26:342–392. [PubMed]

Articles from The Journals of Gerontology Series A: Biological Sciences and Medical Sciences are provided here courtesy of Oxford University Press