|Home | About | Journals | Submit | Contact Us | Français|
Women with a history of gestational diabetes mellitus are at high risk for type 2 diabetes mellitus. We systematically reviewed and synthesized the literature on the sensitivity, specificity, and reproducibility of postpartum screening tests for type 2 diabetes in women with prior gestational diabetes to inform screening guidelines.
We searched electronic databases through October 1, 2008. Two investigators independently reviewed titles, abstracts, and articles, performed serial data abstraction, and independently assessed quality. We calculated standard errors and confidence intervals for sensitivity and specificity using the exact binomial formula.
Eleven studies contained 13 evaluations of a comparison screening test with the 2-h 75-g oral glucose tolerance test (OGTT) reference. All studies used a cross-sectional study design. There were ten comparisons of a single fasting blood glucose (FBG)≥7.0mmol/L (≥126mg/dL) with the OGTT. The sensitivity ranged from 14%–100% in five studies using the 1985 World Health Organization's (WHO) criteria as the reference and from 16%–89% in five studies using the 1999 WHO criteria as the reference. Variation in the sensitivities may be due to the limited number of comparisons, differences in populations, and timing of screening. There were high losses to follow-up, limiting generalizability.
When compared with the OGTT, the single FBG alone was not consistently reported to be a sensitive screening test for type 2 diabetes in women with a history of gestational diabetes. Longitudinal studies are needed to address the natural history of glucose metabolism in women with a history of gestational diabetes, the optimal approach to diagnostic testing for type 2 diabetes in this population, and the short-term and long-term outcomes of testing.
Gestational diabetes mellitus complicates 1%–14% of pregnancies, depending on the population studied.1,2 Its prevalence reflects the population's frequency of risk factors for glucose intolerance and type 2 diabetes, including obesity, older age, and race/ethnicity.3 Women with a history of gestational diabetes are at high risk for type 2 diabetes mellitus, with 3.7% diagnosed with type 2 diabetes at 9 months after delivery and 18.9% at 9 years after delivery in a large Canadian cohort.4 Women with type 2 diabetes continue to be at higher risk for cardiovascular disease (CVD) and overall mortality compared with men with diabetes.5 Interval assessment of women with a history of gestational diabetes in the obstetrical and primary care settings might effectively reduce this excess morbidity and mortality through earlier diagnosis, risk stratification, and primary prevention of type 2 diabetes and subsequent CVD.
Postpartum screening for type 2 diabetes among women with a history of gestational diabetes is recommended at 6–12 weeks postpartum by several national and international organizations and conferences,6–12 but there is no consensus about whether the complete oral glucose tolerance test (OGTT) or the single fasting blood glucose (FBG) screening test is more effective. The frequency of repeated screenings in women with a history of gestational diabetes was addressed using a simulated cost-effectiveness model, which showed that the OGTT every 3 years and a FBG yearly were both cost-effective options.13 Despite recommendations for screening, studies show that few women receive any glucose testing to screen for type 2 diabetes in the postpartum period.14,15
In 1997 the American Diabetes Association (ADA) recommended diagnosis of type 2 diabetes using the FBG over routine use of the OGTT.7 Their recommendation has been debated because of concern about missed diagnoses in individuals who would have been diagnosed only with the 2-h plasma glucose determination.16 Knowledge of the comparative performance of these available tests may inform screening guidelines for postpartum testing and improve patient and provider adherence with recommendations.
No prior studies have systematically reviewed and analyzed the effectiveness of screening tests for type 2 diabetes in women with a history of gestational diabetes. Our objective was to conduct a systematic review of the sensitivity, specificity, and reproducibility of screening tests for type 2 diabetes in women with a history of gestational diabetes.
We conducted a systematic review of the literature as part of a comprehensive evidence report for the Agency for Healthcare Research and Quality (AHRQ) on intrapartum and postpartum management of gestational diabetes mellitus.17 In this article, we go beyond the information in the report17 by presenting additional qualitative data and the results of three new studies found when updating the search.
We searched MEDLINE® (1950 through October 1, 2008), EMBASE® (1974 through October 1, 2008), The Cochrane Central Register of Controlled Trials (CENTRAL, 2006 through October 1, 2008), and the Cumulative Index to Nursing & Allied Health Literature (CINAHL®, 1982 through October 1, 2008) for original articles. Our search included MeSH and text terms related to gestational diabetes mellitus and type 2 diabetes mellitus and was limited to studies published in the English language and conducted in human subjects. We also reviewed reference lists of related reviews and included articles and conducted a hand search of recent issues of 13 medical journals that we had identified as likely to publish on the topic of gestational diabetes.
Two reviewers conducted independent title reviews. All titles identified as potentially addressing the research question were promoted to the abstract review phase. Abstracts were reviewed independently by two investigators and were excluded if both investigators agreed that the article met one or more of the following exclusion criteria: (1) contained no original data (i.e., was a meeting abstract, editorial, commentary, or letter), (2) did not evaluate women with gestational diabetes, (3) was a case report or case series, (4) did not base the diagnosis of gestational diabetes on either a 3-h 100-g OGTT or a 2-h 75-g OGTT, and (5) did not examine postpartum screening tests for type 2 diabetes. After abstract review, articles were independently reviewed by two investigators to determine if they should be included for full data abstraction. Disagreements were resolved by consensus.
We developed forms to abstract data about study population, study design, comparison test and reference standard, and comparison test outcomes (i.e., sensitivity, specificity, and reproducibility). A second investigator confirmed the data abstracted by the first investigator. We developed a quality assessment checklist based on the Standards for Reporting Diagnostics Accuracy (STARD) criteria to evaluate the results of diagnostic accuracy studies.18
For each eligible study, two investigators abstracted data serially to create a 2×2 table for each comparison test, which contained data for the number of true positives (TP), false positives (FP), false negatives (FN), and true negatives (TN). We then calculated the sensitivity (number of TP divided by the sum of TP and FN) and specificity (number of TN divided by the sum of the TN and FP) for each comparison test using the structured 2×2 tables. As some cells contained small numbers or zero, we calculated standard errors (SE) and confidence intervals (Cl) using the exact binomial formula.19 All analyses were performed in Stata (Stata, version 9.2, StataCorp, College Station, TX). We were unable to combine these studies in quantitative meta-analyses because of clinical diversity in the sampling methods, study populations, and time intervals between delivery and screening test.
Of 17,203 unique citations, we identified 11 articles with 13 evaluations of a screening test with a reference standard that met criteria for this review (Appendix 1). There were three general comparisons: (1) two different diagnostic threshold values applied to the FBG component of the 2-h 75-g OGTT,6,20 (2) a single FBG level≥7.0mmol/L (≥126mg/dL) compared with the 2-h 75-g OGTT using the 1999 World Health Organization (WHO) criteria,6,7 and (3) a single FBG level≥7.0mmol/L (≥126mg/dL) compared with the 2-h 75-g OGTT using the 1985 WHO criteria7,20 (Tables 1 and and22).
Table 2 shows a summary of the key characteristics of the 11 studies. All were published between 1999 and 2008 and had a cross-sectional design. Seven studies21,22,25,26,27,29,31 applied the threshold glucose values of the comparison test to previously collected postpartum 2-h 75-g OGTT results. These studies used a clinic convenience sample including all consecutive patients, who returned for postpartum testing within a specified time period. Four studies23,24,28,30 recruited patients with a history of gestational diabetes for subsequent type 2 diabetes screening. Cypryk et al.24 independently performed an FBG and a 2-h 75-g OGTT as two separate tests for comparison, and the study by Hunt and Conway26 was the only one that reported confirmatory testing for all type 2 diabetes diagnoses on a subsequent day (Tables 2 and and33).
The racial and ethnic composition of the study populations differed, and 3 studies did not directly report on race or ethnicity.22,30,31 In 4 studies, the populations were more than 75% Caucasian. The other 4 studies had mainly Arab,21 Mexican American,26 or a mixture of racial/ethnic groups represented27 or a high prevalence of nonwhite participants, as reported in the parent study30,32 (Table 2).
The majority of the studies screened for type 2 diabetes within 1 year following delivery.21–23,25–27,29,31 Two studies reported wide ranges of postpartum testing intervals, 1–86 months (mean 3.1 years) and 6–72 months (median 28 months), respectively.24,28 Only 1 study conducted very late screening of all subjects (4–8 years postpartum) and did not report the mean age of the participants at the time of testing30 (Table 2).
No study fulfilled all the methodological standards for diagnostic test evaluations (Table 3). All the studies had notably high losses to follow-up (range 20%–82%). The rates were highest in those that did not recruit subjects specifically for the study but instead used a clinic convenience sample, as the clinics experienced high rates of postpartum loss to follow-up.21,22,25–27,29,31 Three studies compared subjects who were lost to follow-up with subjects who returned for testing,22,25,26 and 2 of these studies reported that women who failed to return to postpartum testing were at higher risk for type 2 diabetes because of higher glucose values on their antepartum gestational diabetes diagnostic OGTT, insulin requirement during pregnancy, and positive family history for diabetes.26,31
Two studies excluded women with known type 1 or 2 diabetes that had been diagnosed postpartum prior to study start. Cypryk et al.24 recruited 193 women 6 months to 8 years after a delivery complicated by gestational diabetes and excluded 45 women (23.3%) because of new diagnoses of type 1 or 2 diabetes. Kousta et al.28 recruited 192 women 1–86 months after delivery and then excluded 27 women (15%) who had been diagnosed with type 2 diabetes since delivery.
Three studies compared different FBG thresholds as part of the 2-h 75-g OGTT.22,29,30 They reported a specificity of 98%–99% for the OGTT using FBG≥7.0mmol/L (≥126mg/mL)6 compared with FBG≥7.8mmol/L (≥140mg/dL)20 as the reference (Tables 1 and and2).2). For this comparison, the sensitivity of the comparison test, with the lower FBG threshold (FBG≥7.0mmol/L), was fixed at 100% because, by definition, all values above the higher FBG will also exceed the lower glucose threshold (i.e., it is not possible to have a false negative test).
Five studies21,26,27,30,31 compared a single FBG≥7.0mmol/L (≥126mg/dL)7 to a 2-h 75-g OGTT using the same FBG threshold and 2-h plasma glucose>11.1mg/dL (>200mg/dL)6 (Tables 1 and and2).2). The sensitivity for the single FBG ranged from 16% to 89% (Fig. 1). For these comparisons, the specificity was fixed at 100%, as all the 2-h 75-g OGTTs with negative results (FBG<7.0mmol/dL and 2-h plasma glucose<11.1mg/dL) will necessarily have an FBG<7.0mmol/dL, which means it is not possible to have a false positive. The study populations in these 5 studies had different levels of risk for type 2 diabetes.21,26,27,30,31 The highest sensitivity of the FBG (89%) was reported in a study with the longest delay in postpartum testing (4–8 years after delivery), a relatively low loss to follow-up (26%) for this group of studies, and a mainly nonwhite population.30 However, the remaining studies, which conducted postpartum testing at<6 months after delivery, still had high variability in the sensitivity (range 16%–72%).21,26,27,31 The 2 studies with the lowest reported sensitivity of the FBG, 16% by Kitzmiller et al.27 and 31% by Hunt and Conway26 had high Asian and Latino populations, respectively. Hunt et al.26 also reported 59% loss to follow-up, which may have contributed to the low sensitivity; Kitzmiller et al.27 did not report the follow-up rate.
Reinblatt et al.31 performed subgroup analyses of higher-risk subgroups based on family history of type 2 diabetes and an insulin requirement during pregnancy. The FBG had similar sensitivities in these subgroups to the entire cohort, where the sensitivity was 46% (95% CI 27–67). In 169 subjects with a family history of type 2 diabetes, the sensitivity of the FBG was 47% (95% CI 24–71), and in 168 subjects who required insulin during pregnancy, the sensitivity of the FBG was 55% (95% CI 32–76).31
Five studies21,23–25,28 compared the FBG≥7.0mmol/L (≥126mg/dL)7 to a 2-h 75-g OGTT with a fasting threshold≥7.8mmol/L (≥140mg/dL) and a 2-hr plasma glucose>11.1mg/dL (>200mg/dL)21 (Tables 1 and and2).2). These studies consistently reported high specificity of the FBG (range: 94 to 99 percent) with very few “false positives” (Fig. 2). However, the sensitivities of the FBG alone ranged from 14 to 100 percent (Fig. 2)
Cypryk et al.24 reported the lowest sensitivity of the single FBG at 14%. The study population differed from that of the other studies, as 23% of the subjects were excluded from screening because of a new diagnosis of type 1 or 2 diabetes postpartum. Also, this population was 100% Polish and most likely Caucasian, unlike the other studies' populations.24 Without this study, the range in sensitivities of the FBG was more moderate (range 69%–100%). We were unable to determine if the variability in sensitivities was related to timing of testing, study population differences, or other factors because of the limited number of studies included in this comparison.
Based on the STARD initiative recommendations,18 we identified those studies that reported imprecision of glucose measures as the coefficient of variation (CV). The CV is calculated by repeating the test and comparing results over several specified days.18 The study by Holt et al.25 was the only study that reported a measure of reproducibility. The CV for the plasma glucose assay was 1.2% at 3.3mmol/L (59.4mg/dL) and 1.5% at 16.5mmol/L (297mg/dL), which reflects low variability.25 No included studies compared the reproducibility of the FBG with the OGTT, which included a 2-h plasma glucose value. Six studies reported the type of laboratory equipment used to test samples as an indicator of quality control.21–23,25,26,28
To our knowledge, this is the first systematic review assessing the sensitivity, specificity, and reproducibility of postpartum screening tests for type 2 diabetes in women with a history of gestational diabetes. Ideally, a screening test for type 2 diabetes should be both sensitive and specific, as well as reproducible. We found that the FBG≥7.0mmol/L (≥126mg/dL) had a wide range of sensitivities compared with the 2-h 75-g OGTT reference standard. The range of sensitivities was 14%–100% in 5 studies using the 1985 WHO criteria20 as the reference standard and 16%–89% in 5 studies using the 1999 WHO criteria6 as the reference standard. Overall, the FBG had high specificity (>90%) in all potential comparisons. No included study reported the reproducibility of the FBG or OGTT in their population. All the studies in our review were cross-sectional and unable to address the question of postdelivery timing and frequency of screening for type 2 diabetes.
The ADA suggests screening for type 2 diabetes in people over the age of 45 and in higher-risk groups, such as women with prior gestational diabetes, using the FBG over the 2-h OGTT.33 In addition, the National Institute for Health and Clinical Excellence in the United Kingdom recently revised its recommendation and advocates performing an FBG instead of a postpartum OGTT in women with recent gestational diabetes.10 Although debated in the literature, these recommendations to use the FBG were based on the association of an FBG≥7.0mmol/L with microvascular disease and the practical advantage of the FBG over the OGTT, as it is less invasive and less expensive and requires less time to perform.16
In the general adult population, the FBG as a screening test for type 2 diabetes has sensitivities ranging from 32% to 95% and specificities ranging from 84% to 99%.34 We identified a much greater range in the reported sensitivities of the FBG for the diagnosis of type 2 diabetes in women with recent gestational diabetes, however. There are several possible reasons for the high variability. First, it is possible that the pathophysiology of gestational diabetes leading to the development of type 2 diabetes may differ from the development of type 2 diabetes in an older population. There may be additional diagnostic value to the 2-h plasma glucose test in women with recent gestational diabetes. Further studies are needed to understand variations in glucose levels from delivery forward, as well as the consequences of a missed diagnosis if a FBG is used rather than the complete OGTT. Second, even among women who met the criteria for gestational diabetes, there were great differences in the study populations' racial/ethnic composition, timing of testing in relation to delivery, and loss to follow-up. Only 2 studies24,28 excluded women who had been diagnosed with type 1 or 2 diabetes postpartum prior to the screening test. Also, the majority of studies performed testing less than 6 months postpartum, and the sensitivities of the FBG still ranged from 14% to 72%. However, the 3 studies24,28,30 that had mean or median follow-up intervals>2 years postpartum also had older populations, a known risk factor for type 2 diabetes.17
Differences in the study populations as a result of exclusion of known diabetes, timing of testing, and racial/ethnic composition may influence the populations' underlying risk for type 2 diabetes and the current prevalence of hyperglycemia. The impact of these differences on a test's performance characteristics has been referred to as the “spectrum effect.” It may result in lower reported sensitivity of the FBG in lower-risk populations and, depending on the clinical setting, reduce the applicability of these studies to clinical practice.35 For example, in screening studies for type 2 diabetes performed in the general population, both the sensitivity and specificity of the FBG were shown to be higher when subjects with diagnosed diabetes were included.34
Reproducibility, the ability of a test to have consistent results when repeated, is an important criterion for a screening test. A single study25 included in our review reported the CV for the plasma glucose assay, but no study reported the reproducibility of the FBG or OGTT in their population. A study by Corcoy et al.,37 which did not meet inclusion criteria for this review, focused on examining the reproducibility of the 2-h 75-g OGTT according to the 1999 WHO criteria6 for diagnosing type 2 diabetes in women with a history of gestational diabetes. The OGTT was only moderately reproducible, confirming a diagnosis of type 2 diabetes in 60% of women who were retested 3 months after a positive test.37 In addition, the Hoorn Study, performed in the general population, showed that the FBG test is more reproducible than the 2-h plasma glucose test.38
There are several key limitations to our review. First, our research question focused on the diagnostic criteria used in screening for type 2 diabetes, and we did not address the performance characteristics of testing for impaired glucose tolerance or impaired fasting glucose, which may be evidence of a prediabetes state. Clinically, it is not yet clear if identification of this asymptomatic prediabetes state will alter provider recommendations or change outcomes.16 Second, there were 7 studies,21–25,28,29 that used an FBG threshold value≥7.8mmol/L (≥140mg/dL) as part of the 2-h 75-g OGTT. This test may no longer be clinically useful, as current guidelines recommend a threshold≥7.0mmol/L (126mg/dL) as part of the 2-h 75-g OGTT.6,7 These articles met our inclusion criteria, however, and the comparisons with the FBG still provide information about the additive value of the 2-h plasma glucose.21–24,28 Third, overall study quality was limited by sampling methods involving the use of convenience samples that had high losses to follow-up. Based on Hunt and Conway,26 higher-risk patients in San Antonio, Texas, were less likely to follow-up and receive type 2 diabetes screening, although this pattern may have geographic differences. In any case, the high loss to follow-up clearly limited the generalizability of the results.
The primary implication of our study is the identification of an urgent need for future research to determine optimal care for women with a history of gestational diabetes, including which screening test to use for type 2 diabetes, as well as the timing and frequency of testing. Clearly, diagnostic accuracy is an important criterion for providers to choose an appropriate screening test. However, the studies included in this systematic review uniformly had poor adherence to recommended postpartum screening, with high losses to follow-up, consistent with other studies.14,15 As noted by the ADA and the National Institute for Health and Clinical Excellence in the United Kingdom,7,10 the FBG may improve both provider and patient adherence with postpartum screening such that more high-risk women would receive screening. Longitudinal studies are needed to determine the optimal timing of testing, short-term and long-term outcomes of testing, and the frequency of surveillance.
The FBG was not consistently reported as a sensitive screening test for type 2 diabetes in women with a history of gestational diabetes. Although diagnostic accuracy is an important criterion, the FBG may be a more acceptable test for patients and providers. Longitudinal studies are needed to address the natural history of glucose metabolism in women with gestational diabetes, the optimal approach to diagnostic testing for type 2 diabetes in this population, and the short-term and long-term outcomes of testing.
This article is based on research conducted by the Johns Hopkins University Evidence-Based Practice Center under contract to the Agency for Healthcare Research and Quality (Contract No. 290020018), Rockville, MD. The authors of this article are responsible for its contents, including any clinical or treatment recommendations. No statement in this article should be construed as an official position of the Agency for Healthcare Research and Quality or of the U.S. Department of Health and Human Services. We thank Dr. Shilpa Amin for her support as the Task Order Officer for this project.
The authors have no conflicts if interest to report.