|Home | About | Journals | Submit | Contact Us | Français|
Although oral contraception (OC) misuse is presumed to play an important role in unwanted pregnancy, research findings have often been equivocal, perhaps reflecting unaddressed inconsistencies in methodological approaches.
Using established databases, we performed a systematic review of measurement methods for OC use using primary research reports published from January 1965 to December 2009.
Terminology used to describe OC use, which included “continuation,” “compliance,” and “adherence,” differed across studies and was rarely defined. The majority of studies (n=27 of 38, 71%) relied solely on self-report measures of OC use. Only two reports described survey or interview questions, and reliability and validity data were seldom described. More rigorous measurement methods, such as pill counts (electronic or manual), serum and urinary biomarkers, and pharmacy records, were infrequently employed. Nineteen studies simultaneously used more than one method, but only three studies compared direct and indirect methods.
The lack of a consistent, well-defined measurement of OC use limits our understanding of contraceptive misuse and related negative outcomes. Future research should clarify terminology, develop standardized measures, incorporate multimethod approaches with innovative methods, and publish details of measurement methods.
The oral contraceptive pill (OC) is the most popular form of hormonal contraception in the United States1–3 and is highly effective when used perfectly.4,5 Perfect use, however, is seldom achieved.6 As many as 30% of women report missing one or more pills per month,7 and approximately half of new OC users will discontinue use within the first year.8 Recent data from the National Survey of Family Growth indicate that typical users report a 9% failure rate,5 which may be due in part to pill-taking mistakes. Women who misuse or discontinue the pill are three times as likely to have an unintended pregnancy as those who continue the method.7 Over 1 million unintended pregnancies in the United States are believed to result from OC method failure, misuse, or discontinuation, with more than half due to discontinuation alone.7
Although a considerable body of work on OC use patterns exists, researchers rarely use consistent terminology. Terms such as “compliance,” “adherence,” and “continuation” have been used interchangeably,9 and terms such as “misuse”, “nonuse,” and “correct use” are not well defined. Variable terminology contributes to imprecise measurement of OC use and impedes the ability to interpret and apply existing research findings to clinical practice. Additionally, there are no methods for measuring OC use that are accepted as standard.9–17 The majority of OC investigations have relied on interviews and questionnaires, although self-report is the least rigorous way to assess contraceptive behavior, given the strong potential for reporting bias.18,19
We conducted a review to summarize and evaluate the literature on measurement of OC use with the following questions in mind: What terminology has been used to describe OC use? What methods have been used to measure OC use? To what extent have the reliability and validity of these approaches been tested? Finally, is there evidence to support a particular method for evaluating OC use?
We performed a computerized search for articles published from January 1966 through December 2009 using databases of MEDLINE, Cumulative Index of Nursing and Allied Health Literature (CINAHL), PubMed, Google Scholar, and PsychInfo. The following key words were searched: oral contraception or contraceptives, combined contraceptive pills, birth control pills, use, misuse, compliance, noncompliance, adherence, nonadherence, continuation, discontinuation, behavior. Reference lists of key articles obtained from the database search were also examined for relevant citations; unpublished articles, dissertation reports, or conference abstracts were not searched.
We considered articles for inclusion if they were written in English, published in peer-reviewed journals, and focused on human females. Article titles and then abstracts were screened to identify primary research reports in which OC use was characterized in some identifiable way. Editorials and studies that did not specifically address combined OC or measurement were excluded.
Information on the following study characteristics was collected: study purpose, design, level of evidence,20 population, sample size, intervention, primary and secondary outcomes, and outcome variable terminology. We examined each report for descriptions of the measurement method(s) used to assess OC use, including features of reliability and validity. Methods were classified according to the categories described by Osterberg and Blaschke11 for assessing patient adherence to oral medication. This classification system includes both direct methods, such as serum or urinary measures of medication or hormone metabolite levels, and indirect methods, such as electronic monitoring devices. Descriptions of each category, along with their advantages and disadvantages are given in Table 1.
Thirty-eight articles14,15,21–56 met inclusion criteria. Selected study characteristics from the articles reviewed are presented in Table 2. Six studies were level I randomized controlled trials with OC use as a primary outcome,29,36,38,43,44,49 and the remaining 32 studies met criteria for level II quality of evidence. Few interventions in these studies were interventions of OC use measures.
Purposes for measuring OC use varied among studies reviewed (Table 3). Seventeen reports described OC use in defined populations,15,21,25–27,29,31,33,35,37,39,42,47,53–56 15 reports identified predictors of OC use,14,23,24,30,32,40,41,45,46,48,50,51,54–56 7 were randomized trials that evaluated some aspect of OC use,29,34,36,38,43,44,49 and 2 were validation studies that evaluated approaches to measuring OC use.22,52
All studies except 2 were conducted in clinic-based settings; the remaining 2 used population-based samples.48,56 The most common setting was reproductive health practices.21–24,26–29,49–52 All but 5 studies were carried out in the United States.25,34,39,44,14 The reported race/ethnicity of samples was mixed14,32,33,36,37,39–42,43,44,53–56 or Caucasian.21–24,26,35,48 Sample sizes ranged from 1152 to 237,242.54
Terminology used to describe OC use differed among studies. Continuation/discontinuation14,24,29–32,36,37,39,33,40,45,54–56 and compliance/noncompliance14,33–38,43,44,50–53 were the most commonly used terms, followed by use/misuse,14,15,23,25–28,41,46–48 behavior,15,21,26,46,47 and adherence/nonadherence.27,42 No study clearly defined the outcome term used.
The most common measurement method for OC use was self-report. Interviewer-administered questionnaires were used by 26 research teams.14,15,26,27,29–31,34–37,39–41,43–52,55,56 Self-administered surveys were used by 11 research groups25,26,28,35–39,42,44,45 and daily diaries by 6 teams.21–24,38,42 Nine studies used a combination of two or more self-report methods;26,35–38,42–45 only 6 studies combined a self-report method with a direct method or more rigorous indirect measure.21–23,49,50,52
Few researchers provided information on reliability and validity of measures from studies using self-report methods. Only 2 reports described survey or interview questions or diary formats.55,56 Moreau et al.56 used data from the 2002 cycle of the National Survey of Family Growth to investigate discontinuation due to method dissatisfaction and described the standardized procedures used for the interview-collected data in this population-based survey. Oakley et al.26 used computer-assisted interviews, provided training for interviewers, and performed repeat interviews 1 month after initial assessments in order to improve reliability of the self-report by family planning patients and reported a Cohen's kappa of 0.8 for most items as well as a coefficient of 0.6 or higher for test-retest reliability. Other studies used pilot-tested or previously used questionnaires27,35–37,15,47,55 and training for interviewers26,27,29,40,55 to improve reliability and validity.
Two reports examined patient charts but did not describe the specific chart information extracted or how outcomes, such as pregnancy, were determined. Blumenthal et al.39 used chart reviews and the physiological outcome of pregnancy to inform a clinical judgment of OC use. Lara Torre and Schoeder33 compared chart records of women who initiated OC by a Quick Start intervention vs. traditional Sunday start to determine compliance over time. Westhoff et al.29 collected chart information as a secondary measure of misuse and resulting pregnancy only if they were unable to contact women for telephone follow-up interviews.
Zink et al.53 used Medicaid-paid claims to determine the number of pill packs claimed during the 12-month study period as a measure of months of contraceptive coverage. Foster et al.32 used pharmacy-paid claims, examining the number of packs dispensed, timing of dispensation, gaps in dispensation, and pack dispensation or method change before finishing previous packs to identify pill cost as a correlate of OC use. Similarly, Murphy and Brixner54 used a large claims database to examine number of filled contraceptive prescriptions over 3 months. None of these studies were able to determine if patients actually ingested prescribed pills or obtained pill packs from other sources.
Gilliam et al.43 used manual pill counts and pregnancy outcome after written surveys and interviews. To increase reliability, researchers trained pill counters and concealed treatment assignments. The method of pregnancy confirmation was not described, and comparisons of data by measurement method were not compared. Oakley et al.21,23 tested a daily diary card method against a pill-counting electronic monitoring device in which a microchip recorded the time and date when a pill was pushed out of the pill pack. Reliability and validity were not addressed. The authors reported a decline in missed pill self-reporting over time and poor record taking during the third month, resulting in an overestimation of appropriate pill use with diaries compared with the electronic monitors.
In a separate report, Potter et al.22 compared the validity of the same electronic device with that of a self-report diary. The investigators found diaries significantly overestimated daily pill use rates when compared with rates obtained from the device. By the third month of monitoring, measurement agreement between methods was 38%, and the rate of women missing pills as determined by the device was triple the rate reported by diaries. The authors concluded that the electronic device was more accurate in measuring OC pill use, although reliability and validity features were not provided.
DuRant et al.50 used the four-factor Guttman scale to identify factors related to OC compliance in participants in a randomized clinical trial (RCT). Participants received a low-dose combined OC, with 28mg riboflavin added as a urinary metabolite marker for pill ingestion. The Guttman scale, which was evaluated in pilot work, assessed avoidance of pregnancy (not described), appointment adherence (three visits), interview-assessed self-reported missed OCs (three or more during a month), and fluorescence intensity of urinary concentrations of riboflavin.50,51 At follow-up, the presence of urinary riboflavin was assessed by ultraviolet light and was determined in a double-blind fashion by three independent observers. The Guttman scale yielded strong coefficients of reproducibility (0.96) and scalability (0.84). Additionally, self-report had good agreement with the urinary metabolite assessment and other indirect measures, although tests for statistical agreement were not given.
In another report from this study, Jay et al.52 compared results from their clinical trial with findings from a pilot study of urinary florescence of 31 urine samples from 11 subjects. Urinary fluorescence determinations and self-reports from both study samples were significantly associated, and when agreement was achieved between measures, compliance was confirmed by evaluation of serum norethindrone (a synthetic progestin) concentrations in 90% of the cases. Jay et al.49 also conducted an analysis on 26 randomly selected participants in a clinical trial who underwent random serum testing for hormone metabolites as a confirmatory measure of the Guttman scale. Serum norethindrone samples were measured using a radioimmunoassay method. A high degree of association between the serum and urinary tests was found (p<0.02).52
This literature review highlights several important weaknesses in measurement approaches for OC use. Language to describe OC use varies and moreover is not always defined. As the outcomes of OC use studies are dependent on the definition of the main outcome, it is difficult to compare results across studies or to use results to create interventions to improve OC use. Of the 38 reports reviewed, >70% relied exclusively on self-reports from written survey, interview, or diary. Studies provided no information on specific survey or interview questions or diary formats, making it difficult to assess how outcomes might differ depending on type and number of questions asked. Self-report remains the most common OC use measure, likely for its convenience, ease of administration, and noninvasiveness.11 The likelihood of social desirability bias in OC-using populations,18 however, diminishes the accuracy of such studies.
Based on 3 studies by the same authors that used designs that have not been recently replicated, serum and urinary biomarkers appear to be a reliable measure for OC use.49,50,52 However, these investigators did not address the potential errors in direct methods resulting from white-coat adherence, that is, improved compliance immediately preceding clinic visits.57,58 Scheduled appointments remind patients to take their medications in the days just before the visit, resulting in compliance overestimates from temporarily elevated serum concentrations of the medication that are not reflective of actual drug use during the time between appointments.57,58
Studies of other types of pill use have shown electronic monitoring devices to be superior to self-report for their ability to characterize pill-taking patterns between appointments, particularly the timing between successive doses.58–60 Timing of OC doses is critical, given the relatively short half-life of low-dose combined estrogen-progestin steroids in blocking ovulation.60–65 Missed pills with “typical use” can cause a 50-fold less effectiveness rate than with “perfect use.”66 Although electronic monitoring has been evaluated as a successful approach for detecting patterns of medication misuse,58–60 only 3 studies by the same authors have evaluated a single electronic monitoring commercial product for OC use, which is no longer available.21–23
Based on a paucity of rigorous contraceptive studies, we were unable to identify a single superior measure of OC use. The World Health Organization has recommended standardizing terminology and the measurement approach used for describing and evaluating menstrual bleeding patterns67; OC use measurement studies could benefit from such standardization. Inconsistent use of the terms compliance, adherence, continuation, and pill-taking behavior has contributed to conflicting and equivocal findings in contraceptive research. Clarification of terminology used to describe optimal and suboptimal pill use will improve measurement quality. Consensus on a single term and standardized applications of it will increase reliability and validity across studies and improve our ability to synthesize findings and evaluate approaches for measurement.12,13 Such terms as misuse and noncompliance place blame on the research participant and fail to account for those who may have every intention of complying, but may not understand how to use it correctly or may not have continuous access.
Given these shortcomings in research methodologies, we recommend the following terms, which may attribute less intent to the pill user and more objectively describe OC-taking behavior:
Self-report measures will likely remain the mainstay of contraceptive research. We recommend the development of standardized measures for OC use, as has been developed for general medication use68–70 and sexual risk behavior, particularly with sexually transmitted infection acquisition.18 To strengthen self-reports, techniques that have been used in other health behavior research fields, such as anonymity in self-administration of questionnaires, audiocomputer-assisted self-interviews, and telephone administered methods, can reduce threats of socially desirable answers.18
Multidimensional measurement permits reliability and validity checks to verify findings and provides more robust assessment.11,49–52 The value of self-reports is enhanced when complemented with more objective methods.18 Direct methods provide quantitative data but are most reliable when used in conjunction with electronic monitoring devices.58,59 We recommend reevaluation of the accuracy of electronic monitoring devices and serum/urine tests in current populations and settings, specifically to confirm time intervals between successive pill doses. Electronic monitoring devices are especially useful in contraceptive intervention studies because they provide information that can be used for both analysis of pill use and creation of strategies for cognitive behavioral modification.6
We also recommend the evaluation of alternative approaches, such as visual analog scales (VAS),71 and pharmacoeconomic estimates of pill use patterns, such as medication adherence rate (MRA),72,73 that have been tested in studies of other health-related conditions and offer additional quality assessments for measuring medication use.
In order to gain a more accurate assessment of pill misuse and its negative sequelae, researchers need to speak the same language. Using consistent terminology across studies will allow for better comparison of results. Additional research using standardized psychometric evaluations of indirect and direct methods and improved measurement reporting are also needed to provide more reliable findings and facilitate improved understanding of OC use patterns. With a more accurate and comprehensive assessment, researchers can better develop and evaluate strategies to promote successful contraception and improved family planning outcomes.
This work was supported in part by NRSA individual training grant 1F31NR011119-01A1(K.S.H.) and HRSA grant D09HP14667 (N.E.R.) and the NIH Center for Evidence-Based Practice P30NR010677 (N.E.R., co-investigator; S. Bakken, PI).
The authors have no conflicts of interest to report.