|Home | About | Journals | Submit | Contact Us | Français|
Planning study visits during specific menstrual cycle phases is important if the exposure or outcome is influenced by hormonal variation. However, hormone profiles differ across cycles and across women. The value of using fertility monitors to time clinic visits was evaluated in the BioCycle Study (2005–2007). Women aged 18–44 years (mean, 27.4) with self-reported menstrual cycle lengths of 21–35 days were recruited in Buffalo, New York, for 2 cycles (n = 250). Participants were provided with home fertility monitors that measured urinary estrone-3-glucuronide and luteinizing hormone (LH). The women were instructed to visit the clinic for a blood draw when the monitor indicated an LH surge. The monitor recorded a surge during 76% of the first cycles and 78% of the second cycles. Scheduling visits by using set cycle days or algorithms based on cycle length, such as a midcycle window or a window determined by assuming a fixed luteal phase length, would be simpler. However, even with perfect attendance in a 3-day window, these methods would have performed poorly, capturing the monitor-detected LH surge only 37%–57% of the time. Fertility monitors appear to be useful in timing clinic visits in a compliant population with flexible schedules.
Hormone levels in premenopausal women may be of interest as exposures, outcomes, or confounders in many areas of research related to women's health (e.g., fertility, cancer, cardiovascular disease). However, measuring hormone levels is complicated by the fact that they change across the menstrual cycle (Figure 1). In addition, hormone levels and the timing of key phases of the menstrual cycle vary across both women and a given woman's cycles (1). Therefore, it is difficult to anticipate when to measure hormone levels in order to capture key events such as the estrogen rise during the follicular phase, the luteinizing hormone (LH) surge, or the increase in progesterone during the luteal phase.
Researchers have used a variety of approaches in an attempt to measure hormone levels in a meaningful way given the inherent variability of these markers. Nevertheless, many studies rely on a single serum sample drawn on an arbitrary day of each woman's menstrual cycle, which seems inadequate, even with adjustment for cycle day or for cycle phase status (determined by the sample's progesterone level) (2–4). Other studies have attempted to time specimen collection to key hormonal events based on self-reported cycle length or cycle day or the assumption that the luteal phase is 14 days (5–7), but these methods assume that women can reliably report their cycle length or that the timing of hormonal fluctuations is the same across women, both of which are unlikely to be true (1, 8–11). The “gold standard” is daily collection of first-morning urine specimens, which ensures that critical hormone windows are captured if compliance is adequate, but the protocol is burdensome, which affects recruitment and compliance. Ideally, it would be possible to characterize hormone levels across the menstrual cycle by collecting fewer, well-timed measurements, but it is unclear what algorithm should be applied to prospectively assess when the samples should be collected.
In this paper, we examine the utility of providing study participants with a home fertility monitor that can detect an LH surge and compare the timing of the LH surge detected by the monitor with the timing assumed by algorithms based on cycle length.
The BioCycle Study was carried out at the University at Buffalo under an Intramural Research Program contract from the Eunice Kennedy Shriver National Institute of Child Health and Human Development. The study is described in greater detail elsewhere (12). In brief, 259 healthy women were recruited from the community and enrolled for 2 menstrual cycles. Nine women contributed only 1 cycle. Women had to be aged 18–44 years, to have a self-reported cycle length from 21 to 35 days, and to have no known conditions that might affect their menstrual cycle function (e.g., being underweight, current use of hormonal contraception, current breastfeeding). Table 1 contains the complete list of exclusion criteria (12).
Eligible women who consented to participate in the study were asked to come to the clinic 8 times per cycle with 3 visits planned around the time of the expected LH surge. At each cycle visit, the women provided fasting blood and spot urine specimens. Estradiol, follicle-stimulating hormone, LH, and progesterone were measured in all available serum samples by using IMMULITE 2000 chemiluminescent enzymatic immunoassays (Kaleida Health, Buffalo, New York). Participants were also provided with and trained to use the Clearblue Easy Fertility Monitor (Inverness Medical Innovations, Inc., Waltham, Massachusetts) (www.clearblueeasy.com). In addition, they were asked to complete a series of questionnaires and brief daily diaries at home.
The fertility monitor was originally developed to assist women in becoming pregnant by helping them to identify their fertile window through measurement of both estrone-3-glucuronide (E3G) and LH in urine as described in greater detail elsewhere (13). Briefly, the monitor is synchronized to a woman's cycle and then is checked daily to see whether a test is requested. Between the sixth and the ninth days of a woman's cycle (depending on her cycle length history), the monitor begins to request daily tests for 10 days. On test days, the woman briefly submerges a test stick in her first-morning urine and then inserts the test stick into the monitor. The test stick has a nitrocellulose strip with an anti-LH antibody zone and an E3G conjugate zone. The monitor optically reads the level of E3G and LH in the urine by the intensity of the lines in the corresponding zones (13).
Each day, the monitor assigns the woman to low, high, or peak fertility on the basis of her E3G and LH levels. Thus, unlike home LH test sticks, the monitor provides information to help anticipate the LH surge. High fertility is determined by the level of E3G and correlates with the rise in estrogen during the follicular phase (Figure 1). Peak fertility is assessed by the level of LH and correlates with the LH surge prior to ovulation. The monitor first requests 10 consecutive days of testing for all women. If the woman does not reach peak in those first 10 days (whether from poor compliance with testing or low levels of LH), the monitor requests an additional 10 tests for a total of 20 days. The monitor initially determines high and peak fertility levels on the basis of predetermined cutpoints (peak corresponds with approximately 30 IU/L for LH), but if the woman does not reach those cutpoints, the monitor adjusts the cutpoint criteria according to the woman's specific hormone levels.
Although monitor users see only their fertility status and the day of their cycle in the display screen, more detailed data are stored by the monitor and can be downloaded, which is another advantage over home LH test sticks. The data include the following: the date and time the monitor was turned on, the day of the cycle, and whether a test was performed. For days when a test was completed, the fertility level (low, high, peak) and the actual levels of E3G and LH (in percentage transmission units) are also available.
In this paper, we examine summary data from the monitor to see whether the monitor was useful in identifying the LH surge and in helping to time the clinic visits. In addition, we compare 3 alternative methods of prospectively anticipating the timing of the LH surge to see whether they actually capture the surge according to the monitor. The first alternative (fixed cycle days method) schedules all women for a blood draw on cycle days 13–15, which should capture the LH surge in an idealized 28-day cycle (Figure 1). The second method (the luteal-phase method) is based on the assumption that variability in the cycle length is predominately due to variability in the follicular phase and that the luteal phase is more stable across women, averaging approximately 14 days. For this method, a 3-day window around the estimated LH surge is created by subtracting 15 days and 13 days from the woman's self-reported usual cycle length. For example, for a 30-day cycle, this would correspond to cycle days 15–17. The final method (the midpoint method) assumes that the LH surge occurs around the midpoint of the cycle, and therefore, the 3-day window is created around the midpoint of the self-reported cycle length. For a 30-day cycle, this would correspond to days 14–16. We also examined the latter 2 methods using the women's actual cycle lengths even though they would not be known prospectively and therefore could not be used to schedule visits in an actual study.
Participants in the BioCycle Study tended to be young, non-Hispanic white, highly educated, not living as married, and to never have been pregnant (Table 2). A range of income levels was represented in this study. Of the 259 participants who contributed at least 1 study cycle, 249 (96%) brought their monitors in to have the data downloaded at least once during the first cycle, and 232 (93%) of the 250 women who participated for 2 cycles had data downloaded at least once for the second cycle. We considered adherence to testing for the first 10 tests that are requested of all women (Table 3). Of the women who brought in their monitors at least once in a cycle, 82% missed 2 tests or fewer in cycle 1, and 75% missed 2 tests or fewer in cycle 2. Approximately 85% of the women reached high fertility, and over three-quarters reached peak fertility during each cycle. At least three-quarters of the women who reached peak came into the clinic for a blood test on the day the monitor read peak.
Among the women who did not reach peak fertility, 72% in cycle 1 and 64% in cycle 2 missed 2 tests or fewer out of the first 10 requested tests. An increase in progesterone during the luteal phase can be considered a marker for ovulation (Figure 1), so we examined the maximum detected level of serum progesterone. Among the women who did not reach peak on the monitor, 65% in cycle 1 and 85% in cycle 2 had at least 1 serum progesterone measurement that was greater than or equal to 5 ng/mL, which suggests that they did ovulate (Table 3).
In order to further evaluate why women did not reach peak on the monitor, we examined the serum hormone data in conjunction with the detailed monitor hormone data for each woman who did not reach peak. Classification of the reasons for not reaching peak was subjective but is intended to help characterize the most likely challenges to using the monitor in a study. The most common reasons for not reaching peak varied by cycle (Table 3). Failure to comply with the monitor protocol, whether due to missed tests or failing to bring the monitor in to have the data downloaded, was a primary reason for not detecting peak in both cycles. Peak was also not observed when there was a monitor error in reading the test stick, which can occur if the stick is too wet or too dry. In addition, the monitor does not recognize LH surges prior to day 9 as peak and requests no more than 20 tests regardless of cycle length. Some women did not reach peak because they had an early or late surge in LH (not necessarily followed by an increase in progesterone) or a long cycle. Approximately one-fifth of the women who did not reach peak had evidence of a small LH surge followed by elevated progesterone, but many of their cycles looked atypical (e.g., estrogen remained flat throughout the cycle). Finally, some women truly appeared not to have evidence of an LH surge.
For both cycles, the mean day of peak was cycle day 15 (standard deviation (SD), 3.3) among women whose monitor reached peak (Table 4). As expected, the mean day of peak was earlier for women with shorter cycles and later for women with longer cycles. In addition, the mean day of peak decreased with increasing age among women who reached peak. The mean day of peak occurred 13 days (SD, 2.9) before the end of the cycle on average for both cycles. Peak was closer to the end of the cycle for shorter cycles but did not change across age groups. The mean number of days between reaching high fertility (E3G surge) and peak fertility was 5 days (SD, 3.2 for cycle 1 and 2.9 for cycle 2) among women whose monitor reading captured both. Of the women who reached peak, 27 (14%) in cycle 1 and 31 (17%) in cycle 2 went to peak without reaching high first.
Among women who had a monitor peak, the peak day fell between days 13 and 15 (fixed-cycle-day method), 41% of the time in cycle 1 and 37% in cycle 2 (Table 5). The luteal-phase and midpoint methods performed similarly when self-reported cycle length was used to determine the windows, but they improved slightly when the actual cycle length was used with the luteal-phase method performing the best.
Cycle length self-reported at baseline differed by 3 or more days compared with observed study cycle length over 40% of the time (Table 6). In addition, the actual study cycle length for the first cycle differed from that of the second cycle by 3 days or more for almost half the participants who participated for 2 cycles. When peak fertility was observed in both cycles, the cycle day of occurrence differed from one cycle to the next by 3 or more days 37% of the time. However, only one-quarter of the women had the day of peak fertility differ by 3 or more days when the timing of the peak day was assessed from the actual end of each cycle.
For many studies, the timing of biospecimen collection within a woman's menstrual cycle is important because the exposure or outcome is affected by hormone levels that change across the menstrual cycle. However, it is difficult to prospectively anticipate the timing of critical phases of the cycle without daily hormonal measurements. In the BioCycle Study, the fertility monitor was useful in scheduling clinic visits around the time of the LH surge.
Participants in the BioCycle Study did not provide daily serum or urine specimens at the clinic, so it is not possible to validate the LH peak as detected by the monitor in this study. Nevertheless, we felt confident that, with proper usage, the fertility monitor would be able to detect the LH surge on the basis of the findings of Behre et al. (13). They compared fertility monitor data with serum hormone levels and transvaginal ultrasound scans in a study of 53 women. In that study, the monitor recorded peak fertility in 135 of 149 ovulatory cycles as determined by ultrasound (13). Of the 149 ovulatory cycles, a serum LH surge was detected in only 139 cycles, which means that the monitor detected an LH surge when there was a detectable surge 97% of the time (13). In addition, the monitor did not reach peak in the 1 anovulatory cycle (13).
Although the work of Behre et al. (13) suggests that the monitor can reliably detect the LH surge, the monitor is helpful only as a research tool if the women use it properly. In the BioCycle Study, the participants had high but not perfect adherence to the monitor-testing protocol. Because the LH surge occurs during a very small window, even missing a few tests could be problematic, depending on when in the cycle the tests were missed. Nevertheless, peak was observed for over 75% of the cycles, and high fertility, which is not confined to a single day and therefore is less dependent on consistent testing, was observed in approximately 85% of the cycles.
There are several factors that contribute to not observing peak fertility. The logistic factors include incomplete data because the woman did not bring her monitor in to have the data downloaded, poor compliance with testing, good compliance with testing combined with an unfortunately timed missed test, and test-stick reading errors. In some cases though, the lack of observed peak fertility may be due to anovulatory cycles or atypical hormone profiles that may or may not have resulted in ovulation. Our results are similar to those of Robinson et al. (14), who observed peak in approximately 80% of their study cycles, which also reflects a combination of underlying cycle characteristics and compliance.
Timing blood draws at the clinic around the LH surge depended on both the monitor results and the ability of the participants to come to the clinic at the appropriate time. The study protocol instructed participants whose monitors read peak to go to the clinic that morning or to call to schedule a visit as soon as possible. Over three-quarters of the women who reached peak fertility on the monitor were able to attend the clinic for a blood draw on that day. In general, the BioCycle Study participants were an adherent group with schedules that allowed them to come to the clinic on short notice. The need for flexibility in attending the clinic was emphasized at enrollment to increase the likelihood of compliant participants. Once enrolled, participants were given individualized calendars with a projected clinic visit schedule that was revised as needed. These calendars were designed to help the women to anticipate their visit schedule as much as possible. To further encourage adherence to the visit schedule, study staff were sensitive to the time commitment of the participants and followed procedures to minimize the duration of each visit. This flexibility might not be possible in all study settings.
The standardized approaches to scheduling clinic visits (fixed-cycle days, luteal-phase method, and midpoint method) would be easier to implement logistically but would also be less likely to capture the day of peak fertility. The fact that women with longer cycles reached peak fertility later than did women with shorter cycles on average supports the concept behind the standardized methods that cycle length is correlated with the timing of ovulation. In addition, the mean cycle day of peak fertility changed across age categories, but the mean day from the end of the cycle of peak fertility did not, which provides some support specifically for the luteal-phase method. However, the day of peak fertility was farther from the end of the cycle for longer cycles and closer to the end for shorter cycles on average. Despite the compatibility of the mean peak fertility day with general expectations, use of any of the standardized methods would have missed the peak fertility day for the majority of women, even assuming perfect attendance at the clinic during the predetermined 3-day window. The luteal-phase and midpoint methods are based on cycle length, but only self-reported cycle length can be known prospectively, and it performs poorly for both algorithms. Even if the actual cycle length could be used in a prospective study, it was only marginally better than self-reported cycle length. The luteal-phase method slightly improved on the midpoint method, but only if the actual cycle length was used.
Several factors may have contributed to the poor performance of the standardized methods. Natural biologic variability in individual women is complicated by the fact that the event (LH surge) occurs in a small window (approximately 1 day), so that even modest errors in the timing of observation can miss the event completely. In addition, both we and others (9, 11) have found that women's self-reported cycle lengths, which are used to schedule visits in the standardized methods, do not correlate well with their actual cycle lengths. This may be due in part to the fact that women's cycle lengths vary from one cycle to the next, so that even if they correctly characterize length on average, it may be inaccurate for the study cycle. Thus, even prospectively observing a woman's cycle length prior to any clinic visits in order to appropriately schedule clinic visits during a subsequent cycle does not necessarily improve visit timing due to within-woman variability.
Using a fertility monitor is less burdensome and expensive then collecting, storing, and analyzing daily urine or blood samples. However, the fertility monitor is not suitable for use in populations with extremely short (<20 days) or long (>42 days) cycles. In the BioCycle Study, the women were fairly homogeneous and highly educated, and they were selected to have “normal” menstrual cycles. Their adherence to the fertility monitor protocol was good despite the fact they were not trying to become pregnant, one of the traditional motivating factors in reproductive studies. Nevertheless, achieving this level of protocol adherence in other study populations might be difficult. Poor compliance would undermine the utility of the monitor, but the monitor does not need to be used in isolation. In the BioCycle Study, women who did not reach peak fertility, whether because of logistic issues related to testing or because they were anovulatory, still had clinic visits that were timed on the basis of their individualized calendars. The visits for these women may not have been timed correctly, but presumably, the timing was no worse than if we had not used a monitor at all.
Although the fertility monitor does not provide as much information as collecting daily biospecimens, it does seem to be useful for timing clinic visits, particularly compared with standardized methods of scheduling based on cycle length. In fact, the monitor gives instant results, whereas daily biospecimens are generally analyzed at the end of a study, so the monitor is advantageous if other timed measurements (e.g., test of muscle strength) are desired. In addition to identifying the LH surge, the fertility monitor could be used in studies where the initial estrogen surge in the follicular phase (corresponding to the first day of high fertility on the monitor) or a specific day of the luteal phase (corresponding to a set number of days after the monitor peak) is of interest. The monitor would be most useful for researchers who have limited funds but need to time clinic visits to particular phases of the menstrual cycle, especially if the study population includes women with flexible schedules and “regular” menstrual cycles.
Author affiliations: Division of Epidemiology, Statistics, and Prevention Research, the Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, Maryland (Penelope P. Howards, Enrique F. Schisterman); Department of Social and Preventive Medicine, School of Public Health and Health Professions, University at Buffalo, Buffalo, New York (Jean Wactawski-Wende, Jennifer E. Reschke, Andrea A. Frazer, Kathleen M. Hovey); and Department of Gynecology-Obstetrics, School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, New York (Jean Wactawski-Wende).
This work was supported by the Intramural Research Program of the Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health.
The authors would like to thank Scott Littlejohn of Inverness Medical Innovations, Inc., for his assistance in improving their understanding of all aspects of the fertility monitor and Dr. James Kesner of the National Institute for Occupational Safety and Health for generously creating an idealized menstrual cycle figure.
Conflict of interest: none declared.