|Home | About | Journals | Submit | Contact Us | Français|
It is unknown whether data obtained from maternal self-report for assisted reproductive technology treatment parameters and reproductive history are accurate for use in research studies.
We evaluated the accuracy of self-reported in assisted reproductive technology treatment and reproductive history from the Upstate KIDS study in comparison with clinical data reported to the Society for Assisted Reproductive Technology Clinic Outcome Reporting System.
Upstate KIDS maternal questionnaire data from deliveries between 2008 and 2010 were linked to data reported to Society for Assisted Reproductive Technology Clinic Outcome Reporting System. The 617 index deliveries were compared as to treatment type (frozen embryo transfer, and donor egg or sperm) and use of intracytoplasmic sperm injection and assisted hatching. Use of injectable medications, self-report for assisted reproductive technology or frozen embryo transfer prior to the index deliveries were also compared. We report agreement (A) in which both sources had Yes or both No and sensitivity (S) of maternal report using Society for Assisted Reproductive Technology Clinic Outcome Reporting System as gold standard. Significance was determined using Chi square at P<0.05.
Universal agreement was not reached on any parameter but was best for treatment type of frozen embryo transfer (agreement =96%; sensitivity =93%) and use of donor eggs (agreement =97%; sensitivity =82%) or sperm (agreement =98%; sensitivity =82%). Use of intracytoplasmic sperm injection (agreement =78%: sensitivity =78%) and assisted hatching (agreement =57%; sensitivity =38%) agreed less well with self-reported use (P<0.0001). In vitro fertilization (agreement = 82%) and frozen embryo transfer (agreement =90%) prior to the index delivery were more consistently reported than was use of injectable medication (agreement =76%) (P<0.0001).
Women accurately report in vitro fertilization treatment but are less accurate about procedures handled in the laboratory (intracytoplasmic sperm injection or assisted hatching). Clinics might better communicate with patients on use of these procedures and researchers should use caution when using self-reported treatment data.
More than five million babies have been born worldwide from assisted reproductive technology (ART), close to three million of these within the past six years 1. Numerous studies suggest that there is an increase in adverse outcomes in pregnancies resulting from assisted reproductive technology (ART).2–12 Not only is there a higher rate of multiple birth from these pregnancies,13 but increases in low birthweight, prematurity, small for gestation babies, and malformations are found in ART deliveries even in singletons.14–17 Multiple authors have called for outcome studies evaluating the long term health of these children and their mothers and have outlined the difficulties in getting these studies accomplished.18–20
Research on infertility can be performed using clinical diagnostic and treatment data, vital records data, or self-reported survey data and there are relative strengths and weaknesses to each of these data sources. With regard to self-reported data, we have previously evaluated the accuracy of report of IVF treatment in Upstate KIDS surveys and found it to be accurate.21 However, we have looked at treatment parameters in a very small group of 77 survey participants who underwent IVF treatment in Massachusetts and found mixed reporting accuracy.22
In order to use maternal self-reported data for research purposes we must have confidence that treatment information is recalled and reported accurately. This study compared self-reported parameters of ART treatment in the maternal survey of the Upstate KIDS study with clinical data in the Society for Assisted Reproductive Technology Clinic Outcome Reporting System (SART CORS) database. Accuracy of maternal self-report of treatment type and treatment parameters on the index pregnancy was assessed. The secondary objective was to investigate whether time to survey, age of the mother, previous ART use, or presence of male factor infertility (as reported in SART CORS and known to be associated with increased use of intracytoplasmic sperm injection [ICSI]) affect reporting accuracy.
Data were obtained from two sources, the Upstate KIDS maternal questionnaires and the SART CORS clinical ART data.
The Upstate KIDS Study used the New York State’s Perinatal Data System to identify all live births occurring to resident mothers of Upstate New York (57 NY counties excluding the 5 boroughs of New York City) between 2008 and 2010.23 Upstate KIDS was designed to obtain a population based cohort of infants conceived with and without infertility treatment including ART for the assessment of children’s growth and development. All infants for whom the infertility treatment box was checked on their birth certificates were recruited as well as all infants of multiple births irrespective of treatment status. Women delivering singletons conceived without infertility treatment were recruited based on a paradigm including frequency matching at a 3:1 ratio to women delivering singletons conceived with treatment within the perinatal region of delivery. The majority (93%) of Upstate KIDS mothers returned a self-administered questionnaire within 4–6 months of delivery. An incentive of $30 was provided to participants along with reminder calls and emails to achieve a high response rate. For this study we evaluated questions about ART treatment for the index delivery (Question 23) as well as questions about use of ART in previous pregnancies (Question 25). The New York State Department of Health (NYSDOH) and the University at Albany (State University of New York) Institutional Review Boards approved the Study; and served as the Institutional Review Boards under a formal Reliance Agreement with the Eunice Kennedy Shriver National Institutes of Health. All participants provided written informed consent prior to data collection.
The SART CORS database is used by SART to collect national ART data under the Fertility Clinic Success Rate and Certification Act of 1992 (Public Law 102–493) and to report these data to the Centers for Disease Control and Prevention (CDC). SART CORS collects data from more than 90% of US ART clinics and includes over 95% of the US ART cycles. The data collected includes patient demographic information (age, race, height and weight), reproductive history (prior cycles of ART and intrauterine insemination, and female infertility diagnosis) cycle-specific treatment data (fresh vs. frozen cycle, use of autologous or donor oocytes or embryos, use of ICSI, assisted hatching (AH) and other laboratory techniques, numbers of embryos transferred and quality of embryos transferred) and outcome data (cancellation, treatment outcome, pregnancy outcome, birthweight, gestational age). Data are validated annually through review by SART and CDC with yearly site visits to a random selection of clinics to check records for completeness and accuracy of data collection and data entry (http://www.cdc.gov/art/ART2011/NationalSummary_appixa.htm). SART CORS data for this study included fields related to use of donor gametes, micromanipulation and prior treatment.
Upstate KIDS deliveries were linked to ART cycles containing a birth outcome reported to SART CORS as previously described.21 Briefly, deliveries were linked using identifiers for the mothers the infants and the delivery information. Approximately 89% of the women linking to SART CORS had been invited to participate. Overall participation into the study was 27%.
We analyzed how closely the two data sources agree and the rates of reports of each treatment type by each data source. For clinical treatment parameters SART CORS was used as the gold standard, however, for prior treatment we used maternal self-report as the more accurate measure. The process included the evaluation of the percentage agreement between the two data sources for each of the parameters: donor gametes (sperm or oocytes), use of intracytoplasmic sperm injection (ICSI, listed as some or all oocytes within SART CORS), assisted hatching (AH, listed as some or all embryos in SART CORS), and use of fresh or frozen embryos for ART transfer. We determined in which data source the reported use of each of these parameters was greater. We also evaluated use of gamete intrafallopian transfer (GIFT) and zygote intrafallopian transfer (ZIFT), and use of a gestational carrier, however, for the index delivery there was no SART CORS reporting of these procedures. In addition, we evaluated self-report of procedural treatments that are a part of ART treatment such as vaginal ultrasound, and administration of medications. Because fresh and frozen ART treatment may make greater or lesser use of vaginal ultrasound, we evaluated this treatment in all cycles and separately in cycles using fresh oocytes and embryos only.
Sensitivity was determined for each of these parameters treating SART CORS information from the index cycle as the gold standard. SART CORS is considered the gold standard for these items since these are clinical data and are validated as described above. Sensitivity was defined as the proportion of women with a certain ART parameter in SART CORS that were correctly reported in the maternal questionnaire, 95% confidence intervals were calculated using the Agresti-Coull method.24 Logistic regression was used to estimate unadjusted odds ratios (OR) and their 95% confidence levels (95% CI) to identify participant characteristics (i.e., maternal age, time to report, male factor, previous ART) associated with sensitivity (on parameters that had less than perfect agreement). The analytic sample included all women who had the procedures according to the gold standard, allowing assessment of sensitivity. It did not, however, include all women that did not have ART and thus specificity was not estimated. Also, specificity would most likely be very high (>99%) given the relatively rare occurrence of ART compared to non-ART deliveries. In sensitivity analysis, we also evaluated the effect that weighting the analyses by twins, whom are oversampled in this cohort, has on the results.
For comparing information on fertility treatment received to achieve a delivery prior to the index delivery, sensitivity estimates were based on maternal report as the preferred standard rather than SART CORS. This decision is based on previous observations by two of the authors (Luke and Stern) that the prior ART cycle fields do not agree with information on prior cycles recorded for these women in longitudinally linked cycles within SART CORS. For example, national linked data show that for the second and third treatment cycles (cycles #2 and #3) 18.1% and 14.3% respectively are reported as having no prior fresh cycles and the number entered in that field may also have been reported as being much higher than 2 or 3 (up to 12).
We analyzed 617 Upstate KIDS deliveries that were linked to ART cycles reported to SART CORS. Mothers completed questionnaires upon enrolling their infant in the Upstate KIDS study at approximately 4 months following delivery with a median time to report of 147 days. Demographics of the linked cycles as reported in the Upstate KIDS maternal exposure questionnaire are presented in Table 1. The majority of mothers were age 35 years or older at delivery, of white non-Hispanic race/ethnicity, with a college or higher education and covered by private health insurance.
Table 2 shows the results of comparisons of treatments used for the index delivery as reported to SART CORS compared to self-reported maternal treatment data. No parameter had one hundred percent agreement. Strongest agreement (the sum of both sources indicating “Yes” and both sources indicating “No”) was found for medication use (83%), use of donor gametes (97–99%), and use of frozen embryos (96%). The greatest differences between self-report and clinical data were found for laboratory-preformed micromanipulation procedures of ICSI and AH. For calculations of sensitivity, the SART CORS was used as the gold standard.
We evaluated the adjusted odds of whether maternal age, male factor infertility or prior ART services affected sensitivity estimates for each of the parameters in Table 2. Adjusted sensitivities were all estimated for median maternal age (36 years) and median time-to-report (147 days), male factor infertility (yes/no) and prior ART services (yes/no). Prior ART had an effect on the sensitivity for several parameters. For example, women with prior ART had a higher odds of correctly reporting medication administration for the index birth by a factor of 3.73 (P=0.003), 1.93 for ICSI (P=0.013), and 1.66 for AH (P=0.030) versus women with no prior ART using SART CORS as the gold standard. Diagnosis of male factor infertility was positively associated with the reporting of ICSI with an odds ratio of 1.81 (P=0.036). No other parameters were significant (not shown).
Maternal self-report of fertility treatment received to achieve a delivery prior to the index delivery is shown in Table 3. For this analysis the maternal self-report was considered to be more likely correct than the clinical data in SART CORS and this was used as the standard for calculation of sensitivity. For prior ART history, maternal reported ART agreed poorly with SART CORS reported information for prior treatments of ART, frozen embryo transfer (FET) and use of gonadotropins.
This study demonstrates that while ART treatment parameters of medication use, use of donor gametes, fresh versus frozen embryos are accurately reported by mothers, self-report is less valid when drilling down to more specific laboratory procedures such as ICSI and AH. The study results agree with a previous study by members of this study team in a different state but involving smaller numbers of participants.22 The data suggest that although ART is reported accurately other treatment parameters, particularly those performed within the laboratory, are less well understood and reported.
Studies of ART can be performed using clinical data, vital records data, and survey data. Clinical data would be expected to accurately reflect the clinical treatment history but collection of these data can be difficult and expensive and may not provide information on sufficient numbers of individuals to achieve adequate sample size. Vital records, such as birth certificates, are another potential source of information on infertility treatment and many states collect this information at delivery. Nevertheless, prior studies using national ART data, have demonstrated these records, at least in some states, to be less than accurate reflections of the use of ART. In particular, studies in Massachusetts and Florida when compared to national ART data, have shown high specificity (>99%) but low sensitivity (<42%) for birth certificate information.25–26 The low sensitivities may reflect the fact that the questions on the birth certificates have historically not clearly differentiated ART from other assisted reproduction treatments. In New York State, where Upstate KIDS was able to do a comparison with birth certificate data, the sensitivity was higher (55%).23
Some studies are most feasibly and inexpensively performed through survey of large number of consented participants who self-report their own diagnostic and treatment data. The Upstate KIDS Study obtained such diagnostic and treatment history using standardized questionnaires completed by mothers with live births who delivered pregnancies in NY State (excluding New York City). In a prior study22 we compared the accuracy of maternally-reported ART data on the use of assisted reproductive technology (ART) treatment with clinically collected data in the SART CORS database and found excellent agreement on whether ART was used versus not used. Other attempts to validate maternal self-reported ART parameters in a variety of locations have met with mixed success27–29 but specifics of ART treatment self-reported by women who have had ART have rarely been studied. In a recently published research letter we looked at self-reported parameters for ART treatment in a small group (N=77) of Massachusetts women who reported receiving ART in the National Birth Defects Prevention Study (NBDPS) and compared these to treatment parameters reported in SART CORS.22 That study demonstrated mixed results with some treatment parameters agreeing with high sensitivity and others having much lower sensitivity.
Our findings underscore that many aspects of ART are accurately reported by mothers though technical laboratory aspects, such as assisted hatching, are less accurate. To minimize information bias, questionnaires should be designed to capture information at the maximum sensitivity. Some strategies for designing questionnaires include testing survey items in focus groups prior to their implementation or doing small validation studies against medical reports. However, there are no perfect instruments and some level of misclassification is expected. In fact, there are statistical methods correcting for misclassification or examining its effect on sensitivity analyses so that may be retained for analysis. However, such analyses have revealed that bias would be too great to correct when sensitivity is less than 0.6.30 From our findings, this would mean that maternal report of assisted hatching, given a sensitivity of 38%, should not be used but that the reports for ICSI and other factors, though imperfect are potentially correctable. Our sensitivities, if replicable, could potentially be used as a reference for the research community to apply correction methods to their own data when maternal report is the only data available. The higher sensitivities of certain techniques (e.g., donor gametes, frozen cycle) suggest that maternal report could be relied on for future follow-up of offspring, although further studies may be needed to evaluate whether recall accuracy would decrease over greater length of time (such as years after treatment).
We found that maternal age and time to survey had no effect on reporting of any parameters; however, it should be noted that all women were surveyed at least 8–9 months after their ART cycle (which established the pregnancy that was then delivered) and that a survey done immediately after treatment might yield better agreement. Prior ART did influence the accuracy of the self-reported parameters suggesting that women who had more experience with ART treatment were more likely to understand the procedures being performed, and the terminology that might be used. Report of ICSI use was also enhanced by the knowledge that their cause of infertility was male factor. For the very small number of women who, in SART CORS, were reported to have used donor gametes but who did not report them in the maternal survey, it is possible that these women were hesitant to reveal the use of these donations as would be consistent with the known tendency to secrecy in couples using gamete donation.31
The strengths of this study are in the use of accurate clinical data from the SART CORS database and the large numbers of births used for analysis; however, there are some limitations. One limitation includes the possibility for error in the linkage between the two data sources although standard statistically robust methods were used with high accuracy as previously described.21 In addition, SART CORS data are known to be less accurate with regard to prior treatment than they are with regard to the index ART cycle and, hence, why we used maternal report as the standard for data in Table 3. Another potential limitation to the maternal questionnaire was that we did not have enough space to provide thorough explanations of procedures such as assisted hatching or ICSI. Perhaps, if given more information as to what these procedures entail rather than having to recognize them from their terminology alone (for example with the term “intracytoplasmic sperm injection” explaining that this refers to a procedure where the sperm is injected into the egg by an embryologist), maternal recall may prove closer to the data from SART CORS. With regard to questions about use of medications and vaginal ultrasound, accuracy may also be affected by interpretation of the question. IVF includes use of both of these treatments but women could have reasonably interpreted the question to relate to the treatment as a separate entity as opposed to part of a comprehensive IVF procedure. Using New York State birth certificate information as our sampling base may also have increased selection of women whose information is correct and, thereby, women who might have been more knowledgeable about their treatment. Furthermore, the low response to participation may have further increased self-selection of women who were more willing to share information about their ART treatment or were more familiar with the procedures. However, response was within what has been reported in contemporary birth cohorts.32
The findings in this study suggest that mothers more accurately self-report some ART treatment parameters than others. In particular, those aspects of ART that are not directly performed on the couple and take place in the laboratory are less often correctly reported. The data suggest a role for enhanced communication between clinicians and couples undergoing treatment, including more complete conveyance of procedures carried out in the laboratory. Caution is needed when considering the use of self-reported data pertaining to very specific and less common aspects of treatment such as assisted hatching. Researchers may also consider providing more detailed descriptions in questionnaires rather than relying on participant recognition of the technical terminology. Nevertheless, dichotomized ART (yes/no) obtained from mothers appears acceptable for use in research.
SART thanks all of its members for providing clinical information to the SART CORS database for use by patients and researchers. Without the efforts of SART members, this research would not have been possible.
Supported by the Intramural Research Program of the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD; contracts #HHSN275201200005C, #HHSN267200700019C). The authors thank all the Upstate KIDS participants and staff for their important contributions.
Conflict of Interest: The authors report no conflict of interest
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.