|Home | About | Journals | Submit | Contact Us | Français|
To investigate if strict infertility diagnoses correlate with clinical judgment and how adjudicated diagnosis may improve accuracy in predicting IVF success.
Current criteria for infertility diagnoses were determined by literature review. IVF patient’s charts between 2004-06 were adjudicated according to these strict criteria. Agreement with patient’s clinical diagnosis entered into SART was measured using Kappa statistics to quantify the agreement between criteria. Pregnancy rates were calculated for each diagnosis by clinical and strict criteria. Success rates for diagnoses based upon each criterion were compared using multivariable logistic regression with adjustment for repeated measure.
432 women underwent 590 IVF cycles. Kappa statistics showed only moderate agreement between strict and clinical diagnosis of endometriosis, male, and tubal factor. PCOS diagnosis was less correlated. Uterine, unexplained and diminished ovarian reserve (DOR) diagnoses showed the poorest agreement between diagnostic criteria. There are considerable differences based on the two criteria for diagnosis. There was poor agreement between diagnostic criteria in patients with multiple diagnoses. By strict criteria, these patients were significantly less likely to have a live birth than those with a single diagnosis (OR=0.61, p=0.019). This finding was similar with clinical criteria (OR=0.68, p=0.06).
There is poor correlation between clinical infertility diagnoses and strict criteria. Diagnoses with objective criteria showed higher correlation than those with subjective criteria indicating variability in a clinician’s diagnosis. Success rates in some diagnostic categories changed markedly when strict criteria were applied. Patients with multiple diagnoses may have lower success. Accurate infertility diagnoses are important to provide patients with accurate prognosis. Moreover, lack of precision in underlying diagnosis may affect the validity of past and future research using administrative datasets including SART.
A clinical diagnosis of infertility may not agree with strict criteria based on recent review of medical literature. Standardized definitions of diagnostic categories are essential for accurate patient prognosis and future research.
Prior to attempting In Vitro Fertilization (IVF), patients almost always express a desire to know their chance of success. The Society for Assisted Reproductive Technology (SART) makes publicly available the self reported data of participating IVF clinics throughout the United States. Patients are able to access these data via the SART website (www.sart.org) and find specific success rates for their age group and diagnostic category. Clinicians often refer to these rates when counseling patients on their prognosis for pregnancy. Previous authors have demonstrated that age and infertility diagnosis are strong predictors of ultimate success (1, 2). In one population based study, older patients were found to be more likely to have “unexplained” and tubal factor infertility, while younger women are more likely to have ovulatory dysfunction or endometriosis(3). Secondary infertility has also been associated with an increased chance of becoming pregnant with IVF(4).
Each participating SART clinic defines the specific criteria for infertility diagnoses given to their patients. In clinical practice, patients may be given one diagnosis when in fact they do not meet the strict criteria for a specific condition. Correct characterization of a patient’s etiology is essential to provide them with their true prognosis for achieving pregnancy. Furthermore, some authors have questioned whether different clinics can be appropriately compared to each other because of differences in populations, number of cycles and methods to determine diagnoses. (5). We hypothesize that the clinical criteria by which many of these diagnoses are made affects the prognostic value of the success rate quoted to patients.
Current criteria for specific infertility diagnoses that form SART diagnostic categories were reviewed in recent medical literature. Special emphasis was given to position statements from ASRM and ESHRE as well as systematic reviews which analyze the breadth of studies available. Objective criteria for each diagnostic category were determined. IVF patients enrolled for other studies at the University of Pennsylvania between December 2003 and June 2006 had their clinical records reviewed, abstracted and their infertility diagnosis adjudicated according to these strict criteria by trained personnel. Couples were permitted to have multiple diagnoses as long as they met the minimum criteria in each category. Institutional Review Board approval T was obtained prior to chart abstraction.
Adjudicated “strict” diagnoses were then compared to clinical as entered into SART. The degree of agreement between clinical and “strict” diagnoses was calculated using Kappa statistics for specific diagnostic categories and evaluated according to the method of Landis, et al (6). Clinical pregnancy rates per transfer were calculated for each stratum and compared using generalized estimating equations models, an extension of logistic regression, which adjusts for repeated measures per subject. All calculations were performed using Stata v.10, College Station, Texas. This study was designed to assess agreement between diagnostic criteria and was not powered to assess for differences between pregnancy rates of the two groups.
Charts for 590 patients were adjudicated according to strict criteria. Live birth rates for each diagnostic criterion are presented in Table 1. The degree of agreement, represented by Kappa coefficients, between clinical and strict diagnoses was poorest among patients with diagnosis of uterine factor and diminished ovarian reserve. Strict criteria for unexplained infertility and PCOS showed slightly improved agreement with clinical criteria. While there was moderate agreement for diagnoses of endometriosis, tubal factor and Male factor, there remained 20% or greater discordance between clinical and strict diagnoses.
There was at least a 3% absolute change in pregnancy rate for every diagnostic criterion when strict and clinical criteria were compared. When pregnancy rates were calculated for each diagnostic category, success rates changed by more than 15 percent for patients with uterine factor, unexplained infertility and diminished ovarian reserve. Pregnancy rates decreased when strict criteria were applied for most diagnostic categories with the exception of diminished ovarian reserve. Patients with multiple factors were less likely to achieve a pregnancy regardless criteria of which were applied; however their likelihood of pregnancy was even lower with adjudicated diagnoses. By strict criteria, these patients were significantly less likely to have a live birth than those with a single diagnosis (OR=0.61, p=0.019). This finding was similar with clinical criteria (OR=0.68, p=0.06).
These data provide evidence that Dthere is poor agreement between clinical infertility diagnoses and evidence-based, strict infertility diagnosis. Diagnoses with objective criteria showed higher agreement than those with subjective criteria indicating variability in a clinician’s diagnosis. Furthermore, success rates in some diagnostic categories changed markedly when strict criteria were applied. With the exception of diminished ovarian reserve, success rates dropped in all other categories. This discrepancy with DOR patients might reflect that isolated diminished ovarian reserve is actually rare and that it may not carry the same implications as DOR associated with increased age or endometriosis. Furthermore, it is important to note that patients with multiple diagnoses may have lower success than those with a single diagnosis. Given such wide variation in pregnancy rates between clinical and adjudicated diagnoses, we feel it is therefore imperative that clinicians make the most accurate diagnosis when providing their patients with an estimate of their probability of achieving pregnancy.
Previous studies have examined the prognosis for pregnancy associated with specific infertility diagnoses such as tubal factor, endometriosis and PCOS (7-10). Our results may help explain apparent inconsistencies of studies in the literature. According to SART and previous studies, endometriosis patients have no difference in IVF success compared with other groups(12). However, there are conflicting studies which suggest that endometriosis may be associated with a lower chance of success. A meta analysis published from our group confirmed that these patients have a lower chance of pregnancy in IVF and that more severe forms of endometriosis resulted in lower success(9). The discrepancy between previous studies may be due to differences in how endometriosis was diagnosed or coded in the SART database. While our own small sample did not allow for subdivision of endometriosis into minimal, mild, moderate and severe subcategories, instituting these criteria into SART may prove useful in determining patient specific prognosis. Furthermore, lack of precision in underlying diagnosis may affect the validity of past and future research. It is essential that investigators work towards a standardization of diagnostic criteria for all infertility diagnoses in the manner that the Rotterdam Conferences standardized the diagnosis of PCOS(13).
When examining SART clinic specific success rates, it is important to examine diagnosis specific rates(11). The current SART database does not establish specific criteria for each of these diagnoses, but merely offers diagnostic guidelines to participating clinics. In order to accurately compare success rates between clinics, standardization of these criteria are necessary. Careful and critical inspection of published studies in a systematic review of the literature is called for in determining which criteria have the most evidence for affecting outcome. Consensus statements such as the Rotterdam Criteria for PCOS are particularly helpful in bringing together experts in the field to establish definitive criteria for specific diagnoses. We have demonstrated how merely standardizing these criteria in our own practice affected diagnosis specific prognosis. As patients also look at these success rates in order to choose a clinic, standardized and accurate reporting becomes more important. Accurate infertility diagnoses are important to provide patients with accurate prognosis and help them in deciding how and where to best pursue fertility.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
There is poor agreement between clinical infertility diagnoses entered in the SART registry and strict criteria which may affect the accuracy of success rates reported.