|Home | About | Journals | Submit | Contact Us | Français|
We, in this manuscript, address the fact that increasing numbers of published studies in reproductive medicine selectively report outcomes for only favorably selected patients; while failing to note that, so reported outcome data,therefore, cannot be applied to unselected patient populations. Almost all favorablepatient selection methods, starting with prolonged embryo culture to blastocyst stage, have, thus, been widely misrepresented in the literature since they almost universally report outcomes only in reference to embryo transfer. These outcome reports, however, do not include outcomes for poorer prognosis patients who do not reach embryo transfer. Study outcomes are universally applicable only if performed in unselected patient populations and reported with reference point cycle start (intent to treat). All other studies greatly exaggerate clinical pregnancy and live birth rates if applied to general populations, unless specifically noting that they can be extrapolated only to women who reach embryo transfer.
A recent PubMed search under the term in vitro fertilization (IVF) currently reveals approximately 38,500 entries, with many unfortunately reporting misleading results because investigators generalized outcomes, which had been obtained in highly preselected patient populations.
Treatment outcomes should, of course, never be generalized, and most investigators are aware of that. Unfortunately, it nevertheless happens only too often. When investigating utilization of elective single embryo transfer (eSET) in 2004, Thurin et al., for example, offered a very good example for doing it right and wrong at the same time: They correctly concluded that eSET should only be applied to women under age 36 years . They, however, failed to recognize that eSET in most IVF laboratories is predicated on blastocyst stage embryo culture (BEC), and that BEC, as will be discussed here in detail below, results in favorable patient preselection, preferentially excluding women with low functional ovarian reserve (LFOR). LFOR is, however, similarly predictive of poor IVF outcomes as is advanced female age.
Consequently, women with LFOR, even if under age 36, are still poorly advised when pursuing eSET (based on BEC). Thurin et al., thus, correctly recognized that advanced female age preselects patients with poor prognosis but failed to recognize positive selection biases associated with normal FOR. Their recommendation to restrict eSET to young women under age 36, therefore, also should have included, “with normal functional ovarian reserve (NFOR).” In other words, they incorrectly generalized their correct findings about age to all women under age 36. Unfortunately, such generalizations appear to have become more common in the published reproductive medicine literature in recent years.
Generalization of treatment outcomes was popularized in reproductive medicine by the pharma industry, which was more interested in maximizing market size for their products than in establishing specific preselected target populations, where products might work best . This is one important reason why infertility treatment protocols, including IVF stimulations, over decades have remained surprisingly static, and individualization of IVF care is only slowly increasing in popularity.
Other medical specialty areas, especially medical oncology and the treatment of genetic diseases, better than reproductive medicine, now demonstrate treatment paradigms characterized by individualization of treatment protocols, often called “personalized” or “precision” medicine . We are, thus, witnessing a retreat from universal medical treatment protocols toward recognition that best outcomes will be obtained with individualized care, which is based on different effectiveness of different therapies in different individuals. Such a paradigm shift requires that studied patients be well defined, and that treatment outcomes be interpreted strictly for the therapeutic benefit of specific populations.
Examples for inappropriate generalizations of outcomes abound in IVF. Likely, the first time such generalizations significantly affected worldwide IVF practice was when the concept of BEC was born, first proposed by Gardner et al. in 1998 [3, 4]. Increasing popularity of BEC has since led to additional BEC-dependent modifications in IVF practice, including above noted eSET [3, 5, 6], preimplantation genetic screening (PGS) based on trophectoderm biopsy , embryo banking  and, most recently, closed embryo incubation systems with time lapse imaging [9, 10]. Since all of these newly added procedures, alleged to enhance IVF outcomes, further encourage preselection of favorable prognosis patients , such patient selection has to be acknowledged by authors, and interpretations of reported outcomes have to be restricted to similarly preselected patients.
This is, however, not what happened following the original BEC studies, which had been performed in very favorable patient populations [3, 4, 12–14]. Those studies, unfortunately, were rather unabashedly generalized. As a consequence, BEC was increasingly utilized not only in good prognosis but also in intermediate and poor prognosis patients. Follow up studies in more general IVF populations, not surprisingly, then failed to confirm significant outcome advantage for BEC. The literature, indeed, convincingly demonstrates that outcome benefits from BEC were restricted only to good prognosis patients, and even in that patient population, BEC improved implantation rates for surviving embryos only marginally (i.e., immediate clinical pregnancy rates), as two Cochrane reviews convincingly demonstrate [15, 16].
Since IVF outcomes in good prognosis patients are also excellent with cleavage stage embryo transfers, cost-effectiveness of BEC, even in highly favorably selected patients, therefore, still remains to be established.
Both Cochrane reviews also demonstrate that in average, and especially in poor prognosis patients, BEC outcomes are even less favorable: In average prognosis patients, cleavage stage and BEC achieve similar IVF outcomes [15, 16]. BEC, therefore, does not appear indicated in average prognosis women. In poor prognosis patients, BEC actually reduces clinical pregnancy and live birth rates and, therefore, should be considered contraindicated .
Correct analyses of BEC outcomes, however, have to go beyond those conclusions. One also has to take into account that some embryos arrest between cleavage (day 3) and blastocyst stages (days 5/6), and that only embryos that survive BEC reach embryo transfer, and are given the chance of “better” implantation, pregnancy and live birth rates. Since the rate of such embryo loss increases from good, over intermediate to poor prognosis patients, BEC is incrementally deleterious in these three patient populations. This is also supported by the observation that cryopreservation of embryos at blastocyst stage effectively reduces the number of embryos available for cryopreservation . Any analysis of IVF cycle outcome with reference point embryo transfer is, therefore, inherently biased because it only includes preselected relative good prognosis patients, characterized by embryos that made it through BEC.
The patient population in the original BEC studies (and in other studies reporting outcomes only with reference embryo transfer), therefore, actually were favorably preselected twice; once by traditional selection criteria, such as age and FOR (FSH and AMH) [3, 4, 12–14], and a second time, based on whether patients actually produced transferable blastocyst stage embryos.
Such serial preselection biases raise the question what would happen to reported IVF outcomes if they were correctly calculated with reference point cycle start rather than embryo transfer? In the statistical literature such outcome assessments are called assessments “by intent to treat,” and are universally considered the most transparent and desirable way to report IVF outcomes [18, 19]. As the above referenced two Cochrane reviews convincingly demonstrate, BEC outcome assessments by “intent to treat” are unfortunately extremely rare [15, 16]. In practical terms this means that the vast majority of published BEC studies, including studies of other IVF treatment protocols that included BEC, have to be viewed with considerable suspicion.
Quite a number of more recently introduced BEC-dependent practice changes to IVF have also almost exclusively only been based on outcome reports with reference point embryo transfer. Those include embryo banking , PGS , closed incubation systems with time lapse imaging  and others. These BEC-dependent embryo selection steps, therefore, represent a third level of favorable patient preselection after age/FOR and BEC, and should also be reassessed in their alleged respective effects on IVF outcomes. Since dependent on BEC, it would not be surprising if, like BEC, statistically correct assessments would not confirm currently widely circulating claims of clinical benefits (for further detail, see below).
Considering above outlined difficulties in assessing how best immediate outcomes in IVF are to be determined, it may be more logical to assess IVF cycle outcomes not based on immediate pregnancy/live birth rates from one embryo transfer but on cumulative pregnancy and delivery chances from a single IVF cycle cohort of oocytes. This was, indeed, done in previously noted two Cochrane meta-analyses [15, 16], which demonstrate that all IVF patients, including good prognosis patients, achieve significantly higher cumulative pregnancy rates with cleavage than blastocyst stage embryo transfers.
To understand the varying effects of BEC in different patient population is, therefore, essential: In good prognosis patients, in most IVF centers representing ca. 20% of women, implantation (and immediate pregnancy) rates statistically marginally increase with BEC. Due to presumed loss during BEC of potentially healthy embryos, capable of producing normal pregnancies and live births if transferred at cleavage stage (day 3), cumulative pregnancy chances, however, decline (graphically demonstrate in reference 11). In average prognosis patients, in most IVF centers representing ca. 60 % of women, BEC is ineffective in increasing immediate IVF outcomes but, likely, causes no significant harm to outcomes. In such patients, cumulative pregnancy chances are, however, also reduced by BEC. Finally, in poor prognosis patients, again representing on average ca. 20% of women, immediate as well as cumulative pregnancy chances are significantly reduced by BEC.
Accepting this analysis leads to the unavoidable conclusion that a considerable proportion of widely accepted BEC studies in the IVF literature are actually misleading in suggesting that BEC should be routinely applied embryo culture method in IVF centers. Reevaluation of BEC utilization, therefore, appears overdue.
As already noted, a number of recent additions to routine IVF are co-dependent on BEC: They are PGS with trophectoderm biopsy at blastocyst stage, the concept of embryo selection via closed incubation systems with use of time lapse imaging and, at least to a degree, the recently increasingly popular practice of embryo banking. Finally, the concept of eSET is based on BEC because BEC is assumed to offer highest pregnancy and live birth chances at lowest twinning risk [1, 5].
As already noted briefly before, especially PGS introduces an additional patient preselection step among patients already favorably selected by age/FOR and with BEC because the risk of not reaching embryo transfer for lack of euploid embryos increases from good over intermediate to poor prognosis patients [15, 16]. This has been one reason (among a good number of others) why other investigators [22, 23] and we [21, 24] have criticized the increasing utilization of PGS in routine IVF cycles. Since this is not a primary subject of this commentary, we will here not be repetitive in our arguments. Only so much: Two prominent IVF centers [25–36] over the last few years transitioned their IVF cycles almost exclusively toward BEC+PGS.
Failing to demonstrate outcome improvements in pregnancy and live birth rates using this protocol, the latter group, since BEC facilitates eSET, suggested a new rationale for BEC+PGS—the reduction of twin pregnancies [32, 34]. eSET, indeed, reduces twin pregnancies  but does so at the expense of pregnancy chances; whether reduction of twin pregnancies should, indeed, be viewed as an indication for PGS in our opinion, therefore, is questionable [21, 24].
Since both of these groups included only patients who reached embryo transfer in their studies’ outcome reports, they completely avoided from consideration the impacts of previously noted triple patient preselection biases based on (i) patient age and FOR, (ii) BEC, and finally (iii) PGS. They, thus, eliminated from consideration all patients who (i) did not qualify for treatments because of advanced age and/or LFOR, (ii) had no surviving embryos after BEC, and (iii) had no euploid embryos for transfer left after PGS.
Adding up the combined statistical effects of these three consecutive patient preselection steps, of course, greatly inflates reported clinical pregnancy and live birth rates. An unselected patient population, evaluated by intent to treat (i.e., with reference point cycle start), unquestionably, would demonstrate significantly poorer outcomes.
Reporting PGS in this way, however, also misleads in other ways: Especially poorer prognosis patients may be deprived of pregnancy and delivery chances by preventing them from reaching embryo transfer. If transferred at cleavage stage, such women might still have chances of conceiving and delivering healthy offspring [11, 15, 16].
Here voiced criticism of BEC+PGS does not even take into account recently published data, which suggest that PGS is unable to accurately assess embryo ploidy via a single trophectoderm biopsy [37–39]. Indeed, excellent clinical pregnancy and delivery rates of healthy offspring have been reported in highly unfavorable patients after transfer of embryos, previously reported as aneuploid [38, 39], strongly indicating that PGS represents a highly unreliable tool in determining embryo ploidy, and leads to the discarding of large numbers of entirely normal embryos.
Embryo banking, which also greatly gained in popularity in recent years, is another newly introduced addendum to standard IVF, which is co-dependent on BEC. It also suffers from outcome inflation under current reporting standards. U.S. national data demonstrate that in association with embryo banking poorer prognosis patients, disproportionally, do not reach embryo transfer . Patients who do reach embryo transfer, therefore, once again are preselected out for better prognosis, and reported outcomes will be again characterized by inflated pregnancy and live birth rates .
That studies in favorably preselected patient populations bias outcomes, was recently also demonstrated in association with closed incubation systems with time lapse imaging. The one prospectively randomized study of reasonable statistical power of such a system in the literature was performed in a highly preselected favorable patient population, including a large percentage of young oocyte donors . The authors must be given credit for noting this fact in their manuscript; yet, the study’s data nevertheless, have been inappropriately generalized . How these systems would affect poorer prognosis patients is, therefore, currently still unknown and certainly deserves exploration before such systems further enter general clinical IVF practice.
Patient biases affect not only clinical studies. We recently had the opportunity to see a basic genetic study in manuscript peer review, in which very obvious patient selection led to significant distortion in population distribution of the investigated gene mutation. Not recognizing this selection bias, the authors completely misinterpreted their own outcome data.
Finally, Dale et al. in an opinion piece in this journal recently pointed out the obvious limitations of all embryo markers in predicting IVF outcome since “the fate of each embryo depends on the orchestrated management of many physiological activation events that progress independently of the maternal or zygotic genome.” These authors also emphasized that, after almost 35 years of IVF practice, no evidence exists that the IVF laboratory can improve the “intrinsic” quality of gametes or embryos . Since we concur with their conclusions, significant progress in IVF outcomes will, therefore, only unlikely come from attempts at improving mature gametes or embryos (or their selection), as attempted by BEC, PGS, or closed incubation systems with time lapse imaging. If such improvements are to be achieved, it appears more likely that they will be the result of earlier interventions—primarily into follicle and oocyte maturation.
All study outcomes are only applicable to studied patient populations. If one accepts this indisputable premise, the problem with interpretation of the current IVF literature is well defined, and the solution obvious: Only unselected patient data, based on intent to treat (i.e., in reference to cycle start) should be considered acceptable as evidence that is applicable to all IVF patients. Since patient populations vary between IVF centers, this means that every IVF outcome study has to contain a description of the investigated patient population, which at minimum has in detail to define patient age and FOR of studied patients. Every such study also should note in the manuscript that reported findings cannot be applied indiscriminately to patients who do not meet the study population’s characteristics.
These two considerations, alone, would beneficially impact all IVF-related research. Even more importantly, they also would quickly correct inappropriate and ineffective clinical IVF practices that have become popular over the last decade without appropriate prior vetting. Especially poorer prognosis patients will be the primary beneficiaries ; but IVF patients, in general, can be expected to benefit from the recognition that individualization of patient care in IVF should become a major paradigm change in IVF practice.
None of the authors perceive any potential conflicts in respect to here presented manuscript. N.G, and D.H.B, are co-inventors on a number of pending and already awarded U.S. patents claiming therapeutic benefits from androgen supplementation in women with low functional ovarian reserve (LFOR) and relating to the FMR1 gene in a diagnostic function in female fertility. Both receive royalties from Fertility Nutraceuticals, LLC, in which N.G. also holds shares. N.G., D.H.B and V.A.K. also are co-inventors on a pending AMH-related patent application. They report no other potential conflicts with here reported manuscript.
An increasing number of IVF outcome reports in the literature exaggerate pregnancy and live birth rates because they do not asses them by “intent to treat.”