Of a total of 5171 records in the CDSR (Issue 1, 2008), 3385 were reviews rather than protocols, and 2492 of these met our eligibility criteria and contained at least one eligible meta-analysis of two or more studies. Following the removal of ineligible meta-analyses (Figure ), we were left with 2321 reviews, which contained 22,453 meta-analyses. These meta-analyses incorporate data from 112,600 studies.
Types of medical specialty
The medical specialties of the 22,453 meta-analyses are given in Table . The category "Gynaecology, pregnancy and birth" is the most frequently occurring, accounting for over a fifth (21%) of all meta-analyses. This reflects the fact that the Pregnancy and Childbirth Group and the Neonatal Group are two of the four Cochrane Review Groups which have produced the highest numbers of reviews. The two next largest categories are "Respiratory diseases" (13%) and "Mental health and behavioural conditions" (13%). The variation across medical areas partly reflects differences in longevity among the Cochrane Review Groups, as well as differences in research activity.
Types of interventions
Each meta-analysis makes a comparison of two interventions, and each of the interventions was categorised separately, with the initial intervention in the pair being considered the "active" intervention and the second intervention being the "comparator". The distribution of the comparisons made in the 22,453 meta-analyses is summarised in Table . As one might expect in a database of reviews of the effects of health care, the CDSR is dominated by pharmacological interventions. The initial intervention listed is pharmacological in just under two-thirds (63%) of all pair-wise interventions examined. In addition to its primary importance as an active intervention, the pharmacological category also comprises over a quarter (26%) of all comparator interventions in the CDSR. The most common pair-wise comparison is of a pharmacological versus a pharmacological intervention (25%), followed by pharmacological versus placebo (19%) and pharmacological versus control (18%). These three pair-wise interventions account for over 60% of all comparisons in the CDSR. The distinction between placebo and control might not be an accurate representation, since review authors may have coded placebo as 'control' when specifying their interventions. Combining the placebo and control comparators suggests that 37% of meta-analyses evaluated the fundamental efficacy of a pharmacological intervention.
Overall, the most common comparator intervention is control (which we defined as a control intervention which is not explicitly labelled or described as placebo; examples include "usual care" and "no treatment"). Control groups are present in over one third (36%) of all pair-wise comparisons. Placebos account for over one fifth (21%) of comparators and are predominantly paired with active pharmacological interventions: 4321/4763 (91%) of all placebo-controlled comparisons relate to pharmacological rather than non-pharmacological interventions.
Comparisons of a non-pharmacological intervention with an intervention from the same category are reasonably common in the CDSR (13%). These are principally comparisons between surgical procedures (5%) or medical devices (4%). Surgical procedures in particular are frequently compared with another from the same intervention category; 1204 of the 1562 surgical meta-analyses (77%) compared two surgical procedures. Medical devices (9%) and surgery (7%) are the two most frequently occurring initial intervention types after pharmacological interventions.
Types of outcome
Table presents the types of outcomes assessed in the 22,453 meta-analyses. The largest outcome category is signs or symptoms reflecting the continuation or end of a medical condition (16%). This is a broad category, which includes commonly recorded outcomes such as "presence/absence of disease" (in a non-preventative review) and "clinical improvement". These types of outcome are routinely measured in a large number of healthcare areas. Following this in order of size is the category of adverse events (11%). Many outcomes included in this category were simply labelled "adverse events", while some had a more specific label such as "adverse events: weight gain", and other outcomes such as "headache" were assigned to this category if the review title and objectives suggested they related to side effects rather than the main effect of the intervention. Another large category is infection or onset of a new acute or chronic disease (10%), which is largely comprised of binary outcomes from reviews which investigated whether treatments aimed at prevention of a particular illness had succeeded or failed.
Biological markers (9%) include quantifiable biological parameters, typically measured in a laboratory, such as blood components (e.g. CD4 count). General physical health measures (9%) include all physical health measurements manually assessed by the clinician. This includes routine measures such as heart rate or BMI. Obstetric outcomes (7%) are heavily represented by the two afore-mentioned, large Cochrane Review Groups covering pregnancy, childbirth and neonatal care. This category includes binary events such as becoming pregnant or having a miscarriage, as well as continuous outcomes such as fetal measurements or information on gestational age. The most objectively defined category is all-cause mortality, which comprises 6% of the outcomes. These outcomes were usually labelled as "All-cause mortality", but we also assigned phrases such as "Survival" and "Neo-natal mortality" to this category.
The seven categories discussed above represent two-thirds of all the outcomes in the CDSR, with the remaining third split between the other 16 outcome categories (Table ).
Composition of systematic reviews
The majority (61%) of the 2321 reviews that contained at least one eligible meta-analysis included only one pair-wise comparison of interventions (Table ). 86% of reviews measured outcomes for three or fewer comparisons and only 4% looked at seven or more. On closer examination of a review which appeared to report meta-analyses for the largest number of comparisons, some of the 23 comparisons were found to relate to different methods of combining data rather than separate pair-wise comparisons of interventions. The largest number of genuinely different pair-wise comparisons was 20, in a review comparing the effectiveness of interventions for preventing hypotension in women having Caesarean section under spinal anaesthesia [13
Number of comparisons per review, outcomes per comparison and meta-analyses per review in the CDSR.
The median number of outcomes per comparison was three (inter-quartile range 1 to 6; Table ). In 25% of all 4755 comparisons, only one outcome was reported. Several reviews included large numbers of outcomes relating to the same comparison, with 17% of all reviews including at least one comparison that looked at more than ten outcomes.
Of the 2321 reviews in the data set (which all included at least one meta-analysis of two or more studies), just under 10% contained only one meta-analysis. At the other end of the spectrum, one in ten reviews contained 22 or more meta-analyses. We note again that forest plots reporting results for two or more studies were regarded as meta-analyses, irrespective of whether review authors had elected to display the meta-analysis results. The median number of meta-analyses included in a review was six (inter-quartile range 3 to 12). The distribution exhibits positive skew, with five reviews examining more than 100 meta-analyses, the maximum being 128, in a review comparing immunosuppressive regimens for treating kidney transplant recipients [14
Number of studies per meta-analysis
In our sample of 22,453 meta-analyses from the CDSR, which needed to contain at least two studies to be eligible, the median number of included studies was three (inter-quartile range 2 to 6; see Table ). Over a third (36%) of the meta-analyses included the minimum requirement of two studies only, and just under three quarters (75%) contained five or fewer studies.
Number of studies per meta-analysis, overall and broken down by outcome type, intervention comparison type and medical area
Some of the more widely studied medical specialty areas in the CDSR
include meta-analyses that are able to draw upon a wealth of studies, the largest containing 294 [15
], whilst 1% of meta-analyses contain 28 studies or more. Among the 11 specialty categories we used, cancer had a slightly higher median number of included studies (5) than any of the other categories.
There is no clear evidence to suggest that the number of studies per meta-analysis is strongly related to the outcome data type, or to the types of interventions being compared. Meta-analyses of all-cause mortality appear to contain slightly larger numbers of studies than other types of outcome. A Wilcoxon rank-sum test comparing the numbers of studies in meta-analyses of all-cause mortality vs. all other outcome types gave a P-value of 0.001; however, this analysis was not pre-specified and should be interpreted with caution.
Study sample size
Sample size of individual studies varies considerably across reviews and meta-analyses in the CDSR (Table ): from very small studies containing only two individuals, up to some very large studies aiming to investigate the efficacy of vaccines or the impact of screening, which contained hundreds of thousands or even millions of individuals. The overall mean sample size for studies in the CDSR is 513. However, the distribution of sample sizes is better summarised by the median of 91 and the inter-quartile range of 44 to 210.
Study sample size, overall and broken down by outcome type, intervention comparison type and medical area
Studies reporting dichotomous data have a median size of 102 and an inter-quartile range of 50 to 243, whereas studies reporting continuous data have a lower median size (62) and inter-quartile range (33 to 142). This may be because continuous outcomes (e.g. blood pressure or change in peak expiratory flow) tend to be more complex and labour-intensive to measure, making them unsuitable as outcomes in very large studies. On the other hand, dichotomous outcomes such as presence or absence of disease can be collected in a quick and more efficient manner for large numbers of individuals. In addition, statistical power tends to be higher for continuous outcomes than for dichotomous outcomes, so sample size calculations will generally lead to lower sample sizes when the primary outcome is continuous.
Study sizes show notable variation across medical specialties. The medians and quartiles are highest in cancer, and high also for meta-analyses in the areas of infectious diseases and gynaecology, pregnancy and birth. Study sizes tend to be lower in the areas of mental health and behavioural conditions and pathological conditions, symptoms and signs.
There is relatively little variation in medians and inter-quartile ranges of sample size across different types of intervention comparison. However, the maximum sample size for pair-wise interventions comparing non-pharmacological interventions against control or placebo is substantially larger than for the other categories. This can be explained by the fact that the largest studies such as those involving vaccines or screening are included under this category.
Across outcome types, sample sizes are highest for the category including cause-specific mortality, major morbidity events and composite mortality/morbidity events, and are lowest for biological markers and general physical health measures.