Our primary objective was to develop a free program for meta-analysis of causal research (therapeutic trials as well as etiologic cohorts and case-control studies) that could be applied in both analytical and educational settings. Our secondary aim was to validate the analytical tests in the program with output from established reference standards.
Before the actual development, we started with making an inventory of the most important meta-analytical tests and approaches, and brainstormed on ideas for an interface. Since causal meta-analysis methods are relatively well-established (in contrast to diagnostic or prognostic approaches to meta-analysis), we focused on meta-analysis of controlled trials and cohort or case-control studies. In these studies, outcome differences between exposed or treated and non-exposed or untreated groups are compared to assess a causal relationship between the determinant (treatment or exposure) and an outcome (mortality or morbidity). As far as the program structure was concerned, our a priori idea was to create an add-in for Excel. Although a rather unorthodox approach in this area (all existing meta-analysis programs are stand-alone programs and work independently of Microsoft Office), Excel provides a sophisticated calculation and graphics platform that is well-suited to many meta-analytical methods and at the programmer's disposal before any programming is done. Consequently, development and maintenance is relatively easy and costs can be kept to a minimum (one of the main aims in our program development). Furthermore, the spreadsheet environment of Microsoft Excel is familiar to almost all researchers in medical, social, and economical sciences, which was very much in line with our attempt to develop a package that is fit for beginning researchers. Although we realized that even recent versions of Excel can be inaccurate with regard to some statistical calculations [21
], we were confident that we could program around these difficulties if necessary.
Since we wanted to move beyond the occasional spreadsheet that can perform meta-analytical calculations, we started by designing a programming structure in which the already existing Excel functionality could be exploited to its maximum. Sophisticated procedures were custom-programmed with Visual Basic in the Visual Basic for Applications (VBA) editor of Excel 2003 (and tested in Excel 2000 and onward). The so-called front-loader (a start-up program initiated with an icon) and some small assistant programs, all being non-Excel entities, were developed with Visual Basic 6.0 (VB6).
Program architecture and operation
The current version of the program (version 1.5) is still only compatible with Windows operating systems running Excel 2000 or later, but versions for use with Excel on Macintosh and Linux are in preparation. The descriptions below apply to the Windows version, though most of it can be extended to future versions for other operating systems.
Installation is made easy with a set-up program that installs the necessary files in a folder that can be specified by the user (default is C:\Program Files\MIX). It will also create a MIX item in the Windows Start Menu (installing additional start-up icons on the Desktop or in the Quick-Launch bar is optional) and provides the option to start a Flash®-based program introduction. The MIX menu item contains an icon for starting up the MIX program, a folder with a shortcut to the uninstall program, a folder with shortcuts to programs for loading and unloading the Excel add-in, and a folder with educational programs and information. Loading the small MIX add-in that is supplied with the main program (typically automatically loaded during installation) results in a MIX menu-item under the Tools menu in Excel. This MIX menu contains several functions that can be accessed when the MIX program itself is not running. The files that form the core of the program are recognizable by their Mix file extension (*.mix) and currently contain approximately 16,000 lines of command code in 26 code modules and 17 custom user forms. These core files take up approximately 22 Mb of space on a hard-disk and their primary functions are (A) running interface procedures, (B) showing and manipulating output, (C) performing analyses, and finally (D) exporting and communicating with external files and programs. One of the core files is a large Excel workbook with 23 worksheets that forms the calculation engine of the program. It contains 6 sheets with primarily worksheet formulas and 10 sheets with various kinds of pre-calculated graphical and numerical results from meta-analytical tests. The remaining sheets contain information for help functions or programming purposes. This Excel workbook remains hidden from the users at all times. Figure gives a graphical representation of the full program structure.
Figure 2 Structure of the MIX program. The MIX program is started by simply clicking the MIX icon on the desktop or in the Windows Start Menu. The program uses a number of Excel workbooks, of which only the output file (*) is directly accessible by the user. Via (more ...)
At start-up, a dedicated instance (an independent fully functional running program) of Excel is created and becomes visible once all regular Excel menus and toolbars are hidden and replaced by the MIX graphical interface. The Excel instance used by MIX is secured for exclusive use by the MIX program and does not interfere with existing Excel windows or settings.
The interface consists of a menu bar, two toolbars, and several shortcut menus. The menu bar and toolbar are directly accessible and the shortcut menus pop up with a right click of the mouse. The MIX menu bar has eight main menus (File, Edit, View, Numerical Output, Graphical Output, Analysis, and Help) via which all functions of the MIX program can be executed. Most of the common functions require only a single click on the toolbars. Double clicking graph items skips the shortcut menu and directly provides options for changing the graph item's format. Figure shows the MIX program's user-interface with a forest plot and a format box to change the graph's format.
Figure 3 The MIX program's graphical interface with a forest plot. The standard Excel menu and toolbars have been replaced by the MIX interface through which graphical and numerical output can be created and manipulated. Custom shortcut menus are available via (more ...)
The MIX program provides several options for importing or creating data sets for meta-analysis. The most convenient option is to create an Excel or CSV file with data (standard output option in Excel) and import this file into the MIX program. The variable ranges are then selected in Excel-manner to create a data set (see figure ), which is subsequently loaded for analysis and optionally saved as a MIX data set file (*.mxd). The program accepts descriptive data from studies with continuous outcomes, e.g. sample size, mean, standard deviation, and dichotomous outcomes, e.g. group sizes and event numbers (two-by-two table data). Comparative data can also be loaded by means of association measures with their standard error. Initially, however, it is not necessary to make a data set since 19 data sets from the most authoritative books on the subject ("Meta-analysis in Medical Research" by Sutton et al [10
], "Systematic Reviews in Health Care, Meta-Analysis in Context" by Egger et al [6
], and Systematic Reviews in Health Care, A Practial Guide by Glasziou et al [7
]) have been included in the program. Most analyses and graphs presented in these books can be reproduced with a few clicks and the program can be used as a learning or teaching companion to these books. We hope to support more more books in this way in the future. In addition, the MIX website also contains a data set repository where users can contribute and download MIX data sets.
Figure 4 Creation of a data set with the MIX program. Data sets can be created from Excel files, Comma Separated Value (CSV) files, or via manual input. Once the data prepared on a spreadsheet within the program, the user can select the cell ranges that correspond (more ...)
A large variety of numerical and graphical output can be produced by the program. Besides the association measure values from the meta-analysis, several formal tests for heterogeneity, small study effects (publication bias), single study influence, and cumulative trends are also available in MIX. The graphical output is particularly comprehensive, with no less than eighteen informative plots that can be formatted in detail.
Possible association measures from continuous outcome data input are mean difference (MD), Hedges' g (HG), and Cohen's d (CD), analyzed by inverse variance fixed or random effects models. Data from studies with dichotomous outcomes can be analyzed with a risk difference (RD), risk ratio (RR), or odds ratio (OR), weighted by inverse variance, Mantel-Haenszel, Peto (only odds ratio), or Dersimonian-Laird approaches. Analyses based on correlation coefficients or Fisher's Z are also possible, though only if the data are provided as comparative input, e.g. the association measures itself with their standard error. If correlation or effect size data are not in this format, they can be transformed via the MIX Statistics Converter that comes with the program. Table gives an overview of the general features and the numerical and graphical methods in version 1.5 of the MIX program.
Overview of the MIX program's features
The most important educational features are the program's Output Tutor and Concept Tutor. Both are interactive dialog boxes that provide information about epidemiological and statistical concepts and tests. The Output Tutor changes with each analysis and always explains tests and results that are displayed or changed at the very moment. Additional teaching material includes a Flash®-based Theory Tour that explains the fundamentals of systematic reviews and meta-analyses and a Program Tour that shows the basics of how to use the program. The educational materials take up approximately 25 Mb and can also be downloaded separately.
To increase program stability and prevent users from accidentally altering the Visual Basic procedures, the source code cannot be accessed while the program is running. Codes to unlock the VBA modules are provided by the first author upon request.
Version 9.2 of STATA [24
], and more specifically version 1.81 of the metan
], version 1.2.4 of the metabias
], and version 1.0.5 of the metatrim
] were used as the general reference standards for most tests. Details on the development of these user-written programs themselves can be found in the STATA Technical Bulletins [25
]. The meta-analysis software Comprehensive Meta-Analysis (CMA) version 2 [28
] was used for validation of the Fail-safe N output and to double check the results of the other tests. Two investigators (LB, LMY) performed the validation independently with the MIX program (version 1.5 running in Excel 2003) and the reference standard(s) by analyzing eight data sets from meta-analyses that have been published in major journals [4
The data sets represent three of the most often used types of data for meta-analysis in health care research: 1) descriptive data for dichotomous outcomes, 2) descriptive data for continuous outcomes, and 3) comparative (association measure) data. For all three data types we chose a relatively small (less than 10 studies) and large data set (more than 20 studies) and we used two extra data sets in the 'descriptive dichotomous' category (one representing a meta-analysis of substantially heterogeneous studies and one with a rare event). The data sets are summarized in table . The tests that were subject to the validation procedures are shown in table . The items include individual study association measures, combined association measures, and several heterogeneity and small study effect assessments. Whenever applicable, p-values and/or confidence intervals were also compared.
Overview of the data sets used in the validation
Meta-analytical tests that were part of the validation
Results from the analyses of the eight data sets with MIX and the reference software were entered independently in identical custom-made spreadsheets. These spreadsheets were later compared in separate analysis sheets that used a cell-based formula to check for discrepancies of results up to 4 decimals.