The advent of expression profiling and other high throughput technologies has enabled us to systematically study complex human diseases by simultaneously measuring tens of thousands of molecular species in any given cell-based system 
. It is now routine to organize such large-scale gene expression data into co-expression networks to shed light on the functional relationships among genes, and between genes and disease traits 
. Analysis of co-expression networks can be used to study any tissue or organ (such as liver, which plays a key role in the metabolism of glucose, lipids and toxic compounds), as long as the samples from such organs are collected in a population setting. Given that mouse and rat populations are commonly used to study human diseases in this manner, it is important to understand the mechanisms that are conserved between human and the rodent species, especially as we seek better predictions of the efficacy of drug targets identified from mouse or rat in human populations. In addition, identifying mechanisms that differ between humans and rodents can help to improve the design and interpretation of toxicity studies that involve rodent models.
Meta-analysis is the statistical synthesis of data by aggregating results from a set of comparable studies 
. It can be used to systematically examine similarities and differences between molecular profiling studies carried out in populations from different species 
. In a gene co-expression network, relationship between gene pairs is usually measured by correlation coefficients of different forms, such as Pearson correlation, Spearman correlation, or Mutual Information. Therefore, the problem of combining or comparing co-expression relationships across multiple datasets can be framed in the context of a meta-analysis of correlation coefficients, for which various methods have already been introduced. One method is Fisher's Inverse
test, which computes a combined statistic (S
) from the p
-values of the correlation coefficients obtained from (k
) individual datasets as,
. Under fairly general conditions this statistic follows a
degrees of freedom under the joint null hypothesis of no correlation, making it possible to compute p
-values of the combined statistic.
Another widely used meta-analysis method involves computing a weighted average of a common metric (i.e. effect size) derived from correlation coefficients in the individual datasets. Such statistic can then be used to test for homogeneity over the individual measures and for statistical significance. Datasets in this type of meta-analysis are typically weighted by the accuracy of the effect size they provide, which is a function of the individual sample sizes. Once the mean effect size is calculated, its statistical significance can be assessed by estimating the pooled variance of the mean effect size. In defining the effect size, Hedges and Olkin 
and Rosenthal and Rubin 
both advocated converting the correlation coefficient into a standard normal metric using Fisher's Z-transformation and then calculating a weighted average of these transformed scores. Depending on whether the effect sizes are assumed to be equal or not in the multiple datasets, fixed effect as well as random effect models can be employed. In the fixed effect models, the effect size in the population is a fixed but unknown constant and therefore is assumed to be the same for all datasets included in the meta-analysis. For random effect models, effect sizes may vary from dataset to dataset, and are assumed to be a random sample of all population effect sizes. Hunter and Schmidt 
introduced a single random-effects method based on untransformed correlation coefficients. One important feature of this type of method is that heterogeneity of the effect sizes can be estimated, which provides a way to assess the difference in correlation coefficients across multiple datasets. Schulze 
provided a thorough review of these meta-analysis methods and their applications.
For a meta-analysis of co-expression networks from diverse datasets, such as those constructed from different species, one central issue is that it is often unreasonable to assume that every gene pair has a unique, true effect size across evolutionarily diverse species. Although random effect models provide a more realistic way to accommodate cross species variation, it still assumes a parametric distribution on the population effect sizes. To circumvent this problem, a non-parametric meta-analysis method was introduced for the identification of conserved co-expression modules from human, fly, worm and yeast 
. In this method, Pearson correlation coefficients of expression profiles between every gene pair were computed in each organism and then rank-transformed according to their correlations with all other genes. A probabilistic test based on order statistics was then applied to evaluate the probability of observing a particular configuration of ranks across the different organisms by chance. The advantage of this method is two-fold: 1) because the method is based on non-parametric statistics, it makes no assumption on the underlying distribution of correlation coefficients across multiple datasets; and 2) the effect size (i.e. the rank ratio statistic for every gene pair) is defined in a gene-centric fashion such that for any given gene, correlations with all other genes are considered. However, the method also has several limitations including 1) the loss of power in general given the non-parametric formulization 
, and 2) the meta-analysis results cannot be represented in the same format as the individual datasets given there is no concept of a mean effect size. The details of individual methods are presented in the Methods
section. Their pros and cons are summarized in Supplementary Table S1
In this paper, we develop a method for the meta-analysis of diverse datasets generated across multiple species. Our method is semi-parametric in nature, requiring fewer assumptions on the distribution of the effect size than a purely parametric approach while retaining better statistical power than a fully non-parametric method. It also 1) defines an effect size that is gene centric, 2) allows for the computation of a mean effect size, and 3) leads to a heterogeneity statistic to test for differences in correlation structures among distinct datasets. Unlike most network alignment algorithms 
(with the exception of 
) or connectivity-based approaches 
, our method does not rely on the networks inferred a-priori from individual datasets, but instead focuses on the development of rigorous statistics to test directly the relationship between every gene pair. The simulation results showed that our method is robust against noises. When applied to a human, mouse and rat cross species meta-analysis of liver co-expression networks, we demonstrate that our method out-performs existing methods in identifying functionally coherent gene pairs that are conserved among the three species. Our method also leads to the identification of modules of co-expressed genes that represent core functions of the liver that have been conserved throughout evolution. Both highly replicated and less confident genome-wide association study (GWAS) candidate genes for blood lipid levels are found to be enriched in the conserved modules, providing a systematic way to elucidate the mechanisms affecting blood lipid levels. Application of our test for homogeneity leads to the identification of a single sub-network driven by ApoE
that distinguishes two nearly identical experimental cross populations whose genetic backgrounds only vary with respect to the gene ApoE
. We further demonstrate that genes involved in human- or rodent- specific liver interactions tend to be under positive selection throughout evolution. Finally, we identified a human-specific sub-network regulated by RXRG
, which has been validated to play a different role in hyperlipidemia and Type 2 diabetes between human and mouse. Taken together, our approach represents a novel step forward in integrating gene co-expression networks from multiple large scale datasets to leverage not only conserved information but also differences that are dataset-specific.