In this paper we use a linear mixed-effects model in a unique twin design with duplicated repeated measurements to apportion the total variation of molecular phenotypes (protein profiles) into biological and experimental (technical) variation. Understanding the sources of variation (e.g familial, individual environmental, and experimental) inherent in the measurement of a molecular phenotype is a key step in assessing the potential for stable, informative biomarkers. We observed that across the 69 antibodies the median proportion of total variation attributable to familial sources was 12%. This familial component is consistent with protein profiles having the potential to reflect the polygenic basis of complex disease. Further, across the 66 HPA antibodies, the median proportion of total variation explained by stable sources (familiality and individual environment) was 25%. Stable variation in the current setting comprises that a protein profile remains constant within an individual over the course of the sampling period. A small proportion of variation originated from the short-term biological component, individual visit (median proportion 6%). Common visit represented an inconsiderable amount of variation. Most of the variation originated from experimental sources (median proportion 63%). To our current knowledge, the present study is the first to address the key issue of investigating sources of variation in data generated by exploratory antibody microarrays. It should provide important information when aiming at designing and utilizing such assays and be valuable for multiplexed and quality controlled assays [
25] that will become more widely used and accepted for clinical testing.
Ultimately, diagnostic tools built on markers discovered via these screening approaches could become valuable approaches to predict disease state and progression. As an example, antibodies that allow the comparison of individuals that are discordant for diabetes and diabetes-related clinical traits would be useful for identifying individuals likely to suffer from diabetes in the future, long before conventional diagnostic techniques can prove effective. In this way these affinity-based proteomics discoveries would become useful in clinical settings.
However, most antibodies used in this screening method have a large residual variance suggesting that a large proportion of variation in the data is experimentally derived. Potential (and non-separable) sources of this experimental variation, which exclude sample collection, preparation and storage, are the:
(i) complexity and composition of the serum samples which has an effect on the assays;
(ii) biotin modification of samples with regard to the numbers and variability of modifications introduced per molecule and sample;
(iii) sample treatment in terms of liquid transfers, heating, and assay buffer dilution;
(iv) assay procedure with immobilized antibodies selectively capturing aggregated or free molecules from the surrounding solution;
(v) fluorescence-based read-out being influenced by bleaching and dye incorporation onto the target molecules.
(vi) specificity of antibody binding events.
Addressing these issues would lead to a reduction in the proportion of phenotypic variation arising from experimental sources. This would in turn reduce the sample sizes (or degree of technical replication) required to detect epidemiological effects of interest. A reduction of the experimental variability, possible to achieve by using two antibodies to detect a target protein, would ensure that the experimental noise does not swamp biological signals of interest. The technical precision of such a measurement and of other antibodies can be improved, either by addressing the issues outlined above, or, in the short term, by assaying samples in technical replicate.
Research in the field of proteomics is advancing, with affinity-based approaches emerging alongside classical mass spectrometric approaches. With array-based proteomics becoming a promising area in the field of biomedical research, decomposition of the underlying variation in protein profiles into biological (both stable and longitudinally fluctuating) and experimental components is an important and useful step in exploring the applicability of antibody arrays for the exploration of the proteome. Ultimately, such proteomic strategies may lead to new disease markers and drug targets can be identified, benefitting from the possibilities offered by the versatility of both the employed affinity reagents and multiplexed techniques.