This is the first study to assess the relative importance of technical and biological variables affecting multicenter measurement of proinflammatory cytokines in the context of the vaginal mucosal fluid matrix. The study compared complex biological fluids (e.g., human serum or surrogate vaginal fluid) versus simple protein-free matrix (e.g., buffered or nonbuffered saline both widely used to sample vaginal cytokines by lavage). It was determined that matrix complexity unequally affected cytokine measurement, depending on individual cytokine preparation as well as detection platform or assay. In the case of IL-1β, the matrix effect varied by detection platform. In contrast, matrix complexity affected IL-6 measurements regardless of assay or detection platform, with saline and PBS consistently returning lower IL-6 values. In general, the analytic recovery of spiked cytokine assay calibrators was less sensitive to matrix variation than that of recombinant biologically active NIBSC cytokine standards, which should more closely mimic native endogenous cytokines. These results suggest that biologically active or endogenous cytokines rather than assay calibrators should be spiked into relevant biological fluids to estimate possible effects of fluid matrix on cytokine measurements and to establish a standard procedure for future interlaboratory measurement of cytokines as biomarkers in clinical trials. Our results also suggest that the media used for vaginal washing or dilution of samples may play a critical role in the recovery of cytokine concentrations, and therefore, uniform vaginal sampling should be applied across clinical trials. The possibility for underestimating concentrations of some cytokines should be considered, and errors should be estimated by assessing each cytokine recovery in the medium of choice.
Differences in measured cytokines across assay platforms could not be explained solely by different nominal values of assay calibrators provided by each manufacturer. Similarly to our study, previous studies of human blood have reported significant interassay variations.23,24
In those studies, various assays generated similar patterns of IL-6 blood plasma levels; however, absolute concentrations differed between ELISA kits produced by different manufacturers.(23
) Neither in those nor in our study did conversion of cytokine concentrations from pg/mL to NIBSC IU/mL eliminate interassay differences. Our study confirms the need for use of internal calibrator controls along with matrix-specific conversions to allow interassay and interlaboratory meta-analysis.
Antibody-based detection of cytokines in complex biological fluids can be complicated by denaturation of epitopes by fluid components as well as by the physiological variance of cytokine polymerization, glycosylation, soluble receptor binding, or degradation that may not be equally recognized by the detection antibodies used in different assays. Recombinant proteins used for calibration and generation of antibodies for various immunoassays may have various degrees of glycosylation and other posttranscriptional modifications depending on bacterial, yeast, or mammalian rDNA expression systems(19
) which may explain some differences found between assay calibrators and NIBSC standards. For example, IL-1β is initially translated as a biologically inactive precursor molecule that is subsequently processed into a mature protein associated with IL-1 bioactivity.(25
) Both may be available in biological fluids. In the case of vaginal fluid, the presence of mucus can introduce another layer of variability since it can bind some cytokines and thus affect their recognition by capturing antibodies or may nonspecifically attach to immunobeads or assay plates thus affecting downstream assay events.
In this study, the two biologically active international reference preparations of IL-1β differed in terms of sensitivity to matrix complexity. IL-1β calibrators were also sensitive to matrix complexity but only in two out of seven assays. In contrast, the recovery of the NIBSC, but not the calibrator IL-6 variants, differed significantly in terms of their sensitivity to matrix types, regardless of assay type. Although the recovery of NIBSC IL-6 was lower in PBS and saline as compared to serum and the defined vaginal simulant, assay calibrators were equally recoverable from all matrices across all assays. In analogy to results obtained with natural IL-6 present in blood, different assays provided different recovery of IL-6 89/548 in the vaginal fluid simulant. This could be dependent upon the fact that recombinant IL-6 89/548 is glycosylated similarly to the natural cytokine(19
) and its detection would depend on the level of recognition of protein or carbohydrate epitopes by the antibody pairs of each assay.
The interlaboratory differences observed in this study were not due to sample handling or technical skills since variation of sample handling was minimized by following a standardized operational procedure developed for each assay listed in Table and since good intralaboratory and intra-assay reproducibility was achieved within each matrix type and within each assay platform. The small intra-assay variation was confirmed using multiple indexes (i.e., % CV, log10 SD, and estimation of detectable fold differences).
The interlaboratory variation was most significantly associated with assay type and platform as demonstrated by the bar sizes in Figures and and by the fold differences reliably detectable in each assay (Table ). Thus, the sources of interlaboratory variation appear inherent in cytokine−platform or matrix−platform interactions. The complex matrices (serum and vaginal fluid simulant) were consistently associated with higher interlaboratory variation in all assays except the QuantiGlo and MSD (Figures and ), which detect a luminescent signal at 620 nm thus eliminating the problem with color quenching by matrix components that may occur in regular absorbance ELISA or fluorimetric Luminex assays. In addition, MSD uses labels that emit light when electrochemically stimulated where the stimulation event (electricity) is decoupled from the signal (light) thus minimizing background. Nonuniform matrix interactions with immunobeads in the Luminex platform may add an additional layer of variation, which is complicated by the fluidics nature and dependence on efficient washing of the beads between immunoassay steps. Interlaboratory variability has been found previously for noncommercial assays for quantification of HIV-1 RNA in seminal plasma where commercial assays were found to demonstrate highly reproducible (intra-assay log10
SD 0.11−0.32) and noncommercial (in-house) assays less reproducible results (intra-assay log10
) A multicenter study suggested that the lack of reproducibility found for noncommercial assay measurements could be due to the fact that reagents for commercial kits are produced in “large and well-characterized” lots.(26
) Thus, variation of immunobead and antibody lots obtained from different manufacturers that have not been as rigorously subjected to quality control and cross-calibration as commercially assembled kits may be an additional explanation for high interlaboratory variation in in-house assembled Luminex assays.
In our study the commercial assay kits as well as the assays compiled in-house from different commercially available reagents showed high inter-replicate reproducibility allowing reliable detection of 1.5-fold intra-assay difference in concentration levels regardless of cytokine type. However, interlaboratory variation significantly decreased the level of confidence in detectable difference. Our data indicate that IL-6 levels, for example, would have to vary by at least 6-fold in order for multiple laboratories using the Luminex platform to detect reliable differences, provided a uniform biological fluid (matrix) was used. With matrix variation, the threshold for reliable detection of IL-6 differences increased to >26-fold in Luminex Beadlyte and >8-fold in several other assays. Matrix variation affected less the statistical power to detect differences in IL-1β levels. These data emphasize the importance of uniform biological sampling techniques and careful analysis of matrix interferences. Because of high interlaboratory reproducibility regardless of specific cytokine tested, the chemiluminescence-based assays (QuantiGlo and MSD) appear more suitable for multicenter study comparisons. Our study suggests that if uniform platform use is not feasible for financial or technical reasons, each multicenter study should appoint a reference laboratory to perform interassay comparisons to determine the confidence of fold-difference detection in each specific biological fluid and for each cytokine assessed.
In summary, the cytokine measurement methods evaluated here were found to be reproducible for laboratories using commercial assay platforms; however, the measured values of identical samples varied across laboratories and assay platforms. Our data indicate that variability introduced by the sampling fluid and assay type or detection platform may confound interlaboratory or interstudy comparisons. The discrepancy between matrix recovery of assay calibrators and biologically active cytokine variants, which was especially obvious for IL-6 assays, may lead to a significant under- or overestimation of cytokine concentrations in complex biological fluids such as vaginal fluids and blood. It is important to understand that these effects may occur in every day analytical practice not limited to the case of cytokine immunoassays. To avoid their impact, any laboratory should prepare their own normality ranges in the relevant types of clinical samples. Our study emphasizes the need for assay choices based on criteria for reliable interlaboratory comparisons, which goes beyond the scope of vaginal cytokine measurement and microbicide research. Along with parameters commonly provided by manufacturers (e.g., assay sensitivity, lower limit of detection, and assay precision determined by reproducibility of replicate measurements), immunoassay utility should be evaluated also by matrix-specific fold change in cytokine concentration reliably detectable by multiple comparisons methods. Of all the assays evaluated in this study, the (electro)chemiluminescent platform appeared to be the most reliable platform for detection of a ≤1.5-fold change in IL-6 and IL-1β concentrations. However, future validation studies are needed to determine the biological significance of the magnitude of change for each biomarker candidate. It is recommended that prior to adoption of a standardized methodology, cytokine measurements from clinical trials be reported not only in terms of absolute ranges but also in terms of fold differences between samples with defined clinical characteristics.