|Home | About | Journals | Submit | Contact Us | Français|
Type 1 diabetes results from an immunemediated destruction of β-cells, likely to be mediated by T lymphocytes, but the sensitivity, specificity, and other measures of validity of existing assays for islet autoreactive T-cells are not well established. Such assays are vital for monitoring responses to interventions that may modulate disease progression.
We studied the ability of cellular assays to discriminate responses in patients with type 1 diabetes and normal control subjects in a randomized blinded study in the U.S. and U.K. We evaluated the reproducibility of these measurements overall and to individual analytes from repeat collections.
Responses in the cellular immunoblot, U.K.-ELISPOT, and T-cell proliferation assays could differentiate patients from control subjects with odds ratios of 21.7, 3.44, and 3.36, respectively, with sensitivity and specificity as high as 74 and 88%. The class II tetramer and U.S. ELISPOT assays performed less well. Despite the significant association of the responses with type 1 diabetes, the reproducibility of the measured responses, both overall and individual analytes, was relatively low. Positive samples from normal control subjects (i.e., false positives) were generally isolated to single assays.
The cellular immunoblot, U.K.-ELISPOT, and T-cell proliferation assays can distinguish responses from patients with type 1 diabetes and healthy control subjects. The limited reproducibility of the measurements overall and of responses to individual analytes may reflect the difficulty in detection of low frequency of antigen-specific T-cells or variability in their appearance in peripheral blood.
Type 1 diabetes is caused by T-cell–mediated destruction of β-cells (1). Despite this understanding, there are few tools to identify and track cells that mediate the disease in humans. Several assays that can distinguish antigen-specific responses in patients from normal control subjects have been reported (2–7); however, some of these have not performed well in larger blinded studies (8–10), and their reproducibility has not been systematically studied in a masked workshop.
A means of monitoring cellular responses involved in type 1 diabetes is needed to understand the action of immune therapies and to effectively apply therapies in the clinical setting (11). A biomarker that responds to an effective therapy could be used to rapidly screen candidate therapies for use in a further study to examine effects on longer-term clinical outcomes, such as preservation of β-cell function. A highly sensitive and specific biomarker might also potentially function as a surrogate for clinical outcomes that take a longer time to observe and require a larger number of subjects to study.
In a previous blinded study, Seyfert-Margolis et al. (8) reported that two different assays that measured T-cell proliferative responses to antigens were able to distinguish responses in subjects with type 1 diabetes from those in healthy normal control subjects. However, that study used a single collection from each subject and could not evaluate the reproducibility of the measurements that is needed to assess their utility in clinical trials. Moreover, only the two assays that used fresh cells showed significant discriminant validity (i.e., ability to distinguish participants with type 1 diabetes from normal control subjects). Therefore, the present study was conducted to assess the discriminant ability of five T-cell assays with fresh blood samples. We also assessed the reproducibility of the measurements from repeat collections in individual subjects, both qualitatively with respect to the classification of each subject (positive vs. negative) and quantitatively for the different analytes used in each assay.
Sixty-eight control subjects with type 1 diabetes were enrolled, 35 in North America and 33 in the U.K., along with 96 control subjects without type 1 diabetes, 63 in North America and 33 in the U.K. (Table 1). Collections from North American sites were split and distributed among the North American laboratories; those from the U.K. were assayed by the U.K. laboratory.
A greater number of control subjects were studied to provide adequate numbers of subjects with HLA-DR3 and/or DR4 genotypes (12). Two collections were obtained from each subject on different days, the second within the subsequent 2–28 days.
Participants ranged from 8–35 years of age, weighed at least 40 kg (88 lbs), and were free of conditions or treatments that would affect the immune system. Women were not pregnant or lactating. Control subjects with type 1 diabetes were diagnosed within the past 12 months before the first collection; control subjects did not have a first- or second-degree relative with type 1 diabetes.
The specimens were collected between 8:00–10:00 a.m. after an overnight fast, with fasting plasma glucose level between 70–180 mg/dl (3.9–10 mmol/l), and without an injection of short-acting insulin. Fresh blood samples were collected and air-shipped by overnight courier (North American sites) or same-day courier (U.K. site) at ambient temperature to each T-cell laboratory, volume permitting. Before the first visit, control participants were screened for their HLA genotypes at a central laboratory from a buccal swab. Biochemical autoantibodies (GAD-65, ICA-512, microinsulin autoantibody) were measured centrally using a radio-immunobinding assay and islet cell antibodies (ICA) using indirect immunofluorescence from frozen serum collected at the first visit (13–16). Microinsulin autoantibody was not used in this analysis because all subjects were treated with exogenous insulin. The responses in the T-cell assays were classified as positive (diagnostic of type 1 diabetes), negative (diagnostic of control status), or indeterminate.
The laboratories were masked to the status of the subject, the identity of each subject, and the sequence of visits. Detailed laboratory methods and analytes for each assay are presented in the online appendix available at http://diabetes.diabetesjournals.org/cgi/content/full/db09-0249/DC1. Herein acronyms are used to refer to specific analytes, such as “MDR4GAD274” to refer to the DR4-GAD274-286 tetramer stimulated with the DR4 binding peptide pool. The description of each analyte is presented in the online appendix, and brief methods follow.
Human islets were subjected to preparative 10% SDS-PAGE, the gels were electroblotted onto nitrocellulose, nitrocellulose particles prepared, and the nitrocellulose particles used to stimulate peripheral blood mononuclear cells (PBMNCs) in vitro (7). Eighteen blot sections of decreasing size were analyzed. A stimulation index (Si) > 2.0 indicated positive proliferation for a blot, and a sample with four or more positive blots was designated as positive (diagnostic of type 1 diabetes).
PBMNCs producing interferon-γ (IFN-γ) in response to stimulation with synthetic peptides representing naturally processed and presented epitopes of IA-2, GAD65, and proinsulin were detected by cytokine ELISPOT as described (5). Only samples from HLA-DR3 or DR4+ subjects were used. Single peptides were tested at 10 μmol/l, and the pool was tested at 3 μmol/l and 10 μmol/l (final concentration of individual peptide components). A stimulation index (Si, derived as number of spots in test analyte wells/number in negative control wells) of ≥3.0 was designated as a positive response to an analyte. Samples showing a response to one or more analytes were classified as positive.
This assay measured antigen-induced proliferation of PBMNCs in microcultures by up to 20 individual test antigens including type 1 diabetes-relevant and -irrelevant peptides and proteins (3,8). Antigen responses were normalized as stimulation indexes, and samples with three or more T-cell pools that target type 1 diabetes-relevant antigens/epitopes (17) were designated as positive.
The construction of the expression vectors for generation of soluble DR0401 (DRA*0101/DRB1*0401), DR0404 (DRA1*0101/DRB1*0404), or DR0301 (DRA1*0101/DRB1*0301) molecules has been described previously (18). CD4+ T-cells from PBMNC from HLA-DR3 and/or 4+ control subjects were expanded with peptides. On day 14, the cells were stained using 10 μg/ml of PE-labeled HLA-DR0401/04 or 0301 tetramers containing a specific or a negative control peptide and then with anti–CD3-FITC and anti–CD4-PerCP antibodies. Cells were analyzed on a fluorescence-activated cell sorter calibur flow cytometer with FloJo software.
The methods described above for the U.K.-ELISPOT assay were followed. The difference in the assays included the following: 1) Those in the U.K. were processed on the day of blood draw whereas those in the U.S.-ELISPOT were air-shipped overnight. 2) Because of lower overall responses noted in the shipped samples compared to fresh control samples, a Si of ≥2.0 was designated as a positive response to an analyte. 3) Test analytes did not include the DR4-restricted GAD65554–575 peptide. 4) The test peptides were only tested at the 10 μmol/l concentration.
Only evaluable specimen results were used. A sample was declared nonevaluable if the number of viable cells was inadequate or the sample was discarded because of poor quality as indicated by hemolysis before processing or dye exclusion after processing, if the yield of PBMNCs was <0.5 × 106/ml blood, or there was no response (Si < 3.0) to Pediacel (for U.K.-ELISPOT).
Standard measures of diagnostic accuracy were used (19). The assay results are either + (diagnostic of type 1 diabetes) or − for each analyte or for the completed assay. The sensitivity (proportion of patients with type 1 diabetes who test positive) and specificity (proportion of control subjects who test negative) are presented. Because it was planned that about half the subjects with DR3/4 would have diabetes (D), the positive predictive value (PPV) = P(D +) = sensitivity/(sensitivity + α) and the negative predictive value (NPV) = P(D +) = specificity/(specificity + β). The proportion correctly classified (PCC) = (sensitivity + specificity)/2. A generalized estimating equation (GEE) logistic model (20) provided estimates of sensitivity and specificity, allowing for multiple collections for each subject, and of the odds ratio of a specimen being from a subject with diabetes given a positive classification from the assay.
The κ agreement statistic (21) assessed the reproducibility of the repeated qualitative assays (positive or negative) from the pair of collections from each control subject. A constant of 0.5 was added to all frequencies when there was a zero cell in the corresponding 2 × 2 table. The Cochran test of homogeneity tested the differences between the κ values among subgroups (22). An entropy R2 measure of association (the uncertainty coefficient) described the correlation of the repeated assays (23).
For each assay, 30 subjects with type 1 diabetes and 30 or more control subjects provides 85% power with a one-sided test at the 0.05 level to detect a difference in the proportions positive of 0.675 among those with diabetes versus 0.325 (one – specificity) among those without, or an odds ratio of 4.31.
Table 1 describes the characteristics of the 94 subjects from North America and the 60 from the U.K. who contributed an evaluable specimen as defined in the online appendix. Of these, 9 in North America and 15 in U.K. withdrew after collecting the first sample. North American patients were pediatric, whereas U.K. patients were adult with somewhat lower prevalence of autoantibodies. Although sensitivity for any single autoantibody ranged from 59–67%, 61% of subjects with type 1 diabetes were positive for two or more autoantibodies versus 0% of the control subjects (Table 1, Fig. 1). These observations place the test subjects well within the expected serologic ranges (15).
Table 2 presents the fraction of specimens that were evaluable for autoantibodies and each assay. Antibodies were evaluable in virtually all of the specimens. The cellular immunoblot was evaluable in 68% and T-cell proliferation (TCP) in 84% of all specimens. The other assays were only conducted in specimens from DR3 or DR4 subjects, and of these the tetramer was evaluable in 77%, U.S.-ELISPOT in 57%, and the U.K.-ELISPOT in 100%. Indeterminant specimens for which the assay did not provide a clear positive or negative response were frequent with the tetramer assay (27% of those otherwise evaluable), less so for the cellular immunoblot (5%) and U.K.-ELISPOT (7%), and none for the other two laboratories.
For the analysis herein, indeterminate specimens are counted as a negative response (see the online appendix). Table 2 presents the summary measures of diagnostic accuracy for each assay. The summary measures for each analyte within each assay are presented in the online appendix.
The cellular immunoblot assay provided sensitivity of 74%, specificity of 88%, and 81% correctly classified (Table 2), somewhat lower than provided by autoantibodies alone. Of those classified positive, 86% (the PPV) actually had type 1 diabetes, and of those classified as negative, 77% (NPV) were actually control subjects. The odds ratio (21.7) was highly statistically significant (P < 0.0001). Figure 2A shows the distribution of the number of positive blot sections within each group. Among control subjects, the percent with 0–3 blots positive was higher than among those with diabetes. Most of the false-positive control samples had four positive responses and were borderline for overall positivity.
The U.K.-ELISPOT assay also showed a statistically significantly odds ratio (3.44, P = 0.0026), with a sensitivity and specificity of 61 and 69%, respectively (Table 2). Figure 2B shows the distribution of the number of positive responses in each group. Among the control subjects, over 80% of the collections had zero or one analyte positive. Among those with diabetes, two-thirds were positive for one or more analytes.
The TCP assay provided sensitivity of 60% and specificity of 69% and a statistically significant odds ratio (3.36, P = 0.0041, Table 2). Figure 2C shows the distribution of the number of positive responses within each group. A higher fraction of control subjects was positive for 1–3 antigens than were subjects with diabetes, whereas a higher fraction of those with diabetes was positive to 10 or more analytes than were control subjects.
The odds ratios for the tetramer (2.1) and U.S.-ELISPOT (1.09) assays were not statistically significant (Table 2). In this analysis, the sensitivity was <50% for both assays, whereas the specificity was comparable to the other assays (Table 2).
The major histocompatability complex genotype likely affects the performance of assays and, by design, the tetramer and ELISPOT assays. The tetramer assay relies upon HLA-restricted presentation of selected autoantigenic epitopes presented by HLA-DR3 or DR4, and the ELISPOT was optimized using epitopes presented by HLA-DR4 (DRB1*0401) with the exception of GAD65335–352, a sequence known to be HLA-DR3 restricted (24). Therefore, we compared responses in type 1 diabetic patients and control subjects among these genotypes (Table 3). The sensitivity and specificity of each assay varied among HLA classes, but there was not a significant relationship between HLA genotype for any of the assays or autoantibodies. The cellular immunoblot assay had a higher specificity (100%) among samples from subjects with DR3 alone than those with DR4 (79% alone, 81% heterozygous). The U.K.-ELISPOT assay showed somewhat better specificity (78%) among samples from subjects with DR4 alone than subjects with DR3 (57% alone, 60% heterozygous) as previously reported (5). There was a low sensitivity (17%) but a high specificity (93%) of the tetramer assay in HLA-DR4 individuals. The small number of non-DR3 or -DR4 subjects with diabetes (only three with type 1 diabetes in North America) precluded a comparison to those with HLA-DR3 or four in the cellular immunoblot, TCP, and autoantibody assays.
Table 4 presents the qualitative reproducibility of the assay classifications from the repeat collections in each subject. Negative values indicate agreement less frequent than expected by chance. For example, the cellular immunoblot had 82.22% agreement for the 45 subjects with evaluable specimens on the two collections, both type 1 diabetes and normal together. The proportion positive on average for the two collections is 40%, and the corresponding level of agreement by chance alone is 51.6%. Thus, κ is (0.8222 – 0.0.516)/(1 – 0.516) = 0.633, a modest improvement over chance agreement alone but somewhat below a desirable level of 0.8 or more. The level of agreement and κ for autoantibodies was only slightly greater than that of the cellular immunoblot assay.
It is possible that these κ values represent biological variation over the interval 2–28 days between collections, rather than laboratory reproducibility. Thus, for each laboratory, separately among those with diabetes and control subjects, κ was calculated for those retested within 2–7 days, 8–14 days, and 15–28 days, and a test of homogeneity conducted. The detailed results are presented in the online appendix. Although the κ values showed nominally significant heterogeneity for four of the laboratories, there was trend toward decreasing levels of κ as the elapsed time increased only for the TCP and tetramer labs. Those for autoantibodies did not vary.
κ is not a measure of correlation between the two collections. Thus, Table 4 also presents the entropy R2 between the two collections for each laboratory. This is a measure of the proportion of variation in values from one collection that is explained by variation in the other collection, directly analogous the square of a correlation coefficient for quantitative variables. As for κ, the R2 is highest for the cellular immunoblot lab followed by TCP and the others.
Table 5 presents the κ and the R2 values for agreement between laboratories. The cellular immunoblot lab had modest κ agreement and some correlation with the presence of autoantibodies but not the other laboratories. There was weak agreement/correlation between the other labs and autoantibodies and each other, the strongest being between the cellular immunoblot and TCP laboratories but with a κ of 0.25.
We also assessed the concordance of false-positive and false-negative results among the four North American assays from split samples. Of the 113 samples from control subjects judged evaluable by at least one laboratory, 54 were assayed by three or four of the laboratories. None of these were positive in more than two of the laboratories. Likewise, among the 66 samples from those with diabetes, 47 were assayed in three or four of the laboratories, and 8 (17%) were negative in three of the labs, none in all four.
The online appendix presents the discriminant validity and quantitative reproducibility of individual analytes within each laboratory assay. The quantitative discriminant ability of each analyte's numerical values were essentially the same as the analyte positivity. Figure 3A presents the sensitivity and specificity for each blot section in the cellular immunoblot assay. Many of the blot sections showed a specificity greater than 80%, some greater than 90%, but the maximum sensitivity was 50% and the maximum reliability coefficient was 60% (both for blot 16). For the U.K.-ELISPOT assay (Fig. 3B), no analytes had a sensitivity >30%, whereas all the analytes had a specificity >80%. The reproducibility of responses to analytes was also limited, the highest being 21%. For the TCP assay (Fig. 3C), the maximum sensitivity, specificity, and reliability of any analytes were 64, 75, and 42%, respectively, all for different peptides or protein antigens.
For the tetramer assay, the maximum sensitivity was 89% but in an assay with low specificity (7%). The maximum specificity for one analyte was 100% with 32% sensitivity, both greater per individual analyte than any of the other HLA-dependent assays. The maximum reliability was 74% but in an assay with lower sensitivity (50%). For the U.S.-ELISPOT assay, all of the analytes showed specificity of at least 77%, but the maximum sensitivity was 18% and the maximal reliability 44%.
The reliable detection of T-cell responses associated with the autoimmune processes that lead to type 1 diabetes has been a major research goal for nearly 2 decades. In a blinded controlled study, we tested the ability of five T-cell assays to distinguish responses in participants with type 1 diabetes from normal healthy control subjects. We also assessed the qualitative reproducibility of the overall measurement of positivity and the quantitative reproducibility of measured responses to individual assay analytes to understand the nature of the responses to diabetes antigens. The CI, TCP, and U.K.-ELISPOT assays could distinguish between persons with type 1 diabetes from control subjects with sensitivity and specificity ranging between 60–74% and 69–88%, respectively. We did not find a significant effect of HLA genotypes on responses in the combined immunoblot and TCP assays. The ELISPOT assay utilized DR4-restricted peptides predominantly and showed a trend for greater specificity among DR4+ individuals. The tetramer assay was less discriminatory but showed high specificity in DR4+ individuals, possibly reflecting the selection of peptides for binding to this genotype.
Several explanations could account for the differences between the ELISPOT assay results in the U.K. and Denver, Colorado. The U.K. subjects were older and had shorter disease duration. Samples were transported to the U.K. lab on the day they were drawn, whereas the Denver samples were air-shipped, exposed to temperature variation, and more laboratory personnel were involved. These technical details highlight the challenges in replicating bioassays at different laboratory sites.
The performance characteristics with these biologic assays are similar to other biologic assays and interestingly not markedly different from the biochemical autoantibody assays (25,26). Of note, the study design tested the reproducibility of assay results from the same subjects and not the reproducibility of measurements in the same samples. In this regard, the qualitative and quantitative reproducibility of assays in individual subjects between the two samplings and the individual analyte reproducibility were lower than desirable, with the maximum κ agreement statistic of 0.63 and maximum R2 of 0.34, both for the cellular immunoblot assay. The reasons for this variation are not clear; they could reflect variation in the assay or in procedures used to ship or process samples, but they could also reflect biological variation in individuals over time. This is further suggested by 0.7 κ for the autoantibodies, whereas in split duplicates, the κ values were 0.89 and 0.93 for anti-GAD65 and ICA512. Nonetheless, a systematic decline in the κ statistics as a function of elapsed duration was only seen in two laboratories. Thus, if biological mechanisms explain the variation, they must take place within days.
In a previous study, Seyfert-Margolis et al. suggested that “false”-positive results in cellular assays may not be false but a true measure of diabetes antigen–specific T-cells in normal individuals because 2 of 4 positive samples from normal control subjects in the combined immunoblot were also positive in the TCP assay (8). However, in the present study, the low concordance among laboratories in false-positive responses for control subjects suggests that false positives are assay specific rather than a true biologic difference in these healthy individuals. Likewise, the low concordance of false negatives between subjects suggests that they are assay specific.
Each of the assays measures the T-cell responses in ways that differ with respect to assay conditions and antigenic complexity. The combined immunoblot measures the proliferative responses of peripheral blood cells to all islet antigens, whereas the ELISPOT, TCP, and tetramer assays measure responses to a limited repertoire; only a fraction of the cells may be present in the peripheral blood at any time (3–5,7,27,28). The responding cell subpopulations (CD4+, CD8+, or both) also differ between the assays. Therefore, it is not surprising that there was not complete agreement between results with each assay. The relatively higher sensitivity of the combined immunoblot compared to either the T-cell proliferative, ELISPOT, or tetramer assays most likely reflects their detection of responses to the widest array of antigens. A caveat of these studies is that we have not compared responses in patients with type 1 diabetes with other autoimmune diseases. In this exercise, therefore, we cannot distinguish diabetes-specific responses from an “autoimmune” phenotype.
There are limitations of these assays. First, although the highest sensitivity and specificity was with the combined immunoblot assay, it is not clear which antigens are recognized. The combined immunoblot and TCP assays identify the number of positive analytes rather than the responses to each analyte. In this regard, the ELISPOT and tetramer assays may be more useful for following the responses of particular antigen-specific cells. Moreover, the limited reproducibility of the combined immunoblot and TCP assays to specific analytes, whether this is because of biological shifts in particular T-cell populations or technical variability, may present a problem for tracking responses to any group of antigen-specific T-cells over the course of a clinical study. On a practical level, the volume of blood needed for each of the assays (ranging between 10–30 ml) may present a problem for repeated sampling on smaller subjects.
These assays all used fresh rather than frozen PBMNC. In a previous analysis, assays that used frozen cells did not perform well, possibly because of variability in cryopreservation procedures, whereas both the combined immunoblot and TCP assays that used fresh cells showed good discriminant ability, particularly when combined (8). Because using fresh cells limits performance of the assay to the time of the sampling, the reproducibility of the assay over time, particularly of the individual analytes, could complicate the use of these assays in clinical trials. An analysis of the many thousands of (strictly blinded) TCP samples processed over years in the TRIGR diabetes prevention trial may provide new answers to these questions (29).
A number of improvements might be made in the future to enhance the use of these assays in clinical studies. Clearly optimizing methods for freezing cells and adapting the assays for use with frozen cells is important since it would allow the simultaneous measurement of samples from individual subjects collected at different points in time. With selection of the most informative analytes, the volumes of blood needed to run the assays could be reduced. Importantly, repeated studies over time would help to understand the appearance and disappearance of antigen-specific T-cells in the peripheral circulation and help to understand whether the changes that were seen in the assay results reflect differences in the handling of the specimens or reflect a biologic change that may be because of trafficking of antigen-specific cells through various compartments. Nonetheless, our findings provide highly encouraging results regarding the ability of cellular assays to identify responses to multiple targets that can discriminate patients from normal control subjects. Although the discriminatory ability is still superior with a combination of autoantibody measurements, the cellular assays provide insights into cells that are thought to be involved in the disease pathogenesis and are likely to be affected by new interventions that target immune responses.
J.M.L. has consulted with the following companies: TolerRx, GlaxoSmithKline, Bayhill Therapeutics, and Andromeda Biotech. No other potential conflicts of interest relevant to this article were reported.
Clinical trial reg. no. NCT 00212329, clinicaltrials.gov.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.