|Home | About | Journals | Submit | Contact Us | Français|
The therapy of lupus nephritis (LN) is mainly based on clinical data because serial biopsies are not routinely used to evaluate kidney involvement as the disease is treated and evolves. A non-invasive, real-time method to assess renal pathology could be used to titrate treatment and improve outcome. This work was undertaken to develop a urine biomarker of tubulointerstitial inflammation (TI) in LN, an injury that predisposes to renal fibrosis and chronic kidney disease.
Urine samples collected at or close to the time of biopsy for LN (n=64) were used to identify potential biomarkers of TI. TI was scored by a renal pathologist using a semi-quantitative scale. Urine monocyte chemotactic protein-1 (uMCP-1), urine hepcidin (uHepcidin), and urine liver-fatty acid binding protein (uLFABP) were measured by immunoassays. Linear discriminant analysis was used to weight variables and derive composite biomarkers that identified the level of TI.
The discriminant function that described the most accurate biomarker included uMCP-1 and serum creatinine as the independent variables. This composite biomarker had a sensitivity of 100%, specificity of 81%, positive predictive value of 67%, negative predictive value of 100%, and misclassified only 14% of the biopsies.
In conclusion, specific renal pathologic lesions can be modeled by composite biomarkers. These biomarkers can be used to non-invasively follow and adjust the treatment of LN based on renal injury.
When lupus nephritis (LN) is first diagnosed treatment is usually based on kidney biopsy findings. After initial treatment however, flares of LN are often treated without the benefit of kidney pathology because repeat biopsies are infrequently done. Furthermore, there is presently no way of knowing how a treatment is affecting kidney injury on a real-time basis, other than by using the surrogate markers of proteinuria and kidney function, which have limited clinical utility 1–4. Repeat biopsies done as a component of clinical studies have often demonstrated continuing active inflammation and/or the progression of chronic changes such as glomerulosclerosis and interstitial fibrosis despite the use of presumably effective therapies 1–6. Although clinically silent, these lesions predispose to chronic kidney disease and later, end-stage kidney disease.
A continuous read-out of kidney pathology would therefore be theoretically helpful in planning and following therapy for LN, and may allow standard regimens to be tailored for individual patients based on how their disease is responding. To this end, the identification of urine biomarkers that accurately reflect kidney pathology during LN flare cycles is a clinically relevant goal. Some investigations have addressed this by looking for urine biomarkers that distinguish LN from other forms of glomerular disease, or that can differentiate between classes of LN 7, 8. Such biomarkers, if successfully developed, will reflect an integrated sum of kidney injuries, and thus a broad picture of renal pathology.
In contrast, our group has approached the discovery of biomarkers of kidney pathology in SLE by focusing on biomarkers that reflect distinct pathologic lesions that are potentially important treatment targets to prevent chronic kidney disease. This approach is more specific and flexible than correlating biomarkers to ISN/RPS classes of LN, and potentially more useful, in that there can be broad variations in the histology within the same LN class, and combinations of classes are not infrequent 9–11. Additionally, the ISN/RPS schema does not address all the compartments of the kidney equally, but mainly considers glomerular changes. In this context pathology of the renal interstitium is relevant. Inflammatory injury to the renal interstitium can result in interstitial fibrosis, and often determines the fate of the kidneys in LN 1, 5, 6, 12. This report describes the development of a biomarker of interstitial inflammation in LN.
Kidney biopsies were done for the clinical diagnosis of glomerular disease in 61 patients. All biopsies showed immune-complex glomerulonephritis consistent with LN. The entire biopsy population is described in Table 1. The patients with moderate-severe interstitial inflammation were directly compared to the patients with no or mild interstitial inflammation (Table 2). African Americans and other non-Caucasians were over-represented in the moderate-severe inflammation group. Patients with moderate-severe inflammation had significantly more proteinuria at biopsy than patients with none-mild interstitial inflammation. Serum creatinine was numerically higher at the time of biopsy in patients with moderate-severe inflammation, but this did not reach significance. Similarly, there was a higher, but non-significant proportion of patients with Class IV or IV + V LN in the moderate-severe group.
In our initial approach, three candidate biomarkers were selected and examined for correlation to interstitial inflammation. These biomarkers were urine monocyte chemoattractant protein-1 (uMCP-1), urine hepcidin (uHepcidin), and urine liver-type fatty acid binding protein (uLFABP).
uMCP-1 is a biomarker of active LN 13, and MCP-1 is made by infiltrating interstitial leukocytes in a number of glomerular diseases 14. As shown in Table 2, uMCP-1 was significantly greater in patients with moderate-severe interstitial inflammation than patients with no or mild interstitial inflammation. When used to classify the severity of interstitial inflammation in this test set of biopsies, uMCP-1 misclassified 10 of 64 biopsies (Table 3).
uHepcidin was selected because a non-biased proteomic approach showed that it was differentially expressed in the urine during the evolution of LN flares 15. Immunohistochemical staining demonstrated that infiltrating interstitial leukocytes expressed hepcidin in LN kidney biopsies, and human monocytes were shown to produce hepcidin in response to treatment with interleukin-6 and interferon-α 16. uHepcidin was also significantly increased in patients with moderate-severe interstitial inflammation as compared to patients with mild or no inflammation (Table 2). When used alone to classify the severity of interstitial inflammation, uHepcidin misclassified 22 of 64 biopsies (Table 3).
LFABP is made by the proximal tubule in response to injury 17, and was postulated to be responsive to interstitial inflammation. It was significantly increased in the urine of patients with moderate-severe interstitial inflammation (Table 2). Used alone, uLFABP misclassified 14 of 64 biopsies (Table 3).
Table 3 lists the sensitivity, specificity, positive and negative predictive values of each of these individual biomarkers as predictors of the degree of interstitial inflammation. uMCP-1 performs fairly well, but uHepcidin and uLFABP do not. When individual data for each biomarker were examined there was considerable overlap of values in patients with no or mild inflammation and patients with moderate-severe inflammation (data not shown). This contributes to misclassifications and poor performance characteristics.
We next determined if the performance characteristics to differentiate interstitial inflammation status could be improved, and misclassifications could be attenuated by combining candidate urine biomarkers and clinical biomarkers, All combinations of uMCP-1, uHepcidin, uLFABP, serum creatinine (SCr) and proteinuria (expressed as urine protein:creatinine ratio-uPCR) were tested by linear discriminant analysis, a procedure that produces optimal weights for the log-transformed variables involved. The linear discriminant analysis was based on the 49 urine samples collected at the time of biopsy and did not include any repeat biopsies. The best resulting combination is given below:
Here Y1 is the linear discriminant score, and the Y1 value that gave the maximum sum of sensitivity and specificity is 1. At and above this cut-off biopsies were assigned to moderate-severe interstitial inflammation; below this cut-off biopsies were assigned to no-mild interstitial inflammation. The same threshold value of 1 gave the best sum of sensitivity and specificity when applied to all 64 subjects and had the least misclassification probability (Table 3). With this linear discriminant score only 9 of 64 biopsies (14%) were misclassified, specificity was 81% and sensitivity was 100% (Table 3). This means all misclassifications were from no-mild to moderate-severe inflammation. No cases of moderate-severe interstitial disease were misclassified. The positive predictive value was 67% and the negative predictive value was 100%. The receiver-operating characteristic (ROC) curve for this composite biomarker is shown in Figure 1. The area under the curve (AUC) was 0.92.
For comparison all 5 variables were used to derive a biomarker from the same 49 cases. This composite biomarker had lower specificity and a higher number (11) of misclassified cases than Y1.
The misclassified patients could not be differentiated from correctly classified patients by the use of medications at the time of biopsy, including pulse methylprednisolone, oral corticosteroids or immunosuppressive drugs. Two misclassified patients received pulse methylprednisolone (22%), while 10 correctly classified patients received pulse corticosteroids (18%). The median dose of prednisone in the misclassified patients was 3 mg/d (range 0–60), and in the correctly classified patients it was 20 mg/d (range 0–60). Additionally, the misclassified patients could not be differentiated from correctly classified patients by the timing of their urine samples as only 2 gave urine samples after their biopsies. Interestingly, the misclassified patients all had an elevated serum creatinine, and their average creatinine was significantly greater than the correctly classified patients (2.24±0.28 vs. 1.37± 0.16 mg/dl, p=0.003). It is not likely that this finding can be used to identify patients that are likely to be misclassified as several correctly classified patients had serum creatinine values in this range or higher.
We next determined if the urine and clinical biomarkers could be combined to yield a linear discriminant equation for interstitial fibrosis. This seemed possible, as interstitial inflammation leads to injury that may result in interstitial fibrosis and chronic kidney disease 1, 5, 6, 12. The ability to classify interstitial fibrosis as moderate-severe or none-mild was examined using discriminators based on urines obtained at biopsy (no repeat biopsies). For fibrosis, the best discriminant function is given by equation 2 with a threshold value of −1:
The performance characteristics of individual biomarkers and the combined biomarker Y2 are given in Table 4 for all of the biopsies.
The combined biomarker Y2 threshold of −1 based on 46 biopsies did not produce the best sum of sensitivity and specificity for all 60 biopsies, but yielded the lowest misclassification proportion. The best sum of sensitivity and specificity was achieved with a threshold value of −2.94 (sensitivity 80%; specificity 62%) but misclassified 20 out of 61 cases (or 33%). The difference in the sum of sensitivity and specificity however is just 2%. The threshold Y2 value of −1 was thus favored given the lower rate of misclassification. The receiver-operating characteristic (ROC) curve for this composite biomarker is shown in Figure 2. The area under the curve (AUC) was 0.74.
The intersitial inflammation biomarker equation 1 was applied to 10 biopsies that were not LN. These biopsies included an idiopathic immune-complex glomerulonephritis (1), pauci-immune necrotizing and crescentic glomerulonephritis (1), membranous glomerulopathy (1), diabetic glomerulosclerosis (1), IgA nephropathy (1), advanced chronic kidney disease (1), glomerular basement membrane abnormalities (2), non-specific findings (2). Only one of these biopsies had moderate-severe interstitial inflammation, the rest had none-mild. Equation 1 correctly classified 8 of the 10 biopsies, including the biopsy with severe interstitial inflammation. Two biopsies with no-mild interstitial inflammation were misclassified as moderate-severe, and like the misclassified LN patients described previously, these patients had elevated serum creatinine levels.
The ability to non-invasively follow changes in kidney pathology during the treatment of LN would be an important step forward in improving disease management and outcome. Here we have demonstrated that a composite biomarker, uMCP1 +SCr, accurately reflects renal interstitial inflammation in a moderately-sized cohort of SLE patients. Although individual candidate urine biomarkers were, on average, differentially expressed relative to the level of interstitial inflammation in a population, there was significant overlap among cases with and without interstitial inflammation, and this attenuated the performance of single urine proteins as biomarkers.
Equation (1) misclassified 14% of the biopsies. With additional training data this rate may become lower, but will likely not go to zero. Although the kidney biopsy is the gold-standard comparator for Equation (1), there is a finite rate of misclassification with tissue readings. The accuracy of a kidney biopsy depends on the size of the tissue sample obtained. For example, the correct diagnosis of glomerular disease or kidney allograft rejection requires an adequate biopsy defined by a minimum number of glomeruli and blood vessels 18, 19. There is no information on correct classification of tubulointerstitial lesions by biopsy in SLE, however in a study of paired kidney transplant biopsies, interstitial fibrosis identified on the first biopsy was not seen in 12% of second biopsies 20. Because it was not felt that regression of fibrosis had occurred, this was thought to be an estimate of misclassification of tubulointerstitial disease by biopsy, and is close to that of our composite biomarker. It is conceivable that urine biomarkers could be less likely to misclassify kidney pathology because they reflect the total renal environment and are not subject to biopsy sampling errors and size variations.
Although this biomarker of interstitial inflammation was developed for the evaluation of LN, it is likely it can be used to describe the renal interstitium in other types of kidney disease. While this will need to be tested in a larger group of non-lupus patients than included here, it is relevant because tubulointerstitial injury, including interstitial inflammation and fibrosis is a risk factor for renal functional decline and poor response to therapy in a variety of disorders. These include membranous nephropathy, focal segmental glomerulosclerosis, IgA nephropathy, diabetic nephropathy, and renal transplant failure 21–27. Similar to LN, interstitial inflammation appears to be a precursor to interstitial fibrosis in these diseases 27.
Given the relationship between interstitial inflammation and interstitial fibrosis, we used uMCP-1, uLFABP, uHepcidin, uPCR, and serum creatinine to derive a composite biomarker of renal interstitial fibrosis. Here uPCR and uHepcidin were informative, but the composite biomarker did not perform as well as Equation (1), suggesting the addition of other component markers is needed for a final composite fibrosis biomarker.
Our markers of interstitial inflammation and fibrosis were derived from biopsies read for clinical use. Thus, the markers were based on a semi-quantitative evaluation of interstitial inflammation and fibrosis, rather than precise morphometric measurements and enumeration of individual sub-types of interstitial leukocytes. Reassessment of interstitial inflammation and fibrosis in a more quantitative fashion may further improve biomarker models.
The misclassification of biopsies did not appear to be due to increased use of corticosteroids or immunosuppressive medications, or a delay in the collection of urine after biopsy in the misclassified patients.
In summary, this pilot work demonstrates that combinations of urine proteins and clinical variables can be used to derive potentially useful composite biomarkers that reflect specific pathologic lesions in the kidneys of patients with LN. A limitation of our study is the relatively modest sample size of LN biopsies used for linear discriminant analysis. With larger test sets and more precise (quantitative) measurements of pathology, we expect these biomarker equations can be more finely tuned to increase accuracy. Another important limitation is that we do not have an independent set of LN biopsies to use as a validation cohort. This will be necessary before the biomarkers can be used clinically. Despite these limitations, it is envisioned that eventually these biomarkers will be used to follow patients with LN and possibly other forms of kidney disease over time, and individualize treatment decisions.
The cohort was comprised of 64 kidney biopsies from 61 patients, all of whom had at least 4 American College of Rheumatology criteria for systemic lupus erythematosus (SLE), including immune-complex glomerulonephritis, and many of whom participated in the Ohio SLE Study 28. Three patients had repeat biopsies. Urine was collected on the day of biopsy or within 24 hours, except in 12 cases where urine was collected within 2 (n=4), 3 (n=2), 4, 6, 7 (n=2), 12, and 13 days of kidney biopsy. After urine was collected it was centrifuged to remove sediment and stored in preservative-free aliquots at −80°C until use.
Interstitial inflammation and interstitial fibrosis were semi-quantitatively graded as none, mild, moderate, or severe on light-microscopic sections for clinical biopsy reports by a nephro-pathologist blinded to urine biomarker data. The stains used to estimate the percentage of involved cortex were hematoxylin and eosin, periodic-acid Schiff, and tri-chrome. None was considered to be up to 5% of the renal interstitium; mild between 6 and 25%, moderate between 26 and 50%, and severe greater than 50% 1, 29. For analysis biopsies with no-mild inflammation were combined, and biopsies with moderate and severe fibrosis were combined. The rationale for this grouping was to model and distinguish clinically significant interstitial disease, as has been shown in previous studies 1, 30.
Urine MCP1 levels were measured using the Quantikine Human CCL2/MCP1 ELISA kit from R &D Systems (Minneapolis, MN) as before 13. uMCP-1 was normalized to urine creatinine. Creatinine was measured with a Creatinine Detection Kit (Assay Designs, Ann Arbor, MI). The final values were expressed as ng MCP-1/mg creatine.
Urine L-FABP level was measured using the Human L-FABP ELISA kit from CMIC Ltd. (Tokyo, Japan) following the manufacture’s protocol and uLFABP was corrected by urine creatinine. The final values were expressed as ng L-FABP/mg creatine.
Hepcidin-25 was measured by EIA (Bachem Group, Torrance, CA) as before 16. The hepcidin-25 standard Liver-Expressed Antimicrobial Peptide 1 (LEAP1) from Peptides International Inc (Louisville, KY) was used to validate this EIA. The R-squared value was 0.9967 for LEAP1 from 0–50 ng/ml using sigmoid regression. The coefficient of variation (CV) for a fixed hepcidin-25 concentration of 1.56 ng/ml was 3.49% intra-assay and 3.43% inter-assay. Urine hepcidin were then normalized to urine creatinine with the final value expressed as ng hepcidin/mg creatinine.
Fisher’s linear discriminant analysis was used to determine the discriminant score function based on one or more normally distributed components. This procedure produces an optimally weighted linear function of the chosen log-transformed markers and the discriminating threshold value minimizes the expected number of misclassifications under the normal model. This does not necessarily maximize the sum of sensitivity and specificity. We modified the threshold value to be the one that maximizes this sum for the observed data. The data were log-transformed because we examined the Box-Cox family of transformations to look for a good fit to normality and it resulted in the choice of log transformation. The Shapiro-Wilk test was used to formally test for normality of the log-transformed variables. It showed an excellent fit for uMCP1, uHEP, and uPCR, and a moderate fit for SCr where a slight positive skewness was noticed. For uLFABP, the data was bimodal for the no or mild inflammation group and was normal for the moderate-severe group. The software used for analysis was SAS JMP 9.0 (Cary, North Carolina). Comparisons of two groups were done by the Mann-Whitney test. A two-tailed P < 0.05 was considered significant.
This work was supported in part by NIDDK R01 DK074661 (BHR) and R21 DK077331 (BHR).