|Home | About | Journals | Submit | Contact Us | Français|
The rate of decline of glomerular filtration rate (GFR) in children with chronic kidney disease (CKD) can vary, even among those with similar diagnoses. Classic regression methods applied to the log-transformed GFR (i.e., lognormal) quantify only rigid shifts in a given outcome. The generalized gamma distribution offers an alternative approach for characterizing the heterogeneity of effect of an exposure on a positive, continuous outcome. Using directly measured GFR longitudinally assessed between 2005 and 2010 in 529 children enrolled in the Chronic Kidney Disease in Children Study, the authors characterized the effect of glomerular CKD versus nonglomerular CKD diagnoses on the outcome, measured as the annualized GFR ratio. Relative percentiles were used to characterize the heterogeneity of effect of CKD diagnosis across the distribution of the outcome. The rigid shift assumed by the classic mixed models failed to capture the fact that the greatest difference between the glomerular and nonglomerular diagnosis’ annualized GFR ratios was in children who exhibited the fastest GFR declines. Although this difference was enhanced in children with an initial GFR level of 45 mL/minute/1.73 m2 or less, the effect of diagnosis on outcome was not significantly modified by level. Generalized gamma models captured heterogeneity of effect more richly and provided a better fit to the data than did conventional lognormal models.
Chronic kidney disease (CKD) is progressive in nature and characterized by decreasing kidney function over time (1–3). The natural history of CKD has been studied predominantly in adult populations (4); substantially less is known about the progression of CKD in children (5). The Chronic Kidney Disease in Children (CKiD) cohort study assembled a large cohort of children with CKD for prospective, longitudinal measurement of glomerular filtration rate (GFR) to specifically address the lack of knowledge in this area. CKD resulting from glomerular disease or injury has been associated with an accelerated decline in kidney function compared with CKD resulting from nonglomerular (predominantly structural urologic disease) CKD (5). Even among individuals with a specific diagnosis, little to no progression may occur over extended periods, whereas marked progression may occur in the most susceptible individuals (6–9).
Analyses of GFR using traditional regression methods (e.g., lognormal regression) aimed at detecting rigid shifts of the entire distribution of values associated with exposures are often overly simplistic and could fail to detect meaningful differences. Specifically, classical regression analysis of a log-transformed outcome under the standard assumption of a common scale yields a single estimate for the relative effect of an exposure and might mask effect heterogeneity. Here, we show that parametric methods based on the generalized gamma (GG) distribution offer an attractive, alternative approach for characterizing the effect of an exposure on GFR decline. GG models, which are characterized by 3 parameters—location (β), scale (σ), and shape (κ)—allow for determination of heterogeneous effects of an exposure on the percentiles of the distributions of different groups (10). Heterogeneous effects of an exposure across different percentiles of the distribution can be characterized by allowing the exposure variable to modify not only β, but also σ and κ. Furthermore, the GG distribution incorporates as a particular case the classic lognormal regression, in which β depends on covariates, σ is common across exposure groups, and κ takes the fixed value of 0. Relative percentiles (RPs) (i.e., ratios of the values of the outcome below which a certain percentage of individuals are observed in different groups) derived from the GG model are useful for fully characterizing differences between exposure groups; in particular, effects of exposures can be different at different percentiles of the distribution of values. To illustrate the use of GG models and RPs as measures of heterogeneous effects of exposures, we used data collected by the CKiD cohort study to characterize the longitudinal changes in GFR in children with glomerular versus nonglomerular CKD.
The present analysis draws upon data collected from subjects enrolled in the CKiD cohort study, which was conducted at 47 pediatric nephrology centers in North America. Details of the CKiD study protocol and study design have been previously published (11); the study protocol was approved by the institutional review board of each participating center. Briefly, eligibility criteria for enrollment in CKiD included an age of 1–16 years and an estimated Schwartz formula (12) GFR of 30–90 mL/minute/1.73 m2. Clinical and demographic data were collected from CKiD participants at annual visits. The study was designed to directly measure GFR via plasma-iohexol disappearance over 5 hours (13) at the baseline visit, the first follow-up visit (1 year later), and every other visit (2 years apart) thereafter (11). Formulas to estimate GFR based on serum creatinine, cystatin C, and/or blood urea nitrogen levels, as well as sex and height data, have been published previously by CKiD investigators (14). Application of these formulas using the maximal number of markers available at every annual study visit yielded estimates of GFR to complete the longitudinal data at visits when the iohexol protocol was either not implemented or not possible. Hereafter, we refer to the full concatenation of plasma-iohexol disappearance and estimates of GFR measurements as simply GFR. The study population comprised subjects with 2 or more GFR measurements assessed between 2005 and 2010 and a minimum of three-quarters of a year between the first and last measurements.
Except for 4% of serum creatinine measurements and 4% of blood urea nitrogen measurements that were done locally, all laboratory assays were performed by the central CKiD laboratory at the University of Rochester, Rochester, New York. Cystatin C measurement using the DAKO method was performed centrally at the Children’s Mercy Hospital, Kansas City, Missouri.
The outcome of interest was the annualized GFR ratio. When using all GFR measurements in a given individual across time points, the annualized ratio (formally defined in the mixed linear model section below) corresponds to the average ratio of the later to earlier GFR, 1 year apart. A constant GFR over time corresponds to an annualized GFR ratio of 1; an annualized ratio of 0.95 indicates that, on average, each year the individual’s GFR was 95% of the value for the previous year. Those with very low annualized ratio values had rapid progression of CKD.
The primary exposure of interest was CKD diagnosis. A self-report of primary CKD diagnosis was collected at baseline from each participant and categorized as glomerular or nonglomerular CKD. Self-reported diagnoses needing adjudication (i.e., those reported as “other”) were reviewed by CKiD steering committee members and classified accordingly. A list of the specific CKD diagnoses and their glomerular/nonglomerular classifications has been previously published (15). Baseline GFR level was also included in the analysis as a covariate and as a possible effect modifier of the relation between annualized GFR ratio and CKD diagnosis.
GG distributions (10) are characterized by 3 parameters—location (β), scale (σ), and shape (κ)—denoted by GG(β, σ, κ). The location parameter β governs the value of the median for fixed values of the scale and shape parameters; the scale parameter σ determines the value of the interquartile ratio (i.e., third quartile/first quartile) for a fixed value of the shape parameter and independent of the location parameter β. The shape parameter κ is useful for modeling heterogeneity of the tails of the distributions of different groups. In particular, for 2 groups with the same location and scale but different shape, the shape parameters will reflect differences in the tails of the 2 distributions. Because the lognormal distribution corresponds to the particular case of κ = 0, a normal distribution with mean μ and standard deviation σ will correspond to the log-transformation of a GG(μ, σ, 0), that is, N(μ, σ) ↔ log GG(μ, σ, 0).
To obtain parametric estimates of the distributions of annualized GFR ratios by CKD diagnosis using classic methods, log-transformed GFR measurements were regressed on time from baseline using a linear mixed-effects model with random intercept and random slope components. Specifically, if the jth visit from the ith individual occurs tij years from baseline (which in individuals complying with the protocol will be equal to j − 1) and the GFR at that time is GFRij, the model for the change in GFR over time takes the form
where Ai is the estimated log(GFR) for individual i at ti1 = 0 (baseline), Bi is the expected change in log-transformed GFR for the ith individual associated with a 1-unit change in time (i.e., exp(Bi) is the estimate annualized GFR ratio), and Eij are the errors of the log-transformed GFR measurements away from the line of the ith individual, which are assumed to be normally distributed with a mean of zero and standard deviation of θ. Glomerular CKD diagnosis was included as both a fixed covariate and as an effect modifier of the fixed effect of time from baseline. Specifically, for children with nonglomerular CKD, the distribution of Ai is N(α, τ), which is the log of GG(α, τ, 0), and the distribution of Bi is N(β, σ), which is the log of GG(β, σ, 0). For children with glomerular CKD, the distribution of Ai is N(α*, τ), which is the log of GG(α*, τ, 0), and the distribution of Bi is N(β*, σ), which is the log of GG(β*, σ, 0). Thus, the mixed linear model allows for heterogeneity by diagnosis in the location parameters (i.e., (α* − α) for level and (β* − β) for slope) while estimating a common scale parameter σ across all exposure groups; as with all lognormal distributions, the shape parameter is fixed at zero.
Empirical estimates ai for Ai and bi for Bi of an individual were calculated with the least-squares method using each individual’s log-transformed measurements of GFR on time from baseline. As such, exp(bi) is the empirically estimated annualized GFR ratio for the ith individual.
GG regression analysis was used to model the CKD diagnosis-specific distributions of the empirically estimated annualized GFR ratios, exp(bi), using the method of maximum likelihood. The model contained 6 parameters: 2 exposure groups × 3 parameters (location, scale, and shape). As such, this model allowed the distribution of bi for children with nonglomerular CKD to be described as the log of GG(β, σ, κ) and for children with glomerular CKD to be described as the log of GG(β*, σ*, κ*). This parameterization thus generalizes that of the linear mixed model in 2 ways: first, through heterogeneity in the estimate of the scale parameter, σ, and second, by allowing the estimates of the shape parameter, κ, to be nonzero and unique for each CKD diagnosis group.
The cumulative distribution of empirically estimated annualized GFR ratios, exp(bi), was summarized graphically by CKD diagnosis group. To assess the goodness of fit of both the linear mixed-effects model and the GG-based analysis, diagnosis-specific parametric cumulative distribution functions (CDFs) of the estimated annual GFR ratios were superimposed on the diagnosis-specific cumulative distributions of the empirical annualized GFR ratios.
Results from the classical mixed model and the GG model were used to estimate relative percentiles of GFR annualized ratios for subjects with glomerular CKD versus those with nonglomerular CKD. Specifically, annualized ratios in the 2 diagnosis groups were compared using the pth percentile r(p; β, σ, κ). In the setting of a GG model, this corresponds to eβ[rκ(p)]σ, where rκ(p) is the pth percentile of the standard GG having β = 0, σ = 1, and shape = κ (10).
The pth relative percentile of annualized GFR change for children with glomerular CKD versus nonglomerular CKD corresponds to
where r(p; β*, σ*, κ*) is the estimated pth percentile (0–100) of the annualized GFR ratios for children with glomerular CKD and r(p; β, σ, κ) is that of the children with nonglomerular CKD. Confidence intervals for the RPs were calculated using the delta method. RPs equal to 1 correspond to no association with CKD diagnosis; RPs less than 1 indicate that children with glomerular CKD display more accelerated disease progression than do children with nonglomerular CKD. In the classic mixed lognormal models (i.e., σ* = σ and κ* = κ = 0), RPs are equal to exp(β* − β) across all percentiles. Thus, the effect of diagnosis, as modeled by the linear mixed model, is indicated by a single estimate.
We were also interested in both adjusting for the baseline level of GFR and exploring whether the baseline level modified the effect of diagnosis on GFR change. Baseline GFR level was split at the study population median and categorized as 45 mL/minute/1.73 m2 or less versus more than 45 mL/minute/1.73 m2. We then carried out the analysis described above separately for the 2 groups (i.e., the 12-parameter GG regression model) and used likelihood ratio tests to compare this model with 1) the 6-parameter GG regression model that assumed that baseline level did not change the GG distribution and 2) the 9-parameter GG model that was adjusted for baseline level but forced the effect of diagnosis to be equal in the 2 strata.
All analyses were performed using PROC MIXED and PROC NLMIXED procedures in SAS, version 9.2 (SAS Institute, Inc., Cary, North Carolina). Graphic figures were constructed using S-PLUS 8.0 statistical software (Insightful Corporation, Seattle, Washington). All statistical significance was evaluated at α = 0.05.
At the time of analysis, 535 children had 2 or more GFR measurements and more than three-quarters of a year of follow-up time. Of these, 3 children had estimated annualized GFR ratios greater than 1.4 (i.e., greater than 40% estimated increases in renal function) and 3 had estimated baseline GFR measurements greater than 120 mL/minute/1.73 m2; all 6 were excluded from the analysis. Thus, analysis was performed on data from 529 children.
Table 1 presents demographic and clinical characteristics of the 529 children according to diagnosis type (glomerular CKD, n = 109; nonglomerular CKD, n = 420). The 2 groups differed in age, sex, race, duration of CKD, years of follow-up, baseline GFR level, and annualized change in GFR. The number of GFR measurements per subject was similar in both diagnosis groups. More importantly, more than half of the children in each group (52% among those with glomerular CKD and 62% among those with nonglomerular CKD) had 4 or more time points from which to calculate the annualized ratio of GFR.
Figure 1 shows results from the 2 univariate analyses performed to examine the effect of CKD diagnosis on the distribution of annualized GFR ratios: the classic linear mixed-effects model using lognormal distributions (Figure 1A) and the GG analysis (Figure 1B). Shown on the left side of Table 2 are the parameter estimates of the linear mixed model. The classic linear mixed model allowed for diagnosis-specific estimates of the location parameters; a common scale parameter is estimated for the 2 groups and the shape parameters are both fixed at zero. As shown in Table 2, the nonglomerular and glomerular location-parameter estimates were significantly different from one another: −0.048 vs. −0.141, respectively (P < 0.01). Because the classic mixed model allowed differences in the distribution of annual GFR ratios to be modeled only in the location parameter, the result was a rigid, uniform shift to the left of the estimated distribution of annualized GFR ratios for children with glomerular CKD compared with children with nonglomerular CKD. This is shown by the 2 thick black curves in Figure 1A.
Also shown in Figure 1 are the CKD diagnosis-specific distributions of empirically estimated annualized GFR ratios (step functions). The empirical cumulative distributions show that discrepancies in GFR change between the 2 diagnosis groups were greatest in the lower tails of the distributions, corresponding to children with more rapid declines in kidney function. The single common scale estimate and the lack of flexibility in the shape parameter with the lognormal distribution resulted in a poor fit by the linear mixed model of the tail of the nonglomerular distribution of empirically estimated annualized GFR ratios. Moreover, the rigid shift of the estimated glomerular CKD distribution of annualized GFR ratios resulted in a CDF with a poor fit to the empirical distribution for children with glomerular CKD, one that underestimates the prevalence of children with rapid disease progression (e.g., children with a greater than 30% decrease in GFR per year).
Figure 1B shows the same diagnosis-specific cumulative distributions of empirically estimated annualized GFR ratios overlaid with the diagnosis-specific CDFs for annualized GFR ratios generated from the GG regression analysis. The diagnosis-specific parameter estimates for the location, scale, and shape are displayed in the right side of Table 2. Differences in the location and scale parameters were small and statistically nonsignificant (difference in location by diagnosis, P = 0.53; difference in scale, P = 0.16). In contrast, the shape-parameter estimates for nonglomerular CKD and glomerular CKD were significantly different (0.782 vs. 2.100; P < 0.01). Moreover, each shape parameter estimate was significantly different from zero (for nonglomerular κ, 95% confidence interval: 0.60, 0.96; for glomerular κ, 95% confidence interval: 1.31, 2.89), which indicated that neither distribution was lognormal. The greater flexibility of the full GG parameterization produced diagnosis-specific CDFs of annualized GFR ratios with an excellent fit to the empirical distribution that was far superior to the poor fit provided by the linear mixed model.
Figure 2 depicts the estimated RPs and 95% confidence intervals of GFR annualized ratios for subjects with glomerular CKD versus those with nonglomerular CKD based on the GG model. The RP curve succinctly illustrates that the greatest difference between glomerular and nonglomerular GFR annualized ratios is among those in the lowest percentiles (i.e., those with the fastest disease progression), which illustrates the heterogeneity of effect. Additionally, Figure 2 illustrates the limited and overly simplistic information conveyed by the single lognormal (linear mixed model) estimate, whereby the rate of decline in the glomerular group is lowered by 0.91, independent of percentile and thus masking the true heterogeneity of effect.
Two hundred fifty-seven children (49%) (45 (41%) with glomerular CKD and 212 (50%) with nonglomerular CKD) had baseline GFR level of 45 mL/minute/1.73 m2 or less. Figure 3 displays the distribution of GFR ratios by baseline GFR level and CKD diagnosis. Figure 3A presents the diagnosis-specific empirical distributions of annualized GFR ratios (step functions) for children with baseline GFR less than or equal to 45 mL/minute/1.73 m2; Figure 3B shows data for children with baseline GFR greater than 45 mL/minute/1.73 m2. These curves (2 in Figure 3A and 2 in Figure 3B) are superimposed with the estimated CDFs based on the bivariate GG regression models that allow for heterogeneity by both diagnosis and baseline level of the location-, scale-, and shape-parameter estimates. These parameter estimates are shown in Table 3. Among subjects with a mean GFR of 45 mL/minute/1.73 m2 or less, the differences by diagnosis in the GFR annualized ratios at percentiles less than the 50th (i.e., lower tail) were greater than those seen among subjects with mean GFR levels greater than 45 mL/minute/1.73 m2.
To determine whether the 6 parameters describing the GG distribution of participants with baseline GFRs less than or equal to 45 mL/minute/1.73 m2 were different from those of participants with baseline GFRs greater than 45 mL/minute/1.73 m2, we carried out a likelihood ratio test. The −2 log likelihoods of Figure 3A and 3B were −243.64 and −383.72, respectively. In turn, the −2 log likelihood of Figure 1B was −609.14. The likelihood ratio statistic to test the null hypothesis that baseline level does not have any effect on GFR decline for both types of diagnoses was 18.22 = −609.14 − (−243.64 − 383.72), which corresponds to a P value of < 0.01 from a chi-squared analysis with 6 degrees of freedom (12 parameters in Figure 3 minus 6 parameters in Figure 1B). In contrast, when we compared the 12-parameter model with a 9-parameter model that was adjusted for baseline level but that did not allow the effect of the diagnosis to be modified by level, the −2 log likelihood of the 9-parameter model was −624.918 and the likelihood ratio statistic was 2.44 = −624.92 − (−243.64 − 383.72), which corresponded to a P value of 0.49 from a chi-squared analysis with 3 degrees of freedom. Thus, although adjusting for baseline level improved the overall fit of the model, the effect of glomerular CKD versus nonglomerular CKD in the strata less than or equal to 45 mL/minute/1.73 m2 was not statistically different from that in the greater than 45 mL/minute/1.73 m2 group. Nonetheless, the difference by diagnosis type in the shape parameters for those with baseline GFR of 45 mL/minute/1.73 m2 or less (2.099 = 3.002 − 0.903) was 2.5-fold higher than that of children with baseline GFR above mL/minute/1.73 m2 (0.848 = 1.450 − 0.602).
Estimated RPs from the GG models in Table 3 are presented in Figure 4. Up to the 50th percentile, the effect of diagnosis on GFR decline was similar in children with all baseline GFRs. For percentiles below the 50th, the effect of glomerular diagnosis was greater among those with baseline GFR below 45 mL/minute/1.73 m2, but the overlap of the confidence bands was consistent with the lack of statistical significance indicated above. As in the univariate analysis, the estimates for the effect of diagnosis generated by the conventional lognormal mixed model (1 for each GFR level strata and shown on the right side of Figure 4) failed to capture the heterogeneous effect by percentile, but the inference regarding the lack of effect modification was congruent with that of the GG models.
It is common to analyze biomarker data using methods that are based on the normal distribution once the biomarker has been transformed to the logarithmic scale (16, 17). Mixed linear models are the conventional tools for parametrically modeling changes in such biomarkers measured repeatedly over time in individuals. An attractive feature of this conventional analytical approach is that it capitalizes on lognormal distributions and provides a single estimate of effect for the exposure. However, such a rigid, uniform shift of the distribution of the outcome as is suggested by a single average effect of the exposure may not suffice. Here, we have presented an expanded parametric model provided by the GG distribution that is not subject to the restrictions of the classic mixed linear model. In Figures 2 and and4,4, we have illustrated that lognormal models may be able to capture an average level of effect associated with an exposure but will fail to characterize heterogeneity that may exist among individuals. In the most extreme case, where disease progression is affected by the exposure only in the subgroup of susceptible individuals, a single estimate of average effect generated from a lognormal model may fail to capture the relation with statistical significance.
Although the categorization of disease in this report as glomerular CKD versus nonglomerular CKD admittedly conglomerates heterogeneous diagnoses, attributing a single diagnosis to what is frequently a heterogeneous underlying pathophysiology is common in biomedical research. Congruent with possibly heterogeneous pathology, the methods presented here illustrate how individuals express a disease differently (18). Quantile regression methods have also been used to characterize differential effects of an exposure at varying percentiles of unexposed individuals (19–21). An advantage of quantile regression is its nonparametric nature, but it has 2 limitations: 1) the classic lack of efficiency and, more importantly, 2) the need for a very large number of descriptors (i.e., the effect for each percentile) to characterize the overall effect of exposures. Here, we have presented the GG regression model as an effective tool to quantitatively characterize heterogeneity of effects using RPs. By its parametric nature, GG-based regression borrows information from the contiguous percentiles to describe the overall effects of exposures.
Epidemiologically, heterogeneity of effect of an exposure has 2 principal sources. It may arise if the definition of an exposure is too broad, so that the exposed group will be composed of individuals whose vulnerability to disease progression is fundamentally different. In our analysis, the primary exposure of interest (glomerular CKD) was a conglomeration of focal segmental glomerulosclerosis (FSGS) (n = 34), hemolytic uremic syndrome (HUS) (n = 27), and a family of other kidney diseases that can be characterized as glomerular in nature (n = 48). The progression of HUS CKD has been shown to follow a unique longitudinal pathway that is characterized by an initial period of acute kidney injury and followed by either full recovery or slow improvement with residual chronic renal sequelae and decreased GFR, and then gradual decline (6, 22–24). The post-acute injury phase of CKD related to HUS sits in sharp contrast to patients with FSGS and other glomerular CKDs who have chronic, ongoing kidney injury that is reflected in a relatively quicker loss of kidney function (5, 8). Indeed, in our study population, the 50th, 25th, and 10th percentiles of the annualized GFR ratio were 0.87, 0.74, and 0.58 for the FSGS group; 0.84, 0.70, and 0.51 for the other glomerular group; and 0.99, 0.92, and 0.81 for the HUS group. As a result, subjects with HUS disproportionately populated the upper end of the glomerular disease group’s annualized GFR ratio distribution, whereas subjects with FSGS and other glomerular diseases disproportionately constituted the lower tail of the distribution.
In the presented analysis, we were able to explain some heterogeneity within the glomerular CKD group of subjects by identifying children with HUS CKD versus non-HUS CKD. It is not always the case that investigators and analyses will have such a marker available. Because of this, use of the GG and relative percentiles showing heterogeneity of effect should encourage investigators to search for other characteristics that might help to more clearly define exposure groups and yield less imprecision with respect to the definition of a given exposure. In some cases, by revealing effect heterogeneity where no known differences exist, this method holds promise as a method to guide discovery of novel factors of clinical relevance.
Heterogeneity of effect could also be due to unmodeled effect modification. In this case, a patient’s having the primary exposure of interest, glomerular CKD, might not be sufficient for explaining rapid CKD progression. In our analysis, there was a suggestion of the effect of diagnosis being enhanced among those with a low baseline GFR (Figure 4).
Heterogeneity of effect associated with an exposure, even in controlled trials, is a reality. Conventional methods that impose a constant effect (e.g., linear regression models of log-transformed outcomes) require the use of specific interaction terms to divulge the common phenomenon of heterogeneity of the effect of a given exposure. In contrast, the methods presented here based on the GG distribution allow the depiction of heterogeneity of effect of a given exposure and therefore offer a richer description of effects. Furthermore, they can yield specific subgroups with higher or lower reactivity in which unknown or unrecognized effect modifiers can be sought.
Author affiliations: Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland (Christopher B. Pierce, Christopher Cox, Alvaro Muñoz); Department of Pediatrics, Mount Sinai School of Medicine, New York, New York (Jeffrey M. Saland); and Division of Nephrology, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania (Susan L. Furth).
This work and the Chronic Kidney Disease in Children Study were supported by grants from the National Institute of Diabetes and Digestive and Kidney Diseases, funded in collaboration with the National Institute of Neurological Disorders and Strokes, the National Institute of Child Health and Human Development, and the National Heart, Lung and Blood Institute (grants U01-DK-66116, U01-DK-66143, U01-DK-66174, and U01-DK-82194).
Conflict of interest: none declared.