The post-genomic era in science has led to the generation of a number of large, high-density data sets with hundreds to thousands of data points for each tested subject. In the field of toxicology, genomics technologies have been used to investigate how different stresses alter gene expression (see (12
) for a recent review). For many in vivo
studies, gene expression changes are measured in the tissue of interest, but other data are also obtained to give a “phenotypic anchor” (6
) for the gene changes, including clinical chemistry and histopathology (3
). However, for large data sets, including large human clinical/epidemiological studies, it can be problematic to effectively evaluate the phenotypic anchor, due to the sheer number of data points to consider.
Clinical chemistry data are often viewed in a data table or a bar graph, where one can examine the changes that occur for one analyte across the groups of interest. For a study involving only one or a few compounds these types of visualizations help investigators determine how subjects in each group react to the given stressor. However, for large animal data sets involving multiple compounds, dose groups and time points it is very difficult to give a meaningful, visual representation of the data with traditional bar graphs, due to the number of data points that exist in these types of experiments. This is especially true for large human clinical/epidemiological studies, such as the Framingham Heart Study, which has thousands of people enrolled (9
). There is obviously a need to visualize high-density clinical chemistry data in a manner that will assist in putting gene expression data, or other high-density data, in the proper biological context. The goal of this study was to develop a method to visualize multiple analytes of clinical chemistry data over many disparate samples (i.e. different compounds, doses and time points) in a single graphic.
Male Fischer rats, approximately 12 to 14 weeks of age, were treated with a single dose from 1 of 8 hepatotoxicants or their respective vehicle (). Serum (250 μL) was obtained 6, 24, and 48 hours after dosing for clinical chemistry analysis. Each treatment group contained at least 4 animals. All animals were treated humanely in accordance to guidelines established in the NIH Guide for the Care and Use of Laboratory Animals (1
). Clinical chemistry analyses were performed using the Roche Cobas Fara chemistry analyzer (Roche Diagnostic Systems, Inc., Montclair, NJ).
Hepatotoxicants used in this study
The main goal for assessing changes in clinical chemistry across many animals in one graphic is to be able to quickly identify which animals shown signs of organ damage, as evidenced by clinical chemistry alterations. Therefore, each animal’s clinical chemistry measurements were analyzed in relation to values observed in the normal population. Since our study set is large, it was possible to obtain the median and standard deviation of the analytes for all of the vehicle-treated animals at all time points (n=103) to define the reference value for each analyte () within this study.
Clinical chemistry values for vehicle-treated animals
After defining the reference value for each analyte, the data were transformed so that the visualization would be an accurate representation of the data. The raw clinical chemistry data needed to be transformed for several reasons. First, several of the analytes have a range of values of over 3 orders of magnitude, thus making log transformation necessary. Second, the various analytes have vastly different dynamic ranges. For instance, significant liver injury is indicated by a several hundred-fold to several thousand-fold changes in normal serum ALT (alanine aminotransferase) levels, while a greater than 2-fold change in serum creatinine levels indicates a significant loss of kidney function (5
). Therefore, to put the different analytes on the same scale we performed a Z-score transformation on the log transformed values using the median and standard deviation of the vehicle-treated animals for the basis of the transformation. The Z-score transformation [Z = (observed value – baseline median)/ baseline standard deviation] ensures that each analyte over the population of animals has a median value of 0 with a standard deviation of 1. We used the median instead of the mean, since the median is less sensitive to statistical outliers. In addition to putting all of the analytes on the same scale, the Z-score transformation also centers the log transformed data on 0, with values greater than the baseline having a positive Z-score and values less than the baseline having a negative Z-score.
The clinical chemistry Z-scores were then used to perform hierarchical clustering using Eisen’s Cluster program (2
). Eisen’s TreeView program (2
) was used to visualize the data in a heat map, with yellow indicating Z-scores > 0, blue indicating Z-scores < 0, black indicating Z-scores ≈ 0 and grey indicating data not present. shows the cluster and heat map of all treated animals in this 8 compound study (See supplemental Figure 1
for the fully annotated cluster). Looking at the dendrogram for the analytes, it is evident that the liver enzymes cluster tightly together (ALT, AST, LDH, SDH; see for full names), providing support that the data transformation is valid since these analytes are markers of hepatocellular damage and increase with liver injury (5
). The middle portion of the heat map consists of animals displaying evidence of hepatotoxicity based on the elevation of liver enzymes (ALT, AST, SDH, LDH). In general, these animals were exposed to the either the high or moderate dose of the hepatotoxic compounds, with the notable exception of the animals dosed with the non-hepatotoxic compound 1,4-dichlorobenzene (8
), which did not elicit clinical chemistry changes associated with hepatotoxicity. Examination of several subclusters reveals that hepatotoxic doses of the administered compounds elicit similar alterations in the clinical chemistry panel profile; however, each compound elicits a pattern of change that is distinct. Thioacetamide shows indications of eliciting nephrotoxicity, in addition to hepatotoxicity, as evidenced by the elevation of blood urea nitrogen and creatinine 48 hours after a 150 mg/kg dose (Subcluster A, ). Subcluster B () shows a group of animals dosed with 5 different hepatotoxicants exhibiting similar elevations of ALT, AST, LDH, SDH and TBA (total bile acids), but dissimilar reductions in serum triglycerides. Diquat appears to elicit a different hepatotoxic response from the other hepatotoxicants, in that no elevations in SDH or TBA are apparent (Subcluster C, ). Based on this heat map of clinical chemistry alterations, it can be determined that the experimental groups have similar, but distinct pathologies.
Clustering and heat map of clinical chemistry alterations
Described here, for the first time, is a method that can be used to visualize high-density clinical chemistry data in a single graphic by using a heat map. The need for this type of visualization arises from the prevalence of large experimental data sets that contain hundreds, if not thousands, of data points. Our method makes use of the Z-score transformation to put each animal’s clinical chemistry data in the framework of what is normal for an untreated rat, which means a bank of historical data or a large group of concurrent control animals is needed for this type of transformation to work. For animal studies, the reference control should be of the same sex, strain and age as the test animals, with all animals on the same diet, since these factors significantly influence individual animals’ clinical chemistry values (5
). This type of data transformation should also prove useful for large human clinical data sets, as there is considerable historical data on the reference values for the different clinical chemistry analytes in human populations (10
The most important facet of this normalization procedure is that the biological context of the clinical chemistry data is maintained. This can be seen in by the high degree of similarity seen between individual rats in the different treatment groups. Also the tight clustering of ALT, AST, LDH, SDH and TBA indicates that the biological context of the data is preserved following the data transformation, since these analytes are released into the blood following liver injury (5
An important advantage of using a heat map to visualize clinical chemistry data across multiple animals and compounds is that patterns in the data can be identified that were not readily discernible when looking at each clinical chemistry parameter or treatment group individually. For instance, it is apparent the serum triglyceride levels often decrease with hepatic damage; however as can be seen in the heat map, some compounds, interestingly, do not elicit the concomitant decrease in serum triglycerides. Additionally, examination of the heat map indicates that exposure to the highest doses of thioacetamide and N-nitrosomorpholine elicited kidney damage at 48 hours after dosing, as seen by the elevation in blood urea nitrogen levels and was confirmed by histopathology (4
). However, the two compounds appear to elicit different types of kidney damage, since thioacetamide administration led to an increase in serum creatinine, while N-nitrosomorpholine administration did not.
The greatest value of this clinical chemistry data transformation and visualization likely resides in its integration with other high-density data, such as genomics, proteomics and metabalomics data. By integrating disparate types of data effectively, while ensuring to maintain the biological meaning in the data, greater knowledge and insight should be achieved than what can be attained from each type of data by itself. We, and others (4
), have started the process of integrating disparate data types which will hopefully provide a clear benefit for the interpretation of high-density data sets.