|Home | About | Journals | Submit | Contact Us | Français|
Computational image analysis is used in many areas of biological and medical research, but advanced techniques including machine learning remain underutilized. Here, we used automated segmentation and shape analyses, with pre-defined features and with computer generated components, to compare nuclei from various premature aging disorders caused by alterations in nuclear proteins. We considered cells from patients with Hutchinson-Gilford progeria syndrome (HGPS) with an altered nucleoskeletal protein; a mouse model of XFE progeroid syndrome caused by a deficiency of ERCC1-XPF DNA repair nuclease; and patients with Werner syndrome (WS) lacking a functional WRN exonuclease and helicase protein. Using feature space analysis, including circularity, eccentricity, and solidity, we found that XFE nuclei were larger and significantly more elongated than control nuclei. HGPS nuclei were smaller and rounder than the control nuclei with features suggesting small bumps. WS nuclei did not show any significant shape changes from control. We also performed principle component analysis (PCA) and a geometric, contour based metric. PCA allowed direct visualization of morphological changes in diseased nuclei, whereas standard, feature-based approaches required pre-defined parameters and indirect interpretation of multiple parameters. Both methods yielded similar results, but PCA proves to be a powerful pre-analysis methodology for unknown systems.
Clinicians, histologists, biologists, and other researchers have used morphological information of different biological structures for diagnosis and mechanistic information. Specifically, nuclear morphology has played a significant role in cancer histology for decades.1,2 Recently, computational image analysis and machine learning techniques have been utilized to enhance diagnostic imaging.3 Computational analysis of nuclear features aids in diagnosis of cancer4-7 while highlighting cellular phenotypes that are characteristic of tumor cells.8-10 Characterization of nuclear morphological features may provide insights into mechanisms of diseases as well as in cellular development.11
Nuclear morphology changes in several tissues as organisms age.12,13 Furthermore, nuclear morphology abnormalities are common in progeria syndromes, which are diseases of accelerated aging. Examination of nuclear morphology may be used to examine progression of the disease14 and help to evaluate therapies.15 There are several premature aging disorders caused by mutations in nuclear proteins. Here, we study Hutchinson-Gilford progeria syndrome (HGPS), Werner syndrome (WS), and primary cells from Ercc1−/− mice that model XFE (xeroderma pigmentosum type F-ERCC1) progeroid syndrome.16 HGPS is a premature segmented aging syndrome caused by a mutation in LMNA, which codes for the nucleoskeletal structural proteins including lamin A and lamin C.17 Patients with HGPS develop symptoms by two years of age and show extensive systemic phenotypes including osteoporosis, osteoarthristis and cardiovascular disease, which ultimately causes death in the early teens.18 WS, also known as an adult progeria, is a premature aging syndrome caused by loss of the WRN helicase and exonuclease, critical for telomere function and replication stress.19 WS symptoms develop after puberty, patients look much older than their chronological age, and they die of cancer or heart disease in their late forties or early fifties.20 ERCC1-XPF is a structure-specific endonuclease involved in the repair of helix-distorting DNA lesions, interstrand crosslinks and some double-strand breaks.21-24 Mutations that cause reduced levels of ERCC1-XPF can result in either the skin cancer-prone disorder xeroderma pigmentosum (XP), cerebro-oculo-facio-skeletal syndrome or a disease of accelerated aging (progeria).16,25 Nuclei from patients with HGPS,26 WS27 and Ercc1−/− mice28 have all been reported to have altered morphology.
Although the molecular defects of these diseases are well-established, the mechanisms by which they lead to cellular phenotypes or systemic aging are poorly understood. At the nuclear level, morphological phenotypes are typically described qualitatively as “dysmorphic” or “blebbed” and are quantified in a binary fashion as the presence or absence of a particular characteristic, often scored by eye. Numerical, feature-based approaches (including size, circularity, eccentricity, etc.) quantify differences between forms by pre-defined numerical features.29 However, these features must be chosen ahead of time, and composite images are hard to reconstruct from these features. Here, we also view the entire shape as a morphological exemplar constructed in geometric space.30-32 In this approach, little or no information is lost and the entire image information can often be utilized for computation. Direct mapping from one form to another allows direct comparison of morphological changes. We applied a contour-based metric and principle component analysis to produce direct, visual comparisons of nuclear shapes between the three aging disorders.
In this study, we analyze nuclear shapes using large sample sizes, automated segmentation and computational analysis of nuclear images to compare and quantify changes in cells from progeria patients and mice to better understand nuclear dysmorphisms associated with accelerated aging. Ultimately, the characterization of nuclear deformation in premature aging diseases may enable diagnosis or classification of newly identified aging disorders by simple comparison with an image database. High throughput analysis of nuclear shapes characteristic of the disease may provide a quantitative endpoint for screening therapeutic drugs or measuring disease progression. Also, we may be able to suggest mechanisms of cellular aging based on monitoring progressive nuclear structural changes associated with these diseases.
The sample size for each group was dependent on the number of images available and segmentation quality. We used an automated segmentation process, which did not bias segmentation to the human eye and significantly reduced analysis time to under 1 h for hundreds of images (see Methods). User input was only used to confirm the segmentation of each image to avoid overlapping nuclei or blurred images (see Methods from Fig. 1).
We first analyzed the images based on shape parameters, usually dimensionless, which were defined precisely. With our automated segmentation, we were able to perform rigorous FSA in relatively short time. This analysis was performed in the “feature space” since the features were pre-determined. Most commercial image analysis software programs perform feature space analysis (including Image J). With this methodology, dimensionless shape parameters are compared across many groups, but the actual dysmorphic shapes must be inferred from multiple features.
In cells from Ercc1−/− murine model of XFE progeroid syndrome, circularity, perimeter and eccentricity of the nuclei were statistically different from control cells from a normal littermate, but solidity was similar to the control (Fig. 2A). On average, XFE nuclei were more elongated and had a greater perimeter than their control set. Since the increase in perimeter was much greater than the difference in elongation, an increased perimeter may be partly from an increase in size, as well as from elongation.
Nuclei in cells cultured from HGPS patients were less solid, less elongated, more circular and had a smaller perimeter (Fig. 2B). Based on these results, HGPS nuclei were smaller, invaginated, and rounder than the control group. HGPS nuclei were more likely to have many small blebs rather than a few big ones, but the difference in perimeter was greater than the difference in solidity, indicating that a large number of small blebs significantly increase the perimeter without adding much concave area.
In comparison to the nuclei of Ercc1−/− mice and HGPS patients, nuclei from patients with WS did not exhibit any noticeable differences from the corresponding control nuclei (Fig. 2C). While WS is an aging disorder associated with nuclear abnormalities, it did not cause a statistically significant deformation in the nucleus, according to the FSA of large numbers of nuclei.
As we examined feature space shape parameters of XFE and HGPS cells, we observed that the control groups of these diseases were similar to one another. Although the sizes (normalized perimeter) of the control nuclei were significantly different due to species differences (mouse vs. human cells), other parameters of the control groups had statistically similar values. However, each disorder was completely unique in its deformation: XFE nuclei were characterized by elongation and increase in size, HGPS nuclei were characterized by multiple small blebs, which caused the nuclei to be smaller and rounder.
The FSA described above has been reproducibly used to obtain relevant biological information from image data, but it assumes that the chosen set of features includes information relevant to analyzing the data. An alternative is to use a geometry-based approach with the entire contour information from each nucleus obtained from the segmentation and pre-processing steps described above. Geometric analysis compares variation in coordinate locations, with respect to a reference set of coordinates (Fig. 3). First, for each segmented nuclear contour, all the points along this contour are converted to a polar coordinate system with respect to the center of mass, and points are sampled with equal angle intervals. Each nucleus in a set (including both disease and control) is thus defined by an (x,y) in the polar coordinate system (Fig. 3, left). The corresponding points in each contour are then averaged to produce a representative average shape (Fig. 4, left). The coordinates of all the nuclear contours (again, both disease and control) are then analyzed with respect to this average using principal component analysis (PCA).
In PCA, the corresponding (x, y) coordinates are analyzed to extract the main modes of variation for each sample from the average (Fig. 3) simultaneously for all coordinates. The purpose of utilizing this approach is to derive 2-dimensional “features” relevant for analyzing the phenotypes based on the data itself rather than trying to use a priori assumptions as in FSA. The principle modes (i.e., main variations from the average) are computed from these 2-dimensional graphs (Fig. 3, right) by finding vectors which best represent the data (Fig. 3, red lines). The degree of variation from the average, along the bisecting line, is used to quantify the degree of shape change, called the variance. Each principal variation from the average can be quantified to understand the main modes of variation present in the data (Fig. 4, left). The significant modes of variation can then be analyzed for their significance in finding a statistically significant difference between two sets of nuclei, the control and the disease. To that end, each nuclear contour is “projected” onto the direction, and the standard Student t-test can be used to measure significance between two groups. The benefit of this system is that multi-dimensional parameters can be added to the machine-learned sorting including fluorescence intensities and distributions of intensities. The algorithm is able to determine the metrics of sorting as well as when information is not sufficiently different to allow statistical certainty of the sorting.
To determine PCA of nuclei, as described above, the control and disease nuclei of each group were first analyzed together to provide an average nuclear shape (Fig. 4, red box around the averages). XFE, HGPS and WS nuclei, as well as their control analogs, all had similar average elliptical shapes with one pole slightly narrower than another, like an egg. The similarity in average nuclear shape may reflect that all cell types were fibroblasts, and the average shape of WS and HGPS were the most similar because they were both human fibroblasts. The PCA technique was then applied to show how the data set differs from the average. The first 8 modes of deformation typically were able to provide features describing how the sample group varied from the average shape. These modes were determined from two-dimensional shape variations calculated from every shape in the data set, and there is no pre-processing bias. In many cases the differences appear small, but the statistical difference is provided by the algorithm. We can comment qualitatively on the modes, but exact features cannot be interpreted from the shapes. This variance from the average was different for the average and control samples (Fig. 4), and the distribution of shapes within this modal set was different in control and diseased nuclei (Fig. 5).
In XFE nuclei, the disease group showed more variation in shape than the corresponding control group (Fig. 4A), suggesting altered nuclear shape could be a hallmark of the disease, possibly due to division defects. The first mode, related to size, suggested that the control group is both smaller and of more regular size than the disease group. This heterogeneity of size for XFE agreed with the FSA result, which was shown by higher standard error for all parameters. The second mode illustrated an elongation from the normal to the diseased nuclei but no thinning, similar to FSA. For modes greater than three, there was no significant difference between the disease and control groups (Fig. 5A).
Similar to XFE, the diseased group of HGPS showed a greater variance than the control group (Fig. 4B). The first mode suggested that the disease group is smaller and rounder. The second mode confirmed that the control group was more elongated. However, this mode did not also include differences in size, as it had in XFE. In the third mode, slight blebbing and invaginations were seen in the disease group (Fig. 5B). In WS, there was no significant difference between the control and the disease group in any modes or variance (Figs. 4C and and5C).5C). This result agreed with the FSA results.
The abnormal nuclear phenotypes of HGPS fibroblasts become most obvious at late passages.14,26,33 To investigate changes in morphology over passage number in HGPS nuclei, we analyzed HGPS nuclei at three different passages (p13, p22 and p30). The general trend was that diseased nuclei were smaller and rounder than normal nuclei. In examining the first mode of each passage group, the control and diseased nuclei were similar at passage 13 (Fig. 6A) but showed significant deviation at the passage 22 (Figs. 6B and C). By passage 30, there was little change from passage 22 (Figs. 5andand6D).6D). However, at late passages control nuclei were also showing variance. We were able to take this into account, but in FSA there is no way to “subtract out” changes associated with altered control morphology with increased passage number.
Numerous passages produced more dysmorphic behavior both in the HGPS cells and their controls. We calculated Δvariance, the difference between the disease and the control variances, for the first 8 modes (i.e., if the variance of the control is greater than the variance of the disease, Δvariance is negative). At early passage, the control group varied more. This could be because the HGPS nuclei only had a small number of dysmorphic nuclei at early passage. The averaging process of PCA among hundreds of images could not easily detect subtle and complex deformation. For later passages, the disease group had greater variance (Fig. 6D). For the first few modes, the late passage had the largest Δvariance.
Here we used automated segmentation and both feature space analysis and geometric analysis with PCA to compare differences among HGPS and WS patient cells, along with cells from the Ercc1 knockout mouse model of accelerated aging. Segmentation enabled quick analysis of large sets of data through an automated process but still allowed manual correction to ensure the quality of segmentation. The quality of segmentation was dependent upon the quality of imaging and preparation. However, the high-throughput methodology may also produce bias in the system. The segmentation program was less likely to provide satisfactory results for complicated boundaries, and complex images may have been discarded. Also, large sample size resulted in small standard errors for samples, but severe minority events may have been lost. For example, nuclei from HGPS patients can show dramatic outward blebbing and dilation,26 but these events are a very small percent of nuclei. Thus, there is a benefit to both visualization by eye and high throughout analysis.
There are several successful methods for analyzing the geometry of cellular shapes developed over the past two decades including point-based principal component analysis (the method we use), independent component analysis, Fourier descriptors, and several others.34 Generally, no method is superior for nuclear shape,34 so we chose PCA for simplicity and wide popularity in shape analysis from other biomedical applications including radiography. We analyzed the segmented nuclear shapes by the broadly used FSA as well as contour-based geometric approach. By using multiple cell types, the two methods to analyze nuclear morphology were directly compared and the advantages and disadvantages of the new contour-based imaging technique were determined. With a combination of contour-based geometric approach and PCA, shape changes were directly visualized from disease to control. We also computationally eliminated orientation and size differences and purely examined changes in nuclear morphology. However, the geometric approach was computationally intense, requiring a longer analysis time and more powerful hardware than FSA. Also, the contour-based approach with PCA was based on averaging shapes and gave results that were reflected in most dominant directions, so minor changes were not reflected. For example, features of the nuclear blebs associated with HGPS14,26,33 were not well captured with geometric analysis. PCA results are also difficult to compare with other published work without raw data.
Conversely, geometric analysis is well suited to be used in pre-analysis situations where visualization of average shapes and deviations of morphology may be useful to motivate further analysis. For example, an automated segmentation and geometric analysis can be applied as a global diagnostic tool especially when combined with high throughput genomic studies such as for drug studies or cells from the International Knockout Mouse Consortium (IKMC). Also, geometric analysis has the potential to include integration of multiple features, such as lamin concentration and chromatin heterogeneity.35
Two independent analytical techniques yielded similar conclusions regarding nuclear dysmorphology for cells from three distinct progeroid patients or mice and their matched controls. Nuclei of primary Ercc1−/− fibroblasts, from a mouse model of XFE progeroid syndrome, were similar in size to the control nuclei but were statistically elongated. This may reflect stiffening or reorganizing of the DNA inside cells, reminiscent of the hemoglobin polymerization and red cell sickeling observed in sickle cell disorder. Loss of ERCC in mice results in reduced repair of DNA crosslinks, which may result in a global change in nucleoplasmic stiffness. Growing primary fibroblasts at 20% O2 induces cellular senescence.36 Hence the nuclear morphologic changes observed in Ercc1−/− fibroblasts were likely characteristic of senescent cells and could offer a rapid screening endpoint for quantifying senescent cells.
HGPS nuclei were smaller and rounder. This morphological change could be related to the over-accumulation of lamin proteins which causes a lamina-dominated shape which is in equilibrium as a spheroid. Interestingly, nuclei from patients with WS, which is caused by loss of a functional DNA helicase, showed no statistical difference from the control. This suggests that the nuclear changes associated with altered helicase function do not affect global nuclear structure sufficiently to be detected by morphological imaging. However, we examined early passage cells in this study and it was possible that more divergence between and control and WS cellular nuclei might be observed with increasing passage, as we reported for HGPS nuclei. Consistent with this, patients with WS develop premature aging features later in life compared with the Ercc1−/− mice, XFE progeroid syndrome patients, and patients with HGPS.20
The HGPS and control cells were human dermal fibroblasts HGADFN167 and HGADFN168, respectively, obtained from the Progeria Research Foundation at passage 9. HGADFN167 was taken from 8.5-y-old male patient, and HGADFN 168 was taken from the 40.5-y-old father of HGADFN167. The fibroblasts were cultured in Dulbecco's Modification of Eagles Medium (DMEM; Invitrogen 11960–044) with 15% fetal bovine serum (FBS), 1% penicillin/streptomycin, and 2 mmol L-glutamine per foundation protocol.
The WS and control cells were human dermal fibroblasts AG00780 and AG11747, respectively, obtained from the Coriell Institute Cell Repository. AG00780 was from a male donor with confirmed C1336T mutations in both WRN alleles leading to a truncated protein lacking helicase activity. The control AG11747 cells were from a normal male donor. Both WS and control cell lines were from cultures that underwent a minimal number of population doublings, 8 and 16, respectively. Culture media and conditions were similar to the HGPS cell lines.
Ercc1−/− primary mouse embryonic fibroblasts were generated from day 12 to 15 embryos produced from crossing inbred C57BL/6 mice heterozygous for Ercc1 null allele and genotyped by PCR, both described previously.16,37 The cells were grown in F10 and DMEM (1:1) supplemented with 10% FBS and 1% penicillin/streptomycin at 5% CO2. At passage 5, the cells were plated on glass coverslips at a density of 2.5 x 104. The following day, the cells were fixed using 2% PFA for 10 min, rinsed with phosphate-buffered saline (PBS), and mounted with Vectashield with DAPI to label nuclei (Vector Labs, Burlingame, CA).
In preparation for imaging, the HGPS and WS cultured fibroblasts were fixed in 3.7% solution of formaldehyde in PBS, permeabilized with 0.2% solution of Triton-X 100 in PBS, and blocked using 2% bovine serum albumin (BSA) in PBS. To label Lamin A/C, the fibroblasts were incubated in the primary antibody solution of 1:100 (mouse monoclonal IgG, Santa Cruz Biotechnology, sc-7292) and then in the secondary solution of 1:200 (Alexa fluor 555 rabbit anti-mouse, Invitrogen). DNA was labeled using a 1:4000 1 mg/mL DAPI solution. The prepared cells were imaged on Leica DMI 6000B fluorescence Microscope at 63x (1.4 NA) and imaged with Leica DFC350 camera. Red and blue channels represented Lamin A/C and DNA, respectively.
Segmentation codes were developed in Matlab based on the semiautomatic method described elsewhere.35,38,39 Briefly, the random field graph cut method was used to obtain a rough contour, which incorporated both region and boundary information of the image. Then an efficient level set active contour algorithm38 was applied to refine the contours obtained via graph cut.35,40,41 Finally, the segmented results were reviewed manually. The segmented result could be manually adjusted by dilating and eroding, and the satisfactorily segmented nuclei were manually selected for the analysis (Fig. 1). Before the analysis, the images were pre-processed as in our previous works39 to eliminate variations due to arbitrary rotation, translation, and coordinate inversions of each nucleus. The procedure included normalization by the center of mass, rotation by major axis reorientation, and coordinate “flips” set up within a least squares minimization problem.
During this decision making process, 1–50% of the segmentation results were discarded, based primarily on the quality of sample preparation and imaging. Specifically, image quality is dependent on magnification and numerical aperture, focusing on the plane of the nucleus, and uniformity of the cells: most images have numerous nuclei per image and some may be out of focus. Labeling of the nucleoskeleton, including the lamins (as a rim-stain), also allowed better segmentation of nuclei. The useable sample sizes determined after the decision making process are listed in Table 1. In the feature space analysis, all satisfactory segmentation results were included. However, in the principal component analysis, the random subsets of images were chosen to maintain equal sample sizes for the control and disease groups as to not bias the distributions.
Segmented nuclear images were analyzed using a set of pre-programmed numerical features. The shape parameters used in the feature space analysis of this group of nuclei from aging disorders were solidity, circularity, normalized perimeter, and eccentricity. Definitions and equations of the parameters are listed in Table 2. The parameters were selected based on commonly used deformation modes in nuclear morphology.11,15,42 Features of the diseased group with the respective control were compared statistically using Student t-test unpaired with 2 tails. Samples with a Student t-test value of p < 0.01 were considered statistically different. More than 100 nuclei from at least 2 independent experiments, each, were pre-processed; an experiment is defined as independent fixation, labeling and imaging of cells within one passage of the replicate. No significant difference was noted between the analyzable nuclei from the independent experiments and data was combined. Samples were technical replicates since, for each disease type, only one disease cell line and one control cell line was imaged.
Segmented images were also analyzed using a contour-based geometric approach43,44 to characterize the shape of the nuclei. As a result, each nucleus could be mapped to a point in a linear vector space , where m is the dimension of the linear space (see Wang et al. 2011 for more details). The Principle Component Analysis (PCA) was applied to find the principle modes of variations for the data set. Nuclear images were analyzed by PCA to calculate deformation modes of the nuclei , and we also regenerated new samples to interpret the jth mode of variation in the data set including disease group and the control by where . By analyzing the data in this linear geometric space, we could visualize the principle modes of variation and interpret how data was distributed for different data set.
This work was funded partly by the NIH (ES0515052 to P.L.O., ES016114 to L.J.N. and S.Q.G., F30AG030905 to A.K. and GM088816 to G.K.R.), NSF (CBET-0954421 to K.N.D.) and the Progeria Research Foundation (to K.N.D.). S.C. wrote, refined and applied image analysis code and contributed to manuscript preparation. W.W. wrote and refined image analysis code and contributed to manuscript preparation. A.J.S.R. contributed to sample preparation, imaging, code refinement and manuscript preparation. A.K. contributed to sample preparation, imaging and manuscript preparation. S.Q.G. contributed to sample preparation, imaging and manuscript preparation. P.L.O. provided reagents and contributed to manuscript preparation. L.J.N. contributed to experimental design and manuscript preparation. G.K.R. wrote and refined image analysis code and contributed to manuscript preparation. K.N.D. oversaw the multiple PI project and contributed to manuscript preparation.
No potential conflicts of interest were disclosed.
Previously published online: www.landesbioscience.com/journals/nucleus/article/17798