Feature space analysis (FSA)
The sample size for each group was dependent on the number of images available and segmentation quality. We used an automated segmentation process, which did not bias segmentation to the human eye and significantly reduced analysis time to under 1 h for hundreds of images (see Methods). User input was only used to confirm the segmentation of each image to avoid overlapping nuclei or blurred images (see Methods from ).
Figure 1. Automated pre-processing of nuclear images. (A) Raw images were collected with multiple fluorescence channels (see methods): red and blue channels for Lamin A/C and DNA, respectively. (B) Matlab code segmented the Lamin A/C channel using (more ...)
We first analyzed the images based on shape parameters, usually dimensionless, which were defined precisely. With our automated segmentation, we were able to perform rigorous FSA in relatively short time. This analysis was performed in the “feature space” since the features were pre-determined. Most commercial image analysis software programs perform feature space analysis (including Image J). With this methodology, dimensionless shape parameters are compared across many groups, but the actual dysmorphic shapes must be inferred from multiple features.
In cells from Ercc1−/− murine model of XFE progeroid syndrome, circularity, perimeter and eccentricity of the nuclei were statistically different from control cells from a normal littermate, but solidity was similar to the control (). On average, XFE nuclei were more elongated and had a greater perimeter than their control set. Since the increase in perimeter was much greater than the difference in elongation, an increased perimeter may be partly from an increase in size, as well as from elongation.
Figure 2. Feature space analysis of nuclei in aging disorders. Segmented nuclei were analyzed for shape factors (), and perimeter was normalized to the average perimeter of the corresponding control group. (C and D) indicate control and (more ...)
Nuclei in cells cultured from HGPS patients were less solid, less elongated, more circular and had a smaller perimeter (). Based on these results, HGPS nuclei were smaller, invaginated, and rounder than the control group. HGPS nuclei were more likely to have many small blebs rather than a few big ones, but the difference in perimeter was greater than the difference in solidity, indicating that a large number of small blebs significantly increase the perimeter without adding much concave area.
In comparison to the nuclei of Ercc1−/− mice and HGPS patients, nuclei from patients with WS did not exhibit any noticeable differences from the corresponding control nuclei (). While WS is an aging disorder associated with nuclear abnormalities, it did not cause a statistically significant deformation in the nucleus, according to the FSA of large numbers of nuclei.
As we examined feature space shape parameters of XFE and HGPS cells, we observed that the control groups of these diseases were similar to one another. Although the sizes (normalized perimeter) of the control nuclei were significantly different due to species differences (mouse vs. human cells), other parameters of the control groups had statistically similar values. However, each disorder was completely unique in its deformation: XFE nuclei were characterized by elongation and increase in size, HGPS nuclei were characterized by multiple small blebs, which caused the nuclei to be smaller and rounder.
Geometric approach and principle component analysis (PCA)
The FSA described above has been reproducibly used to obtain relevant biological information from image data, but it assumes that the chosen set of features includes information relevant to analyzing the data. An alternative is to use a geometry-based approach with the entire contour information from each nucleus obtained from the segmentation and pre-processing steps described above. Geometric analysis compares variation in coordinate locations, with respect to a reference set of coordinates (). First, for each segmented nuclear contour, all the points along this contour are converted to a polar coordinate system with respect to the center of mass, and points are sampled with equal angle intervals. Each nucleus in a set (including both disease and control) is thus defined by an (x,y) in the polar coordinate system (, left). The corresponding points in each contour are then averaged to produce a representative average shape (, left). The coordinates of all the nuclear contours (again, both disease and control) are then analyzed with respect to this average using principal component analysis (PCA).
Figure 3. Schematic of principal component analysis (PCA) in geometric space. Two dimensional shapes are assigned polar coordinates (x,y) so that many, disparate shapes can be statistically compared on one graph. From this graph, the principal (more ...)
Figure 4. PCA of nuclei in aging disorders. Principal component analysis was performed on control and disease groups of each disease. The panels on the left show the average nuclear shape of both the disease and control groups (red box) and the (more ...)
In PCA, the corresponding (x, y) coordinates are analyzed to extract the main modes of variation for each sample from the average () simultaneously for all coordinates. The purpose of utilizing this approach is to derive 2-dimensional “features” relevant for analyzing the phenotypes based on the data itself rather than trying to use a priori assumptions as in FSA. The principle modes (i.e., main variations from the average) are computed from these 2-dimensional graphs (, right) by finding vectors which best represent the data (, red lines). The degree of variation from the average, along the bisecting line, is used to quantify the degree of shape change, called the variance. Each principal variation from the average can be quantified to understand the main modes of variation present in the data (, left). The significant modes of variation can then be analyzed for their significance in finding a statistically significant difference between two sets of nuclei, the control and the disease. To that end, each nuclear contour is “projected” onto the direction, and the standard Student t-test can be used to measure significance between two groups. The benefit of this system is that multi-dimensional parameters can be added to the machine-learned sorting including fluorescence intensities and distributions of intensities. The algorithm is able to determine the metrics of sorting as well as when information is not sufficiently different to allow statistical certainty of the sorting.
PCA analysis of nuclei
To determine PCA of nuclei, as described above, the control and disease nuclei of each group were first analyzed together to provide an average nuclear shape (, red box around the averages). XFE, HGPS and WS nuclei, as well as their control analogs, all had similar average elliptical shapes with one pole slightly narrower than another, like an egg. The similarity in average nuclear shape may reflect that all cell types were fibroblasts, and the average shape of WS and HGPS were the most similar because they were both human fibroblasts. The PCA technique was then applied to show how the data set differs from the average. The first 8 modes of deformation typically were able to provide features describing how the sample group varied from the average shape. These modes were determined from two-dimensional shape variations calculated from every shape in the data set, and there is no pre-processing bias. In many cases the differences appear small, but the statistical difference is provided by the algorithm. We can comment qualitatively on the modes, but exact features cannot be interpreted from the shapes. This variance from the average was different for the average and control samples (), and the distribution of shapes within this modal set was different in control and diseased nuclei ().
Figure 5. Comparison of control and disease group using PCA. The distribution of control and disease groups in the first three modes of PCA is shown. The x-axis represents the variation from the average, and the corresponding nuclear shape is (more ...)
In XFE nuclei, the disease group showed more variation in shape than the corresponding control group (), suggesting altered nuclear shape could be a hallmark of the disease, possibly due to division defects. The first mode, related to size, suggested that the control group is both smaller and of more regular size than the disease group. This heterogeneity of size for XFE agreed with the FSA result, which was shown by higher standard error for all parameters. The second mode illustrated an elongation from the normal to the diseased nuclei but no thinning, similar to FSA. For modes greater than three, there was no significant difference between the disease and control groups ().
Similar to XFE, the diseased group of HGPS showed a greater variance than the control group (). The first mode suggested that the disease group is smaller and rounder. The second mode confirmed that the control group was more elongated. However, this mode did not also include differences in size, as it had in XFE. In the third mode, slight blebbing and invaginations were seen in the disease group (). In WS, there was no significant difference between the control and the disease group in any modes or variance ( and ). This result agreed with the FSA results.
Passage dependence of HGPS cells
The abnormal nuclear phenotypes of HGPS fibroblasts become most obvious at late passages.14,26,33
To investigate changes in morphology over passage number in HGPS nuclei, we analyzed HGPS nuclei at three different passages (p13, p22 and p30). The general trend was that diseased nuclei were smaller and rounder than normal nuclei. In examining the first mode of each passage group, the control and diseased nuclei were similar at passage 13 () but showed significant deviation at the passage 22 (). By passage 30, there was little change from passage 22 (and
). However, at late passages control nuclei were also showing variance. We were able to take this into account, but in FSA there is no way to “subtract out” changes associated with altered control morphology with increased passage number.
Figure 6. Changes in nuclear shape in cells from HGPS patients with increasing passage. Shapes of HGPS nuclei were compared for multiple passages. For each passage, the distribution of nuclei in the first mode, variances of the first 8 modes, (more ...)
Numerous passages produced more dysmorphic behavior both in the HGPS cells and their controls. We calculated Δvariance, the difference between the disease and the control variances, for the first 8 modes (i.e., if the variance of the control is greater than the variance of the disease, Δvariance is negative). At early passage, the control group varied more. This could be because the HGPS nuclei only had a small number of dysmorphic nuclei at early passage. The averaging process of PCA among hundreds of images could not easily detect subtle and complex deformation. For later passages, the disease group had greater variance (). For the first few modes, the late passage had the largest Δvariance.