|Home | About | Journals | Submit | Contact Us | Français|
The challenges involved in identifying the neuropathological substrates of the clinical syndrome recognized as schizophrenia are well known. Stereological sampling provides a means to obtain accurate and precise quantitative estimates of components of neural circuits, and thus offers promise of an enhanced capacity to detect subtle alterations in brain structure associated with schizophrenia. In this review, we 1) consider the importance and rationale for robust quantitative studies of brain abnormalities in postmortem studies of schizophrenia, 2) provide a brief overview of stereological methods for obtaining such measures, 3) discuss the methodological details that should be reported to document the robustness of a stereological study, 4) given the constraints of postmortem human studies, suggest how to approach the limitations of less robust designs, and 5) present an overview of methodologically sound stereological estimates from postmortem studies of schizophrenia.
The challenges involved in identifying the neuropathological correlates of the clinical syndrome recognized as schizophrenia have been evident since the time of Kraepelin. As in many areas of medical investigation, advances in understanding the neural substrates of schizophrenia are dependent on advances in technology that permit new types of measurements required to dissect complex disease processes. Indeed, the infamous proclamation that (1) “schizophrenia is the graveyard of the neuropathologist” has now turned into a field of opportunities due to advances in a number of methodologies. One of these, the introduction of stereological sampling, has made available the types of accurate and precise measures of neuronal features that are required to detect subtle alterations in the structure of brain circuits. The dynamic field of stereology includes a continuous development of new methods as well as refinements of existing methods and improvements in their application. A thorough review of all the stereological techniques available for postmortem brain research is beyond the scope of this article and we refer the reader to recent review texts (2–6). We provide the rationale for robust quantitative studies of the brain abnormalities in schizophrenia, a brief overview of how to obtain such robust measures using stereological methods, a discussion of the methodological details needed to document a robust stereological study design, and suggestions for how to approach the limitations of less robust designs.
Structural brain changes associated with schizophrenia are typically not pronounced; for example, subjects with schizophrenia have an average reduction in brain weight of only 2% (7). Because the disease effect on any given brain element may be small, and even if large may be heterogeneously distributed within the brain region of interest, identifying such structural brain alterations requires robust quantitative data. Also, structural findings may easily be misinterpreted if based on biased methods that depend upon assumptions which may not be verifiable. Thus, obtaining robust quantitative data depends upon the use of unbiased methods which guarantee that individual measures are accurate; that is, on average, they estimate the true value.
Understanding the importance of unbiased methods requires knowledge of the exact meaning of unbiasedness (accuracy) and precision—see Fig. 1. Importantly, unbiasedness is an inherent feature of the methodological design and cannot be proven from the data alone: in contrast, precision can be directly observed from the scatter of the final data. Furthermore, bias may differ across subject groups despite identical, blinded, parallel processing of tissue specimens. For example, a systematic difference in shrinkage was detected in sections from subjects with depression relative to control subjects (8). Moreover, adding more subjects to an unbiased study makes the accurate assessment of the true group mean more precise, whereas in a biased study, adding more subjects to the study just makes the group mean more precisely inaccurate. Thus, when possible to implement, an unbiased study design substantially increases the information value of a study.
So-called design-based stereological methods are based upon unbiased principles of sampling and guarantee accuracy of the data to the degree that the unbiased principles are successfully implemented. Thus, design-based stereological studies are ideal for detecting small effects in heterogeneous structures and enhance clarity in data interpretation.
Another strength of design-based stereology is that the most common stereological techniques for estimation of volume and cell number allow for assessment of the method precision by determination of the coefficient of error (CE) (Table 1). Assessment of the method CE and its subcomponents—i.e. assessing the amount of random error introduced by the various parts of the stereological procedure—allows for fine-tuning of the method for increased efficiency (i.e., can the number of sections and/or the number of cells or points counted be lowered without markedly decreasing the level of precision, and thus reduce effort and expense?). Comparing the method CE to the observed coefficient of variation (CV; Table 1) across data from a group allows for documentation of the robustness of measures; that is, the study should be executed such that the precision of the method is sufficiently large relative to the intrinsic biological variability so that the data can be interpreted in the context of the biological measure of interest (see also Table 1). As the biological variation often is relatively large across human subjects, and the differences between schizophrenia and comparison subjects typically are small, less precision within a subject may be acceptable. In contrast, the combination of accurate methods and sufficient sample size are crucial to obtain a robust group average (9).
In summary, design-based unbiased stereological methods are needed to 1) avoid unverifiable assumptions; 2) obtain accurate estimates of group means required for both meaningful group comparisons within and across studies; and 3) allow for statistical analysis of the method precision via the CE.
Design-based stereological methods tend to focus on total quantities, like estimates of total number of neurons in a region of interest. In contrast, published studies of schizophrenia frequently report measures of density, such as the number of neurons per unit volume, which can be challenging to interpret. Because a density is a ratio, a change in density means that the numerator and the denominator changed to a different degree—e.g. both could go up, but if the denominator increases to a greater degree, then the ratio goes down. Thus, using cell density as a proxy for cell number comes with the assumption of no change in reference volume across groups. This assumption may not hold and is in stereology referred to as “the reference trap” (10); that is, an observed increase in neuronal density could be due to an increased neuron number, but could also be due to a reduced reference volume with preserved neuron number, or due to a decrease in neuron number in combination with an even greater decrease in the reference volume. A classical example is the observations by Haug that brain tissue from young subjects shrunk more during processing than tissue from older subjects. This caused an apparent decrease in cell density as a function of age—an effect that in preceding studies had been misinterpreted as a decrease in total neuron number with age (11,12). Another example of the limitations of density data is evident from our study of antipsychotic-exposed monkeys. Antipsychotic exposure was associated with an increased density of neurons and unchanged glial density; however, the antipsychotic exposed monkeys actually had significantly smaller brain volumes, unchanged neuron number, and a reduced number of glia cells (13–15). Thus, using only the cell density measures would give the erroneous impression that antipsychotics induced a specific increase in neuron number. It is also important to note that no difference in density only means that the ratio between numerator and denominator did not change; that is, similar reductions of the volume and cell number within the region of interest will lead to an unchanged cell density. For example, we observed that both the volume and total neuron number of the primary visual cortex were 22–25% smaller in a cohort of subjects with schizophrenia, with preserved neuronal density and cortical thickness (16).
In addition, in order for density measures to be accurate, the region of interest must be 1) available for sampling in its entirety; 2) well-defined, with borders that are recognizable for the complete region of interest; and 3) sampled such that all parts have a known (typically uniform) probability of being assessed. In addition, the object being counted must be reliably identified. Consequently, measuring the density of well-stained cells in an arbitrary central section can give rise to misleading density measures since the position and content of the arbitrary section may differ across subject groups due to systematic shifts in gyration pattern or in the shape of the region of interest. Thus, accurate estimates of both cell numbers and densities require that the entire region of interest is available for sampling.
The standard stereological method for estimating the volume of the region of interest is the Cavalieri method (4,17). To implement this method, the region of interest is first divided into slabs by parallel, equidistant, uniformly random cuts. Alternatively, the region of interest may be sectioned exhaustively, and with a random start a systematic set of sections is selected (e.g. every 100th section with the first section selected uniformly randomly among the first 100 sections). Second, the transected area of the region of interest is assessed in each sectioning plane or section—typically estimated by point counts (i.e. by superimposing a randomly oriented and positioned point grid, with a known area per point, and counting the number of points hitting the region of interest). The volume V of the region of interest may then be estimated as:
where T is the slab thickness or intersectional distance, a is the area per point, and Pi the number of points hitting the region of interest in the i’th section which is summed across all section planes/sections. Here and elsewhere “:=” indicates that the entity at the left is estimated by the expression to the right. If the method is based upon slabs, is it important to avoid over or under projection at the edge of the slabs—it is the cut surface of the slab (either top or bottom cut surface consistently across all slabs) that should form the basis for the point count, not the projection of the slab. The Cavalieri estimator is very efficient, in most cases leading to very precise estimates when counting only 200–400 points total across 8–12 slabs or sections (17,18).
Two major sampling designs are used for the estimation of cell number. The fractionator method (19,20) counts all cells in a known fraction of the tissue and multiplies the count by the inverse sampling fraction to estimate the total. That is, the region of interest is split into a number of smaller pieces and then a systematic, uniformly random sample of these is selected. This “split and sample” approach is repeated until the final sample is of a manageable size, allowing the observer to perform a complete cell count. The cell count is typically done using disector probes (see below). For example, the tissue containing a region of interest is split into blocks and every second block is sampled uniformly randomly; the sampled blocks are sectioned exhaustively and every 20th section sampled with a random start among the first 20 sections; then 4% of the area of the selected sections are sampled systematically by a randomly positioned grid of unbiased counting frames (21); finally each frame is sampled though the middle third of its thickness by optical disectors. If 200 cells were counted in the final disector sample, the total cell number can be estimated as: 2×20×25×3×200 = 600,000. In general, the total number, N, of cells in the region of interest is estimated as the product of the inverse sampling fraction for each sampling step multiplied with the cell count:
where bsf, ssf, ass, hsf are the block, section, area, and height sampling fractions, respectively—see also Fig. 2. hsf = 1 if using physical disector probes (see below). The sum is across all sections of the disector count Q− for each section. Importantly, the method is robust to processing induced shrinkage, and because the reference volume is not estimated, the region of interest needs not to have sharp boundaries—as long as all cells belonging to the region of interest are unambiguously recognized at the final level of counting (e.g. by a specific stain). Notice, the fractionator is not biased by an inhomogeneous distribution of cells in the region of interest—see Fig. 3. The observer is allowed to optimize the sampling at each level to reduce the method variability. The smooth fractionator is one such an approach, which in many situations will improve the estimator precision substantially (22). See also the recently developed proportionator method (23,24).
A second approach to obtaining estimates of cell number is the NV × VRef design, also known as a disector design. In this approach, total cell number is estimated as the product of the numerical density, NV, of the cells and the reference volume VRef (i.e. the volume of the region of interest) (25). Typically, the latter is estimated by Cavalieri’s method, while the former is estimated from cell counts in a uniformly random sample of the tissue.
The cell density NV is estimated as (see description of physical and optical disector probes below):
where n is the number of sections separating the sampling and look up section including the sampling section (for adjacent sections in the pair n = 1), BA is the microtome or cryostat block advance (26), i.e. the setting of the calibrated sectioning device, a is the area of the counting frame, p is the number of points associated to the frame (often one using the upper right corner or the center point), P is the number of frame points hitting the region of interest across a section, Q− the disector count for a section, h is the disector height, and is the number-weighted mean section thickness—see Fig. 2 and (26). The sums are across all sections for a subject.
The sampling could be performed as fractionator sampling with the important difference that exact sampling fractions need not be known. This approach is also more robust to lost tissue than the fractionator and gives more freedom in designing the tissue sampling (27–30). However, the region of interest must have well-defined, detectable boundaries, and any shrinkage (or swelling) between the estimation of VRef and NV should be avoided or accounted for.
For both the fractionator and the NV × VRef design the final steps of cell counting are done using the disector, the stereological probe for unbiased counting (25). The disector exists in two versions: the physical disector and the optical disector.
The physical disector (25) consists of two thin sections a known distance apart—hence the name, disector or two sections. An unbiased counting frame (21) is superimposed upon one of the sections (sampling section) and a cell is counted if it is sampled by the counting frame—i.e. wholly or partly inside and not touching the exclusion lines—and if not present in the other section (look-up section). This counting rule ensures that cell size, shape and orientation do not bias the count—see Fig. 2 in (31) for an illustration of the problem. Just counting in one section is biased since larger cells are more likely to be present in a given section than are smaller cells. The physical disector requires that all cells be observable in at least one section, i.e. that no cell is lost completely in the sectioning process. Accordingly, the distance between the sections should be small enough such that no cell can hide undetected between sections and such that profiles from neighboring cells can be correctly distinguished. Typically, this means that section thickness should be maximally 25% of the diameter of the cells to be counted and that sections should be adjacent, although the combination of large cells and very thin sections means that sections do not need to be adjacent.
The physical disector is conceptually a simple approach but can be challenging to implement since it requires the alignment of two physical sections at high magnification. However, recent advances in computerized stereology with automatic alignment of sections seems to have given the method a renaissance (6).
The more popular optical disector (19,26) uses thick tissue sections and a high numerical aperture oil (or water) immersion objective that permits focusing through the z axis of the section and counting cells directly as they appear in an unbiased counting frame. By generating a stack of perfectly aligned thin focal planes, the optical disector solves the alignment problem of the physical disector. However, sufficient guard zones are needed to avoid counting near the top and bottom section surfaces where split, torn out, and damaged cells may bias the count. Thus, the mounted, stained, and coverslipped sections need to be thick enough to allow for sufficient guard zones and a disector ideally at least 10 µm high. The latter is advisable because the focal plane of a high numerical aperture, oil immersion objective is ~0.5µm thick, and a substantial number of focal planes are recommended. The size of the guard zones and the amount of z-axis shrinkage (i.e., in the thickness of the section) depends especially upon the embedding media (26). Compared to the thin sections of the physical disector, the thick sections of the optical disector allow complete inspection of cells in 3D and facilitate the morphological identification of cell types.
Among the many stereological methods that can be used to estimate individual cell volumes, the nucleator (32), and the planar rotator (33,34) are used most commonly. The optical rotator (35), volume estimation from point sampled intercepts (36), the selector (37), and the invariator (38,39) might be of in interest for the neuroscientist in special settings.
All of these methods for cell size estimation require sections which are not only random in position but also in orientation. Two main types of stereological sections fulfill these requirements: Isotropic Uniform Random (IUR) sections (40,41), and Vertical Uniform Random (VUR) sections (42). However, generating randomly rotated sections in the brain may be a challenge (29,30).
Although beyond the scope of this paper, stereological methods exist for assessing a range of other parameters such as axonal or dendritic length (43,44); capillary length and number (45); the surface of the brain (27,46); and number of synapses (47,48).
Quantitative studies based upon immunohistochemical techniques have distinct challenges (49), in addition to those associated with obtaining sensitive and specific detection of the antigen of interest in postmortem tissue (50). A special requirement for quantitative immunohistochemical studies based upon thick sections is that the labeling penetrates the sections in their full thickness. With incomplete penetration, sampling is biased and the bias may systematically differ depending on the diagnose group. When using the optical disector with immunohistochemistry, penetration of the labeling should be verified by a pilot calibration study where a sufficient number of cells (> 400) are sampled systematic uniformly random across a set of sections by recording their z-axis position and the local section thickness (49). Plotting the number of labeled cells as a function of distance between the top and bottom of the section will reveal whether the label fully penetrated the sections as well as the required size of top and bottom guard zones. Both the absolute and relative plots are informative—see Fig 4 in (51).
If the label did not fully penetrate the section, a dip in the number of stained cells is seen in the center of the section. We have with success used fluorescent labeled secondary antibodies without the need for the tertiary cascade system for improved penetration of the stain (15). Another obvious solution is to use thin sections in a physical disector based design (6) since it does not depend on z-axis measures.
Well-written scientific papers are expected to include sufficient methodological details and references to allow the reader to judge the rigor of the study as well as to perform a replication study. For stereological studies, we recommend that the items listed in Table 2 be reported as discussed briefly below.
“If you cannot define it, you cannot quantify it.” This axiom captures the need to include clear definitions of the region and structures (e.g. cells) of interest. Representative photomicrographs or drawings can be helpful but the text must include descriptions of “nontextbook” parts of the region of interest where it may be difficult to define the boundary. It may even be that the very definition of the region of interest turns out to be a study in itself—see e.g. (52).
As discussed above, unbiased estimation starts with unbiased—typically uniformly random—sampling of the region of interest. Therefore, a clear description of how the final sections were sampled from the region of interest is needed. Most often this is best done by including a diagram of the sampling procedure—especially if the design is more complicated than just exhaustive sectioning of the region of interest—see (16,29,30). The description should also include details about how the tissue was cut. For example, marked variations in slab thickness may reduce the precision of the final estimates (53,54), and an un-calibrated microtome or cryostat may unknowingly generate sections with an unexpected thickness and thus bias the final results.
Several key issues are important to report. The optical disector probe requires the use of a high numerical aperture oil or water immersion objective and a linear encoder (microcator) to monitor the z-movement of the stage; how this was accomplished should be reported. Whether a stereological software package and a computerized microscope were employed should be described since it can influence the methodological strength and limitations of a study.
Table 2 summarizes the test system constants and sampling parameters to be reported, including details of pilot calibration studies. Reporting these parameters is important for documenting the precision of the study as well as for serving as a guide for optimizing future studies of the region of interest. Notice that a pilot calibration study is typically warranted for documenting that the guard zone size and disector height were sufficient and optimal—see (49,55) as well as Fig. 4 in (51).
If possible, the mean CE for each of the reported stereological estimates should be included and contrasted with the observed variation in the final data. This information documents the degree to which method imprecision influenced the final data and how optimized the methods were—i.e. did the authors count enough, too much or too little? It is important to report how the CEs were assessed since the methodology for assessing the CE of the various stereological estimators is a field of ongoing research.
It is important to assess or at least discuss the potential impact of tissue shrinkage, as it is crucial for comparing size estimates across studies. For any size estimate, it is typically very difficult or impossible to completely correct for tissue deformation as shrinkage may differ across brain regions, across the various components of a given region (13,26), and across subject groups (8). However, documenting that only a moderate shrinkage was present and that the shrinkage did not on average differ across groups would form a robust basis for interpretation of the results. Also, it should be remembered that estimates of total cell numbers are robust to shrinkage if based upon rigorously implemented design-based stereological methods—most easily by using a fractionator approach (26).
Ideally, the results should include means, observed coefficient of variation and corresponding coefficient of error for each group. Also, it is often a good idea to report data in scatter plots instead of histograms as “one dot per measure” efficiently allows the reader to judge not only the mean and spread but also potential clustering or presence of outliers in the data.
It is important to report and discuss the potential impact of any deviation from the requirements of the used stereological methods. Examples of such limitations could be the use of sections without random orientation for cell size estimation or the impact of missing sections on cell number estimates. An illustration of both these issues can be found in (56).
Under-reporting may limit comparison to other studies. An example is the comparison of results of stereological studies of the human mediodorsal thalamic nucleus (MD) (56). We found that both total volume and total number of neurons of the MD differed substantially from study to study—see Fig. 11 in (56). A number of issues made it difficult to assess the cause of these differences: 1) the criteria for defining the region of interest as well as the neurons (including or excluding small neurons) probably differed among studies; 2) tissue and Nissl staining quality as well as the optics used may have limited the ability of some studies to distinguish small neurons from glial cells; 3) pilot calibration studies to determine and document sufficient size of the guard zones for the optical disectors were not used in most studies; 4) only our study used the number-weighted mean section thickness based version of the optical fractionator (26), which prevents biases due to uneven shrinkage in the section thickness; 5) none of the studies (including our own) kept track of total tissue shrinkage from fresh tissue.
While it is advisable to use a stereological approach if possible, investigators may sometime have various reasons for not doing so—the use of biopsies or archival material not sampled according to stereological principles are common reasons. However, in such cases it is still recommended to employ unbiased stereological principles as much as possible and then carefully declare and discuss the limitations to prevent overstating the strength of the conclusions. Readers are then able to decide to what extent they have confidence in the results of the study. For example, if the complete region of interest is not available for sampling—or only a few arbitrary sections are available—then any obtained estimate will only be valid for the sections examined—not the whole region of interest. This limitation has important implications: 1) it is not possible to obtain estimates of total numbers, leaving any density estimate vulnerable to “the reference trap” (10); 2) any estimate comes with the assumption that the region of interest is homogenous, i.e. the examined tissue is expected to look like what was not examined—an assumption that is difficult to impossible to verify without a complete stereological study. However, large differences across subject groups—e.g. a fivefold change in mean density of neurons per mm³ or spines per length of dendrite—within the examined tissue samples of a reasonably homogenous region strongly suggest the presence of a structural difference. Unfortunately, most often changes in postmortem samples from subjects with schizophrenia are much more subtle. Also, in developmental or disease studies, the reference volume may undergo substantial nonlinear changes as a function of age and/or disease state (due to in vivo size changes as wells as differential shrinkage during tissue processing) thus efficiently limiting the interpretability of density data (e.g. neurons per mm³). However, comparison of the ratios of various densities—i.e. the number of boutons per neuron (instead of the density of boutons and neurons per mm³)—may be interpreted with some care assuming that the region of interest is reasonably homogenous.
It is beyond the scope of the current paper to give a comprehensive review of the existing literature of quantitative postmortem schizophrenia studies. However, we do provide a compilation of stereological postmortem studies of schizophrenia that fulfill the following basic criteria:
For cell counting we required a true 3D count using disector probes—i.e. with guard zones when counting in (thick) single sections. For estimates of the volume of the region of interest (by Cavalieri’s principle), we did not require the individual section areas to be assessed by point counting but also accepted tracing and other forms of planimetry. For the probes that were employed, all relevant requirements had to be fulfilled, e.g. for the nucleator or rotator probes the sections had to be cut either IUR or VUR.
To identify relevant papers we searched PubMed (in March 2010) with a broad list of search terms leading to 1147 abstracts from which 319 papers of interest were selected. The papers were acquired and out of these 67 fulfilled our criteria. In addition, we included one reference (57) from a non-indexed journal. Thus, we are able to list stereological estimates from a total of 68 papers organized by brain region in Table 3–Table 6. Because total number estimates are the most robust, they are emphasized in the tables by bold text.
In the light of the current review, our stereological criteria for inclusion may appear lenient as we did not exclude papers based upon a general lack of methodological details. Thus, it is difficult to judge the robustness of some of the estimates in the lists due to methodological under-reporting. On the other hand, a range of otherwise informative quantitative postmortem schizophrenia studies did not make the list—e.g., several of our own studies assessing the size of neurons in various cortical regions using the nucleator probe were excluded due to incomplete sampling of the region of interest and sections that were not randomly rotated. Also, if a study reported several estimates, only the estimate(s) fulfilling above criteria was listed in the tables: e.g., if the reference volume was estimated by Cavalieri’s principle and the neuron number by counting fractionator style but without guard zones (i.e., essentially by a 2D count) only the former estimate is included in the list.
We have sought to present a fair list of sound stereological estimates—not a comprehensive record of quantitative findings of varying methodological quality. When reading Table 3–Table 6 it should be taken into account that we have included all incremental publications from progressing studies spanning several papers to facilitate completeness of the method description despite some redundancy in the results. On the other hand, a number of studies assessing the volume of various brain regions in schizophrenia did not fulfill our criteria due to subtle methodological issues, and were therefore not included in the tables although they do report informative results (e.g., refs. 58–61).
A range of brain regions have only been targeted by one or a few studies to date. However, for regions examined in multiple stereological studies, the current findings appear to converge on the following conclusions, although it is important to note that these provisional conclusions are not based on meta-analyses or other objective assessments of the data:
Cerebral hemispheres (Table 3): The findings support a tendency towards a slightly reduced cortical volume, without any major global deficits in cortical cell number. The field is moving toward more targeted studies of specific cell types in specific cortical areas, including laminar analyses. Although such studies have detected schizophrenia-related cortical changes, more studies are needed before a consensus can be reached.
Thalamus (Table 4): A number of earlier studies found reduced volume and total neuron number in the mediodorsal thalamic nucleus. However, these findings have not been consistently replicated in newer studies with larger sample sizes. The several studies that have targeted the anterior complex and the lateral geniculate nucleus have consistently failed to find any schizophrenia-related differences. In contrast, a number of studies have found reduced volume of the pulvinar in schizophrenia.
Limbic system (Table 5): A number of studies have found a reduced number of neurons in the lateral nucleus of the amygdala or in the nucleus accumbens. On the other hand, almost all studies of the hippocampus did not report schizophrenia-related differences.
Basal ganglia, cerebellum, and brainstem (Table 6): Several studies have not found general structural changes of the cerebellum in schizophrenia. The results of the numerous studies of the striatum and pallidum report conflicting findings of increased or decrease volume and/or number of neurons.
The use of unbiased sampling designs can enhance the ability to correctly detect subtle alterations in brain structure in schizophrenia and other diseases by providing robust quantitative data. In reporting such studies, it is essential to describe the key elements of the experimental approach and to interpret findings in accord with what the design employed can, or cannot, tell you. Additional studies using such robust designs are needed to sensitively capture the nature of brain structural alterations in schizophrenia.
Studies by the authors described here were supported by National Institutes of Health Grants MH043784 and MH084053.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.