Techniques for loading Ca2+
-indicators into many cells have enabled recent imaging studies of the dynamics of hundreds of neurons and astrocytes (Gobel et al., 2007
; Greenberg et al., 2008
; Mrsic-Flogel et al., 2007
; Nimmerjahn et al., 2009
; Ohki et al., 2005
; Orger et al., 2008
; Stosiek et al., 2003
). However, computational techniques for extracting cellular signals from Ca2+
imaging data lag behind and are mainly region of interest (ROI) analyses. These are typically manual (Dombeck et al., 2007
; Gobel et al., 2007
; Kerr et al., 2005
; Niell and Smith, 2005
), or semi-automated (Ozden et al., 2008
) means of identifying cells and cannot be easily scaled to handle the largest data sets without undue human labor. Moreover, ROI analyses have largely been based on heuristic definitions of the morphology of specific cell types (Gobel et al., 2007
; Ohki et al., 2005
; Ozden et al., 2008
), rather than general principles for decomposing a data set into constituent signal sources. Thus, current analyses are prone to cross talk in the signals extracted from adjacent cells and surrounding neuropil. The present mismatch between the capabilities for Ca2+
imaging and those for analyzing the data restricts the capacity to attain biological insights.
This situation partly resembles that of the early 1990’s, when multi-electrode techniques were blossoming but standardized spike sorting algorithms had yet to arise. Today, automated spike sorting is widely used to assign spikes to individual cells (Fee et al., 1996
; Lewicki, 1998
) and has enabled key advances in understanding neural coding (Batista et al., 2007
; Csicsvari et al., 1998
; Meister, 1996
). An automated procedure for extracting cellular Ca2+
signals would be a similar enabler of scientific progress. However, the challenges in devising such a procedure are distinct from those in spike sorting.
Spike sorting routines tend to rely on two basic ideas. First, the temporal waveforms for spikes from different cells are often sufficiently dissimilar to provide a basis for spike classification. Second, the activity of individual cells is often recorded on multiple electrodes, aiding assignment of spikes based on their relative amplitudes on different recording channels. Neither approach works well for imaging data. First, Ca2+
activity waveforms are strongly dictated by intracellular Ca2+
buffering and the dye’s binding kinetics (Helmchen et al., 1996
), which do not provide strong signatures of individual cells’ identities. Second, single image pixels can contain a complex mixture of signals from neuropil, neurons, astrocytes, and noise. It is nontrivial to disentangle these signals without suffering cross talk and to find the shapes and locations of each cell. A guiding principle is needed to help extract cells’ locations and activities.
We formulated such a principle by considering that intracellular [Ca2+] can transiently rise ~100-fold above background levels during cellular events such as action potentials. Brief periods of elevated [Ca2+] are typically sparsely interspersed among many more background-dominated time frames. Sparseness also holds in the spatial domain if each cell occupies only a small subset of pixels. Thus, Ca2+ signals’ sparseness should be a general attribute that is quantifiable by simple measures, such as the skewness of amplitude distributions.
This reasoning led us to an algorithm that estimates cells’ locations and activities by parsing data into a combination of statistically independent signals, each with a high sparseness. The algorithm requires no preconceptions of cells’ appearances and little user supervision, and it relies on an independent component analysis (ICA) (Bell and Sejnowski, 1995
; Brown et al., 2001
; Reidl et al., 2007
) (). ICA has been used previously for analyses of electroencephalography (EEG) (Makeig et al., 1997
), magnetoencephalography (MEG) (Guimaraes et al., 2007
) and functional magnetic resonance imaging (fMRI) (Beckmann and Smith, 2004
; McKeown et al., 1998
) data, but a challenge has concerned the physiological interpretation of the identified sources, which can be mixtures of signals from different recording channels or brain areas. We reasoned that for ICA analyses of Ca2+
imaging data, such interpretative issues should be much reduced, since cells’ properties can be corroborated by other experimental means, including in the same animals examined by imaging. In studies of human brain activity, corroborative data was much harder to obtain in living subjects.
Analytical Stages of Automated Cell Sorting
We validated our method using simulated movies mimicking Ca2+ imaging data acquired in cerebellar cortex. Our sorting procedure provided superior signal estimates and lower susceptibility to cross talk than reconstructions done by ROI analysis. We also tested our analysis on data recorded by two-photon microscopy in the cerebellar cortex of awake behaving mice, from which we extracted Ca2+ signals of up to >100 total Purkinje cells and Bergmann glia.
To illustrate our method’s utility, we applied it to study the spatiotemporal organization of Purkinje cells’ Ca2+
-spiking activity in behaving mice. We found that synchronously active cells cluster into neighborhoods ~7–18 cells across in the medio-lateral dimension. We identify these as cerebellar microzones, small patches of Purkinje cells receiving similar climbing fiber input (Andersson and Oscarsson, 1978
). Our data revealed that microzones of awake animals have sharply delineated medio-lateral boundaries, to a precision of about a single cell.
We addressed the longstanding question of whether microzones have stable anatomical boundaries (Andersson and Oscarsson, 1978
), or are dynamic entities whose cellular constituents vary across behavioral states (Lang et al., 1999
; Welsh et al., 1995
). We found that during mouse locomotion microzones’ spatial organization was unchanged from that in awake but resting animals, consistent with the idea microzones are stationary anatomical units. These findings reveal basic features of cerebellar dynamics and highlight the impact of automated procedures for analyzing imaging data.