Here we propose to exploit high-throughput DNA sequencing to probe the connectivity of neural circuits at single-neuron resolution. Sequencing technology has not previously been applied in the context of neural connectivity, but the sequencing approach has tremendous potential. The advantage of sequencing is that it is already fast—sequencing billions of nucleotides per day is now routine—and, like microprocessor technology 
, getting faster exponentially. Moreover, the cost of sequencing is plummeting (): it currently costs less than $5,000 to sequence an entire human genome, and the race is on to reach the $1,000 genome. Thus, by converting brain connectivity from a problem of microscopy to a problem of sequencing, it becomes tractable using current technology.
BOINC, the method we propose for converting connectivity into a sequencing problem, can be broken down conceptually into three components (). First, each neuron must be labeled with a unique sequence of nucleotides—a DNA “barcode” (; see also ). The requisite barcoding is conceptually similar—though different in detail—to the generation of antibody diversity by B cells in the immune system through somatic recombination. The idea of barcoding individual neurons is inspired by Brainbow, except that here DNA sequences substitute for fluorophores (XFPs). The advantage of using sequences is diversity: whereas Brainbow allows for at most hundreds of color combinations, a barcode consisting of even 20 random nucleotides can uniquely label 420
neurons, far more than the number of neurons (<108
) in a mouse brain.
Converting connectivity into a sequencing problem can be broken down conceptually into three components.
Second, barcodes from synaptically connected neurons must be associated. One way to associate a pre- and postsynaptic barcode is by means of a transsynaptic virus such as rabies 
or pseudorabies (PRV) 
. These viruses have evolved exquisite mechanisms for moving genetic material across synapses and have been used extensively for tracing neural circuits in rodents. To share barcodes across synapses, the virus must be engineered to carry the barcode within its own genetic sequence. After transsynaptic spread of the virus each postsynaptic neuron can be thought of as a “bag of barcodes,” consisting of copies of its own “host” barcodes, along with “invader” barcodes from presynaptically coupled neurons ().
Finally, barcodes from synaptically connected neurons must be joined into single pieces of DNA for high-throughput sequencing (; see also ). Barcodes are joined in vivo, so there is no need to isolate individual neurons prior to extracting DNA. Since only those pairs associated in vivo are actually joined, observing a host-invader barcode pair indicates that the host and the invader were synaptically coupled. For example, if upon sequencing we observe host barcode D with invader barcodes B and C, we can infer that neuron D is connected to neurons B and C.
Joining barcodes with phiC31 integrase.
Since most neurons are only sparsely connected to other neurons in the brain—for example, in the mouse cortex a typical neuron is connected with perhaps 103 of its 108 potential partners—only a small subset of the potential host-invade barcode pairs will actually be observed. Thus upon high-throughput sequencing, we can fill in the non-zero elements of the sparse connectivity matrix ().
Beyond the abstract connectivity matrix.
In its simplest form the sequencing approach yields only a connectivity matrix. Missing from this matrix are at least two kinds of useful information typically obtained with conventional methods based on microscopy: information about the brain region (e.g., primary auditory cortex, striatum, etc.) from which each barcode originates (), and information about the cell type (e.g., dopaminergic, fast-spiking GABAergic, etc.) of each barcoded neuron (). However, several strategies can be used to augment the connectivity matrix with both kinds of information. Thus, as sequencing-based connectivity analysis matures, it may generate a view of connectivity similar to that provided by traditional methods.
In summary, there are three technical challenges that must be overcome to map neural circuits using high-throughput sequencing: (1) barcoding each neuron, (2) associating barcodes from connected neurons, and (3) joining the barcodes prior to sequencing. We are developing an approach based on PRV amplicons 
. Although there are many technical problems, including PRV toxicity and monosynaptic spread 
, which need to be addressed, this approach promises to offer a proof of principle for our proposal to convert connectivity into a sequencing problem.