|Home | About | Journals | Submit | Contact Us | Français|
The kidney represents an excellent model system for learning the principles of organogenesis. It is intermediate in complexity, and employs many commonly used developmental processes. As such, kidney development has been the subject of intensive study, using a variety of techniques, including in situ hybridization, organ culture and gene targeting, revealing many critical genes and pathways. Nevertheless, proper organogenesis requires precise patterns of cell type specific differential gene expression, involving very large numbers of genes. This review is focused on the use of global profiling technologies to create an atlas of gene expression codes driving development of different mammalian kidney compartments. Such an atlas allows one to select a gene of interest, and to determine its expression level in each element of the developing kidney, or to select a structure of interest, such as the renal vesicle, and to examine its complete gene expression state. Novel component specific molecular markers are identified, and the changing waves of gene expression that drive nephrogenesis are defined. As the tools continue to improve for the purification of specific cell types and expression profiling of even individual cells it is possible to predict an atlas of gene expression during kidney development that extends to single cell resolution.
The developing kidney represents an excellent model system for the study of organogenesis. Formation of the adult metanephric kidney begins with the mutual inductive interactions of the metanephric mesenchyme, a condensed region of mesenchymal cells in the intermediate mesoderm, and the nephric (also called mesonephric) duct, an epithelial tube also formed from the intermediate mesoderm. The metanephric mesenchyme induces the outgrowth of the ureteric bud, which invades the metanephric mesenchyme and undergoes extensive branching morphogenesis, eventually giving rise to the kidney collecting ducts. In turn the branching ureteric bud induces the mesenchyme to condense, undergo a mesenchyme to epithelia transformation, and form the nephrons, the functional units of the kidney.
The process of nephrogenesis is quite remarkable. Although the progression from metanephric mesenchyme to nephron is continuous we can identify a number of discrete intermediate stages. The mesenchyme makes an epithelial ball, the renal vesicle, which clefts once to give a comma shaped body, and then again to form the S-shaped body. The S-shaped body fuses with the tip of the branching ureteric bud, grows, segments and gives rise to the nephron, with a renal corpuscle, including the glomerulus, at one end, followed by the proximal tubule, the loop of Henle and finally the distal tubule. For an excellent comprehensive review of kidney development see reference 1. In this report we focus on recent progress made using microarrays to better understand mammalian kidney development.
History teaches us that technology drives science. In situ hybridization procedures were first developed in the late 1960s,2 gene cloning methods in the early 1970s,3 DNA sequencing in the late 1970s,4 transgenic mouse technology in the early 1980s,5 and mouse gene targeting methods were first applied in the late 1980s.6 All provided radical new tools that resulted in important leaps forward in our understanding of the principles of developmental biology. It appears that progress is often more limited by technology than by a shortage of good ideas.
It is essential to appreciate the significance of the considerable power of newer technologies that globally define the expression levels of all genes. With microarrays or next generation DNA sequencing (RNA-Seq) it is possible to determine in a very sensitive and quantitative manner the transcription status of each gene. A universal and unbiased view is provided. We now have the unprecedented ability to characterize the complete gene expression codes that distinguish different cell types.
Many factors feed into the regulation of gene expression. These include the combinations of transcription factors expressed, growth factor interactions with cell receptors, DNA methylation status and histone modifications. Nevertheless, it is important to consider that the key objective of these many influences is to precisely control the activity levels of genes. The identity of a cell can be defined, in large part, by its gene expression state. The technologies that provide universal definitions of gene activity patterns are yielding new levels of understanding of the principles of organogenesis.
In the beginning microarrays were used to study the gene expression profiles of entire kidneys at different stages of development. Sanjay Nigam and colleagues were precocious in applying microarrays to the analysis of rat kidney development in 2001,7 and this was followed by similar work examining the changing gene expression patterns of the developing mouse kidney.8 These early studies provided a global view of the genes expressed in the developing kidney, identified changing profiles as a function of time and began to identify novel pathways and processes. Nevertheless, the obvious shortcoming of these studies was the absence of spatial information. A gene could be identified as expressed in the developing kidney but which compartment was expressing it? It is important to know, for example, if the gene is active in the stroma or the glomerulus.
Subsequent studies started to address this issue. Spatial definition began to be added by using manual microdissection, allowing the separation of E11.5 metanephric mesenchyme and E11.5 ureteric bud,9–11 as well as condensed E10.5 metanephric mesenchyme and flanking intermediate mesenchyme.9 In another interesting study microdissected metanephric mesenchyme was grown in organ culture, subjected to a variety of treatments and used for gene expression profiling via microarrays to better define the gene expression program of early nephrogenesis.12 In addition, Hoxb7-GFP transgenic mice were used to isolate ureteric bud cells,9 and Sal1-GFP mice to purify mesenchyme cells.13 Meanwhile, similar strategies were being used to better understand the development of other organ systems, including pancreas,14 inner ear,15 mammary gland16 and liver.17
To provide a more systematic approach the NIH established an international consortium of research groups charged with the task of producing a high quality molecular anatomy of the developing urogenital tract.18 This Genitourinary Developmental Molecular Anatomy Project (GUDMAP) uses several complementary strategies. High throughput in situ hybridizations, carried out in the laboratories of Melissa Little and Andy McMahon, have been used to describe the expression patterns of thousands of genes during kidney development. Parallel studies have used microarrays in conjunction with laser capture microdissection (LCM) or other purification schemes, to define global gene expression profiles of the multiple components of the developing kidney (see gudmap.org). In situ hybridizations are used extensively to validate microarray results, and in turn the microarray results often identify sets of interesting genes for in situ hybridization analysis, and serve to direct the production of GFP-Cre transgenic mouse tools for use by the kidney development research community.
LCM facilitated the isolation of most of the distinct structural components of the developing kidney, including renal vesicle, S-shaped body, proximal tubule, loop of Henle, stroma or interstitium, from the medullary and cortical regions and medullary, cortical and tip region collecting duct (Fig. 1). By combining LCM with manual microdissection and Crym-GFP mice, which allowed the separation of capping mesenchyme, 15 distinct elements from the developing kidney were purified, and their gene expression profiles defined with microarrays.19 It is important to note that there is a distinct synergy that derives from a more all-inclusive study, allowing cross comparison analyses of the many different compartments. The results provide a global definition of the gene expression codes driving kidney development, at microanatomic resolution.
Several interesting principles emerge. First, relatively few genes show strictly compartment specific expression. Out of the over seven thousand genes that exhibit interesting expression patterns during nephrogenesis, fewer than two hundred are highly restricted to one compartment. The presence of some genes with extremely localized expression does provide validation of the LCM purification process. Nevertheless it is surprising that the vast majority of genes show quantitative, and not qualitative differences in expression from one element of kidney development to the next. Instead of a digital, on/off, code it appears that there is more of an analog or rheostat code at work. There is an important quantitative component to the different gene expression codes that define the multiple compartments of kidney development. It is important to note, however, that as structures begin to terminally differentiate, thousands of genes with more restricted expression domains are activated. For example the differentiating proximal tubules show specific expression of a number of solute transporters, and the ureteric smooth muscle, surrounding the exiting ureter, express a large number of genes specifically involved in muscle differentiation.
As might be expected, the more related components of kidney development show more similar gene expression profiles. The expression pattern of the capping mesenchyme, for example, somewhat overlaps that of the renal vesicle, which in turn is related to that of the S-shaped body. Of special interest, in some cases this overlap in expression extends to genes thought to be specific markers for only one compartment. For example the capping mesenchyme can show expression of some genes, such as Bmp2 and Lhx1, generally considered markers of the renal vesicle. This could be referred to as anticipatory expression, since it reflects anticipation of the gene expression pattern of the subsequent developmental structure. This has been confirmed by in situ hybridizations to represent real gene expression in distinct cells of the component with anticipatory expression and not contamination (Patterson, unpublished data). Presumably this anticipatory expression helps to drive progression to the following stage of nephrogenesis.
A gene expression atlas can also be used to help define the genetic regulatory circuitry that drives nephrogenesis. As new waves of genes are elevated in expression during succeeding stages of nephron formation their promoters can be examined for the presence of evolutionarily conserved transcription factor binding sites, which define candidate regulatory pathways. Use of this strategy (http://toppgene.cchmc.org/), identifies for example 21 genes (Scrn1, Hoxa10, Ldb2, Col23A1, Isl1, Gpr85, Hoxb1, Nrxn3, Eml1, Dpysl3, Sh3gl3, Sfrp1, Nin, Prrx1, Fzd2, Nr4a3, Nrxn1, Ca13, Grin3A, Lpl, Pcdh8) in the metanephric mesenchyme with proximal promoter positioning of one or more YATGNWAAT_V$OCT_C transcription factor binding sites, suggesting regulation by an Oct transcription factor. Statistical analysis determines that the p value for this observed prevalence of Oct binding sites in the promoters of genes with elevated expression is essentially zero. By performing a similar analysis of the various developing kidney compartments it is possible to begin to identify transcription factors driving nephrogenesis through regulation of specific sets of downstream targets.19
Another strategy for analysis of transcriptional regulatory pathways is to first screen for genes with the most compartment specific expression. For example, by using GeneSpring it is possible to identify a list of 91 genes with the most enriched expression in the E11.5 metanephric mesenchyme. These genes can then be subjected to transcriptional regulatory pathways analysis, again with GeneSpring, giving a diagram as shown in Figure 2. Some of the relationships are previously described, such as Six2 regulation of GDNF, but many others are novel. For example there is a very strong neurogenesis signature, including expression of GDNF, Isl1, Apoe, Slit3, Cxcl12, Chl1 and Gfra3, suggesting striking similarities between kidney and neuron development.
An LCM based atlas of developing kidney gene expression, however, still suffers from relatively low resolution spatial definition. The developing glomerulus can be captured, for example, but it still includes multiple distinct cell types. It would be better to provide the gene expression profile of each cell class within the forming glomerulus. This sort of single cell type resolution can often be achieved by using transgenic mice with a cell type specific promoter driving expression of a reporter gene, such as GFP. As more and more such transgenic reporter mice become available it is becoming possible to isolate more specific cell types. For example, it is now possible to break down the glomerulus into each of its constituents, the endothelial cells, mesangial cells and podocytes. The MafB-GFP mouse precisely marks podocytes. Even within the S-shaped body the precursors of podocytes are precisely indicated by this transgene (Fig. 3). In addition Tie2-GFP can be used to isolate the endothelial cells and Meis1-GFP marks mesangial cells. The Tie2-GFP and Meis1-GFP reporters, however, are also expressed in kidney cells outside of the glomerulus, making it necessary to first purify glomeruli, perhaps by sieving, then to enzymatically dissociate the glomerular cells and finally to purify the cells of interest through fluorescence activated cell sorting (FACS). In this manner it is possible to define the gene expression states of each kind of cell of the glomerulus, in both wild type animals an in animal models of kidney disease. Single cell type gene expression profiles made using transgenic promoter specific-GFP mouse lines to isolate podocytes, mesangial cells, cap mesenchyme, renal vesicles, juxtaglomerular apparatus and endothelial cells, in many cases for multiple developmental time points, are available at the website GUDMAP.org.
Even the use of transgenic GFP reporter mice, however, has limitations. So-called single cell types can often be subdivided into multiple subtypes upon further analysis. For example, either Six2-GFP or Crym-GFP reporter mice can be used to mark the capping mesenchyme progenitor cell population surrounding the tips of the branching ureteric bud.19 But it has been shown that this population of cells can be subdivided into induced and uninduced regions, with several differences in gene expression, as measured by in situ hybridization.20 Another example is the renal vesicle, a ball of epithelial cells that are morphologically indistinguishable. Nevertheless, in situ hybridizations have detected a distinct polarity, with the distal domain specifically expressing Lhx1, Dll1, Brn1, Dkk1, Jag1 and Bmp2, while the proximal region expresses Wt1 and Tmem100.19–24 The early metanephric mesenchyme represents yet another example of a diverse population of progenitor cells that are currently poorly understood. Most of these cells are from the intermediate mesoderm, but some are derived from the paraxial mesoderm.25 They give rise to the many distinct cell types of the kidney, excepting the collecting duct system. We are just beginning to understand the progressive lineage restrictions that these cells undergo, with Foxd1 and Six2 expression for example, demarcating an early stromal, nephron cell type bifurcation.26 It seems likely that an extensive analysis of the gene expression profiles of the individual cells of the early metanephric mesenchyme would lead to a much deeper understanding of the gene expression codes that drive these early developmental decisions. Single cell analysis procedures could also shed further light on the terminal differentiation processes of, for example, the formation of the multiple distinct cell lineages of the proximal tubule. Procedures for the analysis of gene expression in single cells have been in use for many years,27,28 and continue to be refined.29,30 Their application to the study of kidney development promises to be fruitful.
The current atlas of gene expression in the developing kidney does not include information regarding alternative processing of genes.19 This is a significant shortcoming, since over 90 percent of multi-exon genes are subject to alternative processing events according to current estimates,31,32 and different splice forms can have distinct or even opposite functions. Another gap in the current atlas is the absence of microRNA expression data. Both of these deficiencies could be addressed by using microarrays designed to examine exon specific gene expression and expression levels of microRNAs.
Another approach, however, would be to tap into the power of next generation DNA sequencing. By making cDNA copies of RNA and then sequencing them (RNA-Seq) it is possible to provide a digital readout of gene expression levels. This strategy offers many advantages over microarrays. First, there is no bias resulting from the sequences selected to be represented on a microarray. Second, there is essentially no background, because only sequences that can be aligned are used. Third, there is an almost unlimited dynamic range, estimated to cover about five orders of magnitude when using 40 million reads with the mouse genome.33 Microarrays, on the other hand, have detection problems with very low abundance transcripts because of background issues and can saturate, causing loss of linear response at high transcript levels, resulting in a dynamic range of only a few hundred-fold. In addition RNA-Seq provides excellent definition of alternative processing events, by giving a digital readout of individual exon expression levels and by sequencing exon-exon junctions. RNA-Seq can also be used to analyze microRNAs, including important editing events34 and both 5′ and 3′ end variations,35,36 which are invisible to microarrays. As the cost of DNA sequencing continues to plummet it is likely that RNA-Seq will replace microarrays as the method of choice for global gene expression analysis.
Dr. Rafi Kopan, Professor of Medicine and Developmental Biology, Washington University School of Medicine: You would open up Pandora's box by doing the single cell analysis because, as we discussed this morning, you have this gene noise, you have stochastic events, so what kind of statistical tools do you think you will need to differentiate between the majority behavior and noise and would it be in some ways better to look at populations of cells, some randomly chosen population of cells, instead of individual cells, groups of 10 or 20, whatever you want?
Dr. Potter: Yes, we thought about that question a lot, actually. First of all, you are absolutely right, when you perform single cell analysis there is more noise. There are multiple sources. First, not all cells are identical, even cells that we think belong to the same subtype. Of course this is one reason to examine single cells, to define individual differences and define novel subtypes. But, there is also just more inherent noise at the cellular level, with small numbers of molecules, in many cases only one or a few copies of a given transcript on average per cell, meaning some cells have a few and others none. And one must add to this the technical noise, as it is much more difficult to accurately define the gene expression profile of a single cell. We know that when you get down to five or ten picograms of total RNA and you apply these analysis procedures you don't get correlation coefficients that are 0.99, as is possible when examining greater quantities of RNA. Instead, they are in the range of 0.7 to 0.8. So there is noise in the analysis. Our take on this is that in order to get anything really meaningful out of it, you need to significantly increase the number of samples. For most of the studies I have described today we had really good correlation coefficients and we are talking about doing everything in triplicate, but I think when you get down to a single cell level, you have to forget about triplicate. It is too noisy to get an accurate analysis out of triplicate. So, let's say in an ideal world, where cost is no object, you look at a thousand individual cells. We could never afford to do that, but suppose you could. When you look at a thousand you are going to have the array patterns fall into distinct categories or bins and even though each array pattern is in itself going to be noisy, if you sum them together then we think that you're going to get a reliable view of the gene expression profile of that cell type. It is important to emphasize that even though the single cell analysis is not perfect, it is nevertheless extremely powerful and very capable of distinguishing different gene expression profiles and dividing them into distinct groups.
I think one very interesting approach would be to do it all by NexGen RNA-Seq. With NexGen, it is very easy to pool data sets, because you are just taking digital counts, so one just adds up the counts for the individual cells of a given subtype. One would find distinct cell types, for example, if one looked at a large number of cells from the cloud of metanephric mesenchyme. I don't know how many bins there would be, but there might be three; vascular, stroma and nephrogenic, as a very simple example. There might be more. If you had a 100 single cell samples, perhaps more feasible than a thousand and you divided them into three categories, then you could actually pool the RNA-Seq counts for the three different categories and get a very accurate read out of the gene expression profile of each cell type. By adding them together one would overcome the noise issues associated with the single cell samples.
You also asked if it might be better to examine pools of cells, perhaps ten or twenty. But it seems to me that this really defeats the purpose of single cell analysis. One is then looking at averages of many cells, and single cell distinctions would be very difficult to distinguish. The different category types would disappear. Does this make sense?
Dr. Kopan: Yes it does but it opens the question of how then do you correlate that with decision making? Just categorizing the different, let's call them options of gene expression that exist in the cloud, we still don't know what that option translates into in actual developmental potential.
Dr. Potter: I think we will have to sort of play it as we go a little bit. We can't predict exactly what we will find. If we find three distinct categories, for example and some cells are expressing Tie2 and some are expressing Foxd1 and some are expressing Six2, that would be a very simple world and we might be able to easily discern that some cells are determined to make vascular cells, some to make stromal cells and some to become the epithelial cells of the nephron. If we have more complicated results, like our very preliminary data suggesting simultaneous robust Six2 and Foxd1 expression in single cells, that suggests the existence of E11.5 metanephric mesenchyme progenitors that are undecided yet as to whether they are going to become epithelial or stroma.
Dr. Scott Boyle, Postdoctoral Fellow in Nephrology, Washington University School of Medicine: I understand that you can look at the gene expression profiles at P0 thru P4 and get an idea of maybe the program that is responsible for the cessation of nephrogenesis. But I am also interested in another comparison, perhaps P2 versus E14.5. So are there a group of genes there that may identify what is responsible for self renewal in the mesenchyme, given that is the part of the program that shut off during cessation.
Dr. Potter: In more general terms we have compared for example, the renal vesicle at P1 versus the renal vesicle at E12.5 and I didn't point it out on the heat map I showed but it is quite interesting to me that at earlier developmental times there are global differences in gene expression compared to later times. For example, the earlier cells in all compartments seem to be more devoted to cell division and DNA synthesis compared to later compartments. A capping mesenchyme later is not going to be the same as earlier; a renal vesicle later is not the same as earlier. We thought going in they might be, but we see now that they are not. Your question though is if we looked at, say an E15 capping mesenchyme and compare it to P1–3, can we drive a deeper insight into what the differentiation program is. After all, early in the development we have active renewal as well as differentiation but later, at say P3, we have differentiation without renewal. We have not made those comparisons that you are suggesting. Our problem is that we are using two different arrays for those two different analyses. About a year ago when we switched over to GFP single cell type analysis we switched over to a Nugen target application method away from an Epicentere system, we switched over to an Affymetrix gene 1.0ST array, away from the old Affymetrix MOE430. We upgraded our technology and it is difficult to make good array data comparisons when you are using different target amplification methods and different arrays. That would be a nice thing to do and my bioinformatics colleague, Bruce Aronow, might be able to do it, so perhaps I should ask him to try. But I think we can learn much the same thing by comparing the P1 data, immediately after birth, when renewal still is taking place, to the P3 data, when differentiation prevails.
Dr. Chen: Many people think that the Six2 population and the Foxd1 population do not overlap. Is this just because in situ hybridization is not sensitive enough to detect the low level of Foxd1 signal in these Six2 positive cells? One thing worth mentioning is that we looked at mice carrying Foxd1-Cre and ROSA-lacZ and found a small number of blue cells in tubular epithelia, which should be the derivatives of the Six2 population. We thought it could be ectopic expression or occasional repair by stroma cells. But this may also mean that some Six2 positive cap mesenchymal cells expressing Foxd1—as your preliminary results suggested.
Dr. Potter: I like that. This corroborates what we are seeing in our very preliminary single cell data.
Dr. Sanjay Jain, Assistant Professor of Medicine and Pathology and Immunology, Washington University School of Medicine: I have a few questions. One is a developmental question. So you comment about a few hundred specific genes that are compartmentalized, right, so have you gone further and looked at what might be driving the core regulated expression at upstream regions or common transcription factors or micro RNAs?
Dr. Potter: No, not really. What we have done is not focused on those couple of hundred genes but instead we looked at all of the genes that were highly upregulated in a specific compartment. Not just those few that were unique to a given compartment and not expressed anywhere else. Actually, we have a website that was made at Cincinnati Children's Hospital Medical Center, which I like a lot, called ToppGene (http://toppgene.cchmc.org/). You just drop in your list of genes, and it will perform this evolutionarily conserved transcription factor binding site—proximal promoter analysis. So, for each kidney development compartment we took the list of 1–200 genes with most compartment specific expression and used this website to look for the over representation of specific transcription factor binding sites in their promoters. It would, however, be interesting to re-examine the data the way you suggest, with focus on just the small number of genes with extremely specific expression within single compartments.
Supported by George M. O'Brien Center DK079333 and DK070251 (S.S.P.).
Previously published online: www.landesbioscience.com/journals/organogenesis/article/12682