|Home | About | Journals | Submit | Contact Us | Français|
Waxy varieties of the tetraploid cereal broomcorn millet (Panicum miliaceum L.) have endosperm starch granules lacking detectable amylose. This study investigated the basis of this phenotype using molecular and biochemical methods. Iodine staining of starch granules in 72 plants from 38 landrace accessions found 58 nonwaxy and 14 waxy phenotype plants. All waxy types were in plants from Chinese and Korean accessions, a distribution similar to that of the waxy phenotype in other cereals. Granule-bound starch synthase I (GBSSI) protein was present in the endosperm of both nonwaxy and waxy individuals, but waxy types had little or no granule-bound starch synthase activity compared with the wild types. Sequencing of the GBSSI (Waxy) gene showed that this gene is present in two different forms (L and S) in P. miliaceum, which probably represent homeologues derived from two distinct diploid ancestors. Protein products of both these forms are present in starch granules. We identified three polymorphisms in the exon sequence coding for mature GBSSI peptides. A 15-bp deletion has occurred in the S type GBSSI, resulting in the loss of five amino acids from glucosyl transferase domain 1 (GTD1). The second GBSSI type (L) shows two sequence polymorphisms. One is the insertion of an adenine residue that causes a reading frameshift, and the second causes a cysteine–tyrosine amino acid polymorphism. These mutations appear to have occurred in parallel from the ancestral allele, resulting in three GBSSI-L alleles in total. Five of the six possible genotype combinations of the S and L alleles were observed. The deletion in the GBSSI-S gene causes loss of protein activity, and there was 100% correspondence between this deletion and the waxy phenotype. The frameshift mutation in the L gene results in the loss of L-type protein from starch granules. The L isoform with the tyrosine residue is present in starch granules but is nonfunctional. This loss of function may result from the substitution of tyrosine for cysteine, although it could not be determined whether the cysteine isoform of L represents the functional type. This is the first characterization of mutations that occur in combination in a functionally polyploid species to give a fully waxy phenotype.
The evolution and diversification of crop plants has been shaped over thousands of years by conscious and unconscious human selection on a wide range of phenotypic traits. In plants cultivated primarily as a carbohydrate source, including cereals, loci that determine starch quality have been among the key targets for human modification (Whitt et al. 2002; Olsen et al. 2006). Plant starch comprises two polymers: amylose, a predominantly linear chain of α-1,4-linked glucosyl units, and amylopectin, a highly branched molecule with short α-1,4-linked glucosyl chains linked by α-1,6-linkages. Cereal endosperm starch typically comprises 15–30% amylose and 70–85% amylopectin. Amylose synthesis is catalyzed by granule-bound starch synthase I (GBSSI, EC 184.108.40.206), which catalyzes the extension of the amylose molecule by the transfer of α-D-glucose from adenosine diphosphate (ADP)-glucose.
Mutations at the GBSSI (also called Waxy or Wx) locus that result in amylose being present in low levels or absent in endosperm starch (Fukunaga et al. 2002)—the waxy phenotype—have been characterized in several species. In barley, a 413-bp deletion in the promoter and 5′ untranslated region of the GBSSI gene disrupts transcription, reducing the production of GBSSI protein to a fraction of its levels in the nonwaxy type (Domon et al. 2002; Patron et al. 2002). In maize and in foxtail millet (Setaria italica [L.] P. Beauv), gene expression is disrupted by diverse transposable element insertions (Wessler and Varagona 1985; Wessler et al. 1990; Marillonnet and Wessler 1997; Kawase et al. 2005). In rice, a G→T mutation at the splice site of intron 1 results in incomplete processing of pre-mRNA (Hirano and Sano 1991; Wang et al. 1995; Hirano et al. 1998; Isshiki et al. 1998). In sorghum, a single-nucleotide polymorphism (SNP) has been identified, which causes a Glu to His mutation that may inactivate the GBSSI protein (McIntyre et al. 2008). In all the above diploid crops, the waxy phenotype has arisen by spontaneous mutations. In hexaploid bread wheat, a waxy line has been produced by crossing partially waxy cultivars with spontaneous mutations in the three GBSSI homeologues in the A, B, and D genomes (Nakamura et al. 1995). These have been characterized, respectively, as a 23-bp deletion in the Wx-A1b allele that affects transcript splicing or a 173-bp transposon insertion in an exon; a deletion of the entire GBSSI transcription unit in the Wx-B1b allele; and a 588-bp deletion in the Wx-D1b allele that leads to the deletion of 30 amino acids from the C-terminus (Vrinten et al. 1999; Saito et al. 2004). Distinct single-nucleotide insertions and deletions causing frameshift mutations have been identified as responsible for the independent origin of null Wx-A1 alleles in emmer wheat (Saito and Nakamura 2005).
With the exception of waxy strains of emmer and bread wheats, which have been generated by modern breeding programs, waxy varieties of cereals originated in east and southeast Asia, where they have been selected in response to a cultural preference for glutinous-type (i.e., waxy) starchy foods (Sakamoto 1996; Olsen and Purugganan 2002). Panicum miliaceum L. (broomcorn, proso, or common millet), a tetraploid species (2n = 4x = 36), is one of the world's oldest and historically most important domesticated cereals; recent data attest to its presence as early as 10,000 cal BP in northern China (Crawford 2009; Lu et al. 2009). Its probable center of domestication was in this region (although with possible independent domestications further west in Eurasia; Zohary and Hopf 2000; Jones 2004; Hunt et al. 2008), and there is currently considerable interest in its early cultivation and domestication, as revealed through diverse analytical techniques including stable isotopic analysis, phytoliths, and lipid biomarkers (Jacob et al. 2008; Barton et al. 2009; Lu et al. 2009). These factors make this crop an interesting species in which to investigate the selection of waxy mutants. Distinct terms indicating glutinous broomcorn millet are recorded in Chinese classical texts dating back some 2,000 years (Sakamoto 1996). Waxy millet therefore provides an ideal system to explore the biochemistry and molecular biology that lie behind ancient food-preference choices.
Previous work (Graybosch and Baltensperger 2009) has shown that the waxy trait in this tetraploid species is determined by recessive alleles at two loci. In the current study, we have provided the first characterization of these loci, obtaining DNA sequence for the region corresponding to the full length of the mature GBSSI protein for two GBSSI homeologues. We found evidence for the presence of these two forms of GBSSI in starch granules. At one locus, we have established a clear link between a genetic mutation and its biochemical effect of loss of GBSSI activity, leading to loss of amylose synthesis and waxy-type endosperm. At the other locus, we identified three alleles, of which two result in the loss of active GBSSI protein. It was not possible to determine whether the third allele represents the functional version of this homeologue, but consideration of the allele distributions and genotype combinations enables us to establish a hypothesis for the evolution of the waxy phenotype in this species.
Accessions of P. miliaceum germplasm were provided by the USDA-ARS North Central Regional Plant Introduction Station, Ames, IA, and by the Vavilov Research Institute, St Petersburg, Russia. The majority of the accessions were described as having landrace status. Grain from 38 accessions was grown in greenhouses at the University of Cambridge Botanic Garden. Between one and three plants per accession (72 plants in total) were analyzed for endosperm starch phenotype and subsequently for GBSSI genotype (table 1). Eight of these 72 plant samples (highlighted in table 1)—accession 3o plant #1, 3y #1, 04 #1, 47 #1, 70 #1, 71 #1, 76 #1, and 82 #1—were analyzed for starch protein, starch synthase (SS) activity, and sequenced for DNA corresponding to the full length of the mature GBSSI protein.
The development of endosperm starch is determined by the triploid (3n) endosperm genome (Sano 1984; McIntyre et al. 2008). Panicum miliaceum is largely self-pollinated, but crosspollination may exceed 10% (Baltensperger 1996); we excluded the possibility of this by bagging plants at the first sign of panicle development in perforated cellophane bags (Focus Packaging & Design Ltd, Louth, Lincolnshire, United Kingdom). Grains were harvested at maturity, crushed individually between glass slides, and stained with Lugol's solution (10% [w/v] KI [Sigma–Aldrich Ltd., Gillingham, Dorset, United Kingdom], 5% [w/v] I2 [Sigma–Aldrich Ltd.]), diluted 100-fold with water immediately prior to use. Three grains were analyzed per plant. Starch-granule color was observed under 20× objective magnification on a microscope (Nikon, Tokyo, Japan). Slides were photographed using a DN100 camera (Nikon).
Starch protein from the eight plant samples highlighted in table 1 was extracted and analyzed by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE). A single mature grain was dehusked and crushed in an Eppendorf tube. The powdered grain was resuspended in 80-μl extraction buffer (50 mM 2-[4-(2-hydroxyethyl)piperazin-1-yl]ethanesulfonic acid (HEPES) pH 8.0, 1 mM dithiothreitol (DTT), 10 mM ethylenediaminetetraacetic acid (EDTA) and centrifuged in a benchtop centrifuge at 14,000 rpm for 5 min at 4 °C. The supernatant was transferred to a second Eppendorf tube, and the pellet was resuspended in 80-μl extraction buffer to give equivalent volumes in the pellet and supernatant fractions. Twenty-microliter sample buffer (55 mM Tris–HCl pH 6.8, 2% SDS, 10% glycerol, 11 mg ml−1 DTT, bromophenol blue to color) was added to each fraction, and they were heated at 90 °C for 3 min. The samples were cooled and centrifuged, and the supernatant was analyzed by SDS-PAGE on 7.5% polyacrylamide gels that were electroblotted onto polyvinylidene difluoride (PVDF) membrane. Antiserum was raised to a 60-kDa protein excised from an SDS-PAGE gel of granule-bound proteins from developing barley (Hordeum vulgare L.) endosperm (Smith AM, personal communication). Rabbit sera containing the resulting antibarley endosperm GBSSI antibodies in a 1:2,000 (v/v) dilution, and the secondary antibody IgA antirabbit phosphatase conjugate, were used to develop blots following the method of Denyer, Barber, et al. (1997). In a second experiment, powdered grain was resuspended and washed three times in 2% SDS (instead of extraction buffer) then once in H2O. The resulting suspension was spun through 80% CsCl and washed once more in H2O prior to the addition of sample buffer and subsequent analysis as above.
SS activity assays were carried out on the same eight plant samples analyzed with SDS-PAGE. Three replicate starch preparations were made for each plant sample. Starch preparation followed a method modified from Denyer et al. (1995). Twenty grains were dehusked and ground in a chilled mortar, then resuspended in 1-ml extraction buffer (0.1 M Tris–acetate pH 7.0, 0.5 M NaCl, 1 mM DTT, and 1 mM EDTA). The homogenate was centrifuged in a benchtop centrifuge at 13,000 rpm for 3 min at room temperature and the supernatant discarded. The pellet was washed in the same way once more with 1-ml extraction buffer, three times in 1-ml wash buffer (50 mM Tris–acetate pH 8.0, 1 mM DTT, and 1 mM EDTA), and once in ice-cold 100% acetone. The pellet was air dried and stored at −20 °C prior to assaying.
SS assays were carried out in 100 μl volumes containing 100 mM Bicine (pH 8.5), 25 mM potassium acetate, 10 mM DTT, 5 mM EDTA, 20 μl of 50 mg ml−1 potato amylopectin in water (freshly boiled and then cooled to room temperature), 2 mM ADP[U14C]glucose (at 4.6 GBq mol−1), and 20 μl starch suspension (50 mg ml−1 starch in 100 mM 3-morpholinopropane-1-sulfonic acid (MOPS) [pH 7.2], 5 mM MgCl2, 50 ml l−1 glycerol, and 2 mM DTT, 1 g l−1 bovine serum albumin). Assay mixtures were incubated for 30 min at 25 °C, and then 3 ml ice-cold 75% (v/v) aqueous methanol containing 1% (w/v) KCl was added to stop the reaction. Controls were stopped by the addition of methanol/KCl immediately after the addition of the starch-granule suspension to the rest of the assay. After at least 5-min precipitation in methanol/KCl, starch was collected by centrifugation in a benchtop centrifuge at 3,000 rpm for 5 min at room temperature. The supernatant was discarded, and the starch pellet was resuspended in 300 μl water. The methanol–KCl precipitation and water resuspension steps were repeated a further two times. After the final resuspension in water, 3 ml Hisafe II scintillant was added, and radioactivity in starch was determined by scintillation counting. All assays and controls were performed in triplicate for each replicate starch sample.
Leaf tissue was freeze dried and ground using a Qiagen Tissue Lyser. DNA was extracted using either a Qiagen Plant DNeasy kit (Qiagen Ltd, Crawley, West Sussex, United Kingdom) or a hexadecyltrimethylammonium bromide (CTAB) protocol modified from Rogers and Bendich (1994). In the latter, powder was resuspended and incubated in a mix of 500 μl CTAB buffer (2% [w/v] CTAB, 0.1 M Tris–HCl pH 8.0, 1.4 M NaCl, 0.02 M EDTA), 50 μl Sarkosyl buffer (10% [w/v] N-lauryl sarcosine, 0.1 M Tris–HCl pH 8.0, 0.02 M EDTA), and 5 μl β-mercaptoethanol) at 60 °C for 1 h. One volume of chloroform/isoamyl alcohol (24: 1) was then added and the mixture vortexed to form an emulsion. Following 3-min centrifugation at 13,000 rpm in a benchtop centrifuge, the supernatant was transferred into a clean tube, and the chloroform/isoamyl alcohol clean-up stage was repeated. Two-third volume of ice-cold isopropanol was added to the supernatant. Precipitated DNA was pelleted by 3-min centrifugation at 13,000 rpm in a benchtop centrifuge, and pellets were washed in 500 μl 70% (v/v) ethanol. Following a final centrifugation as above, the ethanol supernatant was discarded and pellets air dried prior to resuspension in 100 μl water.
Initial experimentation with primers from Fukunaga et al. (2002) to the S. italica GBSSI sequence consistently demonstrated the presence of two GBSSI loci in P. miliaceum, which we designated the short (“S”) and long (“L”) genes. Following further extensive experimentation with primers designed against the S. italica sequence, we used the following protocols to amplify the region corresponding to the entire mature peptide for the S and L loci in the eight plants previously analyzed for starch protein and enzyme activity. The L locus was amplified in a single polymerase chain reaction (PCR) using the primers FPSLVVC3 and Rstop3 (table 2), in 50 μl volumes using 1× Finnzymes HF buffer (New England Biolabs, Hitchin, Hertfordshire, United Kingdom), 200 μM deoxynucleoside triphosphates (dNTPs), 0.3 μM of each primer, 3% dimethyl sulfoxide, 1 U Finnzymes Phusion High-Fidelity DNA Polymerase (New England Biolabs). Cycling conditions were 30 s at 98 °C; 40 cycles of 10 s at 98 °C, and 2 min 30 s at 72 °C; final extension step of 10 min at 72 °C. The S locus could not be reliably amplified as a single product, apparently due to the presence of a poly-C motif in intron 7 that often caused the reaction to fail. We therefore amplified the S locus in three overlapping fragments. The 5′ and 3′ regions were amplified using the primer combinations FPSLVVC3 and ex7Srext, and int7Sf and Rstop3, respectively. Reactions were performed as for the amplification of the L locus above, except using 40 cycles of a three-step PCR cycle of 10 s at 98 °C, 30 s at 71 °C/64 °C, respectively, 1 min at 72 °C. We amplified the central fragment of the S gene using a nested PCR strategy. The primary PCR used the primers int5Sf and R11 and was carried out according to the same protocol as the above reactions, with the differences that MgCl2 was added to give a final concentration of 2 mM and the three-step cycling program was as follows: 40 cycles of 20 s at 98 °C, 30 s at 64 °C, and 30 s at 72 °C. The cleaned product from this reaction was used as template in a secondary PCR using the primers M17 and Ex8r, which followed the same protocol as the primary PCR.
The primers FPSLVVC3 and Rstop3 overlapped with the N- and C-termini of the mature GBSSI peptide, as expected by comparison with GBSSIs from other species (supplementary fig S1, Supplementary Material online), by 3 and 21 nt, respectively. Peptide mass fragment analysis (see below) was therefore used to confirm that the N- and C-termini matched the amino sequence anticipated by comparison with S. italica.
Following identification of exon polymorphisms, one shorter fragment within the S sequence and two within the L sequence were amplified for all 72 plant samples analyzed for starch phenotype, using the primers M5 and R11, M12 and R12, and int5Lf and R3, respectively (fig. 1 and table 2 show positions and details of primers used). PCRs were carried out on an Eppendorf MasterCycler thermocycler, in 25-μl volumes containing 1× PCR buffer, 0.3 μM of each primer, 200 μM dNTPs, 1.3 U DNA polymerase (Expand High-Fidelity PCR System, Roche Diagnostics Ltd., Burgess Hill, United Kingdom) and 1-μl template DNA. Reactions for the M5-R11 and M12-R12 fragments contained 1.5 mM MgCl2; reactions for the int5Lf-R3 fragment contained 2.5 mM MgCl2. Cycling conditions for the M5-R11 and M12-R12 fragments were as follows: 2 min at 94 °C; 30 cycles of 30 s at 94 °C, 30 s at 58 °C, and 1 min 30 s at 72 °C; final extension step of 7 min at 72 °C. Cycling conditions for the int5Lf-R3 fragment were 2 min at 94 °C, 35 cycles of 45 s at 94 °C, 30 s at 54 °C, and 2 min at 72 °C, with a final extension step of 7 min at 72 °C.
PCR products were checked for size on Tris/acetate/EDTA buffer–agarose gels, excised when necessary, and cleaned using an illustra GFX PCR purification kit (GE Healthcare, Amersham, United Kingdom) or a Qiagen gel purification kit (Qiagen Ltd). Cycle sequencing was performed using primers for each fragment as shown in table 2. Contigs were assembled in ChromasPro version 1.41 (Technelysium Pty, Ltd., Tewantin, Australia). Sequences were aligned in MEGA version 4.0 (Tamura et al. 2007) and predicted intron–exon boundaries and amino acid sequence established by reference to the published sequence for S. italica (Fukunaga et al. 2002). Sequences have been submitted to GenBank; accession numbers are given in table 3.
We aligned the predicted amino acid sequences in P. miliaceum with GBSS sequences downloaded from GenBank for a range of monocots and dicots using the ClustalW alignment tool in MEGA 4.0. A maximum likelihood tree of relationships between these taxa, with 1,000 bootstrap replicates, was estimated using the PhyML online web server (Guindon et al. 2005), using the Jones, Taylor, and Thorton (JTT) + I + G model of protein-sequence evolution, selected using the Akaike information criterion in ProtTest (Abascal et al. 2005). Tree files were edited using Dendroscope version 2.2 (Huson et al. 2007).
In an initial experiment, starch-granule protein from the samples 4 #1/47 #1 (samples combined), 76 #1, 3y #1, and 82 #1 was analyzed to identify the proteins present. Gel plugs (1 mm × 1 mm) containing a band corresponding to a protein approximately 52 kDa in size were cut from the SDS-PAGE gel, washed twice for 20 min in freshly prepared 400 mM ammonium bicarbonate: 100% acetonitrile (1:1), twice in 100% acetonitrile for 1 min and then for 15 min, and then air dried for 10 min. The plugs were incubated for 3 h at 37 °C with 5 μl 10 mM ammonium bicarbonate containing 50-ng modified porcine trypsin (Promega, Madison, WI). After the addition of 5 μl 5% formic acid, the plugs were incubated at room temperature for 10 min. A Dionex U3000 high performance liquid chromatography system was used to deliver the peptides at a flow rate of 150 nl min−1 to the mass spectrometer (LTQ Orbitrap, Thermo Electron Corp., Runcorn, Cheshire, United Kingdom). Peptides were trapped and desalted using a precolumn (C18 pepmap100, LC Packings) and then separated on an analytical column (self-pulled to a length of 12-cm and 50-μm ID and self-packed with Waters BEH130 C18, 1.7 μm) with a gradient of 5–45% acetonitrile in water/0.1% formic acid at 0.66% increase per minute. The mass spectrometer was operated in positive ion mode with a nanospray source and a capillary temperature of 200 °C. The source voltage and focusing voltages were tuned for the transmission of peptide Met-Arg-Phe-Ala (m/z 524). Data-dependent analysis was carried out in Orbitrap-IT parallel mode (collision induced dissociation fragmentation) on the five most abundant ions in each cycle. The Orbitrap was run with a resolution of 30,000 over the MS range from m/z 400 to m/z 1,800 and an MS target of 1e6 and 1-s maximum scan time. The MS2 was triggered by a minimal signal of 5,000 with a target of 2e4 and 200-ms scan time. For selection of 2+ and 3+ charged precursors, charge state and monoisotopic precursor selection was used. Collision energy was 35, and an isolation width of 2 was used. Dynamic exclusion was set to 1 count and 60-s exclusion with an exclusion mass window of −0.5 to +1.2. Raw files were processed in Bioworks to generate data files. The merged data files were used to search the SPtrEMBL (Viridiplantae) database with Mascot 2.2 (Matrixscience) (in-house), with a peptide tolerance of 5 ppm and a fragment tolerance of 0.6 Da, allowing up to three missed cleavages.
In a second experiment, selected samples were reanalyzed to identify the C-terminal sequence of the proteins present. Bands were cut from the gel as above and washed for 30 min with 200 μl 50% acetonitrile in 0.1M ammonium bicarbonate containing 5 mM Tris-[2-carboxyethyl]-phosphine, followed by the addition of 20 μl 250 mM iodoacetamide in water, for a further 30 min. They were then washed for 30 min in 400 μl 50% acetonitrile in 0.1 M ammonium bicarbonate and dried under vacuum for 10 min. The plugs were incubated for 18 h at 37 °C with 25 μl 100 mM ammonium bicarbonate containing 10 μg ml−1 proteinase (either AspN endoproteinase or trypsin). Peptides were recovered by binding to a conditioned μC18 ZipTip (Millipore, United Kingdom), washed with 5% acetic acid, and eluted with 2–5 μl of 70% methanol/0.2% formic acid. Analysis was by matrix assisted laser desorption ionisation (MALDI) mass spectrometry (Waters Micromass MaldiMX Micro) using a-cyano-4-hydroxycinnamic acid matrix (10 mg ml−1 in 50% aqueous acetonitrile/0.1% trifluroacetic acid) and by nanoelectrospray ms/ms using a Thermo Finnigan LCQ Classic instrument. Desalted sample was delivered using a static nanospray source (Proxeon Biosystems, Denmark) at 0.5 kV, and peaks of interest were interrogated manually for mass and fragmentation using standard parameters recommended by the manufacturer. Results were analyzed using Qual Browser in Xcalibur 1.2 (Thermo), Mascot (Matrix Science, United Kingdom) and custom spreadsheets.
Fifty-eight plants, drawn from 31 different accessions, produced grain with endosperm starch that stained blue–black with iodine, indicating the presence of amylose (nonwaxy; fig. 2[a]). Fourteen plants, drawn from eight different accessions, produced grain with starch granules that stained red (waxy type; fig 2[b]).
We selected four nonwaxy and four waxy plant samples for analysis of starch protein. In all eight samples, two bands produced a crossreaction with antibarley GBSSI antiserum (four plant samples are shown in fig. 3[a]; the other four samples gave very similar results). One was approximately 52 kDa, slightly smaller than GBSSI in other species, which is typically close to 60 kDa (Denyer, Edwards, et al. 1997). Because each lane contains protein from the same amount of grain material, the intensity of the bands in the soluble and insoluble fractions enables a comparison of the relative amount of the protein in each. Almost all of the 52-kDa protein was associated with the insoluble (granule-bound) fraction in both nonwaxy and waxy samples. A second protein, approximately 80 kDa in size, was also observed in all samples and was distributed more evenly between the soluble and insoluble starch protein fractions. This protein most likely represents a second isoform of SS, probably SSII.
A close examination of the relative mobilities of the GBSSI proteins (fig. 3[b]) indicated that the predominant band in the waxy samples was of a slightly lower apparent molecular mass than that in the nonwaxy samples. In two of the waxy samples (3o #1 and 3y #1), a double band was seen, with the fainter upper band appearing equivalent in size to the band in the nonwaxy samples.
The four nonwaxy and four waxy plants analyzed for starch protein were assayed for GBSS activity. SS activity per mg starch was several-fold higher in the nonwaxy than in the waxy samples (table 6).
To check whether the waxy GBSSI in the buffer-insoluble pellet was entrapped in starch granules or insoluble because it was denatured, we carried out a second western blotting experiment in which pellets were washed with SDS and CsCl. This has previously been shown to remove proteins effectively from the outside of starch granules (Mu-Forster et al. 1996; Stoddard 1999) and should solubilize most insoluble denatured proteins. The results showed that some waxy GBSSI remained in the insoluble fractions, indicating that it was still at least partially granule bound (data not shown). Thus, loss of activity in these waxy mutant GBSSI proteins is not accompanied by loss of the ability to bind to granules.
Using a combination of primers designed against the S. italica GBSSI sequence and novel internal primers in the P. miliaceum sequences, we were able to amplify and sequence 2 GBSSI products, around 3.6 and 3.2 kb in size, which we designated the large (henceforth L) and small (S) fragments, respectively (fig. 1). We identified exon–intron boundaries within the L and S sequences by alignment with the published sequence for S. italica GBSSI (AB089143) and numbered exons and introns correspondingly. Both the L and S sequences span the region from the start of the mature GBSSI peptide, midway through exon 2, to the end of exon 14. These two sequences were present consistently across accessions of P. miliaceum.
The alignment of P. miliaceum GBSS amino acid sequences with those of other species is available as supplementary fig. S1, Supplementary Material online. All GBSSI sequences form a clade with 100% bootstrap support in the maximum likelihood tree, distinct from other isoforms of GBSS, demonstrating that both Panicum sequences are of GBSSI type (fig. 4) rather than of the GBSSIb/GBSSII types typically expressed in leaves rather than endosperm. Panicum miliaceum GBSSI-S and -L types are closely related to S. italica GBSSI, forming a clade with 94% bootstrap support, consistent with the taxonomic position of these genera within Paniceae and their established relationship in grass phylogenies (e.g., Bouchenak-Khelladi et al. 2008). Within this clade, the two Panicum sequence types emerge as sister taxa with 78% bootstrap support.
High exon-sequence identity (95.3%) was seen between P. miliaceum L, P. miliaceum S, and S. italica GBSSI sequences, with 97.1% predicted amino acid identity. Eight fixed amino acid differences were found between the P. miliaceum M4R9 L and S genes (supplementary fig. S1, Supplementary Material online).
The GBSSI intron sequences aligned very weakly both between P. miliaceum L and P. miliaceum S and with S. italica. Panicum miliaceum L differed from S by a large deletion (ca. 300 bp) in intron 5 and a large insertion (ca. 700 bp) in intron 8. These indels accounted for most of the observed size difference between the two GBSSI sequences.
Three between-plant exon sequence polymorphisms were identified in P. miliaceum GBSSI. Polymorphism for a 15-bp indel was observed in the S gene, near the 5′ end of exon 10, within the region containing the sequence GCGCTGAACAAGGAGGCGCTG. By comparison with other GBSSI sequences in the database, we inferred that this polymorphism arose from a deletion in some P. miliaceum. Because of the repetition of the motif GCGCTG in this region of the alignment, it was not possible to determine unambiguously the exact position of the deletion, but it results in a change of amino acid sequence from …ALNKEAL… to …AL… . In the following discussion, we use S0 to refer to the allele or protein product lacking the deletion and S-15 for the allele or its product with the deletion. The repeated 6-bp motif suggests that this deletion may have arisen by recombination across this repeat.
The amino acid–sequence alignment showed that the ALNKEAL motif containing the site of the deletion in some S sequences is conserved among all Poaceae GBSSI sequences (supplementary fig. S1, Supplementary Material online). The lysine residue in this motif is conserved across all SSs and in glycogen synthase of Agrobacterium. Threading of Arabidopsis SS amino acid sequence onto the 3D structure of glycogen synthase in Agrobacterium (Buschiazzo et al. 2004; Busi et al. 2007) shows that the five amino acid deletion in the S protein of P. miliaceum falls in a helix within the GTD1 (fig. 5; a color version of this figure is available as supplementary fig. S2, Supplementary Material online).
Two polymorphisms were observed in the L gene. One was an insertion/deletion of an adenine residue 19 nt from the 5′ end of exon 9. By comparison with other GBSSI sequences, we inferred that the additional adenine represents an insertion in P. miliaceum that causes a shift in the reading frame, resulting in altered downstream inferred amino acid sequence and a novel stop codon 228 codons beyond the insertion. The second polymorphism was a substitution of an adenine for a guanine residue in exon 7, which causes a change from a cysteine codon to a tyrosine codon at amino acid position 249 (numbered according to the alignment in supplementary fig. S1, Supplementary Material online). By comparison with other GBSSI sequences, we inferred that the cysteine is the wild-type amino acid, and the tyrosine represents a mutant form.
On the basis of these preliminary results, we designed fragment-specific primers, such that the primer pair M5/R11 amplifies the 391-bp region including the deletion site in the S gene, the primer pair M12/R12 amplifies the 632-bp region including the A insertion site in the L gene, and the primer pair int5Lf/R3 amplifies the 251-bp region including the SNP in the L gene. Genotyping results for the S and L loci are shown in table 1.
At the L locus, all plants with the frameshift A insertion had the guanine residue at the SNP site, whereas plants without the frameshift mutation were polymorphic for the guanine–adenine substitution. Thus, we defined three alleles of the L gene: Lf (frameshift mutation), LC (no frameshift, cysteine codon at amino acid position 249), and LY (no frameshift, tyrosine codon at amino acid position 249). We observed these L-gene alleles in combination with the S gene alleles S0 and S−15 such that five of the six possible combinations of genotypes were found. Table 4 summarizes the number of plants of each genotype. No other polymorphic sites were seen in the exon sequence of either locus.
All of the 14 phenotypically waxy plants had the 15-bp deletion at the S locus. Nine of these plants had the Lf allele with the additional A residue in exon 9, whereas five (all those from accessions 3o and 3y) had the LY allele. Of the 58 nonwaxy plants, none had the 15-bp deletion at the S locus; at the L locus, 11 had the additional A (Lf allele), 15 had no A insertion in combination with the tyrosine residue at aa position 249 (LY allele), and 30 had no A insertion in combination with the cysteine codon at aa position 249 (LC allele). One plant was heterozygous for the Lf and LC alleles (table 1). Because the plants (with this one exception for L) were homozygous at both the S and L loci, we can be confident that the endosperm genotypes were homozygous and identical to those of their parent plants, albeit with one extra copy. Thus, among 72 plants representing 38 different accessions, there was 100% correspondence between the waxy starch phenotype and the 15-bp deletion at the S locus. At the L locus, the waxy phenotype of plants with the genotypes S−15/LY and S−15/Lf is evidence that the LY and Lf alleles do not produce an active protein product. Because the LC allele does not occur in combination with the S−15 allele, there is no direct evidence whether LC results in an active GBSSI protein or not.
From the SDS-PAGE gels and the DNA sequence data, we inferred that the GBSSI protein present in each sample most likely represented the S0 or S−15 isoform, according to its genotype, and also the LC or LY isoform in plants with those genotypes. We inferred that no L-type protein was present in plants with the Lf allele, because we did not see a band of molecular mass ~45 kDa that would correspond to the truncated protein predicted from the frameshift. This suggests that the Lf protein is not expressed or is unstable. The apparent lower molecular mass of the major band in the S−15 compared with the S0 samples is consistent with its missing five amino acids. It is likely that the larger band in the doublet in the samples with the genotype S−15/LY represents the LY protein. This band was absent from samples with the genotype S−15/Lf.
To test these hypotheses regarding protein identity, we compared the peptide mass fingerprints of the GBSSI proteins excised from the lanes containing the insoluble protein fractions in the gel in figure 3(a) with those of in silico trypsin-digested plant proteins in the SPtrEMBL Viridiplantae database to which the P. miliaceum sequences S0, S−15, LC, and LY had been manually added. We found that the top matches were all GBSSI proteins, confirming that the 52-kDa protein, present in both nonwaxy and waxy P. miliaceum starch, is indeed GBSSI. The top protein matches were S0 in the accessions with normal starch and S−15 in the accessions with waxy starch. Fragments enabling the discrimination of S from L protein were present in all samples (table 5). All genotypes had peptides specific to the S protein. These were numerous for all samples except the S−15/Lf genotype (sample 82#1), which had low percentage coverage and overall score and showed only a few of the discriminatory fragments. LC/LY genotypes also had numerous fragments specific to the L protein, which were almost or totally absent in Lf samples. This indicates that some of the GBSSI protein in these plants is contributed by the LC or LY gene as appropriate. It was not possible to discriminate between the LC and LY proteins, because the diagnostic fragment containing the cysteine–tyrosine residue was not found in any of the samples analyzed, probably because of its high molecular mass. To discriminate between S0 and S−15 isoforms, we looked for peptides characteristic of S−15. These were present in S−15 genotype samples and almost or entirely absent from S0 genotype samples. Direct evidence for the presence of the S0 protein was limited because many of the fragments containing the additional five amino acids also occur in the LC/LY proteins. This explains the presence of these fragments in the S−15/LY genotype sample. The one fragment diagnostic of S0, K.YDVSTAIAAKALNK.E, is rare because it relies on a missed cleavage; it was found single-fold in the mixed S0/LC and S0/LY genotype sample (which had the highest coverage and emPAI score of any sample). We inferred the presence of the S0 protein in the S0/LC/Y and S0/Lf genotypes indirectly from the absence of S−15 fragments in either, and in the latter sample, from the presence of S0 characteristic fragments in combination with the evidence suggesting the LC or LY protein is lacking.
The low e-values of the peptides and manual checking of the ionic spectra confirmed the reliability of the fragment identifications. Very low levels of unexpected isoform-specific peptides are likely to be due to crosscontamination of the samples. The extensive coverage of the protein sequence and the comparative analysis of the four samples provide good corroborative evidence for the presence of waxy protein isoforms consistent with the genotype of each sample.
The 5′ primer FPSLVVC3 and the 3′ primer Rstop3 overlap with the N- and C-termini of the mature GBSSI protein, as predicted from alignment with other GBSSI sequences (supplementary fig. S1, Supplementary Material online), by 1 and 7 amino acids, respectively. This precludes identification of possible mutations in these sites from DNA sequencing, so to confirm the N-terminus of the mature peptide, we looked for the expected fragments in the peptide mass fingerprints. All four samples contained peptides matching the expected N-terminal sequence –.AAAGMNVVFVGAEMAPWSK.T. Only one sample, genotype S0/LC/Y, with the highest percentage sequence coverage, showed the anticipated C-terminal fragment (K.ENVAAP–). We therefore carried out a second experiment using AspN proteinase to aim to recover a larger predicted C-terminal peptide (D489-P531; 4,457 Da average mass). This mass was successfully observed in S0 protein by MALDI analysis and its identity confirmed by electrospray ms/ms of the [M + 3H]3+ signal at 1,485.7m/z (monoisotopic). The Mascot score established its identity as the C-terminal peptide but with insufficient fragment coverage to confirm the C-terminal six residues (even though their accumulated mass was as expected). Solid identification was achieved by ms/ms of the [M + 4H]4+ signal at 1,114.5m/z. Manual interpretation of the spectrum identified a series of triply and quadruply charged b-ions which clearly derived from the C-terminus of the peptide. This allowed confident matching of the signals to the expected C-terminal sequence of IAPLAKENVAAP for S0 and S−15 protein samples (fig. 6).
Broomcorn millet represents a unique case among plant species with waxy mutants, in being a functional polyploid (Graybosch and Baltensperger 2009) in which waxy phenotypes have appeared and become established without deliberate modern breeding. This contrasts with the situation in wheat, in which partially waxy lines were discovered with mutations in one or two of the A, B, and D (in hexaploids) genome homeologues of GBSSI, and fully waxy tetraploid and hexaploid wheats have only very recently been synthesized through crossing appropriate partially waxy lines (Nakamura et al. 1993, 1995; Yamamori et al. 1994).
Although reticulate evolutionary relationships and the ancestry of polyploid taxa in the genus Panicum are not yet known, the presence of the S and L GBSSI genes in all samples is consistent with the inference from meiotic chromosome behavior by Hamoud et al. (1994) that P. miliaceum is an allotetraploid. We have observed comparable related pairs of other protein-coding nuclear sequences from P. miliaceum (Hunt HV, unpublished data). The L and S sequences would thus represent homeologues of the GBSSI gene derived from two (currently unknown) diploid ancestors of P. miliaceum. Further analysis of the evolutionary dynamics of Waxy sequences in the genus Panicum would be needed to confirm this interpretation; Fortune et al. (2007) demonstrated the presence of both paralogous and homeologous Waxy sequences in hexaploid Spartina species. However, the phylogenetic evidence clearly indicates that both the L and S types in P. miliaceum represent GBSSI genes; neither represents the GBSSIb/GBSSII type isoforms that are typically found in nonendosperm tissues, such as the pericarp in wheat, and pods, roots, and nodules in pea and contribute to amylose synthesis in these organs (Hylton et al. 1996; Denyer, Barber, et al. 1997; Vrinten and Nakamura 2000; Edwards et al. 2002). It is likely that orthologues of the GBSSIb/GBSSII protein exist in P. miliaceum too but encoded by loci distinct from those coding for the L or S genes.
Waxy mutants in different species result variously from mutations that cause loss of GBSSI gene expression, loss of starch granule–bound protein, or loss of enzyme activity. The data accumulated from other species consistently indicate that GBSSI is the sole locus controlling endosperm starch waxiness. Polygenic control occurs only when this locus is duplicated in functional polyploids: The genetic data of Graybosch and Baltensperger (2009) indicated the existence of distinct waxy mutations at two GBSSI loci in P. miliaceum. Our SDS-PAGE data showed that the GBSSI protein was present in approximately equal amounts in both waxy and nonwaxy lines. In contrast, Graybosch and Baltensperger (2009) found only trace amounts of GBSSI in waxy lines; however, we have found that the recovery of GBSSI from starch can vary with genotype and extraction method (data not shown). The SS activity data demonstrated that the GBSSI protein in waxy types was nonfunctional. Consideration of the protein identities and genotype data indicates that, depending on the particular mutant alleles present in a given genotype, either one (S) or two (S and L) GBSSI loci contribute to this nonfunctional protein product. We have characterized the mutations at these two loci as follows.
Our experiments revealed two S-type GBSSI alleles (S0 and S−15) and three L-type GBSSI alleles (LC, LY, and Lf). Of the six possible combinations of these S and L alleles, we found only five: None of the 72 plants analyzed had the genotype S−15/LC. The most common genotype was S0/LC (31 plants, from 17 accessions). We can postulate that this represents the ancestral genotype, for the following reasons: It is the most abundant genotype, and its distribution includes accessions from northern China, thought to be the center of diversity and origin of broomcorn millet (Crawford 2009; Hu et al. 2009). From the phenotype of the S0/Lf plants, we know that the S0 allele encodes an active GBSSI protein. Given that no waxy mutants are known in any wild plant species (Sakamoto 1996; Shapter et al. 2009), it is highly likely that the diploid ancestors of Panicum miliaceum, and the newly formed tetraploid P. miliaceum or its wild progenitor, had S and L alleles encoding functional proteins. The data of Graybosch and Baltensperger (2009) indicate that a functional L-type allele still exists in the genepool. Thus, we hypothesize that LC encodes this active GBSSI, although proof of this is lacking at present, in the absence of an S−15/LC genotype individual. Both the LC and S−15 alleles are present at relatively high frequency in Chinese samples, but because P. miliaceum is strongly self-pollinating, it is possible that nonrandom mating has meant this genotype has not arisen or is extremely rare. If our model is correct, the mutant S−15 allele in combination with either the mutant LY or mutant Lf allele is required to produce a plant with the waxy phenotype.
Redundancy of homeologous gene copies in polyploid species means that it is not uncommon for one homeologue to lose function (Wendel 2000). It is possible that following tetraploidization in P. miliaceum, a loss-of-function mutation in either the S or L GBSSI gene would have had little effect on starch phenotype. This is consistent with the results of Graybosch and Baltensperger (2009). We found two such genotypes with loss-of-function mutations in just the L gene: S0/LY in 15 plants and S0/Lf in 12 plants. We have shown that both the LY and Lf alleles confer loss of L-type GBSSI function. The mutant LY and Lf alleles arose independently, as demonstrated by the finding that no plant with the frameshift mutation in exon 9 also had the alanine substitution in exon 7, and they may have become widespread within the P. miliaceum distribution because, in isolation, they had little impact on the starch phenotype. This is demonstrated by the high frequency of the LY allele in samples from western Russia, Ukraine and the Caucasus, far from the region where sticky-starch foods are preferred. It is also possible that a mutant L allele in an S0 background may confer a partially waxy phenotype with a slightly lower than normal amylose content that is favored for some food uses. If so, these alleles may have spread more rapidly through the population due to active selection. There is evidence that partial waxy mutants of wheat were selected specifically for the production of udon noodles (Nakamura et al. 2002). The varieties of hexaploid wheat now grown for udon noodle production have one or two inactive waxy alleles. It is thought that the slightly lower amylose content of these genotypes confers desirable phenotypic properties and has thus been selected.
The mechanism of loss of activity in mutant L-type GBSSI clearly differs between alleles. For the Lf mutant allele, a frameshift due to the insertion of an additional nucleotide (A) is responsible for the loss of the L protein. The predicted protein product of the Lf gene has its GTD1 domain disrupted and truncated as a result of the frameshift within GTD5 and is therefore unlikely to be catalytically active. We did not see any evidence for this protein from immunoblotting, and it is probable that it is unstable or not synthesized at all (perhaps as a result of nonsense-mediated decay of the RNA). The LY allele is also evidently inactive, and we can hypothesize that it is the substitution of adenine for guanine at position 249 in exon 7 (which causes the substitution of tyrosine for cysteine in the amino acid sequence) that results in loss of L protein activity. The cysteine residue is conserved in all monocot GBSSI sequences and is located in GTD5, close to the active site.
It is likely that the loss-of-function mutation in the GBSSI-S gene arose in a line already carrying an L-gene mutation, because no S−15/LC genotype plants were found, and that this resulted in a plant with a fully waxy phenotype. A single mutation appears to account for loss of S protein activity in all samples analyzed: The 15-bp deletion was present in all plants with waxy-type grain, and conversely, all genotypes with this deletion had waxy grain. We found 14 waxy-phenotype plants: 9 plants with the S−15/Lf genotype and 5 with the S−15/LY genotype. It is unlikely that the S−15 mutation occurred more than once, and so, we suggest that the two genotypes that demonstrably result in waxy phenotypes, S−15/LY and S−15/Lf, arose from hybridization between one of these genotypes and a phenotypically nonwaxy plant with the alternative mutant L allele in an S0 background.
The five amino acid deletion in the GBSSI-S protein, responsible for the loss of S protein activity in the P. miliaceum S−15 allele, lies in the GTD1 (fig. 5). The seven-amino-acid sequence spanning the deletion in the wild-type allele (ALNKEAL) is conserved in all monocot GBSSIs, suggesting it is functionally important. Our immunoblot, peptide fragment MS data and SS activity assay results show that the mutant GBSSI-S protein is present but nearly or completely inactive. This mutant GBSSI protein nevertheless persists in binding to starch granules, as has been found previously for some mutant GBSSIs in other species (e.g., one of the three mutant pea embryo GBSSI isoforms identified by Denyer et al.  was found in normal amounts inside starch granules). We hypothesize that the five-amino-acid deletion in the P. miliaceum GBSSI-S protein results in a disruption of the GTD1, resulting in the observed loss of SS activity, but does not affect the binding of this protein to the starch granules.
It is possible that other mutant GBSSI alleles exist in P. miliaceum. However, our determination of the full-length amino acid sequences of the mature L and S GBSSI proteins, inferred from DNA sequence and peptide mass fragment data in multiple waxy and nonwaxy samples, means that we can be confident that no other mutations exist that universally account for loss of functional protein at either locus.
All waxy-type individuals in this study came from accessions of East Asian (Chinese and Korean) origin. All samples originating elsewhere (Russia, Ukraine, the Caucasus, Central Asia, and Mongolia) were nonwaxy. Chinese samples showed a mixture of waxy and nonwaxy types both within and between accessions. This distribution of waxy phenotypes concurs with the data of Kimata and Negishi (2002), who found waxy and intermediate phenotypes in around 60% of P. miliaceum accessions from East Asia (China, Korea, and Mongolia) and at very low frequencies among accessions from elsewhere in Eurasia. These authors also found that nearly 100% of Japanese accessions were waxy or intermediate; we did not include any samples from Japan in this study, and we did not observe any of the intermediate phenotypes.
A similar geographic pattern is found in other cereals, comprising a mix of waxy and nonwaxy varieties in East and Southeast Asia, and near-exclusively nonwaxy varieties elsewhere in the range. This includes both species domesticated within the region (foxtail millet, rice) and those domesticated elsewhere and introduced into East Asia (barley, maize, sorghum, and Job's tears; Sakamoto 1996; Patron et al. 2002; Kawase et al. 2005; Olsen et al. 2006; Fan et al. 2008). Waxy types have not been recorded in wild cereals (Sakamoto 1996; Shapter et al. 2009). Selective sweeps in the Waxy genomic region have been demonstrated in rice and maize that reflect strong postdomestication selection, revealing the shaping of crop genome evolution through human choice (Olsen et al. 2006; Fan et al. 2008; Vaughan et al. 2008). A “glutinous-endosperm starch food culture,” which arose early in the history of cereal farming in East and Southeast Asia and remained confined to the cultures of this region, is thought to have driven selection for waxy mutants arising in domesticated cereal populations.
The waxy broomcorn millet lines in this study all showed the same mutation in the GBSSI-S gene, implying a single origin for the loss of protein function in this homeologue. In the GBSSI-L gene, there are two nonfunctional alleles of independent origin. As in foxtail millet (S. italica), where waxy phenotypes have resulted from several different transposon insertion events (Kawase et al. 2005), waxy alleles in broomcorn millet may have distinct distribution geographies. Further investigation of the distribution of these alleles within and between populations, varieties, and landraces of cultivated broomcorn millet is of great relevance to understanding both the history of this crop's origins and spread and how the organization of crop plant diversity has been shaped by human food choices.
This work was supported by a Leverhulme Trust research grant to M.K.J. and H.V.H. (ref. F/09707/B) and a Wellcome Trust Bioarchaeology Research Training Fellowship to H.V.H. (ref. 076815). We are grateful to the USDA-ARS and the Vavilov Research Institute for providing plant material. We thank Yo-Ichiro Sato (Research Institute for Humanity and Nature, Kyoto) for initial discussion, Pete Michna and colleagues at the University of Cambridge Botanic Garden for growing plants, Gerhard Saalbach (John Innes Centre) for proteomics analyses, Brendan Fahy (John Innes Centre) and Matt Lawes (University of Cambridge) for help in the laboratory, Mike Weldon (University of Cambridge) for protein-sequence analysis, Ellen Nisbet (University of South Australia) for phylogenetics advice, Kathryn Lilley (University of Cambridge) for proteomics advice, and Carla Lancelotti and Cameron Petrie (University of Cambridge) for assistance with graphics.