Amundsen, K, Rotter D, Li H M, Messing J, Jung G, Belanger F, Warnke S.  2011.  Miniature Inverted-Repeat Transposable Element Identification and Genetic Marker Development in Agrostis. Crop Sci.. 51:854-861.Website
Calvino, M., Bruggmann R, Messing J.  2011.  Characterization of the small RNA component of the transcriptome from grain and sweet sorghum stems. BMC Genomics. 12:356. AbstractWebsite
ABSTRACT: BACKGROUND: Sorghum belongs to the tribe of the Andropogoneae that includes potential biofuel crops like switchgrass, Miscanthus and successful biofuel crops like corn and sugarcane. However, from a genomics point of view sorghum has compared to these other species a simpler genome because it lacks the additional rounds of whole genome duplication events. Therefore, it has become possible to generate a high-quality genome sequence. Furthermore, cultivars exists that rival sugarcane in levels of stem sugar so that a genetic approach can be used to investigate which genes are differentially expressed to achieve high levels of stem sugar. RESULTS: Here, we characterized the small RNA component of the transcriptome from grain and sweet sorghum stems, and from F2 plants derived from their cross that segregated for sugar content and flowering time. We found that variation in miR172 and miR395 expression correlated with flowering time whereas variation in miR169 expression correlated with sugar content in stems. Interestingly, genotypic differences in the ratio of miR395 to miR395* were identified, with miR395* species expressed as abundantly as miR395 in sweet sorghum but not in grain sorghum. Finally, we provided experimental evidence for previously annotated miRNAs detecting the expression of 25 miRNA families from the 27 known and discovered 9 new miRNAs candidates in the sorghum genome. CONCLUSIONS: Sequencing the small RNA component of sorghum stem tissue provides us with experimental evidence for previously predicted microRNAs in the sorghum genome and microRNAs with a potential role in stem sugar accumulation and flowering time.
Wang, W, Messing J.  2011.  High-throughput sequencing of three Lemnoideae (duckweeds) chloroplast genomes from total DNA. PLoS One. 6:e24670. AbstractWebsite
BACKGROUND: Chloroplast genomes provide a wealth of information for evolutionary and population genetic studies. Chloroplasts play a particularly important role in the adaption for aquatic plants because they float on water and their major surface is exposed continuously to sunlight. The subfamily of Lemnoideae represents such a collection of aquatic species that because of photosynthesis represents one of the fastest growing plant species on earth. METHODS: We sequenced the chloroplast genomes from three different genera of Lemnoideae, Spirodela polyrhiza, Wolffiella lingulata and Wolffia australiana by high-throughput DNA sequencing of genomic DNA using the SOLiD platform. Unfractionated total DNA contains high copies of plastid DNA so that sequences from the nucleus and mitochondria can easily be filtered computationally. Remaining sequence reads were assembled into contiguous sequences (contigs) using SOLiD software tools. Contigs were mapped to a reference genome of Lemna minor and gaps, selected by PCR, were sequenced on the ABI3730xl platform. CONCLUSIONS: This combinatorial approach yielded whole genomic contiguous sequences in a cost-effective manner. Over 1,000-time coverage of chloroplast from total DNA were reached by the SOLiD platform in a single spot on a quadrant slide without purification. Comparative analysis indicated that the chloroplast genome was conserved in gene number and organization with respect to the reference genome of L. minor. However, higher nucleotide substitution, abundant deletions and insertions occurred in non-coding regions of these genomes, indicating a greater genomic dynamics than expected from the comparison of other related species in the Pooideae. Noticeably, there was no transition bias over transversion in Lemnoideae. The data should have immediate applications in evolutionary biology and plant taxonomy with increased resolution and statistical power.
Abrouk, M, Murat F, Pont C, Messing J, Jackson S, Faraut T, Tannier E, Plomion C, Cooke R, Feuillet C et al..  2010.  Palaeogenomics of plants: synteny-based modelling of extinct ancestors. Trends Plant Sci. 15:479-87. AbstractWebsite
In the past ten years, international initiatives have led to the development of large sets of genomic resources that allow comparative genomic studies between plant genomes at a high level of resolution. Comparison of map-based genomic sequences revealed shared intra-genomic duplications, providing new insights into the evolution of flowering plant genomes from common ancestors. Plant genomes can be presented as concentric circles, providing a new reference for plant chromosome evolutionary relationships and an efficient tool for gene annotation and cross-genome markers development. Recent palaeogenomic data demonstrate that whole-genome duplications have provided a motor for the evolutionary success of flowering plants over the last 50-70 million years.
Murat, F, Xu JH, Tannier E, Abrouk M, Guilhot N, Pont C, Messing J, Salse J.  2010.  Ancestral grass karyotype reconstruction unravels new mechanisms of genome shuffling as a source of plant evolution. Genome Res. 20:1545-57. AbstractWebsite
The comparison of the chromosome numbers of today's species with common reconstructed paleo-ancestors has led to intense speculation of how chromosomes have been rearranged over time in mammals. However, similar studies in plants with respect to genome evolution as well as molecular mechanisms leading to mosaic synteny blocks have been lacking due to relevant examples of evolutionary zooms from genomic sequences. Such studies require genomes of species that belong to the same family but are diverged to fall into different subfamilies. Our most important crops belong to the family of the grasses, where a number of genomes have now been sequenced. Based on detailed paleogenomics, using inference from n = 5-12 grass ancestral karyotypes (AGKs) in terms of gene content and order, we delineated sequence intervals comprising a complete set of junction break points of orthologous regions from rice, maize, sorghum, and Brachypodium genomes, representing three different subfamilies and different polyploidization events. By focusing on these sequence intervals, we could show that the chromosome number variation/reduction from the n = 12 common paleo-ancestor was driven by nonrandom centric double-strand break repair events. It appeared that the centromeric/telomeric illegitimate recombination between nonhomologous chromosomes led to nested chromosome fusions (NCFs) and synteny break points (SBPs). When intervals comprising NCFs were compared in their structure, we concluded that SBPs (1) were meiotic recombination hotspots, (2) corresponded to high sequence turnover loci through repeat invasion, and (3) might be considered as hotspots of evolutionary novelty that could act as a reservoir for producing adaptive phenotypes.
Wu, Y, Messing J.  2010.  RNA interference-mediated change in protein body morphology and seed opacity through loss of different zein proteins. Plant Physiol. 153:337-47. AbstractWebsite
Opaque or nonvitreous phenotypes relate to the seed architecture of maize (Zea mays) and are linked to loci that control the accumulation and proper deposition of storage proteins, called zeins, into specialized organelles in the endosperm, called protein bodies. However, in the absence of null mutants of each type of zein (i.e. alpha, beta, gamma, and delta), the molecular contribution of these proteins to seed architecture remains unclear. Here, a double null mutant for the delta-zeins, the 22-kD alpha-zein, the beta-zein, and the gamma-zein RNA interference (RNAi; designated as z1CRNAi, betaRNAi, and gammaRNAi, respectively) and their combinations have been examined. While the delta-zein double null mutant had negligible effects on protein body formation, the betaRNAi and gammaRNAi alone only cause slight changes. Substantial loss of the 22-kD alpha-zeins by z1CRNAi resulted in protein body budding structures, indicating that a sufficient amount of the 22-kD zeins is necessary for maintenance of a normal protein body shape. Among different mutant combinations, only the combined betaRNAi and gammaRNAi resulted in drastic morphological changes, while other combinations did not. Overexpression of alpha-kafirins, the homologues of the maize 22-kD alpha-zeins in sorghum (Sorghum bicolor), in the beta/gammaRNAi mutant failed to offset the morphological alterations, indicating that beta- and gamma-zeins have redundant and unique functions in the stabilization of protein bodies. Indeed, opacity of the beta/gammaRNAi mutant was caused by incomplete embedding of the starch granules rather than by reducing the vitreous zone.
Wu, Y, Holding DR, Messing J.  2010.  Gamma-zeins are essential for endosperm modification in quality protein maize. Proc Natl Acad Sci U S A. 107:12810-5. AbstractWebsite
Essential amino acids like lysine and tryptophan are deficient in corn meal because of the abundance of zein storage proteins that lack these amino acids. A natural mutant, opaque 2 (o2) causes reduction of zeins, an increase of nonzein proteins, and as a consequence, a doubling of lysine levels. However, o2's soft inferior kernels precluded its commercial use. Breeders subsequently overcame kernel softness, selecting several quantitative loci (QTLs), called o2 modifiers, without losing the high-lysine trait. These maize lines are known as "quality protein maize" (QPM). One of the QTLs is linked to the 27-kDa gamma-zein locus on chromosome 7S. Moreover, QPM lines have 2- to 3-fold higher levels of the 27-kDa gamma-zein, but the physiological significance of this increase is not known. Because the 27- and 16-kDa gamma-zein genes are highly conserved in DNA sequence, we introduced a dominant RNAi transgene into a QPM line (CM105Mo2) to eliminate expression of them both. Elimination of gamma-zeins disrupts endosperm modification by o2 modifiers, indicating their hypostatic action to gamma-zeins. Abnormalities in protein body structure and their interaction with starch granules in the F1 with Mo2/+; o2/o2; gammaRNAi/+ genotype suggests that gamma-zeins are essential for restoring protein body density and starch grain interaction in QPM. To eliminate pleiotropic effects caused by o2, the 22-kDa alpha-zein, gamma-zein, and beta-zein RNAis were stacked, resulting in protein bodies forming as honeycomb-like structures. We are unique in presenting clear demonstration that gamma-zeins play a mechanistic role in QPM, providing a previously unexplored rationale for molecular breeding.
International-Brachypodium-Initiative.  2010.  Genome sequencing and analysis of the model grass Brachypodium distachyon. Nature. 463:763-8. AbstractWebsite
Three subfamilies of grasses, the Ehrhartoideae, Panicoideae and Pooideae, provide the bulk of human nutrition and are poised to become major sources of renewable energy. Here we describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our knowledge, the first member of the Pooideae subfamily to be sequenced. Comparison of the Brachypodium, rice and sorghum genomes shows a precise history of genome evolution across a broad diversity of the grasses, and establishes a template for analysis of the large genomes of economically important pooid grasses such as wheat. The high-quality genome sequence, coupled with ease of cultivation and transformation, small size and rapid life cycle, will help Brachypodium reach its potential as an important model system for developing new energy and food crops.
Wu, Y, Messing J.  2010.  Rescue of a dominant mutant with RNA interference. Genetics. 186:1493-6. AbstractWebsite
Maize Mucronate1 is a dominant floury mutant based on a misfolded 16-kDa gamma-zein protein. To prove its function, we applied RNA interference (RNAi) as a dominant suppressor of the mutant seed phenotype. A gamma-zein RNAi transgene was able to rescue the mutation and restore normal seed phenotype. RNA interference prevents gene expression. In most cases, this is used to study gene function by creating a new phenotype. Here, we use it for the opposite purpose. We use it to reverse the creation of a mutant phenotype by restoring the normal phenotype. In the case of the maize Mucronate1 (Mc1) phenotype, interaction of a misfolded protein with other proteins is believed to be the basis for the Mc1 phenotype. If no misfolded protein is present, we can reverse the mutant to the normal phenotype. One can envision using this approach to study complex traits and in gene therapy.
Goettel, W, Messing J.  2010.  Divergence of gene regulation through chromosomal rearrangements. BMC Genomics. 11:678. AbstractWebsite
BACKGROUND: The molecular mechanisms that modify genome structures to give birth and death to alleles are still not well understood. To investigate the causative chromosomal rearrangements, we took advantage of the allelic diversity of the duplicated p1 and p2 genes in maize. Both genes encode a transcription factor involved in maysin synthesis, which confers resistance to corn earworm. However, p1 also controls accumulation of reddish pigments in floral tissues and has therefore acquired a new function after gene duplication. p1 alleles vary in their tissue-specific expression, which is indicated in their allele designation: the first suffix refers to red or white pericarp pigmentation and the second to red or white glume pigmentation. RESULTS: Comparing chromosomal regions comprising p1-ww[4Co63], P1-rw1077 and P1-rr4B2 alleles with that of the reference genome, P1-wr[B73], enabled us to reconstruct additive events of transposition, chromosome breaks and repairs, and recombination that resulted in phenotypic variation and chimeric regulatory signals. The p1-ww[4Co63] null allele is probably derived from P1-wr[B73] by unequal crossover between large flanking sequences. A transposon insertion in a P1-wr-like allele and NHEJ (non-homologous end-joining) could have resulted in the formation of the P1-rw1077 allele. A second NHEJ event, followed by unequal crossover, probably led to the duplication of an enhancer region, creating the P1-rr4B2 allele. Moreover, a rather dynamic picture emerged in the use of polyadenylation signals by different p1 alleles. Interestingly, p1 alleles can be placed on both sides of a large retrotransposon cluster through recombination, while functional p2 alleles have only been found proximal to the cluster. CONCLUSIONS: Allelic diversity of the p locus exemplifies how gene duplications promote phenotypic variability through composite regulatory signals. Transposition events increase the level of genomic complexity based not only on insertions but also on excisions that cause DNA double-strand breaks and trigger illegitimate recombination.
Wang, W, Wu Y, Yan Y, Ermakova M, Kerstetter R, Messing J.  2010.  DNA barcoding of the Lemnaceae, a family of aquatic monocots. BMC Plant Biol. 10:205. AbstractWebsite
BACKGROUND: Members of the aquatic monocot family Lemnaceae (commonly called duckweeds) represent the smallest and fastest growing flowering plants. Their highly reduced morphology and infrequent flowering result in a dearth of characters for distinguishing between the nearly 38 species that exhibit these tiny, closely-related and often morphologically similar features within the same family of plants. RESULTS: We developed a simple and rapid DNA-based molecular identification system for the Lemnaceae based on sequence polymorphisms. We compared the barcoding potential of the seven plastid-markers proposed by the CBOL (Consortium for the Barcode of Life) plant-working group to discriminate species within the land plants in 97 accessions representing 31 species from the family of Lemnaceae. A Lemnaceae-specific set of PCR and sequencing primers were designed for four plastid coding genes (rpoB, rpoC1, rbcL and matK) and three noncoding spacers (atpF-atpH, psbK-psbI and trnH-psbA) based on the Lemna minor chloroplast genome sequence. We assessed the ease of amplification and sequencing for these markers, examined the extent of the barcoding gap between intra- and inter-specific variation by pairwise distances, evaluated successful identifications based on direct sequence comparison of the "best close match" and the construction of a phylogenetic tree. CONCLUSIONS: Based on its reliable amplification, straightforward sequence alignment, and rates of DNA variation between species and within species, we propose that the atpF-atpH noncoding spacer could serve as a universal DNA barcoding marker for species-level identification of duckweeds.
Salse, J, Abrouk M, Bolot S, Guilhot N, Courcelle E, Faraut T, Waugh R, Close TJ, Messing J, Feuillet C.  2009.  Reconstruction of monocotelydoneous proto-chromosomes reveals faster evolution in plants than in animals. Proc Natl Acad Sci U S A. 106:14908-13. AbstractWebsite
Paleogenomics seeks to reconstruct ancestral genomes from the genes of today's species. The characterization of paleo-duplications represented by 11,737 orthologs and 4,382 paralogs identified in five species belonging to three of the agronomically most important subfamilies of grasses, that is, Ehrhartoideae (rice) Panicoideae (sorghum, maize), and Pooideae (wheat, barley), permitted us to propose a model for an ancestral genome with a minimal size of 33.6 Mb structured in five proto-chromosomes containing at least 9,138 predicted proto-genes. It appears that only four major evolutionary shuffling events (alpha, beta, gamma, and delta) explain the divergence of these five cereal genomes during their evolution from a common paleo-ancestor. Comparative analysis of ancestral gene function with rice as a reference indicated that five categories of genes were preferentially modified during evolution. Furthermore, alignments between the five grass proto-chromosomes and the recently identified seven eudicot proto-chromosomes indicated that additional very active episodes of genome rearrangements and gene mobility occurred during angiosperm evolution. If one compares the pace of primate evolution of 90 million years (233 species) to 60 million years of the Poaceae (10,000 species), change in chromosome structure through speciation has accelerated significantly in plants.
Wu, Y, Goettel W, Messing J.  2009.  Non-Mendelian regulation and allelic variation of methionine-rich delta-zein genes in maize. Theor Appl Genet. AbstractWebsite
Sufficient methionine levels in the seed are critical for the supply of a balanced diet for feed and food. Currently, animal feed is supplemented with chemically synthesized methionine, which could be completely replaced with naturally synthesized methionine. However, insufficient levels of methionine are due to alleles of two genes in the maize genome that are expressed during seed development, which have a high percentage of methionine codons, ranging from 23 to 28%, while free methionine is very low. The two genes, dzs10 and dzs18, belong to the prolamin gene family that arose during the evolution of the grasses and were duplicated during a whole genome duplication event. We have found several dzs10 and dzs18 null alleles caused either by transposon insertion or frame shift mutations. Maize seeds with null mutations of both genes have a normal phenotype in contrast to other prolamin genes, explaining the accumulation of methionine deficiency in normal breeding efforts. Moreover, the trans-regulation of these genes deviates from Mendelian inheritance. One allele of the regulatory locus dzr1 is inherited in a parent-of-origin fashion, while another allele appears to prevent Mendelian segregation of the high-methionine phenotype in backcrosses.
Goettel, W, Messing J.  2009.  Change of gene structure and function by non-homologous end-joining, homologous recombination, and transposition of DNA. PLoS Genet. 5:e1000516. AbstractWebsite
An important objective in genome research is to relate genome structure to gene function. Sequence comparisons among orthologous and paralogous genes and their allelic variants can reveal sequences of functional significance. Here, we describe a 379-kb region on chromosome 1 of maize that enables us to reconstruct chromosome breakage, transposition, non-homologous end-joining, and homologous recombination events. Such a high-density composition of various mechanisms in a small chromosomal interval exemplifies the evolution of gene regulation and allelic diversity in general. It also illustrates the evolutionary pace of changes in plants, where many of the above mechanisms are of somatic origin. In contrast to animals, somatic alterations can easily be transmitted through meiosis because the germline in plants is contiguous to somatic tissue, permitting the recovery of such chromosomal rearrangements. The analyzed region contains the P1-wr allele, a variant of the genetically well-defined p1 gene, which encodes a Myb-like transcriptional activator in maize. The P1-wr allele consists of eleven nearly perfect P1-wr 12-kb repeats that are arranged in a tandem head-to-tail array. Although a technical challenge to sequence such a structure by shotgun sequencing, we overcame this problem by subcloning each repeat and ordering them based on nucleotide variations. These polymorphisms were also critical for recombination and expression analysis in presence and absence of the trans-acting epigenetic factor Ufo1. Interestingly, chimeras of the p1 and p2 genes, p2/p1 and p1/p2, are framing the P1-wr cluster. Reconstruction of sequence amplification steps at the p locus showed the evolution from a single Myb-homolog to the multi-gene P1-wr cluster. It also demonstrates how non-homologous end-joining can create novel gene fusions. Comparisons to orthologous regions in sorghum and rice also indicate a greater instability of the maize genome, probably due to diploidization following allotetraploidization.
Paterson, AH, Bowers JE, Bruggmann R, Dubchak I, Grimwood J, Gundlach H, Haberer G, Hellsten U, Mitros T, Poliakov A et al..  2009.  The Sorghum bicolor genome and the diversification of grasses. Nature. 457:551-6. AbstractWebsite
Sorghum, an African grass related to sugar cane and maize, is grown for food, feed, fibre and fuel. We present an initial analysis of the approximately 730-megabase Sorghum bicolor (L.) Moench genome, placing approximately 98% of genes in their chromosomal context using whole-genome shotgun sequence validated by genetic, physical and syntenic information. Genetic recombination is largely confined to about one-third of the sorghum genome with gene order and density similar to those of rice. Retrotransposon accumulation in recombinationally recalcitrant heterochromatin explains the approximately 75% larger genome size of sorghum compared with rice. Although gene and repetitive DNA distributions have been preserved since palaeopolyploidization approximately 70 million years ago, most duplicated gene sets lost one member before the sorghum-rice divergence. Concerted evolution makes one duplicated chromosomal segment appear to be only a few million years old. About 24% of genes are grass-specific and 7% are sorghum-specific. Recent gene and microRNA duplications may contribute to sorghum's drought tolerance.
Messing, J.  2009.  Synergy of two reference genomes for the grass family. Plant Physiol. 149:117-24.Website
Xu, JH, Messing J.  2009.  Amplification of prolamin storage protein genes in different subfamilies of the Poaceae. Theor Appl Genet. AbstractWebsite
Prolamins are seed storage proteins in cereals and represent an important source of essential amino acids for feed and food. Genes encoding these proteins resulted from dispersed and tandem amplification. While previous studies have concentrated on protein sequences from different grass species, we now can add a new perspective to their relationships by asking how their genes are shared by ancestry and copied in different lineages of the same family of species. These differences are derived from alignment of chromosomal regions, where collinearity is used to identify prolamin genes in syntenic positions, also called orthologous gene copies. New or paralogous gene copies are inserted in tandem or new locations of the same genome. More importantly, one can detect the loss of older genes. We analyzed chromosomal intervals containing prolamin genes from rice, sorghum, wheat, barley, and Brachypodium, representing different subfamilies of the Poaceae. The Poaceae commonly known as the grasses includes three major subfamilies, the Ehrhartoideae (rice), Pooideae (wheat, barley, and Brachypodium), and Panicoideae (millets, maize, sorghum, and switchgrass). Based on chromosomal position and sequence divergence, it becomes possible to infer the order of gene amplification events. Furthermore, the loss of older genes in different subfamilies seems to permit a faster pace of divergence of paralogous genes. Change in protein structure affects their physical properties, subcellular location, and amino acid composition. On the other hand, regulatory sequence elements and corresponding transcriptional activators of new gene copies are more conserved than coding sequences, consistent with the tissue-specific expression of these genes.
Bolot, S, Abrouk M, Masood-Quraishi U, Stein N, Messing J, Feuillet C, Salse J.  2009.  The 'inner circle' of the cereal genomes. Curr Opin Plant Biol. 12:119-25. AbstractWebsite
Early marker-based macrocolinearity studies between the grass genomes led to arranging their chromosomes into concentric 'crop circles' of synteny blocks that initially consisted of 30 rice-independent linkage groups representing the ancestral cereal genome structure. Recently, increased marker density and genome sequencing of several cereal genomes allowed the characterization of intragenomic duplications and their integration with intergenomic colinearity data to identify paleo-duplications and propose a model for the evolution of the grass genomes from a common ancestor. On the basis of these data an 'inner circle' comprising five ancestral chromosomes was defined providing a new reference for the grass chromosomes and new insights into their ancestral relationships and origin, as well as an efficient tool to design cross-genome markers for genetic studies.
Calviño, M, Miclaus M, Bruggmann R, Messing J.  2009.  Molecular Markers for Sweet Sorghum Based on Microarray Expression Data. Rice. 2:129-142. AbstractWebsite
Using an Affymetrix sugarcane genechip, we previously identified 154 genes differentially expressed between grain and sweet sorghum. Although many of these genes have functions related to sugar and cell wall metabolism, dissection of the trait requires genetic analysis. Therefore, it would be advantageous to use microarray data for generation of genetic markers, shown in other species as single-feature polymorphisms (SFPs). As a test case, we used the GeSNP software to screen for SFPs between grain and sweet sorghum. Based on this screen, out of 58 candidate genes, 30 had single-nucleotide polymorphisms (SNPs) from which 19 had validated SFPs. The degree of nucleotide polymorphism found between grain and sweet sorghum was in the order of one SNP per 248 base pairs, with chromosome 8 being highly polymorphic. Indeed, molecular markers could be developed for a third of the candidate genes, giving us a high rate of return by this method.
Messing, J.  2009.  The Polyploid Origin of Maize. The Maize Handbook: Domestication, Genetics, and Genome. :221-238.
Messing, J.  2009.  Shotgun DNA sequencing bearing fruits: probing the dynamics of genome size. Intern J Nat & Eng Sci. 3:1-6.
Messing, J.  2009.  The Structure of the Maize Genome. Molecular Genetic Approaches to Maize Improvement, Biotechnology in Agriculture and Forestry. 63:213-230.
Xu, J-H, Messing J.  2008.  Diverged Copies of the Seed Regulatory Opaque-2 Gene by a Segmental Duplication in the Progenitor Genome of Rice, Sorghum, and Maize. Mol Plant %R 10.1093/mp/ssn038. 1:760-769. AbstractWebsite
Comparative analyses of the sequence of entire genomes have shown that gene duplications, chromosomal segmental duplications, or even whole genome duplications (WGD) have played prominent roles in the evolution of many eukaryotic species. Here, we used the ancient duplication of a well known transcription factor in maize, encoded by the Opaque-2 (O2) locus, to examine the general features of divergences of chromosomal segmental duplications in a lineage-specific manner. We took advantage of contiguous chromosomal sequence information in rice (Oryza sativa, Nipponbare), sorghum (Sorghum bicolor, Btx623), and maize (Zea mays, B73) that were aligned by conserved gene order (synteny). This analysis showed that the maize O2 locus is contained within a 1.25 million base-pair (Mb) segment on chromosome 7, which was duplicated {approx}56 million years ago (mya) before the split of rice and maize 50 mya. The duplicated region on chromosome 1 is only half the size and contains the maize OHP gene, which does not restore the o2 mutation although it encodes a protein with the same DNA and protein binding properties in endosperm. The segmental duplication is not only found in rice, but also in sorghum, which split from maize 11.9 mya. A detailed analysis of the duplicated regions provided examples for complex rearrangements including deletions, duplications, conversions, inversions, and translocations. Furthermore, the rice and sorghum genomes appeared to be more stable than the maize genome, probably because maize underwent allotetraploidization and then diploidization.
Xu, JH, Messing J.  2008.  Organization of the prolamin gene family provides insight into the evolution of the maize genome and gene duplications in grass species. Proc Natl Acad Sci U S A. 105:14330-5. AbstractWebsite
Zea mays, commonly known as corn, is perhaps the most greatly produced crop in terms of tonnage and a major food, feed, and biofuel resource. Here we analyzed its prolamin gene family, encoding the major seed storage proteins, as a model for gene evolution by syntenic alignments with sorghum and rice, two genomes that have been sequenced recently. Because a high-density gene map has been constructed for maize inbred B73, all prolamin gene copies can be identified in their chromosomal context. Alignment of respective chromosomal regions of these species via conserved genes allow us to identify the pedigree of prolamin gene copies in space and time. Its youngest and largest gene family, the alpha prolamins, arose about 22-26 million years ago (Mya) after the split of the Panicoideae (including maize, sorghum, and millet) from the Pooideae (including wheat, barley, and oats) and Oryzoideae (rice). The first dispersal of alpha prolamin gene copies occurred before the split of the progenitors of maize and sorghum about 11.9 Mya. One of the two progenitors of maize gained a new alpha zein locus, absent in the other lineage, to form a nonduplicated locus in maize after allotetraplodization about 4.8 Mya. But dispersed copies gave rise to tandem duplications through uneven expansion and gene silencing of this gene family in maize and sorghum, possibly because of maize's greater recombination and mutation rates resulting from its diploidization process. Interestingly, new gene loci in maize represent junctions of ancestral chromosome fragments and sites of new centromeres in sorghum and rice.