Gene presence/absence (P/A) polymorphisms are commonly observed in plants and are important in individual adaptation and species differentiation. Detecting their abundance, distribution and variation among individuals would help to understand the role played by these polymorphisms in a given species. The recently sequenced 80 Arabidopsis genomes provide an opportunity to address these questions.
By systematically investigating these accessions, we identified 2,407 P/A genes (or 8.9%) absent in one or more genomes, averaging 444 absent genes per accession. 50.6% of P/A genes belonged to multi-copy gene families, or 31.0% to clustered genes. However, the highest proportion of P/A genes, outnumbered in singleton genes, was observed in the regions near centromeres. In addition, a significant correlation was observed between the P/A gene frequency among the 80 accessions and the diversity level at P/A loci. Furthermore, the proportion of P/A genes was different among functional gene categories. Finally, a P/A gene tree showed a diversified population structure in the worldwide Arabidopsis accessions.
An estimate of P/A genes and their frequency distribution in the worldwide Arabidopsis accessions was obtained. Our results suggest that there are diverse mechanisms to generate or maintain P/A genes, by which individuals and functionally different genes can selectively maintain P/A polymorphisms for a specific adaptation.