Study of structural variants in cacao genomes yields clues about plant diversity
Posted on August 18, 2021UNIVERSITY PARK, Pa. — An exhaustive and painstaking comparison of the genomes of multiple strains of the cacao tree by a team of researchers has provided insights into the role genomic structural variants play in the regulation of gene expression and chromosome evolution, giving rise to the differences within populations of the plant.
The research, which has implications for plant genetics in general, would not have been possible before powerful computers made the high-resolution sequencing of genomes possible, affordable and relatively fast, according to team member Mark Guiltinan, J. Franklin Styer Professor of Horticultural Botany and professor of plant molecular biology in Penn State’s College of Agricultural Sciences.
“The genomes of different populations of cacao trees are 99.9% identical, but it’s the structural variants in that one-tenth of 1% of their genomes that accounts for the plant’s diversity in different regions and its adaptation to climate and various diseases,” he said. “This study makes an association between structural variation and the ability of a plant to adapt to a local environment.”
Molecular geneticists have known for about a decade that genomic structural variants can play important roles in the adaptation and speciation of both plants and animals, but their overall influence on the fitness of plant populations is poorly understood. That’s partly because accurate population-level identification of structural variants requires analysis of multiple high-quality genome assemblies, which are not widely available.
In this study, the researchers investigated the fitness consequences of genomic structural variants in natural populations by analyzing and comparing chromosome-scale genome assemblies of 31 naturally occurring populations of Theobroma cacao, the long-lived tree species that is the source of chocolate. Among those 31 strains of cacao, they found more than 160,000 structural variants.
In findings published today (Aug. 16) in the Proceedings of the National Academy of Sciences, the researchers reported that most structural variants are deleterious and thus constrain adaptation of the cacao plant. These detrimental effects likely arise as a direct result of impaired gene function and as an indirect result of suppressed gene recombination over long periods of time, they noted.
However, despite the overall detrimental effects, the study also identified individual structural variants bearing signatures of local adaptation, several of which are associated with genes differentially expressed between populations. Genes involved in pathogen resistance are among these candidates, highlighting the contribution of structural variants to this important local adaptation trait.
Beyond revealing new empirical evidence for the evolutionary importance of structural variants in all plants, documenting the genomic differences and structural variants among the 31 strains of cacao provides a valuable resource for ongoing genetic and breeding studies for that valuable plant, Guiltinan noted.
“All cacao comes from the Amazon basin — plants were collected a long time ago from the wild by collectors and they were cloned, so we have a permanent collection,” he said. “Their genomes have been sequenced, and that represents a huge amount of work and data. As a result of this study, we know that structural variation is important to the survival of the plant, to the evolution of the plant and especially to the adaptation of the plant to local conditions.”
Also involved in the research at Penn State were Claude dePamphilis, director of the Center for Parasitic and Carnivorous Plants, Dorothy Foehr Huck and J. Lloyd Huck Distinguished Chair in Plant Biology and Evolutionary Genomics, and professor of biology; Eric Wafula, bioinformatics programmer, Eberly College of Science; and Paula Ralph, senior research technologist, Eberly College of Science. Other team members were Tuomas Hamala and Peter Tiffin, Department of Plant and Microbial Biology, University of Minnesota.
The National Science Foundation and the U.S. Department of Agriculture’s National Institute of Food and Agriculture supported this work.
Calculations for the study were performed using high-performance computing resources, including Penn State’s Institutional and Computational Science’s Roar supercomputer.