For decades, the DNA encoded within the tea plant has remained a mystery. Now, the tea genome has finally been sequenced. Tea (Camellia sinensis) can don the DNA-decoded laurels along with dozens of the world’s most important crops, including two other distantly related caffeine-bearing cousins: coffee (Coffea arabica), sequenced in 2014, and cacao (Theobroma cacao), sequenced in 2010.

The scientific paper which revealed this information, published May 1, 2017, is officially titled:
The Tea Tree Genome Provides Insights into Tea Flavor and Independent Evolution of Caffeine Biosynthesis, Xia et al., Molecular Plant (2017) and you can read the whole paper here:

Researchers extracted samples from tea trees growing in Yunnan, specifically a Camellia sinensis var. assamica of the Yunkang 10 cultivar. Even with today’s technology, sequencing DNA is no easy task. Officially, the paper has over two dozen authors, but we can be assured dozens more worked on this massive project which took five years to complete. Lizhi Gao, a researcher on the project and the main media contact, said, “Our lab has successfully sequenced and assembled more than twenty plant genomes. But this genome, the tea tree genome, was tough.”

Besides the pure intellectual feat of uncovering the tea genome, it would seem one of the objectives of the study was to find the genetic basis of the plant’s flavors, specifically the ones derived from flavonoid catechins, caffeine, and theanine. I’ll refrain from extrapolating too much, but here are some clear-cut findings within the genome from Camellia sinensis:

It’s Big.

Tea lies within a botanical group called Asterids, along with coffee, tomato, potato, and pepper plants. The tea genome is bigger than all of them, four times larger than that of coffee, and bigger than almost all other sequenced plant species.

The tea tree’s large genome is the result of an unusually high number of repeating genetic sequences. This is the result of what the paper calls, “…slow, steady, and long-term amplification of a few LTR retrotransposon families.” These “retrotransposon” families copy their code repeatedly into the genome. In fact, the paper reports a single family of retrotransposon persisting in the genome for 50 million years, a small percentage of which could be due to human cultivation.

The tea plant also went through two WGD events, or Whole Genome Duplication. The first one was shared with its distant cousins, the kiwifruit, and the grape, the second with closely related Camellia plants. What causes these events is unclear (at least to me) but exploded the genome even more.

More Resistance, More Flavor.

The huge number of repeating gene sequences and the two WGD events written into the DNA have contributed to the abundance of genes that make the tea plant’s DNA what it is. Specifically, tea has an abundance of genes that resist disease, physical stress, and those which identify pathogens within the plant. Tea has more of these genes than kiwifruit, tomato, cacao, and thale cress, which are related plants in tea’s higher-order clade, the Eudicots. The paper further suggests that this heap of disease-resistant genes is what makes the tea plant ideal for widespread tolerance, growth, and cultivation in many diverse climates across the globe.

This mass duplication of genes has a dramatic effect on the metabolic compounds in tea, known to us as flavor components and the caffeine/theanine content. All Camellia species have these same genes which encode proteins that make the flavonoids, caffeine, and theanine. However, Camellia sinensis and its derivatives have many more of these genes, and they’re the only ones to express these genes at the level of the mature plant.

This, of course, is why humans love tea so much: the over-abundance of flavor-producing, mentally-stimulating, and generally healthy compounds.

Interestingly, even though caffeine is endemic to tea, coffee, and cacao (all Eudicots), previous theories proposed that caffeine evolved differently in tea than in coffee. Now that all three genomes have been sequenced, this theory is now substantiated by scientific analysis. Caffeine in tea arose quite suddenly and recently in its evolution.

Why This Matters.

The paper’s authors don’t extrapolate much with their research, as good science shouldn’t. But, they do venture into ideas how the research could be applied, and their propositions are intriguing.

  • A complete tea genome will allow scientists to help breeders and growers more effectively cross-breed Camellia species and varieties to emphasize and capitalize on the unique properties of these plants. It also allows a genetic timeline of the tea plant to be established and conserved for future research.
  • Since the pathways for the immune system of the tea plant have been identified, this will allow scientists to study those genes further, and potentially create more disease and pest resistant tea plants.
  • Finally, since the genes responsible for the synthesis of tea’s catechins, caffeine, and theanine have been identified and these systems specifically analyzed, this could lead to gene-specific manipulation to create novel expressions and flavors within the tea plant.

Where does all this lead?
Generally, to a better understanding of the plant that we all love.
Specifically, to a full indexing of the code underlying every nuance of the tea plants’ specific biology.
The future? We will see, won’t we? Let us know what you think in the comments!


*Photo Credit: Zach Ware (edited)