The Tea Genome Revealed

For decades, the DNA encoded within the tea plant has remained a mystery. Now, the tea genome has finally been sequenced. Tea (Camellia sinensis) can don the DNA-decoded laurels along with dozens of the world’s most important crops, including two other distantly related caffeine-bearing cousins: coffee (Coffea arabica), sequenced in 2014, and cacao (Theobroma cacao), sequenced in 2010.

The scientific paper which revealed this information, published May 1, 2017, is officially titled:
The Tea Tree Genome Provides Insights into Tea Flavor and Independent Evolution of Caffeine Biosynthesis, Xia et al., Molecular Plant (2017) and you can read the whole paper here:

Researchers extracted samples from tea trees growing in Yunnan, specifically a Camellia sinensis var. assamica of the Yunkang 10 cultivar. Even with today’s technology, sequencing DNA is no easy task. Officially, the paper has over two dozen authors, but we can be assured dozens more worked on this massive project which took five years to complete. Lizhi Gao, a researcher on the project and the main media contact, said, “Our lab has successfully sequenced and assembled more than twenty plant genomes. But this genome, the tea tree genome, was tough.”

Besides the pure intellectual feat of uncovering the tea genome, it would seem one of the objectives of the study was to find the genetic basis of the plant’s flavors, specifically the ones derived from flavonoid catechins, caffeine, and theanine. I’ll refrain from extrapolating too much, but here are some clear-cut findings within the genome from Camellia sinensis:

It’s Big.

Tea lies within a botanical group called Asterids, along with coffee, tomato, potato, and pepper plants. The tea genome is bigger than all of them, four times larger than that of coffee, and bigger than almost all other sequenced plant species.

The tea tree’s large genome is the result of an unusually high number of repeating genetic sequences. This is the result of what the paper calls, “…slow, steady, and long-term amplification of a few LTR retrotransposon families.” These “retrotransposon” families copy their code repeatedly into the genome. In fact, the paper reports a single family of retrotransposon persisting in the genome for 50 million years, a small percentage of which could be due to human cultivation.

The tea plant also went through two WGD events, or Whole Genome Duplication. The first one was shared with its distant cousins, the kiwifruit, and the grape, the second with closely related Camellia plants. What causes these events is unclear (at least to me) but exploded the genome even more.

More Resistance, More Flavor.

The huge number of repeating gene sequences and the two WGD events written into the DNA have contributed to the abundance of genes that make the tea plant’s DNA what it is. Specifically, tea has an abundance of genes that resist disease, physical stress, and those which identify pathogens within the plant. Tea has more of these genes than kiwifruit, tomato, cacao, and thale cress, which are related plants in tea’s higher-order clade, the Eudicots. The paper further suggests that this heap of disease-resistant genes is what makes the tea plant ideal for widespread tolerance, growth, and cultivation in many diverse climates across the globe.

This mass duplication of genes has a dramatic effect on the metabolic compounds in tea, known to us as flavor components and the caffeine/theanine content. All Camellia species have these same genes which encode proteins that make the flavonoids, caffeine, and theanine. However, Camellia sinensis and its derivatives have many more of these genes, and they’re the only ones to express these genes at the level of the mature plant.

This, of course, is why humans love tea so much: the over-abundance of flavor-producing, mentally-stimulating, and generally healthy compounds.

Interestingly, even though caffeine is endemic to tea, coffee, and cacao (all Eudicots), previous theories proposed that caffeine evolved differently in tea than in coffee. Now that all three genomes have been sequenced, this theory is now substantiated by scientific analysis. Caffeine in tea arose quite suddenly and recently in its evolution.

Why This Matters.

The paper’s authors don’t extrapolate much with their research, as good science shouldn’t. But, they do venture into ideas how the research could be applied, and their propositions are intriguing.

  • A complete tea genome will allow scientists to help breeders and growers more effectively cross-breed Camellia species and varieties to emphasize and capitalize on the unique properties of these plants. It also allows a genetic timeline of the tea plant to be established and conserved for future research.
  • Since the pathways for the immune system of the tea plant have been identified, this will allow scientists to study those genes further, and potentially create more disease and pest resistant tea plants.
  • Finally, since the genes responsible for the synthesis of tea’s catechins, caffeine, and theanine have been identified and these systems specifically analyzed, this could lead to gene-specific manipulation to create novel expressions and flavors within the tea plant.

Where does all this lead?
Generally, to a better understanding of the plant that we all love.
Specifically, to a full indexing of the code underlying every nuance of the tea plants’ specific biology.
The future? We will see, won’t we? Let us know what you think in the comments!


*Photo Credit: Zach Ware (edited)


About the Author:

Jordan has spent most of his life working in the food and beverage industry. His professional experience with tea started at American Tea Room in Los Angeles, where he worked for almost six years, becoming their Beverage Director and helping in a three-location expansion. Later, he moved to developing, training, and menu-building what would become Alfred Tea Room, which he's helped expand into Japan. He now serves as Food & Beverage Director for Alfred Inc. which includes Alfred Tea Room and multiple locations of Alfred Coffee in Los Angeles and Austin.


  1. Dweezy May 12, 2017 at 8:49 pm - Reply

    This might be a n00b question but I failed science in school.

    Just kidding. Maybe.

    But if scientists tinker with the plant based on these new genome maps, do these new plants carry the name “GMO”? Are we going to get glowing, disco tea plants to have high mountain oolong tea raves? Am I out of my mind?

    Just trying to “get” the limits of what’s possible here with the map tucked snug under the arms of maniacal and non-maniacal tea scientists.

    • Jordan G. Hardin May 13, 2017 at 1:49 pm - Reply

      It definitely carries the potential for tinkering to happen, which isn’t of itself a bad thing. If the tinkering is done in a lab directly into the DNA, then technically, yes, it would be a GMO. If it’s done in the field, with cross-breeding and hybridization and clonal cuttings, then that’s just good old-fashioned horticulture.

  2. @teascientist May 8, 2017 at 10:47 pm - Reply

    Yes we are. Big stride.

  3. @worldteapodcast May 8, 2017 at 8:29 pm - Reply

    Fascinating article Jordan, an exciting read to say the least. Given the numerous tea research institutions, my question is how long it will take before this information reaches everyone. More so, how will this information be applied by each country that gets their hands on it. not to mention that, dispite the wealth of information that the genome provides, tea unlike other foods such as tomatoes, undergoes post harvest processing to bring out the flavour, tastes, and aromas. The coming years are sure to be interesting to say the least. I’ll be waiting on the side lines to get another episode of the World Tea Podcast recorded!

    • Jordan G. Hardin May 8, 2017 at 9:43 pm - Reply

      Thanks! And excellent point, I’m not sure how this kind of information is spread amongst scientists, and how quickly that leads to results… but based on what my searches, I can assure you that papers come out nearly constantly and there isn’t a lack of communication. They constantly cite each other. The research stations in China are especially interconnected and very well developed. Other countries, hard to say. But yes, the coming years are going to be very interesting. And fun podcast, by the way, I’ve been listening for some months now!

  4. @teascientist May 8, 2017 at 5:11 am - Reply

    Nicely summarised.

    Though I agree with the 3 broad areas where the results can be applied to, I think each has a shorter simpler alternate approach.
    Take the example of developing new breeds. This requires cross breeding the most diverse cultivars as the liklihood of novel outcomes will be highest. For this what is required is a mapping of genetic diversity of the currently available cultivars, creating a core collection of the mmost diverse and the develop a genotype-phenotype correlation which will help in rapid selection of the new seedlings obtained by cross breeding. All this can be acheived without full genome mapping, but of course will be strengthened by full genome mapping. But at same time full genome mapping alone will not take us to this goal.

    • Jordan G. Hardin May 8, 2017 at 10:58 am - Reply

      Interesting point. The paper also mentioned trying to source older, more wild cultivars to create this mapping you’re talking about and to better understand the specific evolution. At least we’re part of the way there!

Leave A Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

World of Tea is now part of the American Specialty Tea Alliance. Learn More.