As Tomato Brown Rugose Fruit Virus infects tomato and pepper plants across the globe, seed companies are working on breeding a resistant tomato.
Rich genetic and genomic resources were available to tomato breeders even before the inception of the tomato’s genomic sequencing projects. Large tomato germplasm collections have long been established in several locations. More than 75,000 tomato accessions are conserved in genebanks around the world. One of the largest is the USDA genebank at Geneva, NY.
Sequencing of the tomato genome was initiated in 2005 as a multinational effort between 14 countries and completed in 2012. These efforts have been greatly accelerated with the application of artificial intelligence, machine learning and advanced algorithms to analyze large quantities of genetic and phenotypic data.
Researchers from the Boyce Thompson Institute in 2020 created a high-quality reference genome for wild tomatoes to produce a more complete and useful tomato pan-genome. They discovered sections of the genome that underlie fruit flavor, size and ripening, stress tolerance and disease resistance. Thanks in part to cutting-edge sequencing technologies that can read very long pieces of DNA, the reference genome is more complete and accurate than the existing database.
Older sequencing technologies that read shorter pieces of DNA identifies mutations at the single-base level. However, they are not good at finding structural variants such as insertions, deletions, inversions or duplications of large chunks of DNA.
In 2019, an international group of researchers constructed a tomato pan-genome using genome sequences from 725 phylogenetically and geographically representative varieties. This database includes 4,873 genes not found in the original reference genome. Present/absent variation analyses revealed substantial gene loss and intense negative selection of genes and promoters during tomato domestication and improvement.