Scientific Publication

A large set of 26 new reference transcriptomes dedicated to comparative population genomics in crops and wild relatives

Abstract

We produced a unique large dataset of reference transcriptomes to obtain new knowledge about the evolution of plant genomes and crop domestication. For this purpose we validated a RNA-Seq data assembly protocol to perform comparative population genomics. For the validation, we assessed and compared the quality of de novo Illumina short-read assemblies using data from two crops for which an annotated reference genome was available, namely grapevine and sorghum. We used the same protocol for the release of 26 new transcriptomes of crop plants and wild relatives, including still understudied crops such as yam, pearl millet and fonio. The species list has a wide taxonomic representation with the inclusion of 15 monocots and 11 eudicots. All contigs were annotated using BLAST, prot4EST, and Blast2GO. A strong originality of the dataset is that each crop is associated with close relative species, which will permit whole genome comparative evolutionary studies between crops and their wild related species. This large resource will thus serve research communities working on both crops and model organisms. All the data are available at http://arcad-bioinformatics.southgreen.fr/.