SARS-CoV-2’s closest relative, RaTG13, was generated from a bat transcriptome not a fecal swab: implications for the origin of COVID-19
Abstract: RaTG13 is the closest related coronavirus genome phylogenetically to SARS-CoV-2, consequently understanding its provenance is of key importance to understanding the origin of the COVID-19 pandemic. The RaTG13 NGS dataset is attributed to a fecal swab from the intermediate horseshoe bat Rhinolophus affinis. However, sequence analysis reveals that this is unlikely. Metagenomic analysis using Metaxa2 shows that only 10.3 % of small subunit (SSU) rRNA sequences in the dataset are bacterial, inconsistent with a fecal sample, which are typically dominated by bacterial sequences. In addition, the bacterial taxa present in the sample are inconsistent with fecal material. Assembly of mitochondrial SSU rRNA sequences in the dataset produces a contig 98.7 % identical to R.affinis mitochondrial SSU rRNA, indicating that the sample was generated from this or a closely related species. 87.5 % of the NGS reads map to the Rhinolophus ferrumequinum genome, the closest bat genome to R.affinis available. In the annotated genome assembly, 62.2 % of mapped reads map to protein coding genes. These results clearly demonstrate that the dataset represents a Rhinolophus sp. transcriptome, and not a fecal swab sample. Overall, the data show that the RaTG13 dataset was generated by the Wuhan Institute of Virology (WIV) from a transcriptome derived from Rhinolophus sp. tissue or cell line, indicating that RaTG13 was in live culture. This raises the question of whether the WIV was culturing additional unreported coronaviruses closely related to SARS-CoV-2 prior to the pandemic. The implications for the origin of the COVID-19 pandemic are discussed.