Contamination of the Toxoplasma gondii reference genome and an algorithm for its detection in paleometagenomic analysis

Sofia Dmitrienko1* and Artem Nedoluzhko2,3

1ITMO University

2Mammoth Museum, North-Eastern Federal University, Yakutsk, Russia

3National Research University Higher School of Economics, Moscow, 117418, Russia

sophiaverdieva [at] mail.ru

Abstract

Paleometagenomic methods can identify pathogens in ancient samples, but their accuracy is critically dependent on the quality of reference databases. Contamination of reference genomes with foreign DNA can lead to false-positive taxonomic assignments, especially when working with low coverage and degraded ancient DNA. In this study, while searching for an alveolate parasite in samples from the baby mammoth Yana (Mammuthus primigenius), we discovered significant contamination of the Toxoplasma gondii reference genome with human DNA and developed an algorithm for verifying such artifacts.

Keywords: Paleometagenomics, reference genome contamination