De novo genome sequencing for endangered bird of prey species

Erić Pavle1*, Marija Tanasković1, Aleksandra Patenković1, Katarina Erić1, Irena Hribšek1, Kristijan Ovari2 and Slobodan Davidović1

1 Department of Genetics of Populations and Ecogenotoxicology, Institute for Biological Research “Siniša Stanković” – National Institute of the Republic of Serbia, University of Belgrade, Belgrade, Serbia

2 Belgrade ZOO, Belgrade, Serbia

pavle.eric [at] ibiss.bg.ac.rs

Abstract

The Eastern Imperial Eagle (Aquila heliaca) is a large migratory bird of prey, with breeding sites spanning from eastern Czechia and Austria to Northwestern China and Mongolia. Due to the decline of its populations throughout its area of distribution, the IUCN Red List categorized species as vulnerable. Thus, it has become a subject of numerous conservation efforts. To develop effective conservation strategies, it is crucial to have a comprehensive understanding of the genetic variability of these populations. Establishing a reference genome serves as a cornerstone for conservationists, offering a starting tool to assess population dynamics, adaptive potential and evolutionary history with further analyses. Therefore, we performed the whole genome sequencing of A. heliaca.

The genome sequencing of a male A. heliaca was conducted using Illumina paired-end 150bp short reads, and de novo assembly was conducted as there is no reference genome available. After the use of paired-end information for scaffolding the assembly remained very fragmented, with the genome being represented with hundreds of thousands of contigs, primarily due to the inherent limitations of short-read sequencing in resolving repetitive regions and regions with strong nucleotide composition bias. To enhance scaffolding, we used a chromosome-level assembly of a closely related species, Aquila chrysaetos, available in the GenBank. BLAST analysis revealed high sequence similarity (~94%) between sequences from our assembly compared to the A. chrysaetos reference genome. The absence of major rearrangements or inversions in the selected contigs supported the usage of the A. chrysaetos reference genome for scaffolding. Thus, we generated a chromosome-level assembly for A. heliaca, encompassing 26 autosomes and the Z sex chromosome. Even though our assembly contained a few thousand unplaced scaffolds ranging in size from over 700Kb to very small fragments, the vast majority (>98%) of the assembly was assigned to 27 chromosomal-level scaffolds. The assembly demonstrated a completeness score of 97.2% according to the BUSCO assessment.

Keywords: Aquila heliaca, WGS, De-novo genome assembly, conservation

Acknowledgement: We would like to express our sincere gratitude to Professor Goran Rakočević for his invaluable help and suggestions in tackling the assembly process. We are also grateful to the RAF School of Computing for granting us access to their server. Their support was pivotal in enabling the assembly of the genome, and we greatly appreciate their contributions.