Dušan Ušjak1*, Cong Feng2, Valentina Đorđević1 and Ming Chen2
1Institute of Molecular Genetics and Genetic Engineering, University of Belgrade, Serbia
2College of Life Sciences, Zhejiang University
dusan.usjak [at] imgge.bg.ac.rs
Abstract
Cardiovascular diseases (CVDs) are the leading cause of morbidity and mortality worldwide, accounting for approximately 32% of all deaths annually. Despite this substantial burden, their underlying molecular mechanisms remain poorly understood. The aim of this study was to construct an integrated single-nucleus RNA-seq (snRNA-seq) atlas using data from heart samples from different pathological conditions and to analyze gene expression patterns across cell types. A systematic search and curation of publicly available snRNA-seq datasets were performed to extract count matrices originating from left ventricular tissue samples. A total of 155 samples from eight independent datasets were included, comprising 75 unaffected controls and 80 samples from diseased hearts (cardiomyopathy, myocardial infarction, aortic stenosis, and related conditions). Data integration was performed using the Harmony algorithm with batch correction applied to preprocessed and concatenated data, resulting in an integrated object containing 1,100,688 nuclei and 25,217 genes. Cell clusters were annotated as cardiomyocytes, fibroblasts, endothelial cells, pericytes, myeloid cells, lymphocytes, smooth muscle cells, endocardial cells, neural cells, lymphatic endothelial cells, adipocytes, mast cells, and epicardial cells. Gene expression patterns were analyzed using high-dimensional weighted gene co-expression network analysis (hdWGCNA) and relative expression analysis to identify differential pathways. The constructed integrated snRNA-seq atlas provides a valuable platform for the identification of cell type-specific gene markers and pathways associated with major CVDs.
Keywords: CVD, Single-nucleus, RNA-seq, Data-mining, hdWGCNA

