Aleksandra Denisova1, Layal Shaheen2, Dmitrii Kharitonov3, Anna Ilinskaya4, Saleem Mansour2, Iskandar Hweijeh2, Grigorii Travin5, Valery Ilinsky4 and Alexander Rakitko6*
1Genotek Ltd., Moscow, Russia, National Research University Higher School of Economics, Russian Federation
2Genotek Ltd., Moscow, Russia, Moscow Center for Advanced Studies, Moscow, Russia
3Genotek Ltd., Moscow, Russia, Genotek Center: AI in Personalized Medicine, ITMO University, Saint-Petersburg, Russia
4Eligens SIA, Riga, Latvia
5Genotek Ltd., Moscow, Russia
6Genotek Ltd., Moscow, Russia, National Research University Higher School of Economics, Russian Federation, Genotek Center: AI in Personalized Medicine, ITMO University, Saint-Petersburg, Russia
rakitko [at] genotek.ru
Abstract
Androgenetic alopecia and related hair loss disorders are highly heritable conditions, yet their genetic architecture remains insufficiently characterized in Eastern European populations. Here, we performed a large-scale genomic study of alopecia in one of the largest Russian consumer genomics cohorts using an integrated phenotyping framework combining ICD-10 diagnoses, antiandrogen therapy records, and AI-assisted assessment of baldness severity from participant photographs.
To minimize population stratification, controls were selected using ancestry-aware propensity score matching. We analyzed both individual phenotypes and a composite phenotype integrating clinical, therapeutic, and image-derived features.
Genome-wide association analysis of androgenetic alopecia (2,436 cases; 24,304 controls) identified significant loci on chromosomes 1, 2, 5, 7, 18, 20, and X, including established risk regions previously implicated in male pattern baldness. The strongest association was observed at 20p11.22 (p=1.48×10-86). In alopecia areata (517 cases; 5,177 controls), six significant loci were detected, including the immune-associated 6q25.1 region near RAET1M and ULBP3 (rs12205199; p=5.15×10-15). Pathway enrichment analysis highlighted genes involved in T-cell activation and immune regulation, including CTLA4, ICOS, and IL2RA, supporting the autoimmune basis of the disease. AI-derived severe baldness phenotypes identified one association at the 2q14.3 locus (p=2.62×10-8). Analysis of androgen receptor CAG repeat variant (rs746853821) demonstrated that longer alleles (>23 repeats) were associated with reduced alopecia risk (OR=0.69, 95% CI 0.56–0.86). Polygenic risk models showed strong predictive performance for androgenetic alopecia (AUC up to 0.705), while the best result for external PRS (PGS003558) had an AUC of 0.725 on our cohort. An even better score was obtained when applying the latter score to the phenotype of severe baldness (stage VII), where an AUC of 0.728 was achieved.
Our study integrates diagnostic records and AI-based image phenotyping, enabling scalable discovery of alopecia genetics in the understudied Eastern European population.
Keywords: alopecia, GWAS, PRS, AI-based phenotyping

