A statistical perspective on the Last Universal Common Ancestor (LUCA)

Pavle Goldstein*, Tonka Rimac and Miljenko Huzak

Department of Mathematics, Science Faculty, University of Zagreb

payo [at] math.hr

Abstract

We introduce an analysis-of-variance statistical framework for analyzing multiple sequence alignments of protein families shared between two groups of organisms. When applied to present-day bacteria and archaea, the method yields a ranking score, estimating likelihood for their presence in LUCA population. By varying bacterial and archaeal subsamples, we assess the robustness of these inferences and gain some information on metabolic pathways potentially present in LUCA. Finally, our results provide a perspective on evolutionary processes associated with the archaeal–bacterial divergence.

Keywords: LUCA, statistics, Pfam