Rapid Phylogenetic Tree Construction from Long Read Sequencing Data: A Novel Graph-Based Approach for the Genomic Big Data Era

Harisankar Sadasivan; Luke Ross; Chih-Yu Chang; Kushantha Upulanga Attanayake

Authors

Harisankar Sadasivan Computer Science and Engineering, University of Michigan Ann Arbor, MI 48109, USA
Luke Ross Computer Science and Engineering, University of Michigan Ann Arbor, MI 48109, USA
Chih-Yu Chang Computer Science and Engineering, University of Michigan Ann Arbor, MI 48109, USA
Kushantha Upulanga Attanayake Computer Science and Engineering, University of Michigan Ann Arbor, MI 48109, USA

Abstract

Genomics is the largest producer of big data, with an expected 40 EB of data every year. The rapid growth of genomic data necessitates efficient methods for analysis and classification. We present a novel, automated pipeline for swift phylogenetic tree construction from long-read sequencing data. Our approach addresses computational challenges by utilizing compact repeat graphs instead of full genome assemblies. We integrate advanced graph embedding techniques, combining structural and content-based approaches, to capture genomic relationships efficiently. Demonstrating our method on 20 bacterial genomes across 5 classes, we achieve a cophenetic correlation of 0.53 with the ground truth phylogenetic tree. Our pipeline reconstructs meaningful evolutionary relationships directly from sequencing reads without requiring complete assemblies or time-consuming alignments. This work represents a significant advancement towards rapid pathogen classification during outbreaks and offers a scalable solution for analyzing the expanding universe of sequenced organisms. By bridging graph theory, machine learning, and genomics, our method paves the way for more efficient phylogenetic analysis in the era of big data biology.

Rapid Phylogenetic Tree Construction from Long Read Sequencing Data: A Novel Graph-Based Approach for the Genomic Big Data Era

Authors

Abstract

Downloads

Published

How to Cite

Issue

Section

Current Issue

Information

Make a Submission