Parallel Graph Traversal for Variation Graphs
- Mentors
- George Githinji, Erik Garrison, Pjotr Prins
- Organization
- Open Bioinformatics Foundation (OBF)
Traditionally, Bioinformatics has focused its efforts on the study of single genomes, each representing an individual of a certain species. In more recent times, the focus has shifted on pangenomes, that encode genomic variation across multiple individuals of a certain species. Pangenome graphs (also called Variation Graphs) are used to store and represent pangenomes. Multiple open-source tools have been developed for building and analyzing Variation Graphs and VG (https://github.com/vgteam/vg) is the most popular one. VG provides a framework to work with Variation Graphs, and it is currently capable of handling moderately-sized graphs. However, it has issues scaling up for large datasets. I want to improve its performances by building a highly parallel graph explorer that makes use of GPU architecture in Rust. GPUs are well known for their ability to do millions of operations at the same time, however achieving this requires code to be written in a specific, parallel-oriented way. I think that, with help from the mentors, this can surely happen, and VG will be able to explore Variation Graphs much faster.