Contributor
Tanmay Singal

Reference registry + contig alias search


Mentors
Andres Silva, Jose Miguel Mut Lopez
Organization
Global Alliance for Genomics and Health

Reference sequences are files that are used as a reference to describe variants that are present in analyzed sequences and play a central role in defining a baseline of knowledge against which our understanding of biological systems, phenotypes and variation are based upon. Reference sequence files often use different naming schemes to refer to the same sequence and thus there is a strong need to be able to cross reference chromosomes/contigs using different nomenclatures. My project focuses on creating a centralized database and an alias resolution service that can cross reference accessions easily and reliably. A web service that allows users to access these services from any client is also required. It need to have a mechanism for manually or periodically ingesting new aliases from a remote data-source.