Create pipeline/interface to prioritize variants for OncoKB curation
- Mentors
- Hongxin, jfkonecn
- Organization
- cBioPortal for Cancer Genomics
- Technologies
- python, mysql, java, react, typescript, aws
- Topics
- Full-Stack, Pipeline
The cBioPortal organization is dedicated to assisting clinicians and researchers in the analysis of complex cancer genomics data. And cBioPortal includes OncoKB annotation in it's view to provide detailed biological effect and clinical implications for each variant. Despite the availability of a standard API for annotating alterations, users without a computational background still find it challenging to annotate their variants. This project addresses this gap by first developing a pipeline to fetch a list of genomic variants from the MSK-IMPACT clinical cohort that have not been curated by OncoKB. The approach utilizes distributed parallel computing to accelerate the identification of target variants while ensuring data consistency through regular monitoring and timely updates. Then, a web interface has been designed to allow users to view and annotate these uncurated variants directly from the webpage. This interface supports searching and sorting functionalities, enabling users to find specific variants and prioritize them for OncoKB curation. The outcome of this initiative includes an automated pipeline, powered by AirFlow, for the continuous retrieval and verification of uncurated variants, alongside a user-friendly interface that facilitates the direct annotation of variants by users through webpage interactions. This project aims to simplify the variant annotation process, making it more accessible to a broader audience and further advancing the field of precision oncology.