In essence, the project aims at building a generic phenopacket scraping tool, providing a command line interface and a rest api. The project can be classified into two parts, the first being a phenopacket scraper with a command line interface and a rest api which takes input in the form of a url or a stream, then scrapes the required content from the web page, analyzes it to extract phenotypes and genes and returns a JSON encoded phenopacket. The second part of the project includes a way of deploying the phenopacket scraper. A Django portal through which users can use the tool to generate phenopackets, search, analyse and share the results to study and collaborate among themselves.





  • Dan Keith