BioSamples is the EMBL-EBI central hub for metadata about samples whose raw data is hosted in reference archives such as EGA, ENA and EVA. The goal of this project is creating API for discovering over BioSamples using the GA4GH metadata schema and stream sequencing data back from ENA via the htsget protocol. Furthermore, there is additional objective: providing Phenopackets export. Phenopackets is open standard for sharing disease and phenotype information. It can be encoded in JSON or YAML and represented as PXF file. Thus, we need to reformulate data from GA4GH to PXF file (phenopacket).


Dilshat Salikhov


  • Luca Cherubin
  • Melanie Courtot