Data Retriever automates the tasks of finding, downloading, and cleaning up publicly available data, and then stores them in a local database or as .csv files. Simply put, it's a package manager for data. This allows data analysts to spend a majority of their time in analysing rather than in cleaning up or managing data.

The Data Retriever is written in Python. It currently has a command line interface (CLI) and can also be used through an associated R package that wraps this CLI. Adding a native Python interface and a Julia package wrapping the CLI would provide access to the tools provided by the Data Retriever in the three major open source languages for data oriented computing.



Shivam Negi


  • Henry Senyondo
  • Ethan White