Data Retriever automates the tasks of finding, downloading, and cleaning up publicly available data, and then stores them in a local database or as .csv files. Simply put, it's a package manager for data. This allows data analysts to spend a majority of their time in analysing rather than in cleaning up or managing data.

The Data Retriever is written in Python. It currently has a command line interface (CLI) and can also be used through an associated R package that wraps this CLI. Adding a native Python interface and a Julia package wrapping the CLI would provide access to the tools provided by the Data Retriever in the three major open source languages for data oriented computing.

Organization

Student

Shivam Negi

Mentors

  • Henry Senyondo
  • Ethan White
close

2017