a free, open source power tool for working with messy data and improving it
OpenRefine is an established data cleaning tool, popular in a broad range of communities: journalism, digital humanities, libraries, linked open data, and many more. Its design has influenced many other tools, not counting forks and rebranded versions.
Scope of the tool
The tool is used to perform data transformations on small to medium-scale datasets, by interactively building workflows which mix automated transforms and human review. The transformations are reproducible: they can be replayed on datasets in the same format, with updated data. The focus is on usability through a web UI, reducing the need to learn a programming or query language.
OpenRefine is built in Java (server-side) and uses a web UI (jQuery). It is easy to work on isolated parts of the code without being familiar with the entire architecture. We try to maintain good quality standards, by testing all our changes, but remain flexible and un-opiniated.
OpenRefine 2020 Projects
Enhancements for the Wikidata extensionAdd OAuth support for the Wikidata extension (#1612). Extend the Wikidata extension to support arbitrary Wikibase instances (#1640).
Replace row pagination by infinite scrollingOpenRefine is a powerful Open Source tool that provides its users the ability to work with and clean messy data. It currently uses a pagination...