Contributor: MatthewBaggins

David-Skene algorithms for Turing.jl

Mentors: Kai, Hong Ge
Organization: The Julia Language
Technologies: julia, Turing
Topics: machine learning, Unsupervised Learning, Statistical Inference

The David-Skene algorithms are a family of computational methods for aggregating crowdsourced annotations of data in order to obtain the true labels or categories for that data, which is very useful in places where we do not have access to the "true" annotations for large datasets and therefore need to rely on fallible human judgment. The most traditional version of the David-Skene (DS) algorithm dates back to 1979 but more recently a more efficient (eightfold sppedup) although less accurate version, called Fast David-Skene (FDS) has been developed. The hybrid variant (HDS) sits in the middle in terms of accuracy and speed between DS and FDS. Since demands for the two differ according to situation, it makes sense to implement all thre versions (DS, FDS, and HDS) so that everyone interested can use the one most suitable to their situation. This is the purspose of this project. Additionally, the algorithms in question will be extended to be applied to multi-label problems as well.