Crowdsourcing for Language Resources and Evaluation
Crowdsourcing is an efficient approach for knowledge acquisition and data annotation that enables gathering and evaluating large-scale linguistic datasets. In this lecture we will focus on practical use of human-assisted computation for language resource construction and evaluation. We will analyze three established approaches for crowdsourcing in NLP. First, we will consider the case study of Wikipedia and Wiktionary that facilitate the community effort using automatic quality control via content assessment and edit patrolling. Second, we will dive deep in microtask-based crowdsourcing using reCAPTCHA and Mechanical Turk as the examples. We will discuss task design and decomposition issues and then carefully describe standard approaches for inter-annotator agreement evaluation (Krippendorff’s α) and answer aggregation (Majority Vote and Dawid-Skene). Third, we will study the case of various games with a purpose, including ESP Game, Infection Game for BabelNet, and OpenCorpora gamification. Finally, we will provide recommendations for ensuring the high quality of the crowdsourced annotation and show useful datasets for further studies.