Annotation Curricula to Implicitly Train Non-Expert Annotators

Ji-Ung Lee; Jan-Christoph Klie; Iryna Gurevych

Authors

Ji-Ung Lee Ubiquitous Knowledge Processing Lab, Technical University of Darmstadt http://orcid.org/0000-0002-8428-2003
Jan-Christoph Klie Ubiquitous Knowledge Processing Lab, Technical University of Darmstadt http://orcid.org/0000-0003-0181-6450
Iryna Gurevych Ubiquitous Knowledge Processing Lab, Technical University of Darmstadt http://orcid.org/0000-0003-2187-7621

Abstract

Annotation studies often require annotators to familiarize themselves with the task, its annotation scheme, and the data domain. This can be overwhelming in the beginning, mentally taxing, and induce errors into the resulting annotations; especially in citizen science or crowd sourcing scenarios where domain expertise is not required and only annotation guidelines are provided. To alleviate these issues, this work proposes annotation curricula, a novel approach to implicitly train annotators.Â The goal is to gradually introduce annotators into the task by ordering instances that are annotated according to a learning curriculum. To do so, this work formalizes annotation curricula for sentence- and paragraph-level annotation tasks, defines an ordering strategy, and identifies well-performing heuristics and interactively trained models on three existing English datasets. Finally, a user study is conducted with 40 voluntary participants who are asked to identify the most fitting misconception for English tweets about the Covid-19 pandemic. The results show that using a simple heuristic to order instances can already significantly reduce the total annotation time while preserving a high annotation quality. Annotation curricula thus can provide a novel way to improve data collection. To facilitate future research, all code and data from the user study consisting of 2,400 annotations is made available.

Author Biographies

Ji-Ung Lee, Ubiquitous Knowledge Processing Lab, Technical University of Darmstadt

Department of Computer Science, PhD Student
Jan-Christoph Klie, Ubiquitous Knowledge Processing Lab, Technical University of Darmstadt

Department of Computer Science, PhD Student
Iryna Gurevych, Ubiquitous Knowledge Processing Lab, Technical University of Darmstadt

Department of Computer Science, Professor

Annotation Curricula to Implicitly Train Non-Expert Annotators

Authors

Abstract

Author Biographies

Downloads

Published

Issue

Section

Make a Submission

Information

Announcements

EACL 2026 – CL deadlines for Qualifying Papers

2026 *ACL Conference Dates