A Large-scale Pseudoword-based Evaluation Framework for State-of-the-Art Word Sense Disambiguation
Abstract
The evaluation of several tasks in lexical semantics is often limited by the lack of large amounts of knowledge, even for testing purposes. Word Sense Disambiguation (WSD) is a case in point, as hand-labeled datasets are particularly hard and time-consuming to create. Consequently, evaluations are performed on a small scale, which do not allow for an in-depth analysis of the factors that determine the systems' performance.In this paper we address this issue by means of a realistic simulation of large-scale evaluation for the WSD task. We do this by providing two main contributions: first, we put forward two novel approaches to the wide-coverage generation of semantically-aware pseudowords, i.e., artificial words capable of modeling real polysemous words; second, we leverage the most suitable type of pseudoword to create large pseudosense-annotated corpora, which enable a large-scale experimental framework for the comparison of state-of-the-art supervised and knowledge-based algorithms.
Thanks to our framework, we study the impact of supervision and knowledge on the respective disambiguation paradigms and perform an in-depth analysis of the factors and conditions which determine their performance.