Word Sense Clustering and Clusterability
Abstract
Word sense disambiguation and the related field of word sense induction traditionally rely on a partition of usages into senses by human annotators or automatic clustering but this is a much easier task for some words than for others. This paper follows from work in linguistics which argues that word meanings fall on a continuum between clear cut cases of ambiguity on the one hand and vague un-distinguished subcases of a general meaning on the other hand. Our aim is to determine where on this continuum a lemma falls by using the notion of clusterability from the machine learning literature. Clusterability measures aim to predict how much structure there is in the data and therefore how easy the data will be to cluster. We present two types of clusterability measures as a means of determining the partitionability of a word into senses: (1) existing measures from the machine learning literature, (2) the congruence between different clusterings of the same data points. These types of measure both rely on hard clustering. We describe the usages to be clustered through paraphrase and translation annotations, and we compare the clusterability prediction for each lemma against a gold partitionability rating derived from graded judgments of usage similarity. Since there is also the possibility of applying overlapping clustering algorithms we also apply a baseline approach which simply measures the amount of overlap, that is instances which are shared between clusters.We show that when controlling for polysemy, our indicators of higher clusterability, particularly separability and variance ratio from the machine learning literature and the paired F-score cluster congruency measure, tend to correlate with partitionability. The baseline measure of overlap gives only a very weak correlation suggesting that, in contrast to our proposed metrics, one cannot rely on overlapping clustering to determine whether or not a lemma is readily partitionable.