Computing Lexical Contrast

Authors

  • Saif Mohammad National Research Council Canada
  • Bonnie Dorr University of Maryland
  • Graeme Hirst University of Toronto
  • Peter Turney National Research Council Canada

Abstract

Knowing the degree of semantic contrast, or oppositeness, between words has widespread application in natural language processing, including machine translation, information retrieval, and dialogue systems. Manually-created lexicons focus on strict opposites, such as antonyms, and have limited coverage. On the other hand, only a few automatic approaches have been proposed, and none have been comprehensively evaluated. Even though oppositeness may seem to be a simple and fairly intuitive idea at first glance, any deeper analysis quickly reveals that it is in fact a complex and heterogeneous phenomenon. In this paper we present a large crowdsourcing experiment to determine the amount of human agreement on the concept of oppositeness and its different kinds. In the process, we flesh out key features of different kinds of opposites and also determine their relative prevalence. We then present an automatic and empirical measure of lexical contrast that combines corpus statistics with the structure of a published thesaurus. Using four different datasets, we evaluated our approach on two different tasks, solving closest-to-opposite questions and distinguishing synonyms from antonyms.The results are analyzed across four parts of speech and across five different kinds of opposites. We show that the proposed measure of lexical contrast obtains high precision and large coverage, outperforming existing methods.

Author Biographies

  • Saif Mohammad, National Research Council Canada
    Research Officer, Institute for Information Technology
  • Bonnie Dorr, University of Maryland
    Professor, Department of Computer Science
  • Graeme Hirst, University of Toronto
    Professor, Department of Computer Science
  • Peter Turney, National Research Council Canada
    Research Officer, Institute for Information Technology

Published

2024-12-05

Issue

Section

Long paper