Modeling regular polysemy: A study of the semantic classification of Catalan adjectives.
Abstract
We present a study of the automatic acquisition of semantic classes for Catalan adjectives from distributional and morphological information, with particular emphasis on the identification of polysemous adjectives. The aim is to distinguish and characterize broad classes, such as qualitative (gran ‘big’) and relational (pulmonar ‘pulmonary’) adjectives, as well as to identify polysemous adjectives such as econòmic (‘economic | cheap’). We specifically aim at modeling regular polysemy, that is, types of sense alternations that are shared across lemmata. To date, both semantic classes for adjectives and regular polysemy have only been sparsely addressed in empirical computational linguistics.
Our experiments provide relevant feedback to the two main questions tackled in this article. First, what is an adequate broad semantic classification for adjectives? We provide empirical support for the qualitative and relational classes, defined in theoretical work, and uncover one type of adjective that has not received enough attention, namely, the event-related class. Second, how is regular polysemy best modeled in computational terms?We present two models, and argue that the second one, which models regular polysemy in terms of membership to multiple basic classes, is both theoretically and empirically more adequate than the first one, which attempts to identify polysemous classes. Taking polysemy into account, our best adjective classifier achieves 69.1% accuracy, against a majority baseline of 51% and an upper bound (human agreement) of 68%.