Polysemy - Evidence from Linguistics, Behavioural Science and Contextualised Language Models
Abstract
Polysemy, a type of lexical ambiguity where a word can have multiple, related senses, traditionally received less research attention than the better-known phenomenon of homonymy, where a word can have multiple, unrelated meanings. But especially throughout the last decade, substantial new evidence and many new theoretical analyses of the processing of polysemy have appeared in linguistics, psychology, and computational linguistics literature. This recent acceleration is fuelled by the growing availability of large, crowd-sourced datasets containing empirical evidence on different facets of the phenomenon, as well as the development of contextualised language models that aim to represent a given word within a given context. In this survey we set out to discuss these recent contributions to the investigation of polysemy against the backdrop of a long legacy of seminal research conducted across multiple decades and disciplines, and attempt to summarise the theoretical picture that is emerging from their observations.
The survey is articulated in three parts. In Part 1, we discuss lexical ambiguity from a linguistic perspective, introducing the distinction between homonymy, polysemy, and vagueness, and cover the key proposals regarding polysemy and its sub-types.
In Part 2, we discuss cognitive evidence on the representation of polysemy in the mental lexicon, reviewing different proposals about the organisation of lexical knowledge in the human language processor. This part covers a range of seminal models based on fundamentally opposing assumptions, as well as a selection of recent hybrid models that combine elements of previous approaches to explain (ir)regularities in the accumulating behavioural data on lexical semantics. Finally, in Part 3, we present an overview of computational approaches to the representation of word meaning in general and polysemy in particular. We introduce traditional distributional semantics models and specialised approaches aimed at representing a specific word sense, before moving to a discussion of language models developed to supply static and - more recently - contextualised word vectors to represent individual word meaning, highlighting how these have influenced and are bound to influence the investigation of polysemy.