How is a "Kitchen Chair" like a "Farm Horse"? Exploring the Representation of Noun-Noun Compound Semantics in Transformer-based Language Models

Authors

Abstract

Despite the success of Transformer-based language models in a wide variety of natural language processing tasks, our understanding of how these models process a given input in order to represent task-relevant information remains incomplete. In this work, we focus on semantic composition and examine how Transformer-based language models represent semantic information related to the meaning of English noun-noun compounds. We use two English noun-noun compound datasets that allow us to probe Transformer-based language models for their knowledge of the thematic relation that links the head noun and modifier word of a compound (e.g. kitchen chair: a chair located in a kitchen). Firstly, using a dataset featuring groups of compounds with shared lexical or semantic features, we find that token representations of six Transformer-based language models distinguish between pairs of compounds based on whether they use the same thematic relation. Secondly, we utilise a more fine-grained representation of noun-noun compound semantics based on vector representations derived from human annotations, and find that token vectors from four of the six models elicit a strong signal of the semantic relations used in the compounds. In a novel 'compositional probe' setting, where we compare the semantic relation signal in mean-pooled token vectors of a noun-noun compound to mean-pooled token vectors when the two constituent words appear in separate sentences, we find that the overall best performing models prefer the former compositional setting, shedding light on the ability of Transformer-based Language models to support semantic composition processes in representing the meaning of noun-noun compounds.

Author Biographies

  • Mark Ormerod, Queen's University Belfast
    PhD Candidate, School of Electronics, Electrical Engineering and Computer Science
    School of Electronics, Electrical Engineering and Computer Science
  • Jesús Martínez del Rincón, Queen's University Belfast
    Senior Lecturer, School of Electronics, Electrical Engineering and Computer Science
  • Barry Devereux, Queen's University Belfast
    Lecturer, School of Electronics, Electrical Engineering and Computer Science

Published

2024-09-02