Probabilistic Distributional Semantics with Latent Variable Models

Authors

  • Diarmuid Ó Séaghdha University of Cambridge
  • Anna Korhonen University of Cambridge

Abstract

We describe a probabilistic framework for acquiring selectional preferences of linguistic predicates and for using the
acquired representations to model the effects of context on word meaning. Our framework uses Bayesian latent-variable
models inspired by, and extending, the well-known Latent Dirichlet allocation (LDA) model of topical structure in documents; when applied to predicate-argument data, topic models automatically induce semantic classes of arguments and assign each predicate a distribution over those classes. We consider LDA and a number of extensions to the model and evaluate them on a variety of semantic prediction tasks, demonstrating that our approach attains state-of-the-art performance . More generally, we argue that probabilistic methods provide an effective and flexible methodology for distributional semantics.

Author Biographies

  • Diarmuid Ó Séaghdha, University of Cambridge

    Research Associate, Computer Laboratory

  • Anna Korhonen, University of Cambridge
    Royal Society University Research Fellow, Computer Laboratory and Department of Theoretical and Applied Linguistics

Published

2024-12-05

Issue

Section

Long paper