Distributional Memory: A General Framework for Corpus-based Semantics

Authors

  • Marco Baroni Center for Mind/Brain Sciences (CIMeC), University of Trento
  • Alessandro Lenci Department of Linguistics, University of Pisa

Abstract

Research into corpus-based semantics has focused on the development
of ad hoc models that treat single tasks, or sets of closely related
tasks, as unrelated challenges to be tackled by extracting different
kinds of distributional information from the corpus. As an
alternative to this "one task, one model" approach, the
Distributional Memory framework extracts distributional information
once and for all from the corpus, in the form of a set of weighted
word-link-word tuples arranged into a third order tensor.  Different
matrices are then generated from the tensor, and their rows and
columns constitute natural spaces to deal with different semantic
problems. In this way, the same distributional information can be
shared across tasks such as modeling word similarity judgments,
discovering synonyms, concept categorization, predicting selectional
preferences of verbs, solving analogy problems, classifying
relations between word pairs, harvesting qualia structures with
patterns or example pairs, predicting the typical properties of
concepts and classifying verbs into alternation classes. Extensive
empirical testing in all these domains shows that a Distributional
Memory implementation performs competitively against task-specific
algorithms recently reported in the literature for the same tasks,
and against our implementations of several state-of-the-art methods.
The Distributional Memory approach is thus shown to be tenable
despite the constraints imposed by its multi-purpose nature.

Author Biographies

  • Marco Baroni, Center for Mind/Brain Sciences (CIMeC), University of Trento
    Center for Mind/Brain Sciences (CIMeC), Rsearcher
  • Alessandro Lenci, Department of Linguistics, University of Pisa
    Department of Linguistics, Researcher

Published

2024-12-05

Issue

Section

Long paper