Deterministic coreference resolution based on entity-centric, precision-ranked rules

Authors

  • Heeyoung Lee Stanford University
  • Angel Chang Stanford University
  • Yves Peirsman University of Leuven
  • Nathanael Chambers United States Naval Academy
  • Mihai Surdeanu Stanford University
  • Dan Jurafsky Stanford University

Abstract

We propose a new deterministic approach to coreference resolution that combines the globalinformation and precise features of modern machine-learning models with the transparencyand modularity of deterministic, rule-based systems. Our sieve architecture applies a battery ofdeterministic coreference models one at a time from highest to lowest precision, where each modelbuilds on the previous model’s cluster output. The two stages of our sieve-based architecture,a mention detection stage that heavily favors recall, followed by coreference sieves that areprecision oriented, offer a powerful way to achieve both high precision and high recall. Further,our approach makes use of global information through an entity-centric model that encouragesthe sharing of features across all mentions that point to the same real-world entity. Despiteits simplicity, our approach gives state-of-the-art performance on several corpora and genres,and has also been incorporated into hybrid state-of-the-art coreference systems for Chinese andArabic. Our system thus offers a new paradigm for combining knowledge in rule-based systemsthat has implications throughout computational linguistics.

Published

2024-12-05

Issue

Section

Short paper