Greedy Transition-based Dependency Parsing with Stack-LSTMs

Authors

  • Miguel Ballesteros Universitat Pompeu Fabra. Carnegie Mellon University. http://orcid.org/0000-0003-3949-0361
  • Chris Dyer Carnegie Mellon University.
  • Yoav Goldberg Bar Ilan University.
  • Noah A. Smith University of Washington.

Abstract

We introduce a greedy transition-based parser that learns representations of parser states using recurrent neural networks. Our primary innovation that enables us to do this efficiently is a new control structure for sequence-to-sequence neural networks---the stack LSTM. Like the conventional stack data structures used in transition-based parsing, elements
can be pushed to or popped from the top of the stack in constant time, but, in
addition, an LSTM maintains a continuous space embedding of the stack contents.
We demonstrate that the stack LSTM help to capture three facets of
a parser's state in an efficient way: (i) unbounded look-ahead into the buffer of incoming words, (ii) the complete history of transition actions taken by the parser,
and (iii) the complete contents of the stack of partially built tree
fragments, including their internal structures. In addition, we
compare two different word representations: (i) standard word vectors
based on look-up tables and (ii) character-based models of words. While the former vector model works well in all languages, the latter improves the
handling of out-of-vocabulary words particularly in morphologically rich languages. Finally, we discuss the use of dynamic oracles in training the parser. These yield state-of-the-art performance, showing that it is possible to achieve one of the best-published results with a linear greedy transition-based parser.

Author Biographies

  • Miguel Ballesteros, Universitat Pompeu Fabra. Carnegie Mellon University.
    Postdoctoral researcher at Universitat Pompeu Fabra and Visiting research scholar at Language Technologies Institute. Carnegie Mellon University
  • Chris Dyer, Carnegie Mellon University.
    Assistant Professor at Language Technologies Institute at Language Technologies Institute. Carnegie Mellon University
  • Yoav Goldberg, Bar Ilan University.
    Senior Lecturer at Bar Ilan University's Computer Science Department
  • Noah A. Smith, University of Washington.
    Associate Professor at the University of Washington

Published

2024-12-05

Issue

Section

Long paper