Data-driven Parsing using Probabilistic Linear Context-Free Rewriting Systems
Abstract
This paper presents the first efficient implementation of a weighteddeductive CYK parser for Probabilistic Linear Context-Free Rewriting
Systems (PLCFRS). LCFRS, an extension of CFG, can describe discontinuities in a straightforward way and is therefore a natural
candidate to be used for data-driven parsing. To speed up parsing, we
use different context-summary estimates of parse items, some of them allowing for A* parsing. We evaluate our parser with grammars extracted from the German NeGra treebank. Our experiments show that data-driven LCFRS parsing is feasible and yields output of competitive quality.