Abstract Syntax as Interlingua: from Controlled Languages to Robust Pipelines

Authors

  • Aarne Ranta Department of Computer Science and Engineering, Chalmers University of Technology and University of Gothenburg
  • Krasimir Angelov
  • Normunds Gruzitis
  • Prasanth Kolachina Institute of Mathematics and Computer Science and Engineering, University of Latvia

Abstract

Abstract syntax is an interlingual representation used in compilers.
Grammatical Framework (GF) applies the abstract syntax idea to natural languages.
The development of GF started in 1998, first as a tool for controlled language implementations, where it has gained an established position in both academic and commercial projects.
GF provides grammar resources for over 40 languages, enabling accurate generation and translation, as well as grammar engineering tools and mobile and web interface components for building applications.
On the research side, the focus has since around 2012 been on scaling up GF to wide-coverage language processing.
The concept of abstract syntax offers a unified view on many other approaches: Universal Dependencies, WordNets, FrameNets, Construction Grammars, and Abstract Meaning Representations.
This makes it possible for GF to utilize data from the other approaches and to build robust pipelines.
In return, GF can contribute to data-driven approaches by methods to bootstrap resources from given languages to new ones, to augment data by rule-based generation, to check the consistency of hand-annotated corpora, and to pipe analyses into high-precision semantic back ends.
This paper gives an overview of the use of abstract syntax as interlingua through both established and emerging NLP applications involving GF.

Published

2024-12-05

Issue

Section

Special Issue: Multilingual and Interlingual Semantic Representations for Natural Language Processing