Parsing Chinese Sentences with Grammatical Relations

Authors

  • Weiwei Sun Peking University
  • Yufei Chen Peking University
  • Xiaojun Wan Peking University
  • Meichun Liu City University of Hong Kong

Abstract

We report our work on building linguistic resources and statistical parsers for grammatical relation (GR) analysis of Chinese sentences.
Chinese, as an analytic language, encodes grammatical information in a highly configurational rather than morphological way. %(either inflectional or derivational) way.
Accordingly, it is possible yet reasonable to represent almost all grammatical relations as bi-lexical dependencies.
In this work, we propose to represent the grammatical information using general directed dependency graphs.
Not only local, but also rich long-distance dependencies are explicitly represented.
To create high-quality annotations, we take advantages of an existing TreeBank, viz. Chinese TreeBank (CTB), which is grounded in the Government and Binding theory.
We define a set of linguistic rules to explore CTB's implicit phrase structural information and build deep dependency graphs.
The reliability of this linguistically-motivated GR extraction procedure is highlighted by manual evaluation.
Based on the converted corpus, we study data-driven, including graph- and transition-based, models for Chinese GR parsing.
For graph-based parsing, we propose graph merging, a new perspective, for building flexible dependency graphs:
  Constructing complex graphs via constructing simple subgraphs.
We discuss two key problems in this perspective: (1) how to decompose a complex graph into
simple subgraphs, and (2) how to combine subgraphs into a coherent complex graph.
For transition-based parsing, we introduce a neural parser based on a list-based transition system.
We also discuss several key problems, including dynamic oracle and beam search, in neural transition-based parsing.
Evaluation gauges how successful GR parsing for Chinese can be by applying data-driven models.
The empirical analysis suggests several directions for future study.

Author Biographies

  • Weiwei Sun, Peking University
    Weiwei Sun is an Associate Professor of Computer Science at Institute of Computer Science and Technology, Peking University. She completed her PhD in Department of Computational Linguistics from Saarland University under the supervision of Prof. Hans Uszkoreit. Before that, she studied in Peking University, where she obtained Bachelor of Art (Linguistics), Bachelor of Science (Computer Science) and Master of Science (Computer Science). She worked at the German Research Center of Artificial
    Intelligence (DFKI) as a research assistent and Department of Linguistics and Translation, City University of Hong Kong as a visiting Associate Professor. Her main research topic is statistical parsing, with a special focus on graph-structured syntacto-semantic representations grounded under expressive grammar formalisms. She is also interested in linguistic investigation, with a special focus on Mandarin syntax. Weiwei is regular members of the program committee for the various ACL
    conferences. She is one of area chair (Tagging and Parsing) of ACL 2018.

  • Xiaojun Wan, Peking University
    Xiaojun Wan is a Professor with Institute of Computer Science and Technology (ICST), Peking University (PKU), China. He received B.S. in Information Sciences from Department of Information Management of PKU in 2000, M.S. and Ph.D. in Computer Science from Department of Computer Science and Technology of PKU in 2003 and 2006 respectively.
    Xiaojun's major research interests include Text Mining and Natural Language Processing. He is broadly interested in several research topics including document summarization, text generation, sentiment analysis, semantic computing, document recommendation and bibliometric analysis.
  • Meichun Liu, City University of Hong Kong
    Professor Meichun Liu received her PhD in Linguistics from the University of Colorado at Boulder in 1993. Before joining the Department of Linguistics and Translation, City University of Hong Kong as the Head in August 2015, she taught in the Department of Foreign Languages and Literature, National Chiao-Tung University (NCTU) since 1994 and was promoted to the rank of Professor of Linguistics in 2002. Between 2003 and 2006, she was the Chair of the Department of Foreign Languages and Literature of NCTU. In 2007-08, she was the Director of NCTU Library. In 2013-14, she was the Coordinator of the Teaching Chinese as a Foreign Language Certificate Program. She was also a Visiting Scholar at the Department of Linguistics, University of Colorado at Boulder, and a Visiting Scholar at the Department of East Asian Languages and Cultures, Stanford University.

    Professor Liu has won a number of significant awards in teaching and research, including the NCTU Excellent Teaching Award (2014), NCTU Distinguished Academic Book Publication Award (2013), LST Thesis of the Year Award, Linguistics Society of Taiwan (2011), Pursuit of Excellency Research Grant Award (2007-2011), NCTU Outstanding Teaching Award (2009-10). She has also served as the External Reviewer, Hong Kong Research Grant Council (2011-14), LST Board Member (2010-11), Section Editor, International Journal of Computational Linguistics & Chinese Language Processing (2009-13), as well as reviewer for a number of academic journals, such as Journal of Pragmatics, Language and Linguistics, and Language.

Published

2024-12-05

Issue

Section

Long paper