Feature-Frequency-Adaptive Online Training for Natural Language Processing with Massive Features

Authors

  • Xu Sun Peking University
  • Wenjie Li
  • Houfeng Wang
  • Qin Lu

Abstract

Large-scale natural language processing (NLP) systems are computationally expensive. In many real-world applications, we further need to optimize very high dimensional model parameters. The heavy NLP models together with high dimensional parameters lead to a challenging problem on model training, which may require week-level training time even with fast computing machines. To solve this problem, we present a new training method, feature-frequency-adaptive online training, for fast and accurate training of NLP systems with high dimensional features/parameters. Theoretical analysis shows that the proposed method is convergent, and with fast convergence rate. Experiments are performed based on well-known benchmark NLP tasks, including named entity recognition, word segmentation, and phrase chunking. Experimental results demonstrate that the proposed method not only achieves significantly better accuracy than existing methods, but also with much faster training speed. Other than the comparisons on the convergence state, we also perform evaluation on 1 pass learning, and results demonstrate that the proposed method also outperforms the existing methods on 1 pass learning. Finally, for all of the three benchmark tasks, the proposed method achieves better accuracy than the existing best reports to our knowledge.

Author Biography

  • Xu Sun, Peking University

    Professor,

    department of CS,

    Peking University

Published

2024-12-05

Issue

Section

Short paper