Bilingual Co-training for Sentiment Classification of Chinese Product Reviews

Xiaojun Wan

Authors

Xiaojun Wan Peking University

Abstract

The lack of reliable Chinese sentiment resources limits the research progress on Chinese sentiment classification. However, there are many freely available English sentiment resources on the Web. This paper focuses on the problem of cross-lingual sentiment classification, which leverages only available English resources for Chinese sentiment classification. We first investigate several basic methods (including lexicon-based methods and corpus-based methods) for cross-lingual sentiment classification by simply leveraging machine translation services for eliminating the language gap, and then propose a co-training approach to making use of both the English view and the Chinese view based on additional unlabeled Chinese data. Experimental results on two test sets show the effectiveness of the proposed approach, which can outperform the basic methods and the transductive methods.

Bilingual Co-training for Sentiment Classification of Chinese Product Reviews

Authors

Abstract

Published

Issue

Section

Make a Submission

Information

Announcements

EACL 2027 - CL deadlines for Qualifying Papers

Special Issue on the Ethics of NLP and CL in Computational Linguistics