Learning to Rank Answers to Non-Factoid Questions from Web Collections

Authors

  • Mihai Surdeanu Stanford University
  • Massimiliano Ciaramita Google
  • Hugo Zaragoza Yahoo! Research Barcelona

Abstract

This work investigates the use of linguistically-derived features to improve search, in particular for ranking answers to non-factoid questions. We show that it is possible to exploit existing large collections of question-answer pairs (from online social Question Answering sites) to extract such features and train ranking models which combine them effectively.We investigate a wide range of feature types, some exploiting natural language processing such as coarse word sense disambiguation, named-entity identification, syntactic parsing andsemantic role labeling. Our experiment demonstrates that using them in combination leads to considerable improvements in accuracy. Depending on the system settings we measure relative improvements of 14% to 21% in Precision@1 and Mean Reciprocal Rank.To our knowledge, this is the first experiment that shows that complex natural language modules such as word sense disambiguation and semantic role labeling have a significant impact on large-scale, open-domain non-factoid Question Answering.

Published

2024-12-05

Issue

Section

Short paper