Learning to Rank QA Data: Evaluating Machine Learning Techniques for Ranking Answers to Why-Questions

In this work, we evaluate a number of machine learning techniques for the purpose of ranking answers to why-questions. We use a set of 37 linguistically motivated features that characterize questions and answers. We experiment with a number of machine learning techniques in various settings. The purpose of the experiments is to assess how the different machine learning techniques can cope with our highly imbalanced binary relevance data. We find that with all machine learning techniques, we eventually obtain an MRR score that is significantly above the TF-IDF baseline of 0.25 and not significantly lower than the best score of 0.35. Regression techniques seem the best option for our learning problem.


Reference: Suzan Verberne, Hans van Halteren, Stephan Raaijmakers, Daphne Theijssen and Lou Boves (2009). Learning to Rank QA Data: Evaluating Machine Learning Techniques for Ranking Answers to Why-Questions. Proceedings of the Workshop Learning to Rank for Information Retrieval (LR4IR 2009) at the 32nd Annual ACM Special Interest Group in Information Retrieval conference (SIGIR 2009), pp. 41-48.
Paper (pdf; 180kB) ; BibTeX


back to publications