Due to the rise of computers and especially the Internet, there is great demand for various language technological applications such as spelling checkers, search engines, machine translation systems, etc. One application within the field of information retrieval (IR) is that of question answering (QA). Where a regular search engine provides the user with a list of relevant documents, QA aims at answering a question in natural language by returning a list of possible answers.
Until recently the focus of QA has been on closed-class questions (e.g. who-, what- and where-questions). The answer to these questions consists of a closed-class entity, for example a noun phrase. The type of answer that is expected is predicted by the question word used in the question: a who-question like Who is the president of the U.S.A.? expects a person as answer.
Much more complex methods are necessary when considering explanatory questions such as why-questions, which are currently addressed in a project at the Radboud University Nijmegen (In Search of the WHY, see
In the current presentation, we will demonstrate the benefit of using syntactic information for why-QA and will discuss the evaluation of a deep syntactic parser in this context. A syntactic parser is a system able to automatically draw syntactic structures from raw text input.
Subject agency seems to be a relevant cue for answer type determination, since only agents can have a motivation for a certain action. To locate the subject, we need information from a syntactic parser. In the current project we use the deep syntactic parser TOSCA (Oostdijk, 1996), which outputs very detailed syntactic trees. Many features used for answer type determination are based on the TOSCA output. Using perfect syntactic trees, the system is able to assign the correct answer type to almost 80% of 238 questions included in our data set. Without syntax only 70% of the questions are correctly classified, meaning that syntactic information improves the why-QA system.
Of course, automatically derived syntactic parsers may contain parsing errors. Therefore, we evaluated the accuracy of the TOSCA parser, and we investigated the influence of the parsing errors on the answer type determination. The first conclusion is that the performance of TOSCA is hampered by the quality of the output of a pre-parsing step (which includes part-of-speech tagging): only 80% of 238 questions could be parsed, with an average labelled F-score of 78.3%*. Using edited input, the parser performs very well (average labelled F-score of 95.9% for 233 covered questions). Moreover, it appears that despite errors in the latter output trees, the performance of the question analysis module is not affected. This indicates that even automatically derived - and therefore possibly erroneous - syntactic trees can still be beneficial to language technological applications.
* The labelled F-score is the harmonic mean of bracketing precision and recall, a generally accepted parse evaluation metric (e.g. used in Dienes and Dubey (2003)).
Presented at: Student conference TWIST, 13 April 2007, Leiden University, Leiden, the Netherlands.
Slides (pdf; 590kB)
back to presentations and posters