Previous research has shown that the high level of detail in syntactic trees produced by the TOSCA parsing system (Oostdijk 1996) is beneficial to why-question answering (QA) (Verberne et al. 2006). TOSCA is an interactive system, i.e. it needs human verification after automatic tagging and parsing. Since only manually corrected TOSCA output has been offered to the why-QA system until now, TOSCA needs extrinsic evaluation of its use in the why-QA system. In this paper we present a necessary step towards it, namely an intrinsic evaluation of the performance of TOSCA on why-questions, which also enables us to trace elements in the parser that leave room for improvement. The evaluation shows that the modularity of the current TOSCA system has a dramatic effect on its performance: Tagging errors and missing syntactic markers radically decrease the coverage and the Parseval scores. Applying the Leaf-Ancestor Assessment metric for parser evaluation, we conclude that the level of detail does not really affect parser accuracy. This stimulates the automatic use of the parsing component in TOSCA for the purpose of why-QA. A new version of TOSCA is under construction, in which the level of detail in the parses is maintained, while there is no longer a need to separately provide POS tags or insert any syntactic markers.
Reference: Daphne Theijssen, Suzan Verberne, Nelleke Oostdijk and Lou Boves (2007). Evaluating Deep Syntactic Parsing: Using TOSCA for the analysis of why-questions. Peter Dirix, Ineke Schuurman, Vincent Vandeghinste and Frank Van Eynde (eds.), Computational Linguistics in the Netherlands 2007, pp. 115-130.
Paper (pdf; 113kB) ; BibTeX
back to publications