As analysis and annotation progresses to deeper linguistic levels, matters prove ever more difficult. It not only becomes harder to get machines to provide proper analyses, but also to define exactly what we want. Whereas there appears to be consensus on what plural nouns are (morpho-syntax) or what relative clauses are (syntax), this is certainly not the case for semantic properties like concreteness. When reading papers referring to such concepts, one is unlikely to notice any problems. Bresnan et al. (2007), e.g., just use concreteness of a noun as a given and draw conclusions about the significance of its influence on choices in the dative alternation.
However, once we ourselves attempt to annotate for concreteness, we run headlong into the absence of any clear definition of concreteness. Bresnan refers to Garretson (2003), where all we get is a vague (and somewhat circular) description and some examples. Looking further, we find lists, such as in the MRC Psycholinguistic Database (Coltheart, 1981), as well as procedures, such as Xing et al.ís (2010) procedure based on WordNet, all apparently leading to values for the property concreteness. But we can only wonder to which degree these various definitions/procedures lead to the same results. In this paper, therefore, we take a number of concreteness value yielding procedures and examine a) to which degree they overlap in their annotation of corpus data (here: Semcor) and b) to which degree they lead to the same conclusions about the influence of concreteness on syntactic processes (here: dative alternation).
Presented at: The 21st meeting of Computational Linguistics In the Netherlands (CLIN-21), 11 February 2011, University College Ghent, Ghent, Belgium.
Poster (pdf; 209kB)
back to presentations and posters