Multi-CAST: Research and publications

The data in Multi-CAST has been annotated with the aim of facilitating cross-linguistic research of the grammar/discourse interface. In particular, Multi-CAST provides quantifiable data for investigating the way referential expressions are encoded in natural language. Typical research questions, including a number of sample approaches to the data, are outlined in the Multi-CAST research context.


Along with transcriptions and word-for-word translations, the texts in Multi-CAST have been annotated with the GRAID (Grammatical Relations and Animacy in Discourse) annotation scheme, developed by Geoffrey Haig and Stefan Schnell. GRAID provides a uniform set of tags and a simple combinatory syntax, and was designed to be applicable to a typologically diverse spectrum of languages.

GRAID is intended to facilitate quantitative cross-corpus investigations targetting, for example, Referential Density (Bickel 2003) or Preferred Argument Structure (Du Bois 1987, 2003). An overview of the system is provided in the GRAID Manual 7.0 (Haig & Schnell 2014).

Publications utilising Multi-CAST

Publications and presentations which make use of data from Multi-CAST or its prototypes are collected here. If you have employed Multi-CAST in your research and would like to see your work included in this list, please contact Geoffrey Haig or Stefan Schnell.

Haig, Geoffrey & Schnell, Stefan. 2016. The discourse basis of ergativity revisited. Language 92(3). 591–618. (DOI: 10.1353/lan.2016.0049)