The Treebank Semantics Parsed Corpus (TSPC)

Author: Alastair Butler


The Treebank Semantics Parsed Corpus (TSPC) was built as a testing ground for generating predicate logic based meaning representations with Treebank Semantics.

The easiest way to obtain all the annotated data is from the GitHub repository.

The parsed annotation follows a scheme informed by both the Penn Historical Corpora scheme (adopting tag labels, construction analysis, and CorpusSearch format), and the SUSANNE scheme (adopting construction analysis, functional and grammatical information, and the forming of complex expressions).

Highlights include:

The parsed data, and further results of analysis (e.g., derived indexing, word dependencies, generated semantic representations), are made accessible through a web based interface.

Development was funded by the Japan Science and Technology Agency (JST) and the Japan Society for the Promotion of Science (JSPS).