Aims and objectives

WP 1: Formalizing and implementing RRG (syntax and semantics)

TreeGraSP builds on previous theoretical work concerning the specific form of the syntax-semantics interface using TAG and it will add work on formalizing RRG based on this. RRG is a rich empirically motivated grammar theory so far lacking formalization. By formalizing it, we will gain new insights about the mathematical properties of grammar and we will be able to provide tools for RRG implementation that will be highly useful to the community. On the other hand, insights on grammar architecture, for instance the mechanisms underlying argument linking, can be transferred from RRG to TAG. Our view is that TAG (as a linguistic theory) and RRG share the same semantic grammar specifications while they differ in their syntactic building blocks and in the syntactic composition operations.

The overall goal of WP 1 is an integrated TAG/RRG grammar development framework including a thorough formalization, implementation tools, an implementation of universal parts of the grammar such as a linking theory for the syntax-semantics interface, and a parser. As a test case, we plan the implementation of typologically interesting grammar fragments.

WP 2: Automatic (meta)grammar extraction and parsing (syntax)

WP 2 is concerned with syntactic grammar induction. This comprises on the one hand a supervised induction of elementary trees (for TAG/RRG) from treebanks and on the other hand a metagrammar induction starting from an existing TAG/RRG. In contrast to WP 1, WP 2 is concerned with a data-driven probabilistic approach. The resulting probabilistic (meta)grammar can then, in turn, be used for parsing, which requires the implementation of corresponding probabilistic parsers for TAG and RRG.

WP 3: Semantic Parsing

WP 3 extends the data-driven approach from WP 2 to semantics. Overall, we will pursue an approach inspired by Lewis & Steedman (2013), albeit with TAG/RRG instead of CCG. More concretely, via a probabilistic grammar of tree-meaning pairs, language will be mapped to semantic representations. The semantic representations are frames enriched with logical operators. Furthermore, the elements of the frames, i.e., the event predicates and semantic roles, are mapped to distributional vectors. This allows then to relate these predicates via their distributional properties, which allows for reasoning with these representations.

Lewis, M. & M. Steedman. 2013. Combined Distributional and Logical Semantics. Transactions of the Association for Computational Linguistics 1. 179–192.