Select an Estonian treebank:

Compose a Tgrep2-search for:

HELP (Category definitions) EXAMPLES (Search conventions)

VISL Corp credits copyright publications links Arborest Tgrep2

Preliminary experiments for a CG-based syntactic tree corpus of Estonian

This search interface was designed to demonstrate the results of a pilot project where a VISL-style phrase structure grammar is used to construct syntactic trees from CG-annotated Estonian text. The project is carried out in the context of the Nordic Treebank Network. The sample texts were taken from the CG Annotated corpus of Estonian, which consists of manually corrected output from Kaili M��risep's CG-parser. The experimental PSG module has been written by Eckhard Bick with help from Heli Uibo and Kadri Muischnek. Note that the present set of trees has not yet been revised manually, though the first 149 sentences are all rule-checked and well-formed trees. You can also browse these first trees by id-number in the Arborest interface. The bigger corpus contains 66% completely well-formed trees and 33% trees that were best-guess-mounted from partial analyses. After each sentence, a "forest-indicator" is given: A5/44, for instance, means that reading number 5 was selected from a forest of 44 well-formed readings.

Try the following search for subject noun phrases with internal relative clauses: S:np < D:fcl, or the more general /np/ < D:fcl.