Search conventions

In this Tgrep-based interface, a search string consists of a sequence of node- or word searches, separated by dots ( . ). Both function and form can be searched for, separated by ':'. Consider the following examples:

begyndte . Od:icl (Danish)
viagem . DN:pp (Portuguese)

Patterns may contain only parts of words and nodes, or even regular expressions, if enclosed in slashes:

/klasse/ or /klasse.*/ (Danish)
/DN:fcl/ or /^DN:.*cl/
/issez$/
or /^.*issez$/ (French)

Single dots mark immediate adjacency, while double dots allow for intervening material. Dollar dots ( $. or $.. ) join nodes as sisters under the same mother node. Thus, /Od:np/ $. /S:/ looks for fronted np-objects, while the somewhat less useful /Od:np/ . /S:/ also matches subjects from another, adjacent clause.

Nodes, words or complex patterns can be related to each other with CG-style dependency arrows.

DN:fcl < Od:icl
DN:fcl << /Dfoc/
/P:/ << /souhait.*/ . Od:fcl
(French)

Dependency markers (<, <<) have precedence over sequence markers (.). Parentheses can be introduced to overrule normal precedence:

DN:fcl < (P:vp . fA:pp)

Note that the highlight function in text - whole sentence-format, as well as the result output in section format, expects subtree-searches. Thus, if a sequence of adjacent same-level subtrees is given as the search pattern, only the first subtree will be highlighted. Thus,in DN:pp . DN:fcl, only the DN:pp will be highlighted. Complete highlight can be achieved by adding a mother-node: /np/ < (DN:pp . DN:fcl).

* << (/spist?er?/ . /Od/) (any mother node)
* < (/P:/ < /spist?er?/ . /Od/) (immediate mother node, forces sisterhood and works only with intermediate P-node!)
* < (/P:/ < /spist?er?/ $.. /Od/) (same as before, but with explicit sisterhood and allowing for intervening nodes [S, fA etc.])

The last examples show how verb complementation patterns and selection restrictions can be extracted from a treebank - a lexicographer's dream.