In recent years large amounts of electronic texts have become available. While the first of these corpora had only a low level
of annotation, the more recent ones are annotated with refined syntactic information. To make these rich annotations accessible
for linguists, the development of query systems has become an important goal. One of the main difficulties in this task consists
in the choice of the right query language, a language which at the same time should be powerful enough to let users formulate
the queries they want and which should be efficiently evaluable to keep query response times short. There is a widespread
belief that such a query language does not exist. It is therefore the aim of this paper to show that there is indeed a powerful
query language that can be efficiently evaluated. We propose the use of monadic second-order logic as a query language. We
show that a query in this language can be evaluated in linear time in the size of a tree in the corpus. We also provide examples
of complicated linguistic queries expressed in monadic second-order logic thereby demonstrating the high expressive power
of the language.
Key words Complexity theory – monadic second-order logic – query – treebank