Stochastic Parsing and Evolutionary Algorithms.
Lourdes Araujo.
Applied Artificial Intelligence 23(4) (2009), pp. 275-303.

This article aims to show the effectiveness of evolutionary algorithms in automatically parsing sentences of
real texts. Parsing methods based on complete search techniques are limited by the exponential increase of the
size of the search space with the size of the grammar and the length of the sentences to be parsed. Approximated
methods, such as evolutionary algorithms, can provide approximate results, adequate to deal with the indeterminism
that ambiguity introduces in natural language processing. This work investigates different alternatives to implement
an evolutionary bottom-up parser. Different genetic operators have been considered and evaluated. We focus on statistical
parsing models to establish preferences among different parses. It is not our aim to propose a new statistical model for
parsing but a new algorithm to perform the parsing once the model has been defined. The training data are extracted from
syntactically annotated corpora (treebanks) which provide sets of lexical and syntactic tags as well as the grammar in
which the parsing is based. We have tested the system with two corpora: Susanne and Penn Treebank, obtaining very
encouraging results.