Genetic Programming for Natural Language Parsing.
Lourdes Araujo.
European Conference on Genetic Programming EuroGP-2004.
Lecture Notes in Computer Science 3003,p. 230-239, Springer-Verlag.

The aim of this paper is to prove the effectiveness of the genetic programming
approach in automatic parsing of sentences of real texts. Classical parsing
methods are based on complete search techniques to find the different
interpretations of a sentence. However, the size of the search space increases
exponentially with the length of the sentence or text to be parsed and the size of
the grammar, so that exhaustive search methods can fail to reach a solution in a
reasonable time. This paper presents the implementation of a probabilistic
bottom-up parser based on genetic programming which works with a population
of partial parses, i.e. parses of sentence segments. The quality of the individuals
is computed as a measure of its probability, which is obtained from the probability
of the grammar rules and lexical tags involved in the parse. In the approach
adopted herein, the size of the trees generated is limited by the length of the
sentence. In this way, the size of the search space, determined by the size of the
sentence to parse, the number of valid lexical tags for each words and specially
by the size of the grammar, is also limited.