Evolutionary Algorithm for Noun Phrase Detection in Natural Language Processing.
J. Ignacio Serrano, Lourdes Araujo
Proc. of the 2005 IEEE Congress on Evolutionary Computation (CEC 2005)
pp. 640-647, IEEE Press (2005).

Noun phrases of a document usually are the main information bearers. Thus, the detection of these
units is crucial in many applications related to information retrieval, such as collecting relevant documents
by search engines according to a user query, text summarizing, etc. We present an evolutionary
algorithm for obtaining a probabilistic finite-state automaton, able to recognize valid noun phrases
defined as a sequence of lexical categories. This approach is highly flexible in the sense that the
automaton is able to recognize noun phrases similar enough to the ones given by the inferred noun
phrase grammar. This flexibility can be allowed thanks to the very accurate set of probabilities provided
by the evolutionary algorithm. It works with both, positive and negative examples of the language, thus
improving the system coverage, while maintaining its precision. Experimental results show a clear
improvement of the performance with respect to others systems.