Part-of-Speech Tagging with Evolutionary Algorithms.
Lourdes Araujo.
International Conference on Intelligent Text Processing and Computational Linguistics CICLing-2002.
Lecture Notes in Artificial Intelligence 2276. pp. 230-239, Springer-Verlag.

This paper presents a part-of-speech tagger based on a genetic
algorithm which, after the ``evolution'' of a population of
sequences of tags for the words in the text, selects the best
individual as solution. The paper describes the main issues
arising in the algorithm, such as the chromosome representation
and the evaluation and design of genetic operators for crossover
and mutation. A probabilistic model, based on the context of
each word (the tags of the surrounding words) has been devised
in order to define the fitness function. The model has been
implemented and different issues have been investigated: size of
the training corpus, effect of the contexts size, and parameters
of the evolutionary algorithm, such as population size and
crossover and mutation rates. The accuracy obtained with this
method is comparable to that of other probabilistic approaches,
but evolutionary algorithms are more efficient in obtaining the
results.