Query Expansion with an Automatically Generated Thesaurus.
Jose R. Pérez Agüera, Lourdes Araujo.
Proc. Int. Conf. on Intelligent Data Engineering and Automated Learning (IDEAL 2006)
LNCS 4224, pp. 771-778, Springer-Verlag (2006).

This paper describes a new method to automatically obtain a new thesaurus which exploits
previously collected information. Our method relies on different resources, such as a text
collection, a set of source thesauri and other linguistic resources. We have applied different
techniques in the different phases of the process. By applying indexing techniques, the text
collection provides the set of initial terms of interest for the new thesaurus. Then, these
terms are searched in the source thesauri, providing the initial structure of the new thesaurus.
Finally, the new thesaurus is enriched by searching for new relationships among its terms.
These relationships are first detected using similarity measures and then are characterized
with a type (equivalence, hierarchy or associativity) by using different linguistic resources.
We have based the system evaluation on the results obtained with and without the thesaurus
in an information retrieval task proposed by the Cross-Language Evaluation Forum (CLEF).
The results of these experiments have revealed a clear improvement of the performance.