Can multilinguality improve Biomedical Word Sense Disambiguation?
Andres Duque, Juan Martinez-Romo, Lourdes Araujo
Journal of Biomedical Informatics 64: 320-332 (2016)

Ambiguity in the biomedical domain represents a major issue when performing Natural Language Processing tasks over
the huge amount of available information in the field. For this reason, Word Sense Disambiguation is critical for
achieving accurate systems able to tackle complex tasks such as information extraction, summarization or document
classification. In this work we explore whether multilinguality can help to solve the problem of ambiguity, and the
conditions required for a system to improve the results obtained by monolingual approaches. Also, we analyze the
best ways to generate those useful multilingual resources, and study different languages and sources of knowledge.
The proposed system, based on co-occurrence graphs containing biomedical concepts and textual information, is
evaluated on a test dataset frequently used in biomedicine. We can conclude that multilingual resources are able to
provide a clear improvement of more than 7% compared to monolingual approaches, for graphs built from a small
number of documents. Also, empirical results show that automatically translated resources are a useful source
of information for this particular task.