A keyphrase-based approach for interpretable ICD-10 code classification of Spanish medical reports.
Andres Duque Fernandez, Hermenegildo Fabregat, Lourdes Araujo, Juan Martinez-Romo
Artificial Intelligence in Medicine 121: 102177 (2021)

Background and objectives
The 10th version of International Classification of Diseases (ICD-10) codification system has been widely
adopted by the health systems of many countries, including Spain. However, manual code assignment of Electronic
Health Records (EHR) is a complex and time-consuming task that requires a great amount of specialised human
resources. Therefore, several machine learning approaches are being proposed to assist in the assignment task.
In this work we present an alternative system for automatically recommending ICD-10 codes to be assigned to EHRs.
Methods
Our proposal is based on characterising ICD-10 codes by a set of keyphrases that represent them. These keyphrases
do not only include those that have literally appeared in some EHR with the considered ICD-10 codes assigned, but
also others that have been obtained by a statistical process able to capture expressions that have led the
annotators to assign the code.
Results
The result is an information model that allows to efficiently recommend codes to a new EHR based on their textual
content. We explore an approach that proves to be competitive with other state-of-the-art approaches and can be
combined with them to optimise results.
Conclusions
In addition to its effectiveness, the recommendations of this method are easily interpretable since the phrases
in an EHR leading to recommend an ICD-10 code are known. Moreover, the keyphrases associated with each ICD-10
code can be a valuable additional source of information for other approaches, such as machine learning techniques.