News

On the Limits of LLM Reasoning: When Accuracy Is Not Enough

High accuracy in multiple-choice benchmarks often reflects recall rather than genuine reasoning. By blocking memorization through answer modification, we reveal systematic …

Eva Sánchez Salido

• Jan 28, 2026 • 1 min read

EXIST

EXIST 2026 - Physiological Data for Multimodal Sexism Characterization in Social Media

The sixth edition of the EXIST challenge will be held at CLEF 2026, expanding the study of sexism detection to multimodal data and integrating physiological signals such as EEG, …

Laura Plaza

• Oct 28, 2025 • 1 min read