I2C-UHU at CLEF-2023 EXIST task: Leveraging Ensembling Language Models to Detect Multilingual Sexism in Social Media
Loading...
Publication date
Advisors
Department
Research group
Center
Abstract
This paper describes the approaches developed by the I2C Group to participate on sub-task 1 in the CLEF 2023 task EXIST: sEXism Identification in Social neTworks. Our main contribution is to show the benefits of translating a bilingual dataset to a single language, as well as the effectiveness of using a group of classifiers based on transformers architecture. By combining different models, the individual advantages were exploited, resulting in better performance than using a single model. Moreover, the importance of choosing suitable hyperparameters during the model training process was highlighted by the results. Through careful experimentation and evaluation of different hyperparameter combinations, the settings that achieved the best performance for the given task were found. In our experiments we f ine-tuned several pre-trained language models and decided to ensemble the three models that reached the best F1-scores. With this approach, we achieved an ICM-Hard score of 0.5075, ranking 25th in the competition.
Keywords
Unesco Subjects
Bibliographic citation
Cordón Hidalgo, Pablo Mata Vázquez, Jacinto Pachón Álvarez, Victoria Domínguez Olmedo, Juan Luis. In Conference and Labs of the Evaluation Forum, September 18–21, 2023, Thessaloniki, Greece














