I2C-Huelva at IberLEF-2024 DETESTS-Dis: Learning from Divergence to Identify Explicit and Implicit Racial Stereotypes in Spanish Texts
Loading...
Publication date
Advisors
Department
Research group
Center
Abstract
This paper was presented at the I International Workshop on Conspiracy theories and hate speech online: Comparison of patterns in narratives and social media about Covid 19, immigrants, refugees and LGTBIQ+ people. Universidad de Huelva, July 12 14, 2023 (https://eventos.uhu.es/99642/detail/i-international-workshop-nonconspirahate-project.html).
This paper presents the approaches developed for detecting and identifying racial stereotypes in Spanish texts using advanced Natural Language Processing (NLP) and Deep Learning techniques, incorporating Learning with Disagreement for enhanced robustness. The major contribution of this work is the demonstration of the effectiveness of transformer-based ensemble classifiers to recognize both explicit and implicit stereotypes. By leveraging the strengths of multiple models, the proposed method achieves better performance than using a single model alone. Additionally, the importance of selecting appropriate hyperparameters during the model training process was highlighted by the results. Through rigorous experimentation and evaluation, optimal hyperparameter combination where identified. In our experiments, we utilized a preprocessed and annotated corpus of Spanish texts and applied data augmentation techniques, such as back-translation, to balance the dataset. Furthermore, we incorporated the ”Learning With Disagreement” (LeWiDi) approach, which uses the discrepancies between different models to improve the classification system. The results obtained demonstrate significant improvements in F1-Score, underscoring the potential application of these methods in moderating content on social media and other digital platforms. With this strategy, we achieved second place in Task 1 using an ensemble consisting of 3 models, one for each annotator, based on RoBERTa. In Task 2, we reached the seventh position, using the same approach.
The paper is part of the I+D+i Project titled "Conspiracy Theories and Hate Speech Online: Comparison of Patterns in Narratives and social networks about COVID-19, immigrants, refugees, and LGBTI people [NON-CONSPIRA-HATE!]", PID2021-123983OB-I00, funded by MCIN/AEI/10.13039/501100011033/ and by "ERDF/EU." (https://eseis.es/investigacion/discursos-de-odio/discursos-odio-tc). We are also grateful for the support of our research group: "Estudios Sociales E Intervención Social" (GrupoESEIS), and the research center "Pensamiento Contemporáneo e Innovación para el Desarrollo Social" (COIDESO), and the Applied Computational Social Science Lab, CISCOA-Lab, at the University of Huelva.
Unesco Subjects
Bibliographic citation
Cerrejón-Naranjo, M; Guerrero-García, M.; Mata-Vázquez, J., & Pachón-Álvarez, V. (2024). I2C-Huelva at IberLEF-2024 DETESTS-Dis: Learning from Divergence to Identify Explicit and Implicit Racial Stereotypes in Spanish Texts. In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2024) colocated with the Conference of the Spanish Society for Natural Language Processing (SEPLN 2024), Valladolid, Spain, September 24, 2024. CEUR Workshop Proceedings 3756














