RT Conference Proceedings
T1 I2C-Huelva at IberLEF-2024 DETESTS-Dis: Learning from Divergence to Identify Explicit and Implicit Racial Stereotypes in Spanish Texts
A1 Cerrejón Naranjo, Manuel
A1 Guerrero García, Manuel
A1 Mata Vázquez, Jacinto
A1 Pachón Álvarez, Victoria
AB This paper was presented at the I International Workshop on Conspiracy theories and hate speech online: Comparison of patterns in narratives and social media about Covid 19, immigrants, refugees and LGTBIQ+ people. Universidad de Huelva, July 12 14, 2023 (https://eventos.uhu.es/99642/detail/i-international-workshop-nonconspirahate-project.html). This paper presents the approaches developed for detecting and identifying racial stereotypes in Spanish texts using advanced Natural Language Processing (NLP) and Deep Learning techniques, incorporating Learning with Disagreement for enhanced robustness. The major contribution of this work is the demonstration of the effectiveness of transformer-based ensemble classifiers to recognize both explicit and implicit stereotypes. By leveraging the strengths of multiple models, the proposed method achieves better performance than using a single model alone. Additionally, the importance of selecting appropriate hyperparameters during the model training process was highlighted by the results. Through rigorous experimentation and evaluation, optimal hyperparameter combination where identified. In our experiments, we utilized a preprocessed and annotated corpus of Spanish texts and applied data augmentation techniques, such as back-translation, to balance the dataset. Furthermore, we incorporated the ”Learning With Disagreement” (LeWiDi) approach, which uses the discrepancies between different models to improve the classification system. The results obtained demonstrate significant improvements in F1-Score, underscoring the potential application of these methods in moderating content on social media and other digital platforms. With this strategy, we achieved second place in Task 1 using an ensemble consisting of 3 models, one for each annotator, based on RoBERTa. In Task 2, we reached the seventh position, using the same approach.The paper is part of the I+D+i Project titled "Conspiracy Theories and Hate Speech Online: Comparison of Patterns in Narratives and social networks about COVID-19, immigrants, refugees, and LGBTI people [NON-CONSPIRA-HATE!]", PID2021-123983OB-I00, funded by MCIN/AEI/10.13039/501100011033/ and by "ERDF/EU." (https://eseis.es/investigacion/discursos-de-odio/discursos-odio-tc). We are also grateful for the support of our research group: "Estudios Sociales E Intervención Social" (GrupoESEIS), and the research center "Pensamiento Contemporáneo e Innovación para el Desarrollo Social" (COIDESO), and the Applied Computational Social Science Lab, CISCOA-Lab, at the University of Huelva.
PB CEUR-WS
YR 2024
FD 2024
LK https://hdl.handle.net/10272/24526
UL https://hdl.handle.net/10272/24526
LA eng
NO Cerrejón-Naranjo, M; Guerrero-García, M.; Mata-Vázquez, J., & Pachón-Álvarez, V. (2024). I2C-Huelva at IberLEF-2024 DETESTS-Dis: Learning from Divergence to Identify Explicit and Implicit Racial Stereotypes in Spanish Texts. In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2024) colocated with the Conference of the Spanish Society for Natural Language Processing (SEPLN 2024), Valladolid, Spain, September 24, 2024. CEUR Workshop Proceedings 3756
NO This paper was presented at the I International Workshop on Conspiracy theories and hate speech online: Comparison of patterns in narratives and social media about Covid 19, immigrants, refugees and LGTBIQ+ people. Universidad de Huelva, July 12 14, 2023 (https://eventos.uhu.es/99642/detail/i-international-workshop-nonconspirahate-project.html). Este trabajo presenta los enfoques desarrollados para detectar e identificar estereotipos raciales en textos en español utilizando técnicas avanzadas de Procesamiento del Lenguaje Natural (PLN) y Deep Learning, incorporando el Aprendizaje con Desacuerdo para mejorar la robustez. La principal contribución de este trabajo es la demostración de la eficacia de los clasificadores ensemble basados en transformadores para reconocer estereotipos tanto explícitos como implícitos. Al aprovechar los puntos fuertes de varios modelos, el método propuesto consigue mejores resultados que si se utiliza un único modelo. Además, los resultados ponen de manifiesto la importancia de seleccionar los hiperparámetros adecuados durante el proceso de entrenamiento del modelo. Mediante una experimentación y evaluación rigurosas, se identificó la combinación óptima de hiperparámetros. En nuestros experimentos, utilizamos un corpus preprocesado y anotado de textos en español y aplicamos técnicas de aumento de datos, como la retrotraducción, para equilibrar el conjunto de datos.  Además, incorporamos el enfoque «Learning With Disagreement» (LeWiDi), que utiliza las discrepancias entre distintos modelos para mejorar el sistema de clasificación. Los resultados obtenidos demuestran mejoras significativas en F1-Score, subrayando la potencial aplicación de estos métodos en la moderación de contenidos en redes sociales y otras plataformas digitales. Con esta estrategia, alcanzamos el segundo puesto en la Tarea 1 utilizando un ensemble formado por 3 modelos, uno para cada anotador, basado en RoBERTa. En la Tarea 2, alcanzamos la séptima posición, utilizando el mismo enfoque.The paper is part of the I+D+i Project titled "Conspiracy Theories and Hate Speech Online: Comparison of Patterns in Narratives and social networks about COVID-19, immigrants, refugees, and LGBTI people [NON-CONSPIRA-HATE!]", PID2021-123983OB-I00, funded by MCIN/AEI/10.13039/501100011033/ and by "ERDF/EU." (https://eseis.es/investigacion/discursos-de-odio/discursos-odio-tc). We are also grateful for the support of our research group: "Estudios Sociales E Intervención Social" (GrupoESEIS), and the research center "Pensamiento Contemporáneo e Innovación para el Desarrollo Social" (COIDESO), and the Applied Computational Social Science Lab, CISCOA-Lab, at the University of Huelva.
NO Proyecto PID2021-123983OB-I0 [NON-CONSPIRA-HATE!], financiado por MCIN/AEI/10.13039/501100011033/ y por ERDF/EU.
DS Repositorio Institucional de la Universidad de Huelva
RD 14 jul 2026