I2C-UHU at IberLEF-2023 HOMO-MEX task: Ensembling Transformers Models to Identify and Classify Hate Messages Towards the Community LGBTQ+

Morano Moriña, Antonio; Román Pásaro, Javier; Mata Vázquez, Jacinto; Pachón Álvarez, Victoria

I2C-UHU at IberLEF-2023 HOMO-MEX task: Ensembling Transformers Models to Identify and Classify Hate Messages Towards the Community LGBTQ+

dc.contributor.author	Morano Moriña, Antonio
dc.contributor.author	Román Pásaro, Javier
dc.contributor.author	Mata Vázquez, Jacinto
dc.contributor.author	Pachón Álvarez, Victoria
dc.date.accessioned	2024-11-27T08:34:44Z
dc.date.available	2024-11-27T08:34:44Z
dc.date.issued	2023
dc.description	This paper was presented at the I International Workshop on Conspiracy theories and hate speech online: Comparison of patterns in narratives and social media about Covid 19, immigrants, refugees and LGTBIQ+ people. Universidad de Huelva, July 12 14, 2023 (https://eventos.uhu.es/99642/detail/i-international-workshop-nonconspirahate-project.html). Este artículo presenta los enfoques propuestos por el Grupo I2C para abordar la tarea HOMO-MEX de IberLef-2023: Detección de discursos de odio en mensajes en línea dirigidos a la población LGBTQ+ hispanohablante de México. La principal contribución ha sido la demostración de la eficacia de utilizar un conjunto de clasificadores basados en transformadores. Al combinar varios modelos, se aprovecharon los puntos fuertes individuales, dando como resultado un mejor rendimiento en comparación con el uso de un único modelo. Además, la importancia de seleccionar los hiperparámetros adecuados durante el proceso de entrenamiento del modelo. resultados. Mediante una meticulosa experimentación y evaluación de distintas combinaciones de hiperparámetros, se identificaron los ajustes que alcanzaron el mejor rendimiento para las tareas en cuestión. En nuestros experimentos para ambas tareas hemos probado varios modelos y decidimos ensamblar los tres modelos que proporcionaron la mejor puntuación F1 para este conjunto de datos. Además, para la Tarea 2 decidimos entrenar clasificadores binarios individuales para cada clase en lugar de hacer un clasificador multietiqueta. El modelo presentado para la Tarea 1 alcanzó una puntuación F1 del 83,25%, situándose en el 6º puesto de la competición. El modelo para la Tarea 2 alcanzó una puntuación F1 de 69,60%, situándose en el primer puesto de la competición. The paper is part of the I+D+i Project titled "Conspiracy Theories and Hate Speech Online: Comparison of Patterns in Narratives and social networks about COVID-19, immigrants, refugees, and LGBTI people [NON-CONSPIRA-HATE!]", PID2021-123983OB-I00, funded by MCIN/AEI/10.13039/501100011033/ and by "ERDF/EU." (https://eseis.es/investigacion/discursos-de-odio/discursos-odio-tc). We are also grateful for the support of our research group: "Estudios Sociales E Intervención Social" (GrupoESEIS), and the research center "Pensamiento Contemporáneo e Innovación para el Desarrollo Social" (COIDESO), and the Applied Computational Social Science Lab, CISCOA-Lab, at the University of Huelva.	es_ES
dc.description.abstract	This paper was presented at the I International Workshop on Conspiracy theories and hate speech online: Comparison of patterns in narratives and social media about Covid 19, immigrants, refugees and LGTBIQ+ people. Universidad de Huelva, July 12 14, 2023 (https://eventos.uhu.es/99642/detail/i-international-workshop-nonconspirahate-project.html). This paper presents the approaches proposed for I2C Group to address the IberLef-2023 Task HOMO-MEX: Hate speech detection in Online Messages directed tOwards the MEXican spanish speaking LGBTQ+ population. The major contribution has been the demonstration of the effectiveness of using an ensemble of classifiers based on transformers. By combining multiple models, the individual strengths were leveraged, resulting in improved performance compared to using a single model. Furthermore, the significance of selecting appropriate hyperparameters during the model training process was underscored by the results. Through meticulous experimentation and evaluation of different hyperparameter combinations, the settings that reached the best performance for the given tasks were identified. In our experiments for both tasks we have tested several models and decided to ensemble the three models that provided the best F1-Score for this dataset. Additionally, for Task 2 we decided to train individual binary classifiers for each class instead of making a multilabel classifier. The model submitted for Task 1 achieved a F1-Score of 83,25%, ranking in the 6th place of the competition. The model for the Task 2 reached a F1-Score of 69,60%, ranking in the 1st place of the competition. The paper is part of the I+D+i Project titled "Conspiracy Theories and Hate Speech Online: Comparison of Patterns in Narratives and social networks about COVID-19, immigrants, refugees, and LGBTI people [NON-CONSPIRA-HATE!]", PID2021-123983OB-I00, funded by MCIN/AEI/10.13039/501100011033/ and by "ERDF/EU." (https://eseis.es/investigacion/discursos-de-odio/discursos-odio-tc). We are also grateful for the support of our research group: "Estudios Sociales E Intervención Social" (GrupoESEIS), and the research center "Pensamiento Contemporáneo e Innovación para el Desarrollo Social" (COIDESO), and the Applied Computational Social Science Lab, CISCOA-Lab, at the University of Huelva.	es_ES
dc.description.department	Tecnologías de la Información	es_ES
dc.description.researchgroup	G.I. ESEIS, Estudios Sociales e Intervención Social (SEJ-216)	es_ES
dc.description.sponsorship	Proyecto PID2021-123983OB-I0 [NON-CONSPIRA-HATE!], financiado por MCIN/AEI/10.13039/501100011033/ y por ERDF/EU.	es_ES
dc.identifier.citation	Morano-Moriña, J; Román-Pásaro, J.; Mata-Vázquez, J., & Pachón-Álvarez, V. (2024). I2C-UHU at IberLEF-2023 HOMO-MEX task: Ensembling Transformers Models to Identify and Classify Hate Messages Towards the Community LGBTQ+. In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2023) colocated with the Conference of the Spanish Society for Natural Language Processing (SEPLN 2023), Jaén, Spain, September 26, 2023. CEUR Workshop Proceedings 3496
dc.identifier.uri	https://hdl.handle.net/10272/24524
dc.language.iso	eng	es_ES
dc.publisher	CEUR-WS	es_ES
dc.rights	Atribución-NoComercial-SinDerivadas 3.0 España	*
dc.rights.accessRights	open access	es_ES
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/es/	*
dc.subject.other	Deep Learning	es_ES
dc.subject.other	Transformers	es_ES
dc.subject.other	Ensembler	es_ES
dc.subject.other	Hypermarameter	es_ES
dc.subject.other	Twitter	es_ES
dc.subject.other	LGBT-Phobia	es_ES
dc.subject.other	Hate Speech Detection	es_ES
dc.subject.other	Natural Language Processing	es_ES
dc.subject.other	Aprendizaje profundo	es_ES
dc.subject.unesco	33 Ciencias Tecnológicas	es_ES
dc.title	I2C-UHU at IberLEF-2023 HOMO-MEX task: Ensembling Transformers Models to Identify and Classify Hate Messages Towards the Community LGBTQ+	es_ES
dc.type	conference output	es_ES
dspace.entity.type	Publication
relation.isAuthorOfPublication	ac76819b-d91a-4158-b947-4a9e827e5e9d
relation.isAuthorOfPublication	47cb4892-3513-4d33-953c-8521bc9cb187
relation.isAuthorOfPublication.latestForDiscovery	ac76819b-d91a-4158-b947-4a9e827e5e9d

Files

Original bundle

Now showing 1 - 1 of 1

Name:: homomex-paper3.pdf
Size:: 902.29 KB
Format:: Adobe Portable Document Format
Description:

Download

Collections

Ponencias, comunicaciones y pósteres