I2C-UHU at CLEF-2023 EXIST task: Leveraging Ensembling Language Models to Detect Multilingual Sexism in Social Media

dc.contributor.authorCordón Hidalgo, Pablo
dc.contributor.authorMata Vázquez, Jacinto
dc.contributor.authorPachón Álvarez, Victoria
dc.contributor.authorDomínguez Olmedo, Juan Luis
dc.date.accessioned2024-09-20T07:10:33Z
dc.date.available2024-09-20T07:10:33Z
dc.date.issued2023
dc.description.abstractThis paper describes the approaches developed by the I2C Group to participate on sub-task 1 in the CLEF 2023 task EXIST: sEXism Identification in Social neTworks. Our main contribution is to show the benefits of translating a bilingual dataset to a single language, as well as the effectiveness of using a group of classifiers based on transformers architecture. By combining different models, the individual advantages were exploited, resulting in better performance than using a single model. Moreover, the importance of choosing suitable hyperparameters during the model training process was highlighted by the results. Through careful experimentation and evaluation of different hyperparameter combinations, the settings that achieved the best performance for the given task were found. In our experiments we f ine-tuned several pre-trained language models and decided to ensemble the three models that reached the best F1-scores. With this approach, we achieved an ICM-Hard score of 0.5075, ranking 25th in the competition.es_ES
dc.identifier.citationCordón Hidalgo, Pablo Mata Vázquez, Jacinto Pachón Álvarez, Victoria Domínguez Olmedo, Juan Luis. In Conference and Labs of the Evaluation Forum, September 18–21, 2023, Thessaloniki, Greecees_ES
dc.identifier.issn1613-0073
dc.identifier.urihttps://hdl.handle.net/10272/24206
dc.language.isoenges_ES
dc.publisherCEUR-WSes_ES
dc.rightsAtribución-NoComercial-SinDerivadas 3.0 España*
dc.rights.accessRightsopen accesses_ES
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/*
dc.subject.otherHate Speeches_ES
dc.subject.otherDeep Learninges_ES
dc.subject.otherTransformerses_ES
dc.subject.otherHyperparameteres_ES
dc.subject.otherEnsembleses_ES
dc.subject.otherTwitteres_ES
dc.subject.otherSexismes_ES
dc.subject.unesco1203.23 Lenguajes de Programaciónes_ES
dc.titleI2C-UHU at CLEF-2023 EXIST task: Leveraging Ensembling Language Models to Detect Multilingual Sexism in Social Mediaes_ES
dc.typeconference paperes_ES
dspace.entity.typePublication
relation.isAuthorOfPublicationac76819b-d91a-4158-b947-4a9e827e5e9d
relation.isAuthorOfPublication47cb4892-3513-4d33-953c-8521bc9cb187
relation.isAuthorOfPublication11d4312c-8591-4e26-b971-740ce012d168
relation.isAuthorOfPublication.latestForDiscoveryac76819b-d91a-4158-b947-4a9e827e5e9d

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
I2C_UHU_Cordon.pdf
Size:
930.55 KB
Format:
Adobe Portable Document Format