TeamMX at PoliticEs 2022: Analysis of Feature Sets in Spanish Author Profiling for Political Ideology

José Luis Ochoa-Hernández*, Yuridiana Alemán

*Autor correspondiente de este trabajo

Producción científica: Contribución a una revistaArtículo de la conferenciarevisión exhaustiva

1 Cita (Scopus)

Resumen

Natural Language Processing (NLP) is evolving more and more every day and it is becoming a very powerful tool, especially when it works in combination with Machine Learning algorithms, as it is making ventures into areas in which it was not well known, such as automatic programming systems based on the GPT-3 model, the market or sales prediction, even, the risk detection in banking systems on the basis of written exchanges between branch managers or directors of the same bank. The so-called short texts, comments/reviews made on social networks like Twitter, Facebook or Youtube, are becoming relevant in several domains. The corpus provided by the IberLEF 2022 Task - PoliticEs was used for extract political ideology information, it was focused on the identification of the gender, the profession, and the political spectrum from a binary (Left, Right) and multi-class perspective (Left, Right, Moderate-Left and Moderate-Right). Eight methods are proposed, six of them didn't have the expected results, but contributed to the two best ones. We implemented a customized stopwords study for our research in collaboration with experiments such as Best unique words per category, Set-based study, Transition point and others to extract the features, then Random Forest, SVM and Neural Network algorithms with default parameters and the Scikit learn tool were used to identify the categories. Obtaining a Macro F1 value of 0.7984 and the highest value achieved was 0.8270 in the category of Profession.

Idioma originalInglés
PublicaciónCEUR Workshop Proceedings
Volumen3202
EstadoPublicada - 2022
Evento2022 Iberian Languages Evaluation Forum, IberLEF 2022 - A Coruna, Espana
Duración: 20 sep. 2022 → …

Nota bibliográfica

Publisher Copyright:
© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).

Huella

Profundice en los temas de investigación de 'TeamMX at PoliticEs 2022: Analysis of Feature Sets in Spanish Author Profiling for Political Ideology'. En conjunto forman una huella única.

Citar esto