Please use this identifier to cite or link to this item: http://repositorio.ufla.br/jspui/handle/1/45801
Title: Classificação de gêneros e faixas etárias em redes sociais online por meio de técnicas de aprendizagem multidimensional
Authors: Rodríguez, Demóstenes Zegarra
Maziero, Erick Galani
Rosa, Renata Lopes
Zegarra Rodríguez, Demóstenes
Lacerda, Wilian Soares
Gertrudes, Jadson Castro
Keywords: Classificação de gêneros
Classificação de faixas etárias
Métodos de transformação
Aprendizagem multidimensional
Classificação multidimensional
Redes sociais
Gender classification
Age-group classification
Transformation methods
Multidimensional learning
Multidimensional classification
Social media
Issue Date: 8-Dec-2020
Publisher: Universidade Federal de Lavras
Citation: SILVA, D. H. Classificação de gêneros e faixas etárias em redes sociais online por meio de técnicas de aprendizagem multidimensional. 2020. 70 p. Dissertação (Mestrado em Engenharia de Sistemas e Automação) – Universidade Federal de Lavras, Lavras, 2020.
Abstract: Due to the large volume of content generated by users on Online Social Networks (OSN), organizations have applied sentiment analysis or opinion mining techniques to obtain information about people or entities of interest. An entity can be products, services, people, governmental and non-governmental institutions, public policies, among other types. The classification of genders and age groups supports the analysis of sentiment and opinion, as they help to obtain a more precise feeling or opinion. However, information about gender and age-group may be hidden or incorrectly filled out in the OSN. In the literature, several approaches are used in order to classify genders and age groups. However, in this work, a new set of features is used to classify genders, and age groups, through multidimensional learning. Thus, the main objective of this work is to develop a new model of classification of genders and age groups with data extracted from OSN Twitter, using the transformation methods Classifier Chains (CC) and Label Powerset (LP), and through machine learning techniques based on rules, linear algebra, and probability. This study works with a new database containing 8000 instances extracted from Twitter. The best subsets of user profile features are evaluated, as well as multidimensional learning models using different performance metrics. Through the experiments, a multidimensional classification model was obtained in the test phase, with 0.999 of F1 micro-average for genders and 0.923 for age groups. The results of the classification of genders surpassed most of the related works, and the performance of the classification of age groups is quite competitive.
URI: http://repositorio.ufla.br/jspui/handle/1/45801
Appears in Collections:Engenharia de Sistemas e automação (Dissertações)



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.