Please use this identifier to cite or link to this item: http://repositorio.ufla.br/jspui/handle/1/55058
Title: Aplicação de algoritmos de aprendizagem de máquina na identificação de registros espúrios no Cadastro Ambiental Rural
Other Titles: Application of machine learning algorithms to identify spurious records in the Rural Environmental Registry
Authors: Ferreira, Danton Diego
Ferreira, Danton Diego
Evsukoff, Alexandre Gonçalves
Lacerda, Wilian Soares
Keywords: Cadastro Ambiental Rural
Classificação de dados
Dados desbalanceado
Aprendizagem de Máquina Interpretável
Ciência de dados
Rural Environmental Registry
Data classification
Imbalanced data
Interpretable Machine Learning
Data science
Issue Date: 9-Sep-2022
Publisher: Universidade Federal de Lavras
Citation: BORGES, F. E. de M. Aplicação de algoritmos de aprendizagem de máquina na identificação de registros espúrios no Cadastro Ambiental Rural. 2022. 90 p. Dissertação (Mestrado em Engenharia de Sistemas e Automação) - Universidade Federal de Lavras, Lavras, 2022.
Abstract: The Rural Environmental Registry (CAR) is a mandatory electronic public registry for all rural properties in the Brazilian territory, integrating environmental information from the properties, helping with the environmental monitoring and contributing to actions to combat deforestation. However, a large number of registrations are made erroneously, generating inconsistent data, leading these to be cancelled and/or to request rectifications for the correct completion of the registration. Performing these analyses, identifying the incorrectly completed registries (spurious) manually, has a great cost, given the need for specialized labor, requiring a large amount of time, due to the immense amount of rural properties in Brazil. In this context, this work aims to provide a smart machine learning-based system that allows to check and classify CAR records into spurious and non- spurious (or cancelled and approved) registries in a fast and effective way. To do this, methodologies involving the entire pipeline of an application involving data science and machine learning have been applied. From pre-processing, with attribute cleaning and selection, followed by training and validation of the classifiers, and finally the use of interpretable machine learning algorithms with the goal of evaluating how each attribute impacted the decision making by the classifiers. Six classification models were applied and their results evaluated according to each preprocessing format, and a classifier interpretation model was used to compare the internal interpretations of models that have interpretability. The predictive results show classification performance rates above 90% for all evaluation measures used in the validation set, and the interpretations listed the variables that most influence automatic classification. Thus, the method proved to be viable for application in a real scenario applied to the Rural Environmental Registry.
URI: http://repositorio.ufla.br/jspui/handle/1/55058
Appears in Collections:Engenharia de Sistemas e automação (Dissertações)



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.