Algoritmos de aprendizado de máquina aplicados na estimação de descontos em itens da cesta básica

Jose Mateus Rodrigues dos Santos; Bruno Samways dos Santos

doi:10.15675/gepros.3037

Authors

Jose Mateus Rodrigues dos Santos Federal Technological University of Paraná https://orcid.org/0009-0004-9818-326X
Bruno Samways dos Santos Federal Technological University of Paraná https://orcid.org/0000-0001-7919-1724

DOI:

https://doi.org/10.15675/gepros.3037

Keywords:

Machine Learning, Public Administration, Public Procurement, Basic Food Basket, Public Purchasing

Abstract

Purpose: This study evaluates machine learning (ML) models for estimating discounts in public procurement of national basic food basket items and analyzes the most relevant features influencing these estimations. Theoretical framework: ML models can be applied in various fields, including engineering, medicine, public health, and economics. These algorithms are capable of uncovering hidden patterns that traditional statistical techniques might miss, making them suitable for predictive and interpretative analyses regarding the characteristics of discounts offered in the procurement of basic food basket items. Methodology/Approach: The algorithms Random Forest, XGBoost, and Artificial Neural Networks were employed and evaluated using mean absolute error and root mean square error. The data were obtained from the Court of Accounts of the State of Paraná and included 18 items from the basic food basket. Findings: Overall, the XGBoost model exhibited the best performance based on error metrics. Regarding feature importance, the ‘quantity’ variable was most significant for estimating discounts on bread and butter, whereas the ‘year of approval’ was a key factor for soybean oil, French bread, beef, rice, and beans. Research, practical & social implications: These models can provide valuable information to managers and oversight bodies, supporting budget planning, fraud detection, and price negotiation in public procurement, thereby contributing to more cost-effective and transparent public administration. However, the models showed limitations in estimating discount values in situations of large variation peaks. Originality/Value: From the preprocessing stage, we suggest a deeper analysis of the data management system, as the fragmented and non-standardized files make data extraction difficult for individuals outside of academia or the corporate sector. ML models can assist public managers in aligning with the principles and trends introduced by the New Public Procurement Law, in a market that accounts for 12% of the national GDP.

Author Biographies

Jose Mateus Rodrigues dos Santos, Federal Technological University of Paraná

Federal Technological University of Paraná (UFTPR), Londrina – Paraná (PR) – Brazil. Undergraduate student in Production Engineering.

Bruno Samways dos Santos, Federal Technological University of Paraná

Federal Technological University of Paraná (UFTPR), Londrina – Paraná (PR) – Brazil. Adjunct Professor in the Department of Production Engineering. Researcher in the Optimization and Data Mining Research Group (GPOMD).

References

Aggarwal, C. C. (2015). Data Mining: The textbook. Springer International Publishing. https://doi.org/10.1007/978-3-319-14142-8 DOI: https://doi.org/10.1007/978-3-319-14142-8

Aggarwal, C. C. (2018). Neural Networks and Deep Learning. In Neural Networks and Deep Learning. Springer International Publishing. https://doi.org/10.1007/978-3-319-94463-0 DOI: https://doi.org/10.1007/978-3-319-94463-0_3

Amorim, V. A. J. de. (2017). Licitações e contratos administrativos: teoria e jurisprudência. https://www2.senado.leg.br/bdsf/handle/id/533714

Bonaccorso, G. (2017). Machine Learning Algorithms: A reference guide to popular algorithms for data science and machine learning (1st ed., Vol. 1). Packt Publishing.

Brasil. (2021). Lei 14133 de Licitações e Contratos Administrativos. https://www.planalto.gov.br/ccivil_03/_ato2019-2022/2021/lei/l14133.htm

Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32. https://doi.org/10.1007/9781441993267_5 DOI: https://doi.org/10.1023/A:1010933404324

Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. https://doi.org/10.1145/2939672.2939785 DOI: https://doi.org/10.1145/2939672.2939785

Dhaliwal, S. S., Nahid, A.-A., & Abbas, R. (2018). Effective Intrusion Detection System Using XGBoost. Information, 9(7), 149. https://doi.org/10.3390/info9070149 DOI: https://doi.org/10.3390/info9070149

França, R. M. M. (2021). Compras públicas conjuntas e economias de escala: um estudo de caso do Sistema COFEN / Conselhos Regionais de Enfermagem [Dissertação de Mestrado, Universidade de Brasília]. http://www.rlbea.unb.br/jspui/handle/10482/42165

Freitas, A. M. de, Prado, F. O., Alexandre, P. L. T., & Carmona, M. F. F. (2021). Nova lei de licitações e contratos administrativos: comentários a lei no 14.133/2021 (2nd ed.). https://www2.senado.leg.br/bdsf/handle/id/588204

Géron, A. (2019). Hands-on: Machine Learning with Scikit-Learn, Keras & Tensorflow (2nd ed.). O’Reilly Media.

Ghazal, M. M., & Hammad, A. (2022). Application of knowledge discovery in database (KDD) techniques in cost overrun of construction projects. International Journal of Construction Management, 22(9), 1632–1646. https://doi.org/10.1080/15623599.2020.1738205 DOI: https://doi.org/10.1080/15623599.2020.1738205

Halužan Vasle, A., & Moškon, M. (2024). Synthetic biological neural networks: From current implementations to future perspectives. BioSystems, 237, 105164. https://doi.org/10.1016/j.biosystems.2024.105164 DOI: https://doi.org/10.1016/j.biosystems.2024.105164

Instituto Brasileiro de Geografia e Estatística. (2024). Sistema Nacional de Índices de Preços ao Consumidor. Tabela 7063 - INPC. https://sidra.ibge.gov.br/tabela/7063

Lara, J. A., Lizcano, D., Martínez, M. A., & Pazos, J. (2014). Data preparation for KDD through automatic reasoning based on description logic. Information Systems, 44, 54–72. https://doi.org/10.1016/j.is.2014.03.002 DOI: https://doi.org/10.1016/j.is.2014.03.002

M. Mijwil, M. (2021). Artificial Neural Networks Advantages and Disadvantages. Mesopotamian Journal of Big Data, 2021, 29–31. https://doi.org/10.58496/MJBD/2021/006 DOI: https://doi.org/10.58496/MJBD/2021/006

Mello, C. A. B. de. (2015). Curso de Direito Administrativo (32nd ed.). Malheiros.

Morabito, R., & Pureza, V. (2018). Modelagem e Simulação. In Cauchick, P.A., (Coord.). Metodologia de Pesquisa em Engenharia de Produção e Gestão de Operações (3 ed., Chap. 8, pp. 165-195). Rio de Janeiro: Elsevier.

Oliveira, L. H. R. de, Rêgo, T. G. do, & Diniz, J. A. (2019). Previsão de Valores de Aquisições Governamentais: o Uso dos Conceitos de Data Science e Machine Learning. XVI Congresso USP de Iniciação Científica Em Contabilidade, 1–15.

Radhoush, S., Whitaker, B. M., & Nehrir, H. (2023). An Overview of Supervised Machine Learning Approaches for Applications in Active Distribution Networks. Energies, 16(16), 5972. https://doi.org/10.3390/en16165972 DOI: https://doi.org/10.3390/en16165972

Raschka, S. (2015). Python Machine Learning (1st ed.). Packt Publishing Ltd.

Ribeiro, C. G., & Inácio Júnior, E. (2019). O mercado de compras governamentais brasileiro (2006-2017): Mensuração e análise.

Sagi, O., & Rokach, L. (2021). Approximating XGBoost with an interpretable decision tree. Information Sciences, 572, 522–542. https://doi.org/10.1016/j.ins.2021.05.055 DOI: https://doi.org/10.1016/j.ins.2021.05.055

Sampaio, I. G., Bernardini, F., Paes, A., Andrade, E. de O., & Viterbo, J. (2019). Avaliação de Modelos de Predição e Previsão Construídos por Algoritmos de Aprendizado de Máquina em Problemas de Cidades Inteligentes. In Tópicos em Sistemas de Informação: Minicursos SBSI 2019 (pp. 81–113). SBC. https://doi.org/10.5753/sbc.480.9.04 DOI: https://doi.org/10.5753/sbc.480.9.04

Santos, L. B., Gentry, D., Tryforos, A., Fultz, L., Beasley, J., & Gentimis, T. (2024). Soybean yield prediction using machine learning algorithms under a cover crop management system. Smart Agricultural Technology, 8, 100442. https://doi.org/10.1016/j.atech.2024.100442 DOI: https://doi.org/10.1016/j.atech.2024.100442

Signor, R., Marchiori, F. F., Raupp, A. B., Magro, R. R., & Lopes, A. de O. (2022). A nova lei de licitações como promotora da maldição do vencedor. Revista de Administração Pública, 56(1), 176–190. https://doi.org/10.1590/0034-761220210133 DOI: https://doi.org/10.1590/0034-761220210133

Silva, M. O., Costa, L. L., Bezerra, G., Gomide, L. D., Hott, H. R., Oliveira, G. P., Brandão, M. A., Lacerda, A., & Pappa, G. (2023). Análise de Sobrepreço em Itens de Licitações Públicas. Anais Do XI Workshop de Computação Aplicada Em Governo Eletrônico (WCGE 2023), 118–129. https://doi.org/10.5753/wcge.2023.230608 DOI: https://doi.org/10.5753/wcge.2023.230608

Tan, P.-N., Steinbach, M., Karpatne, A., & Kumar, V. (2019). Introduction to Data Mining (2nd ed.). Pearson Prentice Hall.

Willmott, C. J., & Matsuura, K. (2005). Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Climate Research, 30(1), 79–82. http://www.jstor.org/stable/24869236 DOI: https://doi.org/10.3354/cr030079

Xerez, S. R. D. (2013). A evolução do orçamento público e seus instrumentos de planejamento. Revista Científica Semana Acadêmica, 01(43), 1–19.

Xu, Q., & Yin, J. (2021). Application of Random Forest Algorithm in Physical Education. Scientific Programming, 2021. https://doi.org/10.1155/2021/1996904 DOI: https://doi.org/10.1155/2021/1996904