Prediction of soybean yields from climate data and vegetation indices using machine learning and neural networks models

Authors

DOI:

https://doi.org/10.14209/jcis.2026.7

Keywords:

Yied forescasting, Nasa Power, Modis, Remote Sensing, Machine Learning, Artificial Intelligence

Abstract

Accurate soybean yield prediction is crucial to support agricultural planning, supply chain logistics, food security strategies and maximize production. In this study, we evaluated the performance of two machine learning models and three neural networks - Random Forest, XGBoost, Multilayer Perceptron (MLP), Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) - for soybean yield forecasting using climate variables and vegetation indices. The data used covers more than 20 years (2001-2020), including municipal soybean yield records from IBGE, meteorological data from NASA POWER and vegetation indices derived from MODIS satellite images. We implemented and compared four forecasting scenarios: a single general predictor, predictors by state, predictors by climate zone and independent predictors specific to each month. Our results reveal that regionalized modeling, especially by climate zones, significantly improves forecasting accuracy. In this prediction scenario, the MLP model obtained the lowest errors (MAE = 102.9 kg/ha, RMSE = 128 kg/ha and rRMSE(%) = 3.88 ) as well as the best coefficient of determination (R2 = 0.83).

Downloads

Download data is not yet available.

Author Biographies

Larissa, University of Campinas (UNICAMP)

Larissa Rangel de Azevedo received her B.S. and M.Sc. degrees in Electrical Engineering from the Federal University of Ouro Preto (UFOP), Minas Gerais, Brazil, in 2022, and the University of Campinas (UNICAMP), São Paulo, Brazil, in 2025, respectively. During her undergraduate studies, she was involved in research on brain–computer interfaces, exploring signal processing and machine learning techniques applied to neural data. Her research focuses on data-driven approaches for environmental and agricultural applications, particularly predictive modeling of agricultural productivity using machine learning and deep learning methods, integrating climate data and remote sensing information. Her main research interests include machine learning, time-series modeling, geospatial data analysis, and brain–computer interfaces.

Levy, Department of Computer Engineering and Automation of School of Electrical and Computer Engineering, State University of Campinas, Campinas - Brazil

Levy Boccato received his B.S. degree in Computer Engineering in 2008, and his M.Sc. and Ph.D. degrees in Electrical Engineering in 2010 and 2013, respectively, all from the University of Campinas (UNICAMP), São Paulo, Brazil. He is currently an Associate Professor at the same university, where he conducts research at the Laboratory of Signal Processing for Communications (DSPCom). He is also a member of the Brazilian Institute of Data Science (BI0S) and the Hub of Artificial Intelligence and Cognitive Architectures (H.IAAC). His main research interests include signal processing, adaptive filtering, machine learning, and brain–computer interfaces.

Downloads

Published

2026-03-28

How to Cite

Rangel de Azevedo, L., & Boccato, L. (2026). Prediction of soybean yields from climate data and vegetation indices using machine learning and neural networks models. Journal of Communication and Information Systems, 41(1), 61–76. https://doi.org/10.14209/jcis.2026.7

Issue

Section

Regular Papers
Received 2025-07-18
Accepted 2026-03-09
Published 2026-03-28