Ensemble machine learning for digital mapping of soil pH and electrical conductivity in the Andean agroecosystem of Peru

In agricultural systems, soil pH and electrical conductivity (EC) are crucial chemical properties that directly affect nutrient availability and microbial activity, but the challenging environment of the Peruvian Andes has limited research on their estimation. This study aimed to develop an ensemble...

Descripción completa

Detalles Bibliográficos
Autores principales: Carbajal Llosa, Carlos Miguel, Barja , Antony, Pizarro Carcausto, Samuel Edwin
Formato: info:eu-repo/semantics/article
Lenguaje:Inglés
Publicado: Frontiers Media S.A. 2025
Materias:
Acceso en línea:http://hdl.handle.net/20.500.12955/2967
https://doi.org/10.3389/fsoil.2025.1673628
_version_ 1855028664891080704
author Carbajal Llosa, Carlos Miguel
Barja , Antony
Pizarro Carcausto, Samuel Edwin
author_browse Barja , Antony
Carbajal Llosa, Carlos Miguel
Pizarro Carcausto, Samuel Edwin
author_facet Carbajal Llosa, Carlos Miguel
Barja , Antony
Pizarro Carcausto, Samuel Edwin
author_sort Carbajal Llosa, Carlos Miguel
collection Repositorio INIA
description In agricultural systems, soil pH and electrical conductivity (EC) are crucial chemical properties that directly affect nutrient availability and microbial activity, but the challenging environment of the Peruvian Andes has limited research on their estimation. This study aimed to develop an ensemble learning method to predict soil pH and EC in Andean agroecosystems using environmental predictors. By using simple and weighted averaging, we developed a heterogeneous ensemble learning approach that integrates machine learning (ML) algorithms, including Support Vector Machine (SVM), Artificial Neural Network (ANN), Random Forest (RF), and Extreme Gradient Boosting (XGBoost). The weighted ensemble assigns weights to models based on their predictive accuracy, measured by R² from spatial cross-validation. Spatial patterns are noticeable, and pH displays greater spatial clustering than EC. Elevation was the most important predictor in ML models for both parameters. Ensemble models significantly outperformed individual models, with the weighted ensemble achieving R² >0.93 and reducing RMSE by approximately 72%. Among standalone models, RF and XGBoost performed best for pH, while SVM performed the best for EC. ANN models were the least effective. Uncertainty analysis indicated high confidence in pH predictions but moderate to high uncertainty in EC predictions, suggesting that EC is more challenging to predict. Ensemble models with optimized weighting provide robust and accurate mapping of spatially autocorrelated soil properties. The high-confidence pH maps are reliable for soil management decisions, while EC predictions, though more uncertain, effectively identify priority areas for future sampling and investigation.
format info:eu-repo/semantics/article
id INIA2967
institution Institucional Nacional de Innovación Agraria
language Inglés
publishDate 2025
publishDateRange 2025
publishDateSort 2025
publisher Frontiers Media S.A.
publisherStr Frontiers Media S.A.
record_format dspace
spelling INIA29672025-12-31T19:12:43Z Ensemble machine learning for digital mapping of soil pH and electrical conductivity in the Andean agroecosystem of Peru Carbajal Llosa, Carlos Miguel Barja , Antony Pizarro Carcausto, Samuel Edwin Ensemble learning Spatial machine learning Digital soil mapping Soil pH Electrical conductivity Aprendizaje conjunto aprendizaje automático espacial mapeo digital del suelo pH del suelo conductividad eléctrica. https://purl.org/pe-repo/ocde/ford#4.01.04 Propiedad del suelo; Soil properties; Teledetección; Remote sensing; Modelo digital de superficie; Digital Surface models; Sistema de información geográfica; Geographic information systems; Análisis espacial; Spatial analysis; Perú; Peru. In agricultural systems, soil pH and electrical conductivity (EC) are crucial chemical properties that directly affect nutrient availability and microbial activity, but the challenging environment of the Peruvian Andes has limited research on their estimation. This study aimed to develop an ensemble learning method to predict soil pH and EC in Andean agroecosystems using environmental predictors. By using simple and weighted averaging, we developed a heterogeneous ensemble learning approach that integrates machine learning (ML) algorithms, including Support Vector Machine (SVM), Artificial Neural Network (ANN), Random Forest (RF), and Extreme Gradient Boosting (XGBoost). The weighted ensemble assigns weights to models based on their predictive accuracy, measured by R² from spatial cross-validation. Spatial patterns are noticeable, and pH displays greater spatial clustering than EC. Elevation was the most important predictor in ML models for both parameters. Ensemble models significantly outperformed individual models, with the weighted ensemble achieving R² >0.93 and reducing RMSE by approximately 72%. Among standalone models, RF and XGBoost performed best for pH, while SVM performed the best for EC. ANN models were the least effective. Uncertainty analysis indicated high confidence in pH predictions but moderate to high uncertainty in EC predictions, suggesting that EC is more challenging to predict. Ensemble models with optimized weighting provide robust and accurate mapping of spatially autocorrelated soil properties. The high-confidence pH maps are reliable for soil management decisions, while EC predictions, though more uncertain, effectively identify priority areas for future sampling and investigation. This research was funded by the INIA project CUI 2487112 "Mejoramiento de los servicios de investigación y transferencia tecnológica en el manejo y recuperación de suelos agrícolas degradados y aguas para riego en la pequeña y mediana agricultura en los departamentos de Lima, Áncash, San Martín, Cajamarca, Lambayeque, Junín, Ayacucho, Arequipa, Puno y Ucayali". Acknowledgments: To the personnel of the Soil, Water, and Foliars Laboratory (LABSAF) at the Santa Ana Agrarian Experimental Station (EEA). 2025-12-30T18:16:21Z 2025-12-30T18:16:21Z 2025-11-06 info:eu-repo/semantics/article Carbajal Llosa, C., Barja, A., & Pizarro Carcausto, S. (2025). Ensemble machine learning for digital mapping of soil pH and electrical conductivity in the Andean agroecosystem of Peru. Frontiers in Soil Science, 5, 1673628. https://doi.org/10.3389/fsoil.2025.1673628 2673-8619 http://hdl.handle.net/20.500.12955/2967 https://doi.org/10.3389/fsoil.2025.1673628 eng urn:issn:2673-8619 Frontiers in Soil Science info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by/4.0/ application/pdf application/pdf Frontiers Media S.A. CH Instituto Nacional de Innovación Agraria Repositorio Institucional - INIA
spellingShingle Ensemble learning
Spatial machine learning
Digital soil mapping
Soil pH
Electrical conductivity
Aprendizaje conjunto
aprendizaje automático espacial
mapeo digital del suelo
pH del suelo
conductividad eléctrica.
https://purl.org/pe-repo/ocde/ford#4.01.04
Propiedad del suelo; Soil properties; Teledetección; Remote sensing; Modelo digital de superficie; Digital Surface models; Sistema de información geográfica; Geographic information systems; Análisis espacial; Spatial analysis; Perú; Peru.
Carbajal Llosa, Carlos Miguel
Barja , Antony
Pizarro Carcausto, Samuel Edwin
Ensemble machine learning for digital mapping of soil pH and electrical conductivity in the Andean agroecosystem of Peru
title Ensemble machine learning for digital mapping of soil pH and electrical conductivity in the Andean agroecosystem of Peru
title_full Ensemble machine learning for digital mapping of soil pH and electrical conductivity in the Andean agroecosystem of Peru
title_fullStr Ensemble machine learning for digital mapping of soil pH and electrical conductivity in the Andean agroecosystem of Peru
title_full_unstemmed Ensemble machine learning for digital mapping of soil pH and electrical conductivity in the Andean agroecosystem of Peru
title_short Ensemble machine learning for digital mapping of soil pH and electrical conductivity in the Andean agroecosystem of Peru
title_sort ensemble machine learning for digital mapping of soil ph and electrical conductivity in the andean agroecosystem of peru
topic Ensemble learning
Spatial machine learning
Digital soil mapping
Soil pH
Electrical conductivity
Aprendizaje conjunto
aprendizaje automático espacial
mapeo digital del suelo
pH del suelo
conductividad eléctrica.
https://purl.org/pe-repo/ocde/ford#4.01.04
Propiedad del suelo; Soil properties; Teledetección; Remote sensing; Modelo digital de superficie; Digital Surface models; Sistema de información geográfica; Geographic information systems; Análisis espacial; Spatial analysis; Perú; Peru.
url http://hdl.handle.net/20.500.12955/2967
https://doi.org/10.3389/fsoil.2025.1673628
work_keys_str_mv AT carbajalllosacarlosmiguel ensemblemachinelearningfordigitalmappingofsoilphandelectricalconductivityintheandeanagroecosystemofperu
AT barjaantony ensemblemachinelearningfordigitalmappingofsoilphandelectricalconductivityintheandeanagroecosystemofperu
AT pizarrocarcaustosamueledwin ensemblemachinelearningfordigitalmappingofsoilphandelectricalconductivityintheandeanagroecosystemofperu