CY-Bench: a comprehensive benchmark dataset for sub-national crop yield forecasting (Preprint)

In-season, pre-harvest crop yield forecasts are essential for enhancing transparency in commodity markets and improving food security. They play a key role in increasing resilience to climate change and extreme events and thus contribute to the United Nations’ Sustainable Development Goal 2 of zero...

Descripción completa

Detalles Bibliográficos
Autores principales: Paudel, D., Kallenberg, M., Ofori-Ampofo, S., Baja, H., van Bree, R., Potze, A., Poudel, P., Saleh, A., Anderson, W., von Bloh, M., Castellano, A., Ennaji, O., Hamed, R., Laudien, R., Lee, D., Luna, I., Meroni, M., Mutuku, J.M., Mkuhlani, S., Richetti, J., Ruane, A.C., Sahajpal, R., Shai, G., Sitokonstantinou, V., de Souza Noia Junior, R., Srivastava, A., Strong, R., Sweet, L., Vojnovic, P., Athanasiadis, I.N.
Formato: Preprint
Lenguaje:Inglés
Publicado: 2025
Materias:
Acceso en línea:https://hdl.handle.net/10568/177066
_version_ 1855527597708935168
author Paudel, D.
Kallenberg, M.
Ofori-Ampofo, S.
Baja, H.
van Bree, R.
Potze, A.
Poudel, P.
Saleh, A.
Anderson, W.
von Bloh, M.
Castellano, A.
Ennaji, O.
Hamed, R.
Laudien, R.
Lee, D.
Luna, I.
Meroni, M.
Mutuku, J.M.
Mkuhlani, S.
Richetti, J.
Ruane, A.C.
Sahajpal, R.
Shai, G.
Sitokonstantinou, V.
de Souza Noia Junior, R.
Srivastava, A.
Strong, R.
Sweet, L.
Vojnovic, P.
Athanasiadis, I.N.
author_browse Anderson, W.
Athanasiadis, I.N.
Baja, H.
Castellano, A.
Ennaji, O.
Hamed, R.
Kallenberg, M.
Laudien, R.
Lee, D.
Luna, I.
Meroni, M.
Mkuhlani, S.
Mutuku, J.M.
Ofori-Ampofo, S.
Paudel, D.
Potze, A.
Poudel, P.
Richetti, J.
Ruane, A.C.
Sahajpal, R.
Saleh, A.
Shai, G.
Sitokonstantinou, V.
Srivastava, A.
Strong, R.
Sweet, L.
Vojnovic, P.
de Souza Noia Junior, R.
van Bree, R.
von Bloh, M.
author_facet Paudel, D.
Kallenberg, M.
Ofori-Ampofo, S.
Baja, H.
van Bree, R.
Potze, A.
Poudel, P.
Saleh, A.
Anderson, W.
von Bloh, M.
Castellano, A.
Ennaji, O.
Hamed, R.
Laudien, R.
Lee, D.
Luna, I.
Meroni, M.
Mutuku, J.M.
Mkuhlani, S.
Richetti, J.
Ruane, A.C.
Sahajpal, R.
Shai, G.
Sitokonstantinou, V.
de Souza Noia Junior, R.
Srivastava, A.
Strong, R.
Sweet, L.
Vojnovic, P.
Athanasiadis, I.N.
author_sort Paudel, D.
collection Repository of Agricultural Research Outputs (CGSpace)
description In-season, pre-harvest crop yield forecasts are essential for enhancing transparency in commodity markets and improving food security. They play a key role in increasing resilience to climate change and extreme events and thus contribute to the United Nations’ Sustainable Development Goal 2 of zero hunger. Pre-harvest crop yield forecasting is a complex task, as several interacting factors contribute to yield formation, including in-season weather variability, extreme events, long-term climate change, soil, pests, diseases and farm management decisions. Several modeling approaches have been employed to capture complex interactions among such predictors and crop yields. Prior research for in-season, pre-harvest crop yield forecasting has primarily been case-study based, which makes it difficult to compare modeling approaches and measure progress systematically. To address this gap, we introduce CY-Bench (Crop Yield Benchmark), a comprehensive dataset and benchmark to forecast maize and wheat yields at a global scale. CY-Bench was conceptualized and developed within the Machine Learning team of the Agricultural Model Intercomparison and Improvement Project (AgML) in collaboration with agronomists, climate scientists, and machine learning researchers. It features publicly available sub-national yield statistics and relevant predictors—such as weather data, soil characteristics, and remote sensing indicators—that have been pre-processed, standardized, and harmonized across spatio-temporal scales. With CY-Bench, we aim to: (i) establish a standardized framework for developing and evaluating data-driven models across diverse farming systems in more than 25 countries across six continents; (ii) enable robust and reproducible model comparisons that address real-world operational challenges; (iii) provide an openly accessible dataset to the earth system science and machine learning communities, facilitating research on time series forecasting, domain adaptation, and online learning. The dataset (https://doi.org/10.5281/zenodo.11502142, (Paudel et al., 2025a)) and accompanying code (https://github.com/WUR-AI/AgML-CY-Bench, (Paudel et al., 2025b))) are openly available to support the continuous development of advanced data driven models for crop yield forecasting to enhance decision-making on food security.
format Preprint
id CGSpace177066
institution CGIAR Consortium
language Inglés
publishDate 2025
publishDateRange 2025
publishDateSort 2025
record_format dspace
spelling CGSpace1770662025-12-08T09:54:28Z CY-Bench: a comprehensive benchmark dataset for sub-national crop yield forecasting (Preprint) Paudel, D. Kallenberg, M. Ofori-Ampofo, S. Baja, H. van Bree, R. Potze, A. Poudel, P. Saleh, A. Anderson, W. von Bloh, M. Castellano, A. Ennaji, O. Hamed, R. Laudien, R. Lee, D. Luna, I. Meroni, M. Mutuku, J.M. Mkuhlani, S. Richetti, J. Ruane, A.C. Sahajpal, R. Shai, G. Sitokonstantinou, V. de Souza Noia Junior, R. Srivastava, A. Strong, R. Sweet, L. Vojnovic, P. Athanasiadis, I.N. benchmark datasets crop yield agriculture climate change food security forecasting In-season, pre-harvest crop yield forecasts are essential for enhancing transparency in commodity markets and improving food security. They play a key role in increasing resilience to climate change and extreme events and thus contribute to the United Nations’ Sustainable Development Goal 2 of zero hunger. Pre-harvest crop yield forecasting is a complex task, as several interacting factors contribute to yield formation, including in-season weather variability, extreme events, long-term climate change, soil, pests, diseases and farm management decisions. Several modeling approaches have been employed to capture complex interactions among such predictors and crop yields. Prior research for in-season, pre-harvest crop yield forecasting has primarily been case-study based, which makes it difficult to compare modeling approaches and measure progress systematically. To address this gap, we introduce CY-Bench (Crop Yield Benchmark), a comprehensive dataset and benchmark to forecast maize and wheat yields at a global scale. CY-Bench was conceptualized and developed within the Machine Learning team of the Agricultural Model Intercomparison and Improvement Project (AgML) in collaboration with agronomists, climate scientists, and machine learning researchers. It features publicly available sub-national yield statistics and relevant predictors—such as weather data, soil characteristics, and remote sensing indicators—that have been pre-processed, standardized, and harmonized across spatio-temporal scales. With CY-Bench, we aim to: (i) establish a standardized framework for developing and evaluating data-driven models across diverse farming systems in more than 25 countries across six continents; (ii) enable robust and reproducible model comparisons that address real-world operational challenges; (iii) provide an openly accessible dataset to the earth system science and machine learning communities, facilitating research on time series forecasting, domain adaptation, and online learning. The dataset (https://doi.org/10.5281/zenodo.11502142, (Paudel et al., 2025a)) and accompanying code (https://github.com/WUR-AI/AgML-CY-Bench, (Paudel et al., 2025b))) are openly available to support the continuous development of advanced data driven models for crop yield forecasting to enhance decision-making on food security. 2025 2025-10-14T13:00:36Z 2025-10-14T13:00:36Z Preprint https://hdl.handle.net/10568/177066 en Open Access application/pdf Paudel, D., Kallenberg, M., Ofori-Ampofo, S., Baja, H., van Bree, R., Potze, A., ... & Athanasiadis, I.N. (2025). CY-Bench: a comprehensive benchmark dataset for sub-national crop yield forecasting. Earth System Science Data, 2025, 1-28.
spellingShingle benchmark
datasets
crop yield
agriculture
climate change
food security
forecasting
Paudel, D.
Kallenberg, M.
Ofori-Ampofo, S.
Baja, H.
van Bree, R.
Potze, A.
Poudel, P.
Saleh, A.
Anderson, W.
von Bloh, M.
Castellano, A.
Ennaji, O.
Hamed, R.
Laudien, R.
Lee, D.
Luna, I.
Meroni, M.
Mutuku, J.M.
Mkuhlani, S.
Richetti, J.
Ruane, A.C.
Sahajpal, R.
Shai, G.
Sitokonstantinou, V.
de Souza Noia Junior, R.
Srivastava, A.
Strong, R.
Sweet, L.
Vojnovic, P.
Athanasiadis, I.N.
CY-Bench: a comprehensive benchmark dataset for sub-national crop yield forecasting (Preprint)
title CY-Bench: a comprehensive benchmark dataset for sub-national crop yield forecasting (Preprint)
title_full CY-Bench: a comprehensive benchmark dataset for sub-national crop yield forecasting (Preprint)
title_fullStr CY-Bench: a comprehensive benchmark dataset for sub-national crop yield forecasting (Preprint)
title_full_unstemmed CY-Bench: a comprehensive benchmark dataset for sub-national crop yield forecasting (Preprint)
title_short CY-Bench: a comprehensive benchmark dataset for sub-national crop yield forecasting (Preprint)
title_sort cy bench a comprehensive benchmark dataset for sub national crop yield forecasting preprint
topic benchmark
datasets
crop yield
agriculture
climate change
food security
forecasting
url https://hdl.handle.net/10568/177066
work_keys_str_mv AT paudeld cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT kallenbergm cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT oforiampofos cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT bajah cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT vanbreer cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT potzea cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT poudelp cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT saleha cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT andersonw cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT vonblohm cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT castellanoa cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT ennajio cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT hamedr cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT laudienr cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT leed cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT lunai cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT meronim cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT mutukujm cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT mkuhlanis cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT richettij cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT ruaneac cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT sahajpalr cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT shaig cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT sitokonstantinouv cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT desouzanoiajuniorr cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT srivastavaa cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT strongr cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT sweetl cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT vojnovicp cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint
AT athanasiadisin cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint