CY-Bench: a comprehensive benchmark dataset for sub-national crop yield forecasting (Preprint)
In-season, pre-harvest crop yield forecasts are essential for enhancing transparency in commodity markets and improving food security. They play a key role in increasing resilience to climate change and extreme events and thus contribute to the United Nations’ Sustainable Development Goal 2 of zero...
| Main Authors: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
|---|---|
| Format: | Preprint |
| Language: | Inglés |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://hdl.handle.net/10568/177066 |
| _version_ | 1855527597708935168 |
|---|---|
| author | Paudel, D. Kallenberg, M. Ofori-Ampofo, S. Baja, H. van Bree, R. Potze, A. Poudel, P. Saleh, A. Anderson, W. von Bloh, M. Castellano, A. Ennaji, O. Hamed, R. Laudien, R. Lee, D. Luna, I. Meroni, M. Mutuku, J.M. Mkuhlani, S. Richetti, J. Ruane, A.C. Sahajpal, R. Shai, G. Sitokonstantinou, V. de Souza Noia Junior, R. Srivastava, A. Strong, R. Sweet, L. Vojnovic, P. Athanasiadis, I.N. |
| author_browse | Anderson, W. Athanasiadis, I.N. Baja, H. Castellano, A. Ennaji, O. Hamed, R. Kallenberg, M. Laudien, R. Lee, D. Luna, I. Meroni, M. Mkuhlani, S. Mutuku, J.M. Ofori-Ampofo, S. Paudel, D. Potze, A. Poudel, P. Richetti, J. Ruane, A.C. Sahajpal, R. Saleh, A. Shai, G. Sitokonstantinou, V. Srivastava, A. Strong, R. Sweet, L. Vojnovic, P. de Souza Noia Junior, R. van Bree, R. von Bloh, M. |
| author_facet | Paudel, D. Kallenberg, M. Ofori-Ampofo, S. Baja, H. van Bree, R. Potze, A. Poudel, P. Saleh, A. Anderson, W. von Bloh, M. Castellano, A. Ennaji, O. Hamed, R. Laudien, R. Lee, D. Luna, I. Meroni, M. Mutuku, J.M. Mkuhlani, S. Richetti, J. Ruane, A.C. Sahajpal, R. Shai, G. Sitokonstantinou, V. de Souza Noia Junior, R. Srivastava, A. Strong, R. Sweet, L. Vojnovic, P. Athanasiadis, I.N. |
| author_sort | Paudel, D. |
| collection | Repository of Agricultural Research Outputs (CGSpace) |
| description | In-season, pre-harvest crop yield forecasts are essential for enhancing transparency in commodity markets and improving food security. They play a key role in increasing resilience to climate change and extreme events and thus contribute to the United Nations’ Sustainable Development Goal 2 of zero hunger. Pre-harvest crop yield forecasting is a complex task, as several interacting factors contribute to yield formation, including in-season weather variability, extreme events, long-term climate change, soil, pests, diseases and farm management decisions. Several modeling approaches have been employed to capture complex interactions among such predictors and crop yields. Prior research for in-season, pre-harvest crop yield forecasting has primarily been case-study based, which makes it difficult to compare modeling approaches and measure progress systematically. To address this gap, we introduce CY-Bench (Crop Yield Benchmark), a comprehensive dataset and benchmark to forecast maize and wheat yields at a global scale. CY-Bench was conceptualized and developed within the Machine Learning team of the Agricultural Model Intercomparison and Improvement Project (AgML) in collaboration with agronomists, climate scientists, and machine learning researchers. It features publicly available sub-national yield statistics and relevant predictors—such as weather data, soil characteristics, and remote sensing indicators—that have been pre-processed, standardized, and harmonized across spatio-temporal scales. With CY-Bench, we aim to: (i) establish a standardized framework for developing and evaluating data-driven models across diverse farming systems in more than 25 countries across six continents; (ii) enable robust and reproducible model comparisons that address real-world operational challenges; (iii) provide an openly accessible dataset to the earth system science and machine learning communities, facilitating research on time series forecasting, domain adaptation, and online learning. The dataset (https://doi.org/10.5281/zenodo.11502142, (Paudel et al., 2025a)) and accompanying code (https://github.com/WUR-AI/AgML-CY-Bench, (Paudel et al., 2025b))) are openly available to support the continuous development of advanced data driven models for crop yield forecasting to enhance decision-making on food security. |
| format | Preprint |
| id | CGSpace177066 |
| institution | CGIAR Consortium |
| language | Inglés |
| publishDate | 2025 |
| publishDateRange | 2025 |
| publishDateSort | 2025 |
| record_format | dspace |
| spelling | CGSpace1770662025-12-08T09:54:28Z CY-Bench: a comprehensive benchmark dataset for sub-national crop yield forecasting (Preprint) Paudel, D. Kallenberg, M. Ofori-Ampofo, S. Baja, H. van Bree, R. Potze, A. Poudel, P. Saleh, A. Anderson, W. von Bloh, M. Castellano, A. Ennaji, O. Hamed, R. Laudien, R. Lee, D. Luna, I. Meroni, M. Mutuku, J.M. Mkuhlani, S. Richetti, J. Ruane, A.C. Sahajpal, R. Shai, G. Sitokonstantinou, V. de Souza Noia Junior, R. Srivastava, A. Strong, R. Sweet, L. Vojnovic, P. Athanasiadis, I.N. benchmark datasets crop yield agriculture climate change food security forecasting In-season, pre-harvest crop yield forecasts are essential for enhancing transparency in commodity markets and improving food security. They play a key role in increasing resilience to climate change and extreme events and thus contribute to the United Nations’ Sustainable Development Goal 2 of zero hunger. Pre-harvest crop yield forecasting is a complex task, as several interacting factors contribute to yield formation, including in-season weather variability, extreme events, long-term climate change, soil, pests, diseases and farm management decisions. Several modeling approaches have been employed to capture complex interactions among such predictors and crop yields. Prior research for in-season, pre-harvest crop yield forecasting has primarily been case-study based, which makes it difficult to compare modeling approaches and measure progress systematically. To address this gap, we introduce CY-Bench (Crop Yield Benchmark), a comprehensive dataset and benchmark to forecast maize and wheat yields at a global scale. CY-Bench was conceptualized and developed within the Machine Learning team of the Agricultural Model Intercomparison and Improvement Project (AgML) in collaboration with agronomists, climate scientists, and machine learning researchers. It features publicly available sub-national yield statistics and relevant predictors—such as weather data, soil characteristics, and remote sensing indicators—that have been pre-processed, standardized, and harmonized across spatio-temporal scales. With CY-Bench, we aim to: (i) establish a standardized framework for developing and evaluating data-driven models across diverse farming systems in more than 25 countries across six continents; (ii) enable robust and reproducible model comparisons that address real-world operational challenges; (iii) provide an openly accessible dataset to the earth system science and machine learning communities, facilitating research on time series forecasting, domain adaptation, and online learning. The dataset (https://doi.org/10.5281/zenodo.11502142, (Paudel et al., 2025a)) and accompanying code (https://github.com/WUR-AI/AgML-CY-Bench, (Paudel et al., 2025b))) are openly available to support the continuous development of advanced data driven models for crop yield forecasting to enhance decision-making on food security. 2025 2025-10-14T13:00:36Z 2025-10-14T13:00:36Z Preprint https://hdl.handle.net/10568/177066 en Open Access application/pdf Paudel, D., Kallenberg, M., Ofori-Ampofo, S., Baja, H., van Bree, R., Potze, A., ... & Athanasiadis, I.N. (2025). CY-Bench: a comprehensive benchmark dataset for sub-national crop yield forecasting. Earth System Science Data, 2025, 1-28. |
| spellingShingle | benchmark datasets crop yield agriculture climate change food security forecasting Paudel, D. Kallenberg, M. Ofori-Ampofo, S. Baja, H. van Bree, R. Potze, A. Poudel, P. Saleh, A. Anderson, W. von Bloh, M. Castellano, A. Ennaji, O. Hamed, R. Laudien, R. Lee, D. Luna, I. Meroni, M. Mutuku, J.M. Mkuhlani, S. Richetti, J. Ruane, A.C. Sahajpal, R. Shai, G. Sitokonstantinou, V. de Souza Noia Junior, R. Srivastava, A. Strong, R. Sweet, L. Vojnovic, P. Athanasiadis, I.N. CY-Bench: a comprehensive benchmark dataset for sub-national crop yield forecasting (Preprint) |
| title | CY-Bench: a comprehensive benchmark dataset for sub-national crop yield forecasting (Preprint) |
| title_full | CY-Bench: a comprehensive benchmark dataset for sub-national crop yield forecasting (Preprint) |
| title_fullStr | CY-Bench: a comprehensive benchmark dataset for sub-national crop yield forecasting (Preprint) |
| title_full_unstemmed | CY-Bench: a comprehensive benchmark dataset for sub-national crop yield forecasting (Preprint) |
| title_short | CY-Bench: a comprehensive benchmark dataset for sub-national crop yield forecasting (Preprint) |
| title_sort | cy bench a comprehensive benchmark dataset for sub national crop yield forecasting preprint |
| topic | benchmark datasets crop yield agriculture climate change food security forecasting |
| url | https://hdl.handle.net/10568/177066 |
| work_keys_str_mv | AT paudeld cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT kallenbergm cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT oforiampofos cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT bajah cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT vanbreer cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT potzea cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT poudelp cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT saleha cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT andersonw cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT vonblohm cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT castellanoa cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT ennajio cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT hamedr cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT laudienr cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT leed cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT lunai cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT meronim cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT mutukujm cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT mkuhlanis cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT richettij cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT ruaneac cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT sahajpalr cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT shaig cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT sitokonstantinouv cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT desouzanoiajuniorr cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT srivastavaa cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT strongr cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT sweetl cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT vojnovicp cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint AT athanasiadisin cybenchacomprehensivebenchmarkdatasetforsubnationalcropyieldforecastingpreprint |