Optimal sample size and composition for crop classification with Sen2-Agri’s random forest classifier

Sen2-Agri is a software system that was developed to facilitate the use of multi-temporal satellite data for crop classification with a random forest (RF) classifier in an operational setting. It automatically ingests and processes Sentinel-2 and LandSat 8 images. Our goal was to provide practitione...

Descripción completa

Detalles Bibliográficos
Autores principales: Schulthess, Urs, Rodrigues, Francelino, Taymans, Matthieu, Bellemans, Nicolas, Bontemps, Sophie, Ortíz Monasterio, Jose Iván, Gerard, Bruno G., Defourny, Pierre
Formato: Journal Article
Lenguaje:Inglés
Publicado: MDPI 2023
Materias:
Acceso en línea:https://hdl.handle.net/10568/128426
_version_ 1855514220119982080
author Schulthess, Urs
Rodrigues, Francelino
Taymans, Matthieu
Bellemans, Nicolas
Bontemps, Sophie
Ortíz Monasterio, Jose Iván
Gerard, Bruno G.
Defourny, Pierre
author_browse Bellemans, Nicolas
Bontemps, Sophie
Defourny, Pierre
Gerard, Bruno G.
Ortíz Monasterio, Jose Iván
Rodrigues, Francelino
Schulthess, Urs
Taymans, Matthieu
author_facet Schulthess, Urs
Rodrigues, Francelino
Taymans, Matthieu
Bellemans, Nicolas
Bontemps, Sophie
Ortíz Monasterio, Jose Iván
Gerard, Bruno G.
Defourny, Pierre
author_sort Schulthess, Urs
collection Repository of Agricultural Research Outputs (CGSpace)
description Sen2-Agri is a software system that was developed to facilitate the use of multi-temporal satellite data for crop classification with a random forest (RF) classifier in an operational setting. It automatically ingests and processes Sentinel-2 and LandSat 8 images. Our goal was to provide practitioners with recommendations for the best sample size and composition. The study area was located in the Yaqui Valley in Mexico. Using polygons of more than 6000 labeled crop fields, we prepared data sets for training, in which the nine crops had an equal or proportional representation, called Equal or Ratio, respectively. Increasing the size of the training set improved the overall accuracy (OA). Gains became marginal once the total number of fields approximated 500 or 40 to 45 fields per crop type. Equal achieved slightly higher OAs than Ratio for a given number of fields. However, recall and F-scores of the individual crops tended to be higher for Ratio than for Equal. The high number of wheat fields in the Ratio scenarios, ranging from 275 to 2128, produced a more accurate classification of wheat than the maximal 80 fields of Equal. This resulted in a higher recall for wheat in the Ratio than in the Equal scenarios, which in turn limited the errors of commission of the non-wheat crops. Thus, a proportional representation of the crops in the training data is preferable and yields better accuracies, even for the minority crops.
format Journal Article
id CGSpace128426
institution CGIAR Consortium
language Inglés
publishDate 2023
publishDateRange 2023
publishDateSort 2023
publisher MDPI
publisherStr MDPI
record_format dspace
spelling CGSpace1284262025-12-08T10:29:22Z Optimal sample size and composition for crop classification with Sen2-Agri’s random forest classifier Schulthess, Urs Rodrigues, Francelino Taymans, Matthieu Bellemans, Nicolas Bontemps, Sophie Ortíz Monasterio, Jose Iván Gerard, Bruno G. Defourny, Pierre crops forests machine learning agriculture remote sensing Sen2-Agri is a software system that was developed to facilitate the use of multi-temporal satellite data for crop classification with a random forest (RF) classifier in an operational setting. It automatically ingests and processes Sentinel-2 and LandSat 8 images. Our goal was to provide practitioners with recommendations for the best sample size and composition. The study area was located in the Yaqui Valley in Mexico. Using polygons of more than 6000 labeled crop fields, we prepared data sets for training, in which the nine crops had an equal or proportional representation, called Equal or Ratio, respectively. Increasing the size of the training set improved the overall accuracy (OA). Gains became marginal once the total number of fields approximated 500 or 40 to 45 fields per crop type. Equal achieved slightly higher OAs than Ratio for a given number of fields. However, recall and F-scores of the individual crops tended to be higher for Ratio than for Equal. The high number of wheat fields in the Ratio scenarios, ranging from 275 to 2128, produced a more accurate classification of wheat than the maximal 80 fields of Equal. This resulted in a higher recall for wheat in the Ratio than in the Equal scenarios, which in turn limited the errors of commission of the non-wheat crops. Thus, a proportional representation of the crops in the training data is preferable and yields better accuracies, even for the minority crops. 2023-02-01 2023-02-03T08:30:07Z 2023-02-03T08:30:07Z Journal Article https://hdl.handle.net/10568/128426 en Open Access application/pdf MDPI Schulthess, U., Rodrigues, F., Taymans, M., Bellemans, N., Bontemps, S., Ortiz-Monasterio, I., Gérard, B., & Defourny, P. (2023). Optimal Sample Size and Composition for Crop Classification with Sen2-Agri’s Random Forest Classifier. Remote Sensing, 15(3), 608. https://doi.org/10.3390/rs15030608
spellingShingle crops
forests
machine learning
agriculture
remote sensing
Schulthess, Urs
Rodrigues, Francelino
Taymans, Matthieu
Bellemans, Nicolas
Bontemps, Sophie
Ortíz Monasterio, Jose Iván
Gerard, Bruno G.
Defourny, Pierre
Optimal sample size and composition for crop classification with Sen2-Agri’s random forest classifier
title Optimal sample size and composition for crop classification with Sen2-Agri’s random forest classifier
title_full Optimal sample size and composition for crop classification with Sen2-Agri’s random forest classifier
title_fullStr Optimal sample size and composition for crop classification with Sen2-Agri’s random forest classifier
title_full_unstemmed Optimal sample size and composition for crop classification with Sen2-Agri’s random forest classifier
title_short Optimal sample size and composition for crop classification with Sen2-Agri’s random forest classifier
title_sort optimal sample size and composition for crop classification with sen2 agri s random forest classifier
topic crops
forests
machine learning
agriculture
remote sensing
url https://hdl.handle.net/10568/128426
work_keys_str_mv AT schulthessurs optimalsamplesizeandcompositionforcropclassificationwithsen2agrisrandomforestclassifier
AT rodriguesfrancelino optimalsamplesizeandcompositionforcropclassificationwithsen2agrisrandomforestclassifier
AT taymansmatthieu optimalsamplesizeandcompositionforcropclassificationwithsen2agrisrandomforestclassifier
AT bellemansnicolas optimalsamplesizeandcompositionforcropclassificationwithsen2agrisrandomforestclassifier
AT bontempssophie optimalsamplesizeandcompositionforcropclassificationwithsen2agrisrandomforestclassifier
AT ortizmonasteriojoseivan optimalsamplesizeandcompositionforcropclassificationwithsen2agrisrandomforestclassifier
AT gerardbrunog optimalsamplesizeandcompositionforcropclassificationwithsen2agrisrandomforestclassifier
AT defournypierre optimalsamplesizeandcompositionforcropclassificationwithsen2agrisrandomforestclassifier