Data augmentation enhances plant-genomic-enabled predictions
Genomic selection (GS) is revolutionizing plant breeding. However, its practical implementation is still challenging, since there are many factors that affect its accuracy. For this reason, this research explores data augmentation with the goal of improving its accuracy. Deep neural networks with da...
| Autores principales: | , , , , , , , , , , |
|---|---|
| Formato: | Journal Article |
| Lenguaje: | Inglés |
| Publicado: |
MDPI
2024
|
| Materias: | |
| Acceso en línea: | https://hdl.handle.net/10568/159826 |
| _version_ | 1855533306094813184 |
|---|---|
| author | Montesinos-Lopez, Osval A. Solis-Camacho, Mario Alberto Crespo Herrera, Leonardo A. Saint Pierre, Carolina Huerta Prado, Gloria Isabel Ramos-Pulido, Sofia Al-Nowibet, Khalid Fritsche-Neto, Roberto Gerard, Guillermo S. Montesinos-Lopez, Abelardo Crossa, José |
| author_browse | Al-Nowibet, Khalid Crespo Herrera, Leonardo A. Crossa, José Fritsche-Neto, Roberto Gerard, Guillermo S. Huerta Prado, Gloria Isabel Montesinos-Lopez, Abelardo Montesinos-Lopez, Osval A. Ramos-Pulido, Sofia Saint Pierre, Carolina Solis-Camacho, Mario Alberto |
| author_facet | Montesinos-Lopez, Osval A. Solis-Camacho, Mario Alberto Crespo Herrera, Leonardo A. Saint Pierre, Carolina Huerta Prado, Gloria Isabel Ramos-Pulido, Sofia Al-Nowibet, Khalid Fritsche-Neto, Roberto Gerard, Guillermo S. Montesinos-Lopez, Abelardo Crossa, José |
| author_sort | Montesinos-Lopez, Osval A. |
| collection | Repository of Agricultural Research Outputs (CGSpace) |
| description | Genomic selection (GS) is revolutionizing plant breeding. However, its practical implementation is still challenging, since there are many factors that affect its accuracy. For this reason, this research explores data augmentation with the goal of improving its accuracy. Deep neural networks with data augmentation (DA) generate synthetic data from the original training set to increase the training set and to improve the prediction performance of any statistical or machine learning algorithm. There is much empirical evidence of their success in many computer vision applications. Due to this, DA was explored in the context of GS using 14 real datasets. We found empirical evidence that DA is a powerful tool to improve the prediction accuracy, since we improved the prediction accuracy of the top lines in the 14 datasets under study. On average, across datasets and traits, the gain in prediction performance of the DA approach regarding the Conventional method in the top 20% of lines in the testing set was 108.4% in terms of the NRMSE and 107.4% in terms of the MAAPE, but a worse performance was observed on the whole testing set. We encourage more empirical evaluations to support our findings. |
| format | Journal Article |
| id | CGSpace159826 |
| institution | CGIAR Consortium |
| language | Inglés |
| publishDate | 2024 |
| publishDateRange | 2024 |
| publishDateSort | 2024 |
| publisher | MDPI |
| publisherStr | MDPI |
| record_format | dspace |
| spelling | CGSpace1598262025-12-08T10:29:22Z Data augmentation enhances plant-genomic-enabled predictions Montesinos-Lopez, Osval A. Solis-Camacho, Mario Alberto Crespo Herrera, Leonardo A. Saint Pierre, Carolina Huerta Prado, Gloria Isabel Ramos-Pulido, Sofia Al-Nowibet, Khalid Fritsche-Neto, Roberto Gerard, Guillermo S. Montesinos-Lopez, Abelardo Crossa, José marker-assisted selection plant breeding data genomes Genomic selection (GS) is revolutionizing plant breeding. However, its practical implementation is still challenging, since there are many factors that affect its accuracy. For this reason, this research explores data augmentation with the goal of improving its accuracy. Deep neural networks with data augmentation (DA) generate synthetic data from the original training set to increase the training set and to improve the prediction performance of any statistical or machine learning algorithm. There is much empirical evidence of their success in many computer vision applications. Due to this, DA was explored in the context of GS using 14 real datasets. We found empirical evidence that DA is a powerful tool to improve the prediction accuracy, since we improved the prediction accuracy of the top lines in the 14 datasets under study. On average, across datasets and traits, the gain in prediction performance of the DA approach regarding the Conventional method in the top 20% of lines in the testing set was 108.4% in terms of the NRMSE and 107.4% in terms of the MAAPE, but a worse performance was observed on the whole testing set. We encourage more empirical evaluations to support our findings. 2024-03 2024-11-15T14:51:00Z 2024-11-15T14:51:00Z Journal Article https://hdl.handle.net/10568/159826 en Open Access application/pdf MDPI Montesinos-López, O. A., Solis-Camacho, M. A., Crespo-Herrera, L., Saint Pierre, C., Huerta Prado, G. I., Ramos-Pulido, S., Al-Nowibet, K., Fritsche-Neto, R., Gerard, G. S., Montesinos-López, A., & Crossa, J. (2024). Data augmentation enhances plant-genomic-enabled predictions. Genes, 15(3), 286. https://doi.org/10.3390/genes15030286 |
| spellingShingle | marker-assisted selection plant breeding data genomes Montesinos-Lopez, Osval A. Solis-Camacho, Mario Alberto Crespo Herrera, Leonardo A. Saint Pierre, Carolina Huerta Prado, Gloria Isabel Ramos-Pulido, Sofia Al-Nowibet, Khalid Fritsche-Neto, Roberto Gerard, Guillermo S. Montesinos-Lopez, Abelardo Crossa, José Data augmentation enhances plant-genomic-enabled predictions |
| title | Data augmentation enhances plant-genomic-enabled predictions |
| title_full | Data augmentation enhances plant-genomic-enabled predictions |
| title_fullStr | Data augmentation enhances plant-genomic-enabled predictions |
| title_full_unstemmed | Data augmentation enhances plant-genomic-enabled predictions |
| title_short | Data augmentation enhances plant-genomic-enabled predictions |
| title_sort | data augmentation enhances plant genomic enabled predictions |
| topic | marker-assisted selection plant breeding data genomes |
| url | https://hdl.handle.net/10568/159826 |
| work_keys_str_mv | AT montesinoslopezosvala dataaugmentationenhancesplantgenomicenabledpredictions AT soliscamachomarioalberto dataaugmentationenhancesplantgenomicenabledpredictions AT crespoherreraleonardoa dataaugmentationenhancesplantgenomicenabledpredictions AT saintpierrecarolina dataaugmentationenhancesplantgenomicenabledpredictions AT huertapradogloriaisabel dataaugmentationenhancesplantgenomicenabledpredictions AT ramospulidosofia dataaugmentationenhancesplantgenomicenabledpredictions AT alnowibetkhalid dataaugmentationenhancesplantgenomicenabledpredictions AT fritschenetoroberto dataaugmentationenhancesplantgenomicenabledpredictions AT gerardguillermos dataaugmentationenhancesplantgenomicenabledpredictions AT montesinoslopezabelardo dataaugmentationenhancesplantgenomicenabledpredictions AT crossajose dataaugmentationenhancesplantgenomicenabledpredictions |