Machine learning algorithms translate big data into predictive breeding accuracy

Statistical machine learning (ML) extracts patterns from extensive genomic, phenotypic, and environmental data. ML algorithms automatically identify relevant features and use cross-validation to ensure robust models and improve prediction reliability in new lines. Furthermore, ML analyses of genotyp...

Full description

Bibliographic Details
Main Authors: Crossa, José, Montesinos-Lopez, Osval A., Costa-Neto, Germano, Vitale, Paolo, Martini, Johannes W.R., Runcie, Daniel E., Fritsche-Neto, Roberto, Montesinos-Lopez, Abelardo, Perez-Rodriguez, Paulino, Gerard, Guillermo S., Dreisigacker, Susanna, Crespo-Herrera, Leonardo A., Saint Pierre, Carolina, Lillemo, Morten, Cuevas, Jaime, Bentley, Alison R., Ortiz, Rodomiro
Format: Journal Article
Language:Inglés
Published: Elsevier 2025
Subjects:
Online Access:https://hdl.handle.net/10568/169929
Description
Summary:Statistical machine learning (ML) extracts patterns from extensive genomic, phenotypic, and environmental data. ML algorithms automatically identify relevant features and use cross-validation to ensure robust models and improve prediction reliability in new lines. Furthermore, ML analyses of genotype-by-environment (G×E) interactions can offer insights into the genetic factors that affect performance in specific environments. By leveraging historical breeding data, ML streamlines strategies and automates analyses to reveal genomic patterns. In this review we examine the transformative impact of big data, including multi-trait genomics, phenomics, and environmental covariables, on genomic-enabled prediction in plant breeding. We discuss how big data and ML are revolutionizing the field by enhancing prediction accuracy, deepening our understanding of G×E interactions, and optimizing breeding strategies through the analysis of extensive and diverse datasets.