EXGEP: a framework for predicting genotype-by-environment interactions using ensembles of explainable machine-learning models

Phenotypic variation results from the combination of genotype, the environment, and their interaction. The ability to quantify the relative contributions of genetic and environmental factors to complex traits can help in breeding crops with superior adaptability for growth in varied environments. He...

Full description

Bibliographic Details
Main Authors: Yu, Tingxi, Zhang, Hao, Chen, Shoukun, Gao, Shang, Liu, Ze, Wang, Jiankang, Crossa, Jose, Montesinos-Lopez, Osval Antonio, Hearne, Sarah, Li, Huihui
Format: Journal Article
Language:Inglés
Published: Oxford University Press 2025
Subjects:
Online Access:https://hdl.handle.net/10568/179238
_version_ 1855520405617377280
author Yu, Tingxi
Zhang, Hao
Chen, Shoukun
Gao, Shang
Liu, Ze
Wang, Jiankang
Crossa, Jose
Montesinos-Lopez, Osval Antonio
Hearne, Sarah
Li, Huihui
author_browse Chen, Shoukun
Crossa, Jose
Gao, Shang
Hearne, Sarah
Li, Huihui
Liu, Ze
Montesinos-Lopez, Osval Antonio
Wang, Jiankang
Yu, Tingxi
Zhang, Hao
author_facet Yu, Tingxi
Zhang, Hao
Chen, Shoukun
Gao, Shang
Liu, Ze
Wang, Jiankang
Crossa, Jose
Montesinos-Lopez, Osval Antonio
Hearne, Sarah
Li, Huihui
author_sort Yu, Tingxi
collection Repository of Agricultural Research Outputs (CGSpace)
description Phenotypic variation results from the combination of genotype, the environment, and their interaction. The ability to quantify the relative contributions of genetic and environmental factors to complex traits can help in breeding crops with superior adaptability for growth in varied environments. Here, we developed and extensively evaluated the performance of an explainable machine-learning framework named explainable genotype-by-environment interactions prediction (EXGEP) to accurately predict the grain yield in crops. To assess the performance of EXGEP, we applied it to a dataset comprising 70 693 phenotypic records of grain yield traits for 3793 hybrids (also including both genotype and environmental condition data). When used with four different combinations of genotypes and environmental data, EXGEP exceeded the yield prediction performance of the classic model Bayesian ridge regression model by 17.37%-42.35%. Moreover, EXGEP incorporates SHapley Additive exPlanations values that can uncover complex nonlinear relationships between genotype and environment and identify key features, and their interactions, that provide the main contributions to model performance, thus enhancing our understanding of genotype-by-environment interactions. Additionally, data from a series of tests support that EXGEP exhibits superior performance in terms of prediction accuracy and explainability. Our development of EXGEP and comparisons of it against alternative models provides valuable insights into methods for accurately predicting complex traits in multiple environments.
format Journal Article
id CGSpace179238
institution CGIAR Consortium
language Inglés
publishDate 2025
publishDateRange 2025
publishDateSort 2025
publisher Oxford University Press
publisherStr Oxford University Press
record_format dspace
spelling CGSpace1792382025-12-24T02:04:30Z EXGEP: a framework for predicting genotype-by-environment interactions using ensembles of explainable machine-learning models Yu, Tingxi Zhang, Hao Chen, Shoukun Gao, Shang Liu, Ze Wang, Jiankang Crossa, Jose Montesinos-Lopez, Osval Antonio Hearne, Sarah Li, Huihui maize machine learning phenotypic variation genotype environment interaction artificial intelligence Phenotypic variation results from the combination of genotype, the environment, and their interaction. The ability to quantify the relative contributions of genetic and environmental factors to complex traits can help in breeding crops with superior adaptability for growth in varied environments. Here, we developed and extensively evaluated the performance of an explainable machine-learning framework named explainable genotype-by-environment interactions prediction (EXGEP) to accurately predict the grain yield in crops. To assess the performance of EXGEP, we applied it to a dataset comprising 70 693 phenotypic records of grain yield traits for 3793 hybrids (also including both genotype and environmental condition data). When used with four different combinations of genotypes and environmental data, EXGEP exceeded the yield prediction performance of the classic model Bayesian ridge regression model by 17.37%-42.35%. Moreover, EXGEP incorporates SHapley Additive exPlanations values that can uncover complex nonlinear relationships between genotype and environment and identify key features, and their interactions, that provide the main contributions to model performance, thus enhancing our understanding of genotype-by-environment interactions. Additionally, data from a series of tests support that EXGEP exhibits superior performance in terms of prediction accuracy and explainability. Our development of EXGEP and comparisons of it against alternative models provides valuable insights into methods for accurately predicting complex traits in multiple environments. 2025-07 2025-12-23T16:41:14Z 2025-12-23T16:41:14Z Journal Article https://hdl.handle.net/10568/179238 en Open Access application/pdf Oxford University Press Yu, T., Zhang, H., Chen, S., Gao, S., Liu, Z., Wang, J., Crossa, J., Montesinos-López, O. A., Hearne, S., & Li, H. (2025). EXGEP: a framework for predicting genotype-by-environment interactions using ensembles of explainable machine-learning models. PubMed, 26(4), bbaf414. https://doi.org/10.1093/bib/bbaf414
spellingShingle maize
machine learning
phenotypic variation
genotype environment interaction
artificial intelligence
Yu, Tingxi
Zhang, Hao
Chen, Shoukun
Gao, Shang
Liu, Ze
Wang, Jiankang
Crossa, Jose
Montesinos-Lopez, Osval Antonio
Hearne, Sarah
Li, Huihui
EXGEP: a framework for predicting genotype-by-environment interactions using ensembles of explainable machine-learning models
title EXGEP: a framework for predicting genotype-by-environment interactions using ensembles of explainable machine-learning models
title_full EXGEP: a framework for predicting genotype-by-environment interactions using ensembles of explainable machine-learning models
title_fullStr EXGEP: a framework for predicting genotype-by-environment interactions using ensembles of explainable machine-learning models
title_full_unstemmed EXGEP: a framework for predicting genotype-by-environment interactions using ensembles of explainable machine-learning models
title_short EXGEP: a framework for predicting genotype-by-environment interactions using ensembles of explainable machine-learning models
title_sort exgep a framework for predicting genotype by environment interactions using ensembles of explainable machine learning models
topic maize
machine learning
phenotypic variation
genotype environment interaction
artificial intelligence
url https://hdl.handle.net/10568/179238
work_keys_str_mv AT yutingxi exgepaframeworkforpredictinggenotypebyenvironmentinteractionsusingensemblesofexplainablemachinelearningmodels
AT zhanghao exgepaframeworkforpredictinggenotypebyenvironmentinteractionsusingensemblesofexplainablemachinelearningmodels
AT chenshoukun exgepaframeworkforpredictinggenotypebyenvironmentinteractionsusingensemblesofexplainablemachinelearningmodels
AT gaoshang exgepaframeworkforpredictinggenotypebyenvironmentinteractionsusingensemblesofexplainablemachinelearningmodels
AT liuze exgepaframeworkforpredictinggenotypebyenvironmentinteractionsusingensemblesofexplainablemachinelearningmodels
AT wangjiankang exgepaframeworkforpredictinggenotypebyenvironmentinteractionsusingensemblesofexplainablemachinelearningmodels
AT crossajose exgepaframeworkforpredictinggenotypebyenvironmentinteractionsusingensemblesofexplainablemachinelearningmodels
AT montesinoslopezosvalantonio exgepaframeworkforpredictinggenotypebyenvironmentinteractionsusingensemblesofexplainablemachinelearningmodels
AT hearnesarah exgepaframeworkforpredictinggenotypebyenvironmentinteractionsusingensemblesofexplainablemachinelearningmodels
AT lihuihui exgepaframeworkforpredictinggenotypebyenvironmentinteractionsusingensemblesofexplainablemachinelearningmodels