| Sumario: | Phenotypic variation results from the combination of genotype, the environment, and their interaction. The ability to quantify the relative contributions of genetic and environmental factors to complex traits can help in breeding crops with superior adaptability for growth in varied environments. Here, we developed and extensively evaluated the performance of an explainable machine-learning framework named explainable genotype-by-environment interactions prediction (EXGEP) to accurately predict the grain yield in crops. To assess the performance of EXGEP, we applied it to a dataset comprising 70 693 phenotypic records of grain yield traits for 3793 hybrids (also including both genotype and environmental condition data). When used with four different combinations of genotypes and environmental data, EXGEP exceeded the yield prediction performance of the classic model Bayesian ridge regression model by 17.37%-42.35%. Moreover, EXGEP incorporates SHapley Additive exPlanations values that can uncover complex nonlinear relationships between genotype and environment and identify key features, and their interactions, that provide the main contributions to model performance, thus enhancing our understanding of genotype-by-environment interactions. Additionally, data from a series of tests support that EXGEP exhibits superior performance in terms of prediction accuracy and explainability. Our development of EXGEP and comparisons of it against alternative models provides valuable insights into methods for accurately predicting complex traits in multiple environments.
|