Leveraging automated machine learning for environmental data-driven genetic analysis and genomic prediction in maize hybrids

Genotype, environment, and genotype-by-environment (GxE) interactions play a critical role in shaping crop phenotypes. Here, a large-scale, multi-environment hybrid maize dataset is used to construct and validate an automated machine learning framework that integrates environmental and genomic data...

Full description

Bibliographic Details
Main Authors: He, Kunhui, Yu, Tingxi, Gao, Shang, Chen, Shoukun, Li, Liang, Zhang, Xuecai, Huang, Changling, Xu, Yunbi, Wang, Jiankang, Boddupalli, Prasanna, Hearne, Sarah, Li, Xinhai, Li, Huihui
Format: Journal Article
Language:Inglés
Published: Wiley-VCH Verlag 2025
Subjects:
Online Access:https://hdl.handle.net/10568/179136
_version_ 1855522108114731008
author He, Kunhui
Yu, Tingxi
Gao, Shang
Chen, Shoukun
Li, Liang
Zhang, Xuecai
Huang, Changling
Xu, Yunbi
Wang, Jiankang
Boddupalli, Prasanna
Hearne, Sarah
Li, Xinhai
Li, Huihui
author_browse Boddupalli, Prasanna
Chen, Shoukun
Gao, Shang
He, Kunhui
Hearne, Sarah
Huang, Changling
Li, Huihui
Li, Liang
Li, Xinhai
Wang, Jiankang
Xu, Yunbi
Yu, Tingxi
Zhang, Xuecai
author_facet He, Kunhui
Yu, Tingxi
Gao, Shang
Chen, Shoukun
Li, Liang
Zhang, Xuecai
Huang, Changling
Xu, Yunbi
Wang, Jiankang
Boddupalli, Prasanna
Hearne, Sarah
Li, Xinhai
Li, Huihui
author_sort He, Kunhui
collection Repository of Agricultural Research Outputs (CGSpace)
description Genotype, environment, and genotype-by-environment (GxE) interactions play a critical role in shaping crop phenotypes. Here, a large-scale, multi-environment hybrid maize dataset is used to construct and validate an automated machine learning framework that integrates environmental and genomic data for improved accuracy and efficiency in genetic analyses and genomic predictions. Dimensionality-reduced environmental parameters (RD_EPs) aligned with developmental stages are applied to establish linear relationships between RD_EPs and traits to assess the influence of environment on phenotype. Genome-wide association study identifies 539 phenotypic plasticity trait-associated markers (PP-TAMs), 223 environmental stability TAMs (Main-TAMs), and 92 GxE-TAMs, revealing distinct genetic bases for PP and GxE interactions. Training genomic prediction models with both TAMs and RD_EPs increase prediction accuracy by 14.02% to 28.42% over that of genome-wide marker approaches. These results demonstrate the potential of utilizing environmental data for improving genetic analysis and genomic selection, offering a scalable approach for developing climate-adaptive maize varieties.
format Journal Article
id CGSpace179136
institution CGIAR Consortium
language Inglés
publishDate 2025
publishDateRange 2025
publishDateSort 2025
publisher Wiley-VCH Verlag
publisherStr Wiley-VCH Verlag
record_format dspace
spelling CGSpace1791362025-12-22T02:05:31Z Leveraging automated machine learning for environmental data-driven genetic analysis and genomic prediction in maize hybrids He, Kunhui Yu, Tingxi Gao, Shang Chen, Shoukun Li, Liang Zhang, Xuecai Huang, Changling Xu, Yunbi Wang, Jiankang Boddupalli, Prasanna Hearne, Sarah Li, Xinhai Li, Huihui environment data genetics marker-assisted selection genotype environment interaction machine learning maize hybrids Genotype, environment, and genotype-by-environment (GxE) interactions play a critical role in shaping crop phenotypes. Here, a large-scale, multi-environment hybrid maize dataset is used to construct and validate an automated machine learning framework that integrates environmental and genomic data for improved accuracy and efficiency in genetic analyses and genomic predictions. Dimensionality-reduced environmental parameters (RD_EPs) aligned with developmental stages are applied to establish linear relationships between RD_EPs and traits to assess the influence of environment on phenotype. Genome-wide association study identifies 539 phenotypic plasticity trait-associated markers (PP-TAMs), 223 environmental stability TAMs (Main-TAMs), and 92 GxE-TAMs, revealing distinct genetic bases for PP and GxE interactions. Training genomic prediction models with both TAMs and RD_EPs increase prediction accuracy by 14.02% to 28.42% over that of genome-wide marker approaches. These results demonstrate the potential of utilizing environmental data for improving genetic analysis and genomic selection, offering a scalable approach for developing climate-adaptive maize varieties. 2025-05-08 2025-12-21T21:19:03Z 2025-12-21T21:19:03Z Journal Article https://hdl.handle.net/10568/179136 en Open Access application/pdf Wiley-VCH Verlag He, K., Yu, T., Gao, S., Chen, S., Li, L., Zhang, X., Huang, C., Xu, Y., Wang, J., Prasanna, B. M., Hearne, S., Li, X., & Li, H. (2025). Leveraging automated machine learning for environmental data-driven genetic analysis and genomic prediction in maize hybrids. Advanced Science, 12(17), 2412423. https://doi.org/10.1002/advs.202412423
spellingShingle environment
data
genetics
marker-assisted selection
genotype environment interaction
machine learning
maize
hybrids
He, Kunhui
Yu, Tingxi
Gao, Shang
Chen, Shoukun
Li, Liang
Zhang, Xuecai
Huang, Changling
Xu, Yunbi
Wang, Jiankang
Boddupalli, Prasanna
Hearne, Sarah
Li, Xinhai
Li, Huihui
Leveraging automated machine learning for environmental data-driven genetic analysis and genomic prediction in maize hybrids
title Leveraging automated machine learning for environmental data-driven genetic analysis and genomic prediction in maize hybrids
title_full Leveraging automated machine learning for environmental data-driven genetic analysis and genomic prediction in maize hybrids
title_fullStr Leveraging automated machine learning for environmental data-driven genetic analysis and genomic prediction in maize hybrids
title_full_unstemmed Leveraging automated machine learning for environmental data-driven genetic analysis and genomic prediction in maize hybrids
title_short Leveraging automated machine learning for environmental data-driven genetic analysis and genomic prediction in maize hybrids
title_sort leveraging automated machine learning for environmental data driven genetic analysis and genomic prediction in maize hybrids
topic environment
data
genetics
marker-assisted selection
genotype environment interaction
machine learning
maize
hybrids
url https://hdl.handle.net/10568/179136
work_keys_str_mv AT hekunhui leveragingautomatedmachinelearningforenvironmentaldatadrivengeneticanalysisandgenomicpredictioninmaizehybrids
AT yutingxi leveragingautomatedmachinelearningforenvironmentaldatadrivengeneticanalysisandgenomicpredictioninmaizehybrids
AT gaoshang leveragingautomatedmachinelearningforenvironmentaldatadrivengeneticanalysisandgenomicpredictioninmaizehybrids
AT chenshoukun leveragingautomatedmachinelearningforenvironmentaldatadrivengeneticanalysisandgenomicpredictioninmaizehybrids
AT liliang leveragingautomatedmachinelearningforenvironmentaldatadrivengeneticanalysisandgenomicpredictioninmaizehybrids
AT zhangxuecai leveragingautomatedmachinelearningforenvironmentaldatadrivengeneticanalysisandgenomicpredictioninmaizehybrids
AT huangchangling leveragingautomatedmachinelearningforenvironmentaldatadrivengeneticanalysisandgenomicpredictioninmaizehybrids
AT xuyunbi leveragingautomatedmachinelearningforenvironmentaldatadrivengeneticanalysisandgenomicpredictioninmaizehybrids
AT wangjiankang leveragingautomatedmachinelearningforenvironmentaldatadrivengeneticanalysisandgenomicpredictioninmaizehybrids
AT boddupalliprasanna leveragingautomatedmachinelearningforenvironmentaldatadrivengeneticanalysisandgenomicpredictioninmaizehybrids
AT hearnesarah leveragingautomatedmachinelearningforenvironmentaldatadrivengeneticanalysisandgenomicpredictioninmaizehybrids
AT lixinhai leveragingautomatedmachinelearningforenvironmentaldatadrivengeneticanalysisandgenomicpredictioninmaizehybrids
AT lihuihui leveragingautomatedmachinelearningforenvironmentaldatadrivengeneticanalysisandgenomicpredictioninmaizehybrids