DNNGP, a deep neural network-based method for genomic prediction using multi-omics data in plants

Genomic prediction is an effective way to accelerate the rate of agronomic trait improvement in plants. Traditional methods typically use linear regression models with clear assumptions; such methods are unable to capture the complex relationships between genotypes and phenotypes. Non-linear models...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Kelin, Abid, Muhammad Ali, Awais Rasheed, Crossa, José, Hearne, Sarah Jane, Huihui Li
Formato: Journal Article
Lenguaje:Inglés
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://hdl.handle.net/10568/132711
_version_ 1855520869274615808
author Wang, Kelin
Abid, Muhammad Ali
Awais Rasheed
Crossa, José
Hearne, Sarah Jane
Huihui Li
author_browse Abid, Muhammad Ali
Awais Rasheed
Crossa, José
Hearne, Sarah Jane
Huihui Li
Wang, Kelin
author_facet Wang, Kelin
Abid, Muhammad Ali
Awais Rasheed
Crossa, José
Hearne, Sarah Jane
Huihui Li
author_sort Wang, Kelin
collection Repository of Agricultural Research Outputs (CGSpace)
description Genomic prediction is an effective way to accelerate the rate of agronomic trait improvement in plants. Traditional methods typically use linear regression models with clear assumptions; such methods are unable to capture the complex relationships between genotypes and phenotypes. Non-linear models (e.g., deep neural networks) have been proposed as a superior alternative to linear models because they can capture complex non-additive effects. Here we introduce a deep learning (DL) method, deep neural network genomic prediction (DNNGP), for integration of multi-omics data in plants. We trained DNNGP on four datasets and compared its performance with methods built with five classic models: genomic best linear unbiased prediction (GBLUP); two methods based on a machine learning (ML) framework, light gradient boosting machine (LightGBM) and support vector regression (SVR); and two methods based on a DL framework, deep learning genomic selection (DeepGS) and deep learning genome-wide association study (DLGWAS). DNNGP is novel in five ways. First, it can be applied to a variety of omics data to predict phenotypes. Second, the multilayered hierarchical structure of DNNGP dynamically learns features from raw data, avoiding overfitting and improving the convergence rate using a batch normalization layer and early stopping and rectified linear activation (rectified linear unit) functions. Third, when small datasets were used, DNNGP produced results that are competitive with results from the other five methods, showing greater prediction accuracy than the other methods when large-scale breeding data were used. Fourth, the computation time required by DNNGP was comparable with that of commonly used methods, up to 10 times faster than DeepGS. Fifth, hyperparameters can easily be batch tuned on a local machine. Compared with GBLUP, LightGBM, SVR, DeepGS and DLGWAS, DNNGP is superior to these existing widely used genomic selection (GS) methods. Moreover, DNNGP can generate robust assessments from diverse datasets, including omics data, and quickly incorporate complex and large datasets into usable models, making it a promising and practical approach for straightforward integration into existing GS platforms.
format Journal Article
id CGSpace132711
institution CGIAR Consortium
language Inglés
publishDate 2023
publishDateRange 2023
publishDateSort 2023
publisher Elsevier
publisherStr Elsevier
record_format dspace
spelling CGSpace1327112025-12-08T10:11:39Z DNNGP, a deep neural network-based method for genomic prediction using multi-omics data in plants Wang, Kelin Abid, Muhammad Ali Awais Rasheed Crossa, José Hearne, Sarah Jane Huihui Li marker-assisted selection methods data learning Genomic prediction is an effective way to accelerate the rate of agronomic trait improvement in plants. Traditional methods typically use linear regression models with clear assumptions; such methods are unable to capture the complex relationships between genotypes and phenotypes. Non-linear models (e.g., deep neural networks) have been proposed as a superior alternative to linear models because they can capture complex non-additive effects. Here we introduce a deep learning (DL) method, deep neural network genomic prediction (DNNGP), for integration of multi-omics data in plants. We trained DNNGP on four datasets and compared its performance with methods built with five classic models: genomic best linear unbiased prediction (GBLUP); two methods based on a machine learning (ML) framework, light gradient boosting machine (LightGBM) and support vector regression (SVR); and two methods based on a DL framework, deep learning genomic selection (DeepGS) and deep learning genome-wide association study (DLGWAS). DNNGP is novel in five ways. First, it can be applied to a variety of omics data to predict phenotypes. Second, the multilayered hierarchical structure of DNNGP dynamically learns features from raw data, avoiding overfitting and improving the convergence rate using a batch normalization layer and early stopping and rectified linear activation (rectified linear unit) functions. Third, when small datasets were used, DNNGP produced results that are competitive with results from the other five methods, showing greater prediction accuracy than the other methods when large-scale breeding data were used. Fourth, the computation time required by DNNGP was comparable with that of commonly used methods, up to 10 times faster than DeepGS. Fifth, hyperparameters can easily be batch tuned on a local machine. Compared with GBLUP, LightGBM, SVR, DeepGS and DLGWAS, DNNGP is superior to these existing widely used genomic selection (GS) methods. Moreover, DNNGP can generate robust assessments from diverse datasets, including omics data, and quickly incorporate complex and large datasets into usable models, making it a promising and practical approach for straightforward integration into existing GS platforms. 2023-01 2023-11-03T15:52:14Z 2023-11-03T15:52:14Z Journal Article https://hdl.handle.net/10568/132711 en Open Access application/pdf Elsevier Wang, K., Abid, M. A., Rasheed, A., Crossa, J., Hearne, S., & Li, H. (2023). DNNGP, a deep neural network-based method for genomic prediction using multi-omics data in plants. Molecular Plant, 16(1), 279–293. https://doi.org/10.1016/j.molp.2022.11.004
spellingShingle marker-assisted selection
methods
data
learning
Wang, Kelin
Abid, Muhammad Ali
Awais Rasheed
Crossa, José
Hearne, Sarah Jane
Huihui Li
DNNGP, a deep neural network-based method for genomic prediction using multi-omics data in plants
title DNNGP, a deep neural network-based method for genomic prediction using multi-omics data in plants
title_full DNNGP, a deep neural network-based method for genomic prediction using multi-omics data in plants
title_fullStr DNNGP, a deep neural network-based method for genomic prediction using multi-omics data in plants
title_full_unstemmed DNNGP, a deep neural network-based method for genomic prediction using multi-omics data in plants
title_short DNNGP, a deep neural network-based method for genomic prediction using multi-omics data in plants
title_sort dnngp a deep neural network based method for genomic prediction using multi omics data in plants
topic marker-assisted selection
methods
data
learning
url https://hdl.handle.net/10568/132711
work_keys_str_mv AT wangkelin dnngpadeepneuralnetworkbasedmethodforgenomicpredictionusingmultiomicsdatainplants
AT abidmuhammadali dnngpadeepneuralnetworkbasedmethodforgenomicpredictionusingmultiomicsdatainplants
AT awaisrasheed dnngpadeepneuralnetworkbasedmethodforgenomicpredictionusingmultiomicsdatainplants
AT crossajose dnngpadeepneuralnetworkbasedmethodforgenomicpredictionusingmultiomicsdatainplants
AT hearnesarahjane dnngpadeepneuralnetworkbasedmethodforgenomicpredictionusingmultiomicsdatainplants
AT huihuili dnngpadeepneuralnetworkbasedmethodforgenomicpredictionusingmultiomicsdatainplants