Balancing sensitivity and specificity enhances top and bottom ranking in genomic prediction of cultivars

Genomic selection (GS) is a predictive methodology that is revolutionizing plant and animal breeding. However, the practical application of the GS methodology is challenging since a successful implementation requires a good identification of the best lines. For this reason, some approaches have been...

Descripción completa

Detalles Bibliográficos
Autores principales: Montesinos-Lopez, Osval A., Kismiantini, Alemu, Admas, Montesinos-López, Abelardo, Montesinos-Lopez, José Cricelio, Crossa, Jose
Formato: Journal Article
Lenguaje:Inglés
Publicado: MDPI 2025
Materias:
Acceso en línea:https://hdl.handle.net/10568/173478
_version_ 1855538895685419008
author Montesinos-Lopez, Osval A.
Kismiantini
Alemu, Admas
Montesinos-López, Abelardo
Montesinos-Lopez, José Cricelio
Crossa, Jose
author_browse Alemu, Admas
Crossa, Jose
Kismiantini
Montesinos-Lopez, José Cricelio
Montesinos-Lopez, Osval A.
Montesinos-López, Abelardo
author_facet Montesinos-Lopez, Osval A.
Kismiantini
Alemu, Admas
Montesinos-López, Abelardo
Montesinos-Lopez, José Cricelio
Crossa, Jose
author_sort Montesinos-Lopez, Osval A.
collection Repository of Agricultural Research Outputs (CGSpace)
description Genomic selection (GS) is a predictive methodology that is revolutionizing plant and animal breeding. However, the practical application of the GS methodology is challenging since a successful implementation requires a good identification of the best lines. For this reason, some approaches have been proposed to be able to select the top (or bottom) lines with more Precision. Despite the varying popularity of methods, with some being notably more efficient than others, this paper delves into the fundamentals of these techniques. We used five models/methods: (1) RC, known as the Bayesian Best Linear Unbiased Predictor (GBLUP); (2) R, which is like RC but uses a threshold; (3) RO, Regression Optimum, that leverages the RC model in its training process to fine-tune the threshold; (4) B, Threshold Bayesian Probit Binary model (TGBLUP) with a threshold of 0.5 to classify the cultivars as top or non-top; (5) BO is the TGBLUP but the threshold used is an optimal probability threshold that guarantees similar Sensitivity and Specificity. We also present a benchmark comparison of existing approaches for selecting the top (or bottom) performers, utilizing five real datasets for comprehensive analysis. For methods that necessitate a rigorous tuning process, we suggest a streamlined tuning approach that significantly decreases implementation time without notably compromising performance. Our analysis revealed that the regression optimal (RO) method outperformed other models across the five real datasets, achieving superior results in terms of the F1 score. Specifically, RO was more effective than models R, B, RC, and BO by 60.87, 42.37, 17.63, and 9.62%, respectively. When looking at the Kappa coefficient, the RO model was better than models B, BO, R, and RC by 37.46, 36.21, 52.18, and 3.95%, respectively. In terms of Sensitivity, the RO model outperformed models B, R, and RC by 145.74, 250.41, and 86.20, respectively. The second-best model was the model BO. It is important to point out that in the first stage, the BO and RO approaches train a classification and regression model, respectively, to classify the lines as the top (bottom) or not the top (not the bottom). However, both the BO and RO approaches optimize a threshold in the second stage to perform the classification of the lines that minimize the difference between the Sensitivity and Specificity. The BO and RO methods are superior for the selection of the top (or bottom) lines. For this reason, we encourage breeders to adopt these approaches to increase genetic gain in plant breeding programs.
format Journal Article
id CGSpace173478
institution CGIAR Consortium
language Inglés
publishDate 2025
publishDateRange 2025
publishDateSort 2025
publisher MDPI
publisherStr MDPI
record_format dspace
spelling CGSpace1734782025-12-08T10:29:22Z Balancing sensitivity and specificity enhances top and bottom ranking in genomic prediction of cultivars Montesinos-Lopez, Osval A. Kismiantini Alemu, Admas Montesinos-López, Abelardo Montesinos-Lopez, José Cricelio Crossa, Jose plant breeding marker-assisted selection bayesian theory datasets Genomic selection (GS) is a predictive methodology that is revolutionizing plant and animal breeding. However, the practical application of the GS methodology is challenging since a successful implementation requires a good identification of the best lines. For this reason, some approaches have been proposed to be able to select the top (or bottom) lines with more Precision. Despite the varying popularity of methods, with some being notably more efficient than others, this paper delves into the fundamentals of these techniques. We used five models/methods: (1) RC, known as the Bayesian Best Linear Unbiased Predictor (GBLUP); (2) R, which is like RC but uses a threshold; (3) RO, Regression Optimum, that leverages the RC model in its training process to fine-tune the threshold; (4) B, Threshold Bayesian Probit Binary model (TGBLUP) with a threshold of 0.5 to classify the cultivars as top or non-top; (5) BO is the TGBLUP but the threshold used is an optimal probability threshold that guarantees similar Sensitivity and Specificity. We also present a benchmark comparison of existing approaches for selecting the top (or bottom) performers, utilizing five real datasets for comprehensive analysis. For methods that necessitate a rigorous tuning process, we suggest a streamlined tuning approach that significantly decreases implementation time without notably compromising performance. Our analysis revealed that the regression optimal (RO) method outperformed other models across the five real datasets, achieving superior results in terms of the F1 score. Specifically, RO was more effective than models R, B, RC, and BO by 60.87, 42.37, 17.63, and 9.62%, respectively. When looking at the Kappa coefficient, the RO model was better than models B, BO, R, and RC by 37.46, 36.21, 52.18, and 3.95%, respectively. In terms of Sensitivity, the RO model outperformed models B, R, and RC by 145.74, 250.41, and 86.20, respectively. The second-best model was the model BO. It is important to point out that in the first stage, the BO and RO approaches train a classification and regression model, respectively, to classify the lines as the top (bottom) or not the top (not the bottom). However, both the BO and RO approaches optimize a threshold in the second stage to perform the classification of the lines that minimize the difference between the Sensitivity and Specificity. The BO and RO methods are superior for the selection of the top (or bottom) lines. For this reason, we encourage breeders to adopt these approaches to increase genetic gain in plant breeding programs. 2025 2025-03-04T04:38:22Z 2025-03-04T04:38:22Z Journal Article https://hdl.handle.net/10568/173478 en Open Access application/pdf MDPI Montesinos-López, O. A., Kismiantini, Alemu, A., Montesinos-López, A., Montesinos-López, J. C., & Crossa, J. (2025). Balancing sensitivity and specificity enhances top and bottom ranking in genomic prediction of cultivars. Plants, 14(3), 308. https://doi.org/10.3390/plants14030308
spellingShingle plant breeding
marker-assisted selection
bayesian theory
datasets
Montesinos-Lopez, Osval A.
Kismiantini
Alemu, Admas
Montesinos-López, Abelardo
Montesinos-Lopez, José Cricelio
Crossa, Jose
Balancing sensitivity and specificity enhances top and bottom ranking in genomic prediction of cultivars
title Balancing sensitivity and specificity enhances top and bottom ranking in genomic prediction of cultivars
title_full Balancing sensitivity and specificity enhances top and bottom ranking in genomic prediction of cultivars
title_fullStr Balancing sensitivity and specificity enhances top and bottom ranking in genomic prediction of cultivars
title_full_unstemmed Balancing sensitivity and specificity enhances top and bottom ranking in genomic prediction of cultivars
title_short Balancing sensitivity and specificity enhances top and bottom ranking in genomic prediction of cultivars
title_sort balancing sensitivity and specificity enhances top and bottom ranking in genomic prediction of cultivars
topic plant breeding
marker-assisted selection
bayesian theory
datasets
url https://hdl.handle.net/10568/173478
work_keys_str_mv AT montesinoslopezosvala balancingsensitivityandspecificityenhancestopandbottomrankingingenomicpredictionofcultivars
AT kismiantini balancingsensitivityandspecificityenhancestopandbottomrankingingenomicpredictionofcultivars
AT alemuadmas balancingsensitivityandspecificityenhancestopandbottomrankingingenomicpredictionofcultivars
AT montesinoslopezabelardo balancingsensitivityandspecificityenhancestopandbottomrankingingenomicpredictionofcultivars
AT montesinoslopezjosecricelio balancingsensitivityandspecificityenhancestopandbottomrankingingenomicpredictionofcultivars
AT crossajose balancingsensitivityandspecificityenhancestopandbottomrankingingenomicpredictionofcultivars