Balancing sensitivity and specificity enhances top and bottom ranking in genomic prediction of cultivars
Genomic selection (GS) is a predictive methodology that is revolutionizing plant and animal breeding. However, the practical application of the GS methodology is challenging since a successful implementation requires a good identification of the best lines. For this reason, some approaches have been...
| Autores principales: | , , , , , |
|---|---|
| Formato: | Journal Article |
| Lenguaje: | Inglés |
| Publicado: |
MDPI
2025
|
| Materias: | |
| Acceso en línea: | https://hdl.handle.net/10568/173478 |
| _version_ | 1855538895685419008 |
|---|---|
| author | Montesinos-Lopez, Osval A. Kismiantini Alemu, Admas Montesinos-López, Abelardo Montesinos-Lopez, José Cricelio Crossa, Jose |
| author_browse | Alemu, Admas Crossa, Jose Kismiantini Montesinos-Lopez, José Cricelio Montesinos-Lopez, Osval A. Montesinos-López, Abelardo |
| author_facet | Montesinos-Lopez, Osval A. Kismiantini Alemu, Admas Montesinos-López, Abelardo Montesinos-Lopez, José Cricelio Crossa, Jose |
| author_sort | Montesinos-Lopez, Osval A. |
| collection | Repository of Agricultural Research Outputs (CGSpace) |
| description | Genomic selection (GS) is a predictive methodology that is revolutionizing plant and animal breeding. However, the practical application of the GS methodology is challenging since a successful implementation requires a good identification of the best lines. For this reason, some approaches have been proposed to be able to select the top (or bottom) lines with more Precision. Despite the varying popularity of methods, with some being notably more efficient than others, this paper delves into the fundamentals of these techniques. We used five models/methods: (1) RC, known as the Bayesian Best Linear Unbiased Predictor (GBLUP); (2) R, which is like RC but uses a threshold; (3) RO, Regression Optimum, that leverages the RC model in its training process to fine-tune the threshold; (4) B, Threshold Bayesian Probit Binary model (TGBLUP) with a threshold of 0.5 to classify the cultivars as top or non-top; (5) BO is the TGBLUP but the threshold used is an optimal probability threshold that guarantees similar Sensitivity and Specificity. We also present a benchmark comparison of existing approaches for selecting the top (or bottom) performers, utilizing five real datasets for comprehensive analysis. For methods that necessitate a rigorous tuning process, we suggest a streamlined tuning approach that significantly decreases implementation time without notably compromising performance. Our analysis revealed that the regression optimal (RO) method outperformed other models across the five real datasets, achieving superior results in terms of the F1 score. Specifically, RO was more effective than models R, B, RC, and BO by 60.87, 42.37, 17.63, and 9.62%, respectively. When looking at the Kappa coefficient, the RO model was better than models B, BO, R, and RC by 37.46, 36.21, 52.18, and 3.95%, respectively. In terms of Sensitivity, the RO model outperformed models B, R, and RC by 145.74, 250.41, and 86.20, respectively. The second-best model was the model BO. It is important to point out that in the first stage, the BO and RO approaches train a classification and regression model, respectively, to classify the lines as the top (bottom) or not the top (not the bottom). However, both the BO and RO approaches optimize a threshold in the second stage to perform the classification of the lines that minimize the difference between the Sensitivity and Specificity. The BO and RO methods are superior for the selection of the top (or bottom) lines. For this reason, we encourage breeders to adopt these approaches to increase genetic gain in plant breeding programs. |
| format | Journal Article |
| id | CGSpace173478 |
| institution | CGIAR Consortium |
| language | Inglés |
| publishDate | 2025 |
| publishDateRange | 2025 |
| publishDateSort | 2025 |
| publisher | MDPI |
| publisherStr | MDPI |
| record_format | dspace |
| spelling | CGSpace1734782025-12-08T10:29:22Z Balancing sensitivity and specificity enhances top and bottom ranking in genomic prediction of cultivars Montesinos-Lopez, Osval A. Kismiantini Alemu, Admas Montesinos-López, Abelardo Montesinos-Lopez, José Cricelio Crossa, Jose plant breeding marker-assisted selection bayesian theory datasets Genomic selection (GS) is a predictive methodology that is revolutionizing plant and animal breeding. However, the practical application of the GS methodology is challenging since a successful implementation requires a good identification of the best lines. For this reason, some approaches have been proposed to be able to select the top (or bottom) lines with more Precision. Despite the varying popularity of methods, with some being notably more efficient than others, this paper delves into the fundamentals of these techniques. We used five models/methods: (1) RC, known as the Bayesian Best Linear Unbiased Predictor (GBLUP); (2) R, which is like RC but uses a threshold; (3) RO, Regression Optimum, that leverages the RC model in its training process to fine-tune the threshold; (4) B, Threshold Bayesian Probit Binary model (TGBLUP) with a threshold of 0.5 to classify the cultivars as top or non-top; (5) BO is the TGBLUP but the threshold used is an optimal probability threshold that guarantees similar Sensitivity and Specificity. We also present a benchmark comparison of existing approaches for selecting the top (or bottom) performers, utilizing five real datasets for comprehensive analysis. For methods that necessitate a rigorous tuning process, we suggest a streamlined tuning approach that significantly decreases implementation time without notably compromising performance. Our analysis revealed that the regression optimal (RO) method outperformed other models across the five real datasets, achieving superior results in terms of the F1 score. Specifically, RO was more effective than models R, B, RC, and BO by 60.87, 42.37, 17.63, and 9.62%, respectively. When looking at the Kappa coefficient, the RO model was better than models B, BO, R, and RC by 37.46, 36.21, 52.18, and 3.95%, respectively. In terms of Sensitivity, the RO model outperformed models B, R, and RC by 145.74, 250.41, and 86.20, respectively. The second-best model was the model BO. It is important to point out that in the first stage, the BO and RO approaches train a classification and regression model, respectively, to classify the lines as the top (bottom) or not the top (not the bottom). However, both the BO and RO approaches optimize a threshold in the second stage to perform the classification of the lines that minimize the difference between the Sensitivity and Specificity. The BO and RO methods are superior for the selection of the top (or bottom) lines. For this reason, we encourage breeders to adopt these approaches to increase genetic gain in plant breeding programs. 2025 2025-03-04T04:38:22Z 2025-03-04T04:38:22Z Journal Article https://hdl.handle.net/10568/173478 en Open Access application/pdf MDPI Montesinos-López, O. A., Kismiantini, Alemu, A., Montesinos-López, A., Montesinos-López, J. C., & Crossa, J. (2025). Balancing sensitivity and specificity enhances top and bottom ranking in genomic prediction of cultivars. Plants, 14(3), 308. https://doi.org/10.3390/plants14030308 |
| spellingShingle | plant breeding marker-assisted selection bayesian theory datasets Montesinos-Lopez, Osval A. Kismiantini Alemu, Admas Montesinos-López, Abelardo Montesinos-Lopez, José Cricelio Crossa, Jose Balancing sensitivity and specificity enhances top and bottom ranking in genomic prediction of cultivars |
| title | Balancing sensitivity and specificity enhances top and bottom ranking in genomic prediction of cultivars |
| title_full | Balancing sensitivity and specificity enhances top and bottom ranking in genomic prediction of cultivars |
| title_fullStr | Balancing sensitivity and specificity enhances top and bottom ranking in genomic prediction of cultivars |
| title_full_unstemmed | Balancing sensitivity and specificity enhances top and bottom ranking in genomic prediction of cultivars |
| title_short | Balancing sensitivity and specificity enhances top and bottom ranking in genomic prediction of cultivars |
| title_sort | balancing sensitivity and specificity enhances top and bottom ranking in genomic prediction of cultivars |
| topic | plant breeding marker-assisted selection bayesian theory datasets |
| url | https://hdl.handle.net/10568/173478 |
| work_keys_str_mv | AT montesinoslopezosvala balancingsensitivityandspecificityenhancestopandbottomrankingingenomicpredictionofcultivars AT kismiantini balancingsensitivityandspecificityenhancestopandbottomrankingingenomicpredictionofcultivars AT alemuadmas balancingsensitivityandspecificityenhancestopandbottomrankingingenomicpredictionofcultivars AT montesinoslopezabelardo balancingsensitivityandspecificityenhancestopandbottomrankingingenomicpredictionofcultivars AT montesinoslopezjosecricelio balancingsensitivityandspecificityenhancestopandbottomrankingingenomicpredictionofcultivars AT crossajose balancingsensitivityandspecificityenhancestopandbottomrankingingenomicpredictionofcultivars |