Benchmarking database systems for Genomic Selection implementation
With high-throughput genotyping systems now available, it has become feasible to fully integrate genotyping information into breeding programs. To make use of this information effectively requires DNA extraction facilities and marker production facilities that can efficiently deploy the desired set...
| Main Authors: | , , , , , , , , , , |
|---|---|
| Format: | Journal Article |
| Language: | Inglés |
| Published: |
Oxford University Press
2019
|
| Subjects: | |
| Online Access: | https://hdl.handle.net/10568/105632 |
| _version_ | 1855530580232372224 |
|---|---|
| author | Nti-Addae, Y. Matthews, D. Jun Ulat, V. Syed, R. Sempéré, G. Pétel, A. Renner, J. Larmande, Pierre Guignon, Valentin Jones, E. Robbins, K. |
| author_browse | Guignon, Valentin Jones, E. Jun Ulat, V. Larmande, Pierre Matthews, D. Nti-Addae, Y. Pétel, A. Renner, J. Robbins, K. Sempéré, G. Syed, R. |
| author_facet | Nti-Addae, Y. Matthews, D. Jun Ulat, V. Syed, R. Sempéré, G. Pétel, A. Renner, J. Larmande, Pierre Guignon, Valentin Jones, E. Robbins, K. |
| author_sort | Nti-Addae, Y. |
| collection | Repository of Agricultural Research Outputs (CGSpace) |
| description | With high-throughput genotyping systems now available, it has become feasible to fully integrate genotyping information into breeding programs. To make use of this information effectively requires DNA extraction facilities and marker production facilities that can efficiently deploy the desired set of markers across samples with a rapid turnaround time that allows for selection before crosses needed to be made. In reality, breeders often have a short window of time to make decisions by the time they are able to collect all their phenotyping data and receive corresponding genotyping data. This presents a challenge to organize information and utilize it in downstream analyses to support decisions made by breeders. In order to implement genomic selection routinely as part of breeding programs, one would need an efficient genotyping data storage system. We selected and benchmarked six popular open-source data storage systems,i ncluding relational database management and columnar storage systems. Results:We found that data extract times are greatly influenced by the orientation in which genotype data is stored in a system. HDF5 consistently performed best, in part because it can more efficiently work with both orientations of the allele matrix |
| format | Journal Article |
| id | CGSpace105632 |
| institution | CGIAR Consortium |
| language | Inglés |
| publishDate | 2019 |
| publishDateRange | 2019 |
| publishDateSort | 2019 |
| publisher | Oxford University Press |
| publisherStr | Oxford University Press |
| record_format | dspace |
| spelling | CGSpace1056322025-11-12T05:44:24Z Benchmarking database systems for Genomic Selection implementation Nti-Addae, Y. Matthews, D. Jun Ulat, V. Syed, R. Sempéré, G. Pétel, A. Renner, J. Larmande, Pierre Guignon, Valentin Jones, E. Robbins, K. information systems information storage data databases genotypes plant breeding With high-throughput genotyping systems now available, it has become feasible to fully integrate genotyping information into breeding programs. To make use of this information effectively requires DNA extraction facilities and marker production facilities that can efficiently deploy the desired set of markers across samples with a rapid turnaround time that allows for selection before crosses needed to be made. In reality, breeders often have a short window of time to make decisions by the time they are able to collect all their phenotyping data and receive corresponding genotyping data. This presents a challenge to organize information and utilize it in downstream analyses to support decisions made by breeders. In order to implement genomic selection routinely as part of breeding programs, one would need an efficient genotyping data storage system. We selected and benchmarked six popular open-source data storage systems,i ncluding relational database management and columnar storage systems. Results:We found that data extract times are greatly influenced by the orientation in which genotype data is stored in a system. HDF5 consistently performed best, in part because it can more efficiently work with both orientations of the allele matrix 2019-01-01 2019-11-05T09:55:52Z 2019-11-05T09:55:52Z Journal Article https://hdl.handle.net/10568/105632 en Open Access application/pdf Oxford University Press Nti-Addae, Y.; Matthews, D.; Jun Ulat, V.; Syed, R.; Sempéré, G.; Pétel, A.; Renner, J.; Larmande, P.; Guignon, V.; Jones, E.; Robbins, K. (2019) Benchmarking database systems for Genomic Selection implementation. Database vol 2019, Article ID: baz096. ISSN: 1758-0463 |
| spellingShingle | information systems information storage data databases genotypes plant breeding Nti-Addae, Y. Matthews, D. Jun Ulat, V. Syed, R. Sempéré, G. Pétel, A. Renner, J. Larmande, Pierre Guignon, Valentin Jones, E. Robbins, K. Benchmarking database systems for Genomic Selection implementation |
| title | Benchmarking database systems for Genomic Selection implementation |
| title_full | Benchmarking database systems for Genomic Selection implementation |
| title_fullStr | Benchmarking database systems for Genomic Selection implementation |
| title_full_unstemmed | Benchmarking database systems for Genomic Selection implementation |
| title_short | Benchmarking database systems for Genomic Selection implementation |
| title_sort | benchmarking database systems for genomic selection implementation |
| topic | information systems information storage data databases genotypes plant breeding |
| url | https://hdl.handle.net/10568/105632 |
| work_keys_str_mv | AT ntiaddaey benchmarkingdatabasesystemsforgenomicselectionimplementation AT matthewsd benchmarkingdatabasesystemsforgenomicselectionimplementation AT junulatv benchmarkingdatabasesystemsforgenomicselectionimplementation AT syedr benchmarkingdatabasesystemsforgenomicselectionimplementation AT sempereg benchmarkingdatabasesystemsforgenomicselectionimplementation AT petela benchmarkingdatabasesystemsforgenomicselectionimplementation AT rennerj benchmarkingdatabasesystemsforgenomicselectionimplementation AT larmandepierre benchmarkingdatabasesystemsforgenomicselectionimplementation AT guignonvalentin benchmarkingdatabasesystemsforgenomicselectionimplementation AT jonese benchmarkingdatabasesystemsforgenomicselectionimplementation AT robbinsk benchmarkingdatabasesystemsforgenomicselectionimplementation |