SNP-Seek II: A resource for allele mining and analysis of big genomic data in Oryza sativa
The 3000 Rice Genomes Project generated a large dataset of genomic variation to the world’s most important crop, Oryza sativa L. Using the Burrows-Wheeler Aligner (BWA) and the Genome Analysis Toolkit (GATK) variant calling on this dataset, we identified ∼40 M single-nucleotide polymorphisms (SNPs)....
| Autores principales: | , , , , , , , , , , , , , |
|---|---|
| Formato: | Journal Article |
| Lenguaje: | Inglés |
| Publicado: |
Elsevier
2016
|
| Acceso en línea: | https://hdl.handle.net/10568/165183 |
| _version_ | 1855540872336113664 |
|---|---|
| author | Mansueto, Locedie Fuentes, Roven Rommel Chebotarov, Dmytro Borja, Frances Nikki Detras, Jeffrey Abriol-Santos, Juan Miguel Palis, Kevin Poliakov, Alexandre Dubchak, Inna Solovyev, Victor Hamilton, Ruaraidh Sackville McNally, Kenneth L. Alexandrov, Nickolai Mauleon, Ramil |
| author_browse | Abriol-Santos, Juan Miguel Alexandrov, Nickolai Borja, Frances Nikki Chebotarov, Dmytro Detras, Jeffrey Dubchak, Inna Fuentes, Roven Rommel Hamilton, Ruaraidh Sackville Mansueto, Locedie Mauleon, Ramil McNally, Kenneth L. Palis, Kevin Poliakov, Alexandre Solovyev, Victor |
| author_facet | Mansueto, Locedie Fuentes, Roven Rommel Chebotarov, Dmytro Borja, Frances Nikki Detras, Jeffrey Abriol-Santos, Juan Miguel Palis, Kevin Poliakov, Alexandre Dubchak, Inna Solovyev, Victor Hamilton, Ruaraidh Sackville McNally, Kenneth L. Alexandrov, Nickolai Mauleon, Ramil |
| author_sort | Mansueto, Locedie |
| collection | Repository of Agricultural Research Outputs (CGSpace) |
| description | The 3000 Rice Genomes Project generated a large dataset of genomic variation to the world’s most important crop, Oryza sativa L. Using the Burrows-Wheeler Aligner (BWA) and the Genome Analysis Toolkit (GATK) variant calling on this dataset, we identified ∼40 M single-nucleotide polymorphisms (SNPs). Five reference genomes of rice representing the major variety groups were used: Nipponbare (temperate japonica), IR 64 (indica), 93–11 (indica), DJ 123 (aus), and Kasalath (aus). The results are accessible through the Rice SNP-Seek Database (http://snp-seek.irri.org) and through web services of the application programming interface (API). We incorporated legacy phenotypic and passport data for the sequenced varieties originating from the International Rice Genebank Collection Information System (IRGCIS) and gene models from several rice annotation projects. The massive genotypic data in SNP-Seek are stored using hierarchical data format 5 (HDF5) files for quick retrieval. Germplasm, phenotypic, and genomic data are stored in a relational database management system (RDBMS) using the Chado schema, allowing the use of controlled vocabularies from biological ontologies as query constraints in SNP-Seek. In this paper, we discuss the datasets stored in SNP-Seek, architecture of the database and web application, interoperability methodologies in place, and discuss a few use cases demonstrating the utility of SNP-Seek for diversity analysis and molecular breeding. |
| format | Journal Article |
| id | CGSpace165183 |
| institution | CGIAR Consortium |
| language | Inglés |
| publishDate | 2016 |
| publishDateRange | 2016 |
| publishDateSort | 2016 |
| publisher | Elsevier |
| publisherStr | Elsevier |
| record_format | dspace |
| spelling | CGSpace1651832024-12-19T14:13:40Z SNP-Seek II: A resource for allele mining and analysis of big genomic data in Oryza sativa Mansueto, Locedie Fuentes, Roven Rommel Chebotarov, Dmytro Borja, Frances Nikki Detras, Jeffrey Abriol-Santos, Juan Miguel Palis, Kevin Poliakov, Alexandre Dubchak, Inna Solovyev, Victor Hamilton, Ruaraidh Sackville McNally, Kenneth L. Alexandrov, Nickolai Mauleon, Ramil The 3000 Rice Genomes Project generated a large dataset of genomic variation to the world’s most important crop, Oryza sativa L. Using the Burrows-Wheeler Aligner (BWA) and the Genome Analysis Toolkit (GATK) variant calling on this dataset, we identified ∼40 M single-nucleotide polymorphisms (SNPs). Five reference genomes of rice representing the major variety groups were used: Nipponbare (temperate japonica), IR 64 (indica), 93–11 (indica), DJ 123 (aus), and Kasalath (aus). The results are accessible through the Rice SNP-Seek Database (http://snp-seek.irri.org) and through web services of the application programming interface (API). We incorporated legacy phenotypic and passport data for the sequenced varieties originating from the International Rice Genebank Collection Information System (IRGCIS) and gene models from several rice annotation projects. The massive genotypic data in SNP-Seek are stored using hierarchical data format 5 (HDF5) files for quick retrieval. Germplasm, phenotypic, and genomic data are stored in a relational database management system (RDBMS) using the Chado schema, allowing the use of controlled vocabularies from biological ontologies as query constraints in SNP-Seek. In this paper, we discuss the datasets stored in SNP-Seek, architecture of the database and web application, interoperability methodologies in place, and discuss a few use cases demonstrating the utility of SNP-Seek for diversity analysis and molecular breeding. 2016-11 2024-12-19T12:54:48Z 2024-12-19T12:54:48Z Journal Article https://hdl.handle.net/10568/165183 en Open Access Elsevier Mansueto, Locedie; Fuentes, Roven Rommel; Chebotarov, Dmytro; Borja, Frances Nikki; Detras, Jeffrey; Abriol-Santos, Juan Miguel; Palis, Kevin; Poliakov, Alexandre; Dubchak, Inna; Solovyev, Victor; Hamilton, Ruaraidh Sackville; McNally, Kenneth L.; Alexandrov, Nickolai and Mauleon, Ramil. 2016. SNP-Seek II: A resource for allele mining and analysis of big genomic data in Oryza sativa. Current Plant Biology, Volume 7-8 p. 16-25 |
| spellingShingle | Mansueto, Locedie Fuentes, Roven Rommel Chebotarov, Dmytro Borja, Frances Nikki Detras, Jeffrey Abriol-Santos, Juan Miguel Palis, Kevin Poliakov, Alexandre Dubchak, Inna Solovyev, Victor Hamilton, Ruaraidh Sackville McNally, Kenneth L. Alexandrov, Nickolai Mauleon, Ramil SNP-Seek II: A resource for allele mining and analysis of big genomic data in Oryza sativa |
| title | SNP-Seek II: A resource for allele mining and analysis of big genomic data in Oryza sativa |
| title_full | SNP-Seek II: A resource for allele mining and analysis of big genomic data in Oryza sativa |
| title_fullStr | SNP-Seek II: A resource for allele mining and analysis of big genomic data in Oryza sativa |
| title_full_unstemmed | SNP-Seek II: A resource for allele mining and analysis of big genomic data in Oryza sativa |
| title_short | SNP-Seek II: A resource for allele mining and analysis of big genomic data in Oryza sativa |
| title_sort | snp seek ii a resource for allele mining and analysis of big genomic data in oryza sativa |
| url | https://hdl.handle.net/10568/165183 |
| work_keys_str_mv | AT mansuetolocedie snpseekiiaresourceforalleleminingandanalysisofbiggenomicdatainoryzasativa AT fuentesrovenrommel snpseekiiaresourceforalleleminingandanalysisofbiggenomicdatainoryzasativa AT chebotarovdmytro snpseekiiaresourceforalleleminingandanalysisofbiggenomicdatainoryzasativa AT borjafrancesnikki snpseekiiaresourceforalleleminingandanalysisofbiggenomicdatainoryzasativa AT detrasjeffrey snpseekiiaresourceforalleleminingandanalysisofbiggenomicdatainoryzasativa AT abriolsantosjuanmiguel snpseekiiaresourceforalleleminingandanalysisofbiggenomicdatainoryzasativa AT paliskevin snpseekiiaresourceforalleleminingandanalysisofbiggenomicdatainoryzasativa AT poliakovalexandre snpseekiiaresourceforalleleminingandanalysisofbiggenomicdatainoryzasativa AT dubchakinna snpseekiiaresourceforalleleminingandanalysisofbiggenomicdatainoryzasativa AT solovyevvictor snpseekiiaresourceforalleleminingandanalysisofbiggenomicdatainoryzasativa AT hamiltonruaraidhsackville snpseekiiaresourceforalleleminingandanalysisofbiggenomicdatainoryzasativa AT mcnallykennethl snpseekiiaresourceforalleleminingandanalysisofbiggenomicdatainoryzasativa AT alexandrovnickolai snpseekiiaresourceforalleleminingandanalysisofbiggenomicdatainoryzasativa AT mauleonramil snpseekiiaresourceforalleleminingandanalysisofbiggenomicdatainoryzasativa |