A Comprehensive Database of CGIAR Climate-Related Journal Articles (2012–2023)
This dataset contains bibliographic metadata for 3,450 peer-reviewed journal articles used in the 2024 synthesis of CGIAR work on climate change. The metadata was retrieved from eight CGIAR institutional repositories, processed using a Python-based extract, transform, and load (ETL) pipeline, and sc...
| Main Authors: | , , , , , , |
|---|---|
| Format: | Conjunto de datos |
| Language: | Inglés |
| Published: |
International Livestock Research Institute
2024
|
| Subjects: | |
| Online Access: | https://hdl.handle.net/10568/163158 |
| _version_ | 1855537144605442048 |
|---|---|
| author | Orth, Alan S. Bosire, Caroline K. Rabago, Laura Vaidya, Shrijana Rajbhandari, Sitashma Pradhan, Prajal Mukherji, Aditi |
| author_browse | Bosire, Caroline K. Mukherji, Aditi Orth, Alan S. Pradhan, Prajal Rabago, Laura Rajbhandari, Sitashma Vaidya, Shrijana |
| author_facet | Orth, Alan S. Bosire, Caroline K. Rabago, Laura Vaidya, Shrijana Rajbhandari, Sitashma Pradhan, Prajal Mukherji, Aditi |
| author_sort | Orth, Alan S. |
| collection | Repository of Agricultural Research Outputs (CGSpace) |
| description | This dataset contains bibliographic metadata for 3,450 peer-reviewed journal articles used in the 2024 synthesis of CGIAR work on climate change. The metadata was retrieved from eight CGIAR institutional repositories, processed using a Python-based extract, transform, and load (ETL) pipeline, and screened for climate change relevance in Rayyan.
Through harvesting we identified 5,487 journal articles matching the inclusion criteria in CGIAR repositories:
- Issue date between 2012 and 2023
- The words "climate change" in the title, abstract, or keywords
- English language
- DOI assigned
The bibliographic metadata was merged and normalized to ensure consistent use of date formats, multi-value separators, and identifiers. The ETL pipeline used titles and DOIs to identify and remove duplicates, as well as exclude any others that had been erroneously included due to incorrect repository metadata we could identify (mislabeled preprints, non-English, etc.). We used Crossref, Unpaywall, and OpenAlex to fill in gaps for missing metadata such as usage (license) and access rights, affiliations, and publishers because this information can be valuable to researchers. Minor normalization was performed on affiliations, countries, and publishers, but all other metadata was used as-is from the respective repositories.
4,495 journal articles were uploaded to the Rayyan platform for a blinded screening of climate change relevance by a team trained in systematic literature review methodology. Reviewers excluded journal articles not deemed to be climate change related or identified as further duplicates.
This dataset is useful for understanding CGIAR’s research on climate change. Potential areas of work could be to use machine learning to classify thematic areas.
The Python code used to perform the harvesting and processing of this dataset can be found on GitHub: https://github.com/ilri/cgiar-climate-change-synthesis |
| format | Conjunto de datos |
| id | CGSpace163158 |
| institution | CGIAR Consortium |
| language | Inglés |
| publishDate | 2024 |
| publishDateRange | 2024 |
| publishDateSort | 2024 |
| publisher | International Livestock Research Institute |
| publisherStr | International Livestock Research Institute |
| record_format | dspace |
| spelling | CGSpace1631582025-12-08T10:04:27Z A Comprehensive Database of CGIAR Climate-Related Journal Articles (2012–2023) Orth, Alan S. Bosire, Caroline K. Rabago, Laura Vaidya, Shrijana Rajbhandari, Sitashma Pradhan, Prajal Mukherji, Aditi climate change This dataset contains bibliographic metadata for 3,450 peer-reviewed journal articles used in the 2024 synthesis of CGIAR work on climate change. The metadata was retrieved from eight CGIAR institutional repositories, processed using a Python-based extract, transform, and load (ETL) pipeline, and screened for climate change relevance in Rayyan. Through harvesting we identified 5,487 journal articles matching the inclusion criteria in CGIAR repositories: - Issue date between 2012 and 2023 - The words "climate change" in the title, abstract, or keywords - English language - DOI assigned The bibliographic metadata was merged and normalized to ensure consistent use of date formats, multi-value separators, and identifiers. The ETL pipeline used titles and DOIs to identify and remove duplicates, as well as exclude any others that had been erroneously included due to incorrect repository metadata we could identify (mislabeled preprints, non-English, etc.). We used Crossref, Unpaywall, and OpenAlex to fill in gaps for missing metadata such as usage (license) and access rights, affiliations, and publishers because this information can be valuable to researchers. Minor normalization was performed on affiliations, countries, and publishers, but all other metadata was used as-is from the respective repositories. 4,495 journal articles were uploaded to the Rayyan platform for a blinded screening of climate change relevance by a team trained in systematic literature review methodology. Reviewers excluded journal articles not deemed to be climate change related or identified as further duplicates. This dataset is useful for understanding CGIAR’s research on climate change. Potential areas of work could be to use machine learning to classify thematic areas. The Python code used to perform the harvesting and processing of this dataset can be found on GitHub: https://github.com/ilri/cgiar-climate-change-synthesis 2024-12-06T11:45:29Z 2024-12-06T11:45:29Z Dataset https://hdl.handle.net/10568/163158 en Open Access International Livestock Research Institute Alan Orth, Caroline K. Bosire, Laura Rabago, Shrijana Vaidya, Sitashma Rajbhandari, Prajal Pradhan, Aditi Mukherji. (5/12/2024). A Comprehensive Database of CGIAR Climate-Related Journal Articles (2012–2023) [Bibliographic metadata]. |
| spellingShingle | climate change Orth, Alan S. Bosire, Caroline K. Rabago, Laura Vaidya, Shrijana Rajbhandari, Sitashma Pradhan, Prajal Mukherji, Aditi A Comprehensive Database of CGIAR Climate-Related Journal Articles (2012–2023) |
| title | A Comprehensive Database of CGIAR Climate-Related Journal Articles (2012–2023) |
| title_full | A Comprehensive Database of CGIAR Climate-Related Journal Articles (2012–2023) |
| title_fullStr | A Comprehensive Database of CGIAR Climate-Related Journal Articles (2012–2023) |
| title_full_unstemmed | A Comprehensive Database of CGIAR Climate-Related Journal Articles (2012–2023) |
| title_short | A Comprehensive Database of CGIAR Climate-Related Journal Articles (2012–2023) |
| title_sort | comprehensive database of cgiar climate related journal articles 2012 2023 |
| topic | climate change |
| url | https://hdl.handle.net/10568/163158 |
| work_keys_str_mv | AT orthalans acomprehensivedatabaseofcgiarclimaterelatedjournalarticles20122023 AT bosirecarolinek acomprehensivedatabaseofcgiarclimaterelatedjournalarticles20122023 AT rabagolaura acomprehensivedatabaseofcgiarclimaterelatedjournalarticles20122023 AT vaidyashrijana acomprehensivedatabaseofcgiarclimaterelatedjournalarticles20122023 AT rajbhandarisitashma acomprehensivedatabaseofcgiarclimaterelatedjournalarticles20122023 AT pradhanprajal acomprehensivedatabaseofcgiarclimaterelatedjournalarticles20122023 AT mukherjiaditi acomprehensivedatabaseofcgiarclimaterelatedjournalarticles20122023 AT orthalans comprehensivedatabaseofcgiarclimaterelatedjournalarticles20122023 AT bosirecarolinek comprehensivedatabaseofcgiarclimaterelatedjournalarticles20122023 AT rabagolaura comprehensivedatabaseofcgiarclimaterelatedjournalarticles20122023 AT vaidyashrijana comprehensivedatabaseofcgiarclimaterelatedjournalarticles20122023 AT rajbhandarisitashma comprehensivedatabaseofcgiarclimaterelatedjournalarticles20122023 AT pradhanprajal comprehensivedatabaseofcgiarclimaterelatedjournalarticles20122023 AT mukherjiaditi comprehensivedatabaseofcgiarclimaterelatedjournalarticles20122023 |