CGIAR Climate Change Synthesis Scripts

Code used to generate datasets for the 2024 synthesis of CGIAR work on climate change. Items matching the inclusion criteria were retrieved from eight CGIAR institutional repositories. This Python-based extract, transform, and load (ETL) pipeline filtered, merged, and normalized the metadata to ens...

Full description

Bibliographic Details
Main Author: Orth, Alan S.
Format: Source Code
Published: International Livestock Research Institute 2024
Subjects:
Online Access:https://hdl.handle.net/10568/163212
_version_ 1855537203457818624
author Orth, Alan S.
author_browse Orth, Alan S.
author_facet Orth, Alan S.
author_sort Orth, Alan S.
collection Repository of Agricultural Research Outputs (CGSpace)
description Code used to generate datasets for the 2024 synthesis of CGIAR work on climate change. Items matching the inclusion criteria were retrieved from eight CGIAR institutional repositories. This Python-based extract, transform, and load (ETL) pipeline filtered, merged, and normalized the metadata to ensure consistent use of date formats, multi-value separators, and identifiers. Naive deduplication was performed using titles and DOIs. Items identified to have been included erroneously due to incorrect repository metadata (mislabeled preprints, non-English, etc) were excluded. We used Crossref and Unpaywall to fill in gaps for missing metadata such as usage (license) and access rights because this information can be valuable to researchers. All other metadata was used as-is from the respective repositories. Bibliographic metadata in the CSV output is oriented towards use with the Rayyan platform for systematic literature review.
format Source Code
id CGSpace163212
institution CGIAR Consortium
publishDate 2024
publishDateRange 2024
publishDateSort 2024
publisher International Livestock Research Institute
publisherStr International Livestock Research Institute
record_format dspace
spelling CGSpace1632122025-01-28T12:21:20Z CGIAR Climate Change Synthesis Scripts Orth, Alan S. python Code used to generate datasets for the 2024 synthesis of CGIAR work on climate change. Items matching the inclusion criteria were retrieved from eight CGIAR institutional repositories. This Python-based extract, transform, and load (ETL) pipeline filtered, merged, and normalized the metadata to ensure consistent use of date formats, multi-value separators, and identifiers. Naive deduplication was performed using titles and DOIs. Items identified to have been included erroneously due to incorrect repository metadata (mislabeled preprints, non-English, etc) were excluded. We used Crossref and Unpaywall to fill in gaps for missing metadata such as usage (license) and access rights because this information can be valuable to researchers. All other metadata was used as-is from the respective repositories. Bibliographic metadata in the CSV output is oriented towards use with the Rayyan platform for systematic literature review. 2024-12-09 2024-12-09T12:45:57Z 2024-12-09T12:45:57Z Source Code https://hdl.handle.net/10568/163212 Open Access International Livestock Research Institute Orth, A. 2024. CGIAR Climate Change Synthesis Scripts v1.0.0. Source Code. Nairobi, Kenya: ILRI.
spellingShingle python
Orth, Alan S.
CGIAR Climate Change Synthesis Scripts
title CGIAR Climate Change Synthesis Scripts
title_full CGIAR Climate Change Synthesis Scripts
title_fullStr CGIAR Climate Change Synthesis Scripts
title_full_unstemmed CGIAR Climate Change Synthesis Scripts
title_short CGIAR Climate Change Synthesis Scripts
title_sort cgiar climate change synthesis scripts
topic python
url https://hdl.handle.net/10568/163212
work_keys_str_mv AT orthalans cgiarclimatechangesynthesisscripts