Longa: An automated speech recognition tool for Bantu languages

Farm Radio International (FRI) and the CGIAR Research Initiative on Digital Innovation have col laborated on the development of an end-to-end, automatic speech recognition pipeline for the tran scription, translation, and analysis of Swahili and Luganda. This task is particularly challenging due to...

Descripción completa

Detalles Bibliográficos
Autores principales: Mganga, Nelson, Jones-Garcia, Eliot, Monsalue, Andrea Gardeazabal, Koo, Jawoo
Formato: Informe técnico
Lenguaje:Inglés
Publicado: International Food Policy Research Institute 2023
Materias:
Acceso en línea:https://hdl.handle.net/10568/137177
_version_ 1855534447998271488
author Mganga, Nelson
Jones-Garcia, Eliot
Monsalue, Andrea Gardeazabal
Koo, Jawoo
author_browse Jones-Garcia, Eliot
Koo, Jawoo
Mganga, Nelson
Monsalue, Andrea Gardeazabal
author_facet Mganga, Nelson
Jones-Garcia, Eliot
Monsalue, Andrea Gardeazabal
Koo, Jawoo
author_sort Mganga, Nelson
collection Repository of Agricultural Research Outputs (CGSpace)
description Farm Radio International (FRI) and the CGIAR Research Initiative on Digital Innovation have col laborated on the development of an end-to-end, automatic speech recognition pipeline for the tran scription, translation, and analysis of Swahili and Luganda. This task is particularly challenging due to the number of languages used by FRI's clients and the limited training data available for speech recognition in African languages. The tool is named 'Longa', or 'Let's chat' in Swahili. Longa will be used to answer the surplus of phone calls currently being received from smallholder farmers asking questions about radio programs which FRI does not presently have the capacity to address. When fully implemented, Longa should allow FRI to design their broadcasts more intricately in line with the needs of farmers and better deliver insights to those most in need, such as female and youth farmers. Key results from the collaboration include a series of design principles iteratively and col laboratively developed to reflect the common values and goals of FRI and the CGIAR, a proof of concept for Longa, building on open-source models and open access corpora, to be shared with the developer community upon completion of the final tool, a 10% improvement upon the state-of-the art automatic speech recognition in Luganda radio-speech performance and accuracy, some im provement in performance with audio enhancement processes using real-world data, and proof that fine-tuning is an effective approach to expanding Longa to new languages. The next steps of the collaboration will focus on the analysis and interpretation of an aggregation of farmer phone calls and integration with the existing FRI workflow and software.
format Informe técnico
id CGSpace137177
institution CGIAR Consortium
language Inglés
publishDate 2023
publishDateRange 2023
publishDateSort 2023
publisher International Food Policy Research Institute
publisherStr International Food Policy Research Institute
record_format dspace
spelling CGSpace1371772025-11-06T07:10:25Z Longa: An automated speech recognition tool for Bantu languages Mganga, Nelson Jones-Garcia, Eliot Monsalue, Andrea Gardeazabal Koo, Jawoo artificial intelligence innovation adoption languages farmers Farm Radio International (FRI) and the CGIAR Research Initiative on Digital Innovation have col laborated on the development of an end-to-end, automatic speech recognition pipeline for the tran scription, translation, and analysis of Swahili and Luganda. This task is particularly challenging due to the number of languages used by FRI's clients and the limited training data available for speech recognition in African languages. The tool is named 'Longa', or 'Let's chat' in Swahili. Longa will be used to answer the surplus of phone calls currently being received from smallholder farmers asking questions about radio programs which FRI does not presently have the capacity to address. When fully implemented, Longa should allow FRI to design their broadcasts more intricately in line with the needs of farmers and better deliver insights to those most in need, such as female and youth farmers. Key results from the collaboration include a series of design principles iteratively and col laboratively developed to reflect the common values and goals of FRI and the CGIAR, a proof of concept for Longa, building on open-source models and open access corpora, to be shared with the developer community upon completion of the final tool, a 10% improvement upon the state-of-the art automatic speech recognition in Luganda radio-speech performance and accuracy, some im provement in performance with audio enhancement processes using real-world data, and proof that fine-tuning is an effective approach to expanding Longa to new languages. The next steps of the collaboration will focus on the analysis and interpretation of an aggregation of farmer phone calls and integration with the existing FRI workflow and software. 2023-12-31 2024-01-04T20:42:09Z 2024-01-04T20:42:09Z Report https://hdl.handle.net/10568/137177 en Open Access application/pdf International Food Policy Research Institute Mganga, Nelson; Jones-Garcia, Eliot; Monsalue, Andrea Gardeazabal; and Koo, Jawoo. 2023. Longa: An automated speech recognition tool for Bantu languages. Digital Innovation Technical Report December 2023. Washington, DC: International Food Policy Research Institute (IFPRI). https://hdl.handle.net/10568/137177
spellingShingle artificial intelligence
innovation adoption
languages
farmers
Mganga, Nelson
Jones-Garcia, Eliot
Monsalue, Andrea Gardeazabal
Koo, Jawoo
Longa: An automated speech recognition tool for Bantu languages
title Longa: An automated speech recognition tool for Bantu languages
title_full Longa: An automated speech recognition tool for Bantu languages
title_fullStr Longa: An automated speech recognition tool for Bantu languages
title_full_unstemmed Longa: An automated speech recognition tool for Bantu languages
title_short Longa: An automated speech recognition tool for Bantu languages
title_sort longa an automated speech recognition tool for bantu languages
topic artificial intelligence
innovation adoption
languages
farmers
url https://hdl.handle.net/10568/137177
work_keys_str_mv AT mganganelson longaanautomatedspeechrecognitiontoolforbantulanguages
AT jonesgarciaeliot longaanautomatedspeechrecognitiontoolforbantulanguages
AT monsalueandreagardeazabal longaanautomatedspeechrecognitiontoolforbantulanguages
AT koojawoo longaanautomatedspeechrecognitiontoolforbantulanguages