0

Bemba Speech Translation: Exploring a Low-Resource African Language

A cascaded speech translation system for low-resource languages using Whisper and NLLB-200 employs data augmentation and synthetic data to improve performance.

Year
2025
Venue
arXiv 2025
Authors
3
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2505.02518v3ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

This paper describes our system submission to the International Conference on Spoken Language Translation (IWSLT 2025), low-resource languages track, namely for Bemba-to-English speech translation. We built cascaded speech translation systems based on Whisper and NLLB-200, and employed data augmentation techniques, such as back-translation. We investigate the effect of using synthetic data and discuss our experimental setup.

Authors

3