Abstract

The article presents preliminary experiments investigating the impact of accent on the performance of the Whisper automatic speech recognition (ASR) system, specifically for the Polish language and medical data. The literature review revealed a scarcity of studies on the influence of accents on speech recognition systems in Polish, especially concerning medical terminology. The experiments involved voice cloning of selected individuals and adding prosodic contours with Russian and German accents, followed by transcription of these samples using all available models from the Whisper family and comparison with the original transcription. The results of these initial experiments suggest that the Whisper model struggles with foreign accents in the context of Polish language and medical terminology. This highlights the need for further research aimed at improving ASR systems for transcription of medical personnel.

Recommended Citation

Zaporowski, S. (2024). The Impact of Foreign Accents on the Performance of Whisper Family Models Using Medical Speech in Polish. In B. Marcinkowski, A. Przybylek, A. Jarzębowicz, N. Iivari, E. Insfran, M. Lang, H. Linger, & C. Schneider (Eds.), Harnessing Opportunities: Reshaping ISD in the post-COVID-19 and Generative AI Era (ISD2024 Proceedings). Gdańsk, Poland: University of Gdańsk. ISBN: 978-83-972632-0-8. https://doi.org/10.62036/ISD.2024.110

Paper Type

Poster

DOI

10.62036/ISD.2024.110

Share

COinS
 

The Impact of Foreign Accents on the Performance of Whisper Family Models Using Medical Speech in Polish

The article presents preliminary experiments investigating the impact of accent on the performance of the Whisper automatic speech recognition (ASR) system, specifically for the Polish language and medical data. The literature review revealed a scarcity of studies on the influence of accents on speech recognition systems in Polish, especially concerning medical terminology. The experiments involved voice cloning of selected individuals and adding prosodic contours with Russian and German accents, followed by transcription of these samples using all available models from the Whisper family and comparison with the original transcription. The results of these initial experiments suggest that the Whisper model struggles with foreign accents in the context of Polish language and medical terminology. This highlights the need for further research aimed at improving ASR systems for transcription of medical personnel.