Abstract
This paper presents a solution for generating corpora of simulated Polish speech recordings in complex acoustic environments. The proposed method introduces an additional layer of unpredictable sound events, in addition to the acoustic scene noise and reverberation, making the solution unique. We generated a corpus comprising over 277 hours of training examples and over 5.5 hours for testing purposes using publicly available data sources. Next, we trained the Conv-TasNet network on the generated data to enhance single speech and separate two speakers from complex noise. The results of the experiments indicated the potential of the generated corpora for solving these tasks. Researchers can use publicly available codes to create their corpora tailored to the Polish language and solve various speech-related tasks.
Paper Type
Full Paper
DOI
10.62036/ISD.2024.37
Developing a Corpus for Polish Speech Enhancement by Reducing Noise, Reverberation, and Disruptions
This paper presents a solution for generating corpora of simulated Polish speech recordings in complex acoustic environments. The proposed method introduces an additional layer of unpredictable sound events, in addition to the acoustic scene noise and reverberation, making the solution unique. We generated a corpus comprising over 277 hours of training examples and over 5.5 hours for testing purposes using publicly available data sources. Next, we trained the Conv-TasNet network on the generated data to enhance single speech and separate two speakers from complex noise. The results of the experiments indicated the potential of the generated corpora for solving these tasks. Researchers can use publicly available codes to create their corpora tailored to the Polish language and solve various speech-related tasks.
Recommended Citation
Kleć, M., Szklanny, K. & Wieczorkowska, A. (2024). Developing a Corpus for Polish Speech Enhancement by Reducing Noise, Reverberation, and Disruptions. In B. Marcinkowski, A. Przybylek, A. Jarzębowicz, N. Iivari, E. Insfran, M. Lang, H. Linger, & C. Schneider (Eds.), Harnessing Opportunities: Reshaping ISD in the post-COVID-19 and Generative AI Era (ISD2024 Proceedings). Gdańsk, Poland: University of Gdańsk. ISBN: 978-83-972632-0-8. https://doi.org/10.62036/ISD.2024.37