Speech

发布日期: 2024-01-18

2024-01-18 更新

Binaural Angular Separation Network

Authors:Yang Yang, George Sung, Shao-Fu Shih, Hakan Erdogan, Chehung Lee, Matthias Grundmann

We propose a neural network model that can separate target speech sources from interfering sources at different angular regions using two microphones. The model is trained with simulated room impulse responses (RIRs) using omni-directional microphones without needing to collect real RIRs. By relying on specific angular regions and multiple room simulations, the model utilizes consistent time difference of arrival (TDOA) cues, or what we call delay contrast, to separate target and interference sources while remaining robust in various reverberation environments. We demonstrate the model is not only generalizable to a commercially available device with a slightly different microphone geometry, but also outperforms our previous work which uses one additional microphone on the same device. The model runs in real-time on-device and is suitable for low-latency streaming applications such as telephony and video conferencing.
PDF Accepted to ICASSP 2024

点此查看论文截图

木子已

https://ipaper.today/2024/01/18/2024-01-18-speech/

本博客所有文章除特別声明外，均采用 CC BY 4.0 许可协议。转载请注明来源木子已 !

Speech

GAN

2024-01-18 GAN

GAN

无监督/半监督/对比学习

2024-01-18 无监督/半监督/对比学习

无监督半监督对比学习

Speech

2024-01-18 更新

Binaural Angular Separation Network

打赏用于支持本站流量费