Speech

发布日期: 2023-05-12

2023-05-12 更新

Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition

Authors:Dima Rekesh, Samuel Kriman, Somshubra Majumdar, Vahid Noroozi, He Huang, Oleksii Hrinchuk, Ankur Kumar, Boris Ginsburg

Conformer-based models have become the most dominant end-to-end architecture for speech processing tasks. In this work, we propose a carefully redesigned Conformer with a new down-sampling schema. The proposed model, named Fast Conformer, is 2.8x faster than original Conformer, while preserving state-of-the-art accuracy on Automatic Speech Recognition benchmarks. Also we replace the original Conformer global attention with limited context attention post-training to enable transcription of an hour-long audio. We further improve long-form speech transcription by adding a global token. Fast Conformer combined with a Transformer decoder also outperforms the original Conformer in accuracy and in speed for Speech Translation and Spoken Language Understanding.
PDF

点此查看论文截图

木子已

https://ipaper.today/2023/05/12/2023-05-12-speech/

本博客所有文章除特別声明外，均采用 CC BY 4.0 许可协议。转载请注明来源木子已 !

Speech

NeRF

2023-05-12 NeRF

NeRF

检测/分割/跟踪

2023-05-12 检测/分割/跟踪

检测分割跟踪

Speech

2023-05-12 更新

Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition

打赏用于支持本站流量费