2024-01-22 更新
Investigating Training Strategies and Model Robustness of Low-Rank Adaptation for Language Modeling in Speech Recognition
Authors:Yu Yu, Chao-Han Huck Yang, Tuan Dinh, Sungho Ryu, Jari Kolehmainen, Roger Ren, Denis Filimonov, Prashanth G. Shivakumar, Ankur Gandhe, Ariya Rastow, Jia Xu, Ivan Bulyko, Andreas Stolcke
The use of low-rank adaptation (LoRA) with frozen pretrained language models (PLMs) has become increasing popular as a mainstream, resource-efficient modeling approach for memory-constrained hardware. In this study, we first explore how to enhance model performance by introducing various LoRA training strategies, achieving relative word error rate reductions of 3.50\% on the public Librispeech dataset and of 3.67\% on an internal dataset in the messaging domain. To further characterize the stability of LoRA-based second-pass speech recognition models, we examine robustness against input perturbations. These perturbations are rooted in homophone replacements and a novel metric called N-best Perturbation-based Rescoring Robustness (NPRR), both designed to measure the relative degradation in the performance of rescoring models. Our experimental results indicate that while advanced variants of LoRA, such as dynamic rank-allocated LoRA, lead to performance degradation in $1$-best perturbation, they alleviate the degradation in $N$-best perturbation. This finding is in comparison to fully-tuned models and vanilla LoRA tuning baselines, suggesting that a comprehensive selection is needed when using LoRA-based adaptation for compute-cost savings and robust language modeling.
PDF
点此查看论文截图
Exploring Color Invariance through Image-Level Ensemble Learning
Authors:Yunpeng Gong, Jiaquan Li, Lifei Chen, Min Jiang
In the field of computer vision, the persistent presence of color bias, resulting from fluctuations in real-world lighting and camera conditions, presents a substantial challenge to the robustness of models. This issue is particularly pronounced in complex wide-area surveillance scenarios, such as person re-identification and industrial dust segmentation, where models often experience a decline in performance due to overfitting on color information during training, given the presence of environmental variations. Consequently, there is a need to effectively adapt models to cope with the complexities of camera conditions. To address this challenge, this study introduces a learning strategy named Random Color Erasing, which draws inspiration from ensemble learning. This strategy selectively erases partial or complete color information in the training data without disrupting the original image structure, thereby achieving a balanced weighting of color features and other features within the neural network. This approach mitigates the risk of overfitting and enhances the model’s ability to handle color variation, thereby improving its overall robustness. The approach we propose serves as an ensemble learning strategy, characterized by robust interpretability. A comprehensive analysis of this methodology is presented in this paper. Across various tasks such as person re-identification and semantic segmentation, our approach consistently improves strong baseline methods. Notably, in comparison to existing methods that prioritize color robustness, our strategy significantly enhances performance in cross-domain scenarios. The code available at \url{https://github.com/layumi/Person\_reID\_baseline\_pytorch/blob/master/random\_erasing.py} or \url{https://github.com/finger-monkey/Data-Augmentation}.
PDF