对抗攻击

发布日期: 2022-12-16

2022-12-16 更新

SAIF: Sparse Adversarial and Interpretable Attack Framework

Authors:Tooba Imtiaz, Morgan Kohler, Jared Miller, Zifeng Wang, Mario Sznaier, Octavia Camps, Jennifer Dy

Adversarial attacks hamper the decision-making ability of neural networks by perturbing the input signal. The addition of calculated small distortion to images, for instance, can deceive a well-trained image classification network. In this work, we propose a novel attack technique called Sparse Adversarial and Interpretable Attack Framework (SAIF). Specifically, we design imperceptible attacks that contain low-magnitude perturbations at a small number of pixels and leverage these sparse attacks to reveal the vulnerability of classifiers. We use the Frank-Wolfe (conditional gradient) algorithm to simultaneously optimize the attack perturbations for bounded magnitude and sparsity with $O(1/\sqrt{T})$ convergence. Empirical results show that SAIF computes highly imperceptible and interpretable adversarial examples, and outperforms state-of-the-art sparse attack methods on the ImageNet dataset.
PDF

点此查看论文截图

木子已

https://ipaper.today/2022/12/16/2022-12-16-dui-kang-gong-ji/

本博客所有文章除特別声明外，均采用 CC BY 4.0 许可协议。转载请注明来源木子已 !

对抗攻击

Speech

2022-12-16 Speech

Speech

检测/分割/跟踪

2022-12-16 检测/分割/跟踪

检测分割跟踪

对抗攻击

2022-12-16 更新

SAIF: Sparse Adversarial and Interpretable Attack Framework

打赏用于支持本站流量费