By Chong Xiang and Prateek Mittal
In our previous post, we discussed adversarial patch attacks and presented our first defense algorithm PatchGuard. The PatchGuard framework (small receptive field + secure aggregation) has become the most popular defense strategy over the past year, subsuming a long list of defense instances (Clipped BagNet, De-randomized Smoothing, BagCert, Randomized Cropping, PatchGuard++, ScaleCert, Smoothed ViT, ECViT). In this post, we will present a different way of building robust image classification models: PatchCleanser. Instead of using small receptive fields to suppress the adversarial effect, PatchCleanser directly masks out adversarial pixels in the input image. This design makes PatchCleanser compatible with any high-performance image classifiers and achieve state-of-the-art defense performance.
PatchCleanser: Removing the Dependency on Small Receptive Fields
The limitation of small receptive fields. We have seen the small receptive field plays an important role in PatchGuard: it limits the number of corrupted features and lays a foundation for robustness. However, the small receptive field also limits the information received by each feature; as a result, it hurts the clean model performance (when there is no attack). For example, the PatchGuard models (BagNet+robust masking) can only have a 55%-60% clean accuracy on the ImageNet dataset while state-of-the-art undefended models, which all have large receptive fields, can achieve an accuracy of 80%-90%.
This huge drop in clean accuracy discourages the real-world deployment of PatchGuard-style defenses. A natural question to ask is:
Can we achieve strong robustness without the use of small receptive fields?