< Back to previous page


Detecting adversarial examples with inductive Venn-ABERS predictors

Book Contribution - Book Chapter Conference Contribution

Inductive Venn-ABERS predictors (IVAPs) are a type of probabilistic predictors with the theoretical guarantee that their predictions are perfectly calibrated. We propose to exploit this calibration property for the detection of adversarial examples in binary classification tasks. By rejecting predictions if the uncertainty of the IVAP is too high, we obtain an algorithm that is both accurate on the original test set and significantly more robust to adversarial examples. The method appears to be competitive to the state of the art in adversarial defense, both in terms of robustness as well as scalability
Book: ESANN 2019 proceedings : European symposium on artificial neural networks, computational intelligence and machine learning
Pages: 143 - 148
Publication year:2019