Attack and Protect Classifiers

Adversarial Reverse Engineering and Classifier Robustness

Attack a Classifier:

In security-sensitive applications, e.g., spam filters and intrusion detection systems, the deployed classification algorithms can be attacked by adversaries through generating exploratory attacks such as evasion and reverse engineering. For example, an attacker can probe the classifier with queries in order to reveal some confidential information about the training dataset that was used by the system or model the classifier's decision boundary.  How to construct artificial queries from scratch? Query synthesis is a branch of active learning for generating queries in order to reveal sensitive information about the true decision boundary. 

The objective of this study is to learn a deterministic noise-free halfspace quite efficiently via query synthesis.

The algorithm was published in the paper:

Ibrahim M Alabdulmohsin, Xin Gao, Xiangliang Zhang, "Efficient Active Learning of Halfspaces via Query Synthesis".  In the proceedings of Twenty-Ninth AAAI Conference on Artificial Intelligence - AAAI 2015​.

Download the Matlab code of the algorithm​


Protect a Classifier:

Under such adversarial environments, adversaries can generate exploratory attacks against the defender such as evasion and reverse engineering. We investigate the use of randomization as a suitable strategy for mitigating their risk. In particular, we derive a semidefinite programming (SDP) formulation for learning a distribution of classifiers subject to the constraint that any single classifier picked at random from such distribution provides reliable predictions with a high probability. We analyze the tradeoff between variance of the distribution and its predictive accuracy, and establish that one can almost always incorporate randomization with large variance without incurring a loss in accuracy. ​

More details can be found in the paper:

Ibrahim M Alabdulmohsin, Xin Gao, Xiangliang Zhang, "Adding Robustness to Support Vector Machines Against Adversarial Reverse Engineering". Proceedings of the 23rd ACM International Conference on Information and Knowledge Management- CIKM 2014.

Download the Matlab code of the algorithm​​