The University of Auckland

Project #91: Poison is Not Traceless: Fully-Agnostic Detection of Poisoning Attacks

Back

Description:

The performance of machine learning models depends on the quality of the underlying data. Malicious actors can attack the model by poisoning the training data. Current detectors are tied to either specific data types, models, or attacks, and therefore have limited applicability in real-world scenarios. We have developed a novel fully-agnostic framework, DIVA (Detecting InVisible Attacks), that detects attacks solely relying on analyzing the potentially poisoned data set. DIVA is based on the idea that poisoning attacks can be detected by comparing the classifier's accuracy on poisoned and clean data and pre-trains a meta-learner using Complexity Measures to estimate the otherwise unknown accuracy on a hypothetical clean dataset. The framework applies to generic poisoning attacks. For evaluation purposes, we focus on label-flipping attacks. 

In this project, we will use DIVA’s existing implementation and set up, run, improve, visualize, and interpret a comprehensive set of experiments gaining insights into DIVA’s behavior, strengths and weaknesses. The ideal outcome of this project is a conference publication and a detailed report on DIVA's performence.

Type:

Undergraduate

Outcome:

Experimental setup (code), results, figures, text, and ultimately a publication

Prerequisites

Experience in Python programming is essential. Prior understanding of machine learning concepts will be an advantage. No prior knowledge of adversarial learning and attacks is required.

Specialisations

Categories

Supervisor

Co-supervisor

Team

Lab

No lab has been assigned to this project