Data Poisoning: The Ethical Dilemma.

January 23, 2024

The list of possible use cases for AI is long: It can streamline cumbersome workflows, help agencies more effectively detect fraud and even support law enforcement efforts. Regardless of use case, one thing holds true: The more data a model ingests, the more accurate and impactful it will be. This assumes, of course, that the data isn’t being edited or added maliciously.

Data poisoning — the manipulation of algorithms through incorrect or compromised data — represents a new threat vector, particularly as more businesses embrace AI. While data poisoning attacks are not new, they have become the most critical vulnerability in ML and AI as malicious agents gain access to greater computing power and new tools.

Type of attacks

Data poisoning attacks can be classified based on the attacker's knowledge and the tactics they employ. In a black-box attack, the bad actor lacks knowledge of the data, while a white-box attack involves full knowledge of the model and its training parameters, resulting in higher success rates. Grey-box attacks fall in between. Additionally, based on knowledge, attackers may choose from four main types of data poisoning attacks: availability attacks, targeted attacks, subpopulation attacks, and backdoor attacks. Let's delve into each category.

Availability Attack: In this type of attack, the entire model is corrupted, leading to a significant reduction in accuracy. The model produces false positives, false negatives, and misclassifies test samples. Label flipping, involving the addition of approved labels to compromised data, is one form of an availability attack.

Targeted Attack: Unlike an availability attack affecting the entire model, a targeted attack impacts only a subset. The model maintains good performance for most samples, making targeted attacks harder to detect.

Subpopulation Attack: Similar to a targeted attack, a subpopulation attack doesn't impact the entire model but influences subsets with similar features.

Backdoor Attack: As the name implies, a backdoor attack occurs when an adversary introduces a back door—such as a set of pixels in the corner of an image—into training examples, causing the model to misclassify items.

Prevention Techniques

Preventing data poisoning requires proactive measures and meticulous considerations. Organisations must exercise extreme diligence in selecting data sets for model training, carefully controlling access to them. Maintaining the secrecy of a model's operating information during training is crucial. This vigilance can be augmented with high-speed verifiers and zero-trust content disarm and reconstruction tools, ensuring the cleanliness of transferred data.

Incorporating statistical models helps identify anomalies in the data, while advanced tools like Microsoft Azure Monitor can detect accuracy shifts. These proactive steps are crucial because rectifying a poisoned model is a complex process, involving a detailed analysis of training inputs to detect and remove fraudulent elements.

So, what is Nightshade?

Nightshade emerges as a novel "data poisoning" tool designed to shield artists' creations from AI data collection. In response to the unauthorized utilization of their artistic works by AI image generators, artists can now employ Nightshade. Developed by computer scientists at the University of Chicago, this tool utilizes a technique known as "data poisoning." It subtly alters image pixels, leaving them visually unchanged to humans but disrupting computer vision. The Nightshade team emphasizes its responsible use, acting as a deterrent against model trainers who may disregard copyrights, opt-out lists, and do-not-scrape/robots.txt directives.

Girl with a Pearl Earring

Introducing more "poisoned" images into training data intensifies disruption, pressuring rogue tech entities to respect copyright laws. The Nightshade team emphasizes a proactive approach, imposing a small incremental cost on unauthorized data scraping and training. Nevertheless, concerns have arisen about potential misuse, as intentional "poisoning" might disrupt AI generators' services. Despite its noble goal of safeguarding artists' intellectual property, Nightshade prompts ethical considerations on responsible tool use. Responding to critics, the researchers clarify that Nightshade aims to elevate the cost of training on unlicensed data, promoting the viable alternative of licensing images from creators.

This raises critical questions about the ethical use of such tools. How can we strike a balance between protecting artistic creations and ensuring responsible deployment of data poisoning techniques in the realm of artificial intelligence?