AI Poisoning has now become a serious threat to Artificial Intelligence. A report by the UK AI Security Institute, Alan Turing Institute, and Anthropic states that even a small amount of malicious information in training data can corrupt the entire model. This can alter the model's behavior, leading to false information and security risks.
AI Poisoning Threat: AI Poisoning is rapidly becoming a growing threat in Artificial Intelligence. A recent report by the UK AI Security Institute, Alan Turing Institute, and Anthropic revealed that introducing just a few malicious files into a model's training data can corrupt the entire model. This process can alter the machine's behavior, leading to incorrect answers and harmful outcomes. Experts state that data security and stringent training standards are essential to prevent this, ensuring users and systems remain safe.
How AI Poisoning Occurs

AI Poisoning is divided into two parts: Data Poisoning and Model Poisoning. Data Poisoning occurs when data is tampered with during the model's training. Model Poisoning involves altering the model's code or parameters after training.
Types of Attacks
- Direct or Targeted Attack: In this, the model's responses change only based on a specific trigger. For example, if a hacker wants the model to give derogatory responses about a particular person, they add such examples to the training data.
- Indirect or Non-Targeted Attack: This weakens the overall functionality of the model. Attackers spread false information on the internet, which the model learns and subsequently reiterates as truth.
Real-World Threats
AI Poisoning not only spreads misinformation but also increases security risks. For instance, introducing just 0.001% false medical data into a model can lead it to provide harmful medical advice, even while its test scores remain normal. In 2023, OpenAI had to temporarily shut down ChatGPT when some users' chat and account details were leaked.
AI Poisoning might seem minor on the surface, but its severe consequences can profoundly impact a model's behavior and user safety. Experts emphasize that stringent security measures and meticulous care of training data are crucial to combat this.













