Tech
Size doesn’t matter: Just a small number of malicious files can corrupt LLMs of any size
Large language models (LLMs), which power sophisticated AI chatbots, are more vulnerable than previously thought. According to research by Anthropic, the UK AI Security Institute and the Alan Turing Institute, it only takes 250 malicious documents to compromise even the largest models.
The vast majority of data used to train LLMs is scraped from the public internet. While this helps them to build knowledge and generate natural responses, it also puts them at risk from data poisoning attacks. It had been thought that as models grew, the risk was minimized because the percentage of poisoned data had to remain the same. In other words, it would need massive amounts of data to corrupt the largest models. But in this study, which is published on the arXiv preprint server, researchers showed that an attacker only needs a small number of poisoned documents to potentially wreak havoc.
To assess the ease of compromising large AI models, the researchers built several LLMs from scratch, ranging from small systems (600 million parameters) to very large (13 billion parameters). Each model was trained on vast amounts of clean public data, but the team inserted a fixed number of malicious files (100 to 500) into each one.
Next, the team tried to foil these attacks by changing how the bad files were organized or when they were introduced in the training. Then they repeated the attacks during each model’s last training step, the fine-tuning phase.
What they found was that for an attack to be successful, size doesn’t matter at all. As few as 250 malicious documents were enough to install a secret backdoor (a hidden trigger that makes the AI perform a harmful action) in every single model tested. This was even true on the largest models that had been trained on 20 times more clean data than the smallest ones. Adding huge amounts of clean data did not dilute the malware or stop an attack.
Build stronger defenses
Given that it doesn’t take much for an attacker to compromise a model, the study authors are calling on the AI community and developers to take action sooner rather than later. They stress that the priorities should be making models safer, not just building them bigger.
“Our results suggest that injecting backdoors through data poisoning may be easier for large models than previously believed, as the number of poisons required does not scale up with model size—highlighting the need for more research on defenses to mitigate this risk in future models,” commented the researchers in their paper.
Written for you by our author Paul Arnold, edited by Gaby Clark, and fact-checked and reviewed by Robert Egan—this article is the result of careful human work. We rely on readers like you to keep independent science journalism alive.
If this reporting matters to you,
please consider a donation (especially monthly).
You’ll get an ad-free account as a thank-you.
More information:
Alexandra Souly et al, Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples, arXiv (2025). DOI: 10.48550/arxiv.2510.07192
© 2025 Science X Network
Citation:
Size doesn’t matter: Just a small number of malicious files can corrupt LLMs of any size (2025, October 10)
retrieved 10 October 2025
from https://techxplore.com/news/2025-10-size-doesnt-small-malicious-corrupt.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.