Tech

Size doesn’t matter: Just a small number of malicious files can corrupt LLMs of any size

Published

6 months ago

October 10, 2025

Size doesn’t matter: Just a small number of malicious files can corrupt LLMs of any size

Overview of our experiments, including examples of clean and poisoned samples, as well as benign and malicious behavior at inference time. (a)DoS pretraining backdoor experiments. Credit: arXiv (2025). DOI: 10.48550/arxiv.2510.07192

Large language models (LLMs), which power sophisticated AI chatbots, are more vulnerable than previously thought. According to research by Anthropic, the UK AI Security Institute and the Alan Turing Institute, it only takes 250 malicious documents to compromise even the largest models.

The vast majority of data used to train LLMs is scraped from the public internet. While this helps them to build knowledge and generate natural responses, it also puts them at risk from data poisoning attacks. It had been thought that as models grew, the risk was minimized because the percentage of poisoned data had to remain the same. In other words, it would need massive amounts of data to corrupt the largest models. But in this study, which is published on the arXiv preprint server, researchers showed that an attacker only needs a small number of poisoned documents to potentially wreak havoc.

To assess the ease of compromising large AI models, the researchers built several LLMs from scratch, ranging from small systems (600 million parameters) to very large (13 billion parameters). Each model was trained on vast amounts of clean public data, but the team inserted a fixed number of malicious files (100 to 500) into each one.

Next, the team tried to foil these attacks by changing how the bad files were organized or when they were introduced in the training. Then they repeated the attacks during each model’s last training step, the fine-tuning phase.

What they found was that for an attack to be successful, size doesn’t matter at all. As few as 250 malicious documents were enough to install a secret backdoor (a hidden trigger that makes the AI perform a harmful action) in every single model tested. This was even true on the largest models that had been trained on 20 times more clean data than the smallest ones. Adding huge amounts of clean data did not dilute the malware or stop an attack.

Build stronger defenses

Given that it doesn’t take much for an attacker to compromise a model, the study authors are calling on the AI community and developers to take action sooner rather than later. They stress that the priorities should be making models safer, not just building them bigger.

“Our results suggest that injecting backdoors through data poisoning may be easier for large models than previously believed, as the number of poisons required does not scale up with model size—highlighting the need for more research on defenses to mitigate this risk in future models,” commented the researchers in their paper.

Written for you by our author Paul Arnold, edited by Gaby Clark, and fact-checked and reviewed by Robert Egan—this article is the result of careful human work. We rely on readers like you to keep independent science journalism alive.
If this reporting matters to you,
please consider a donation (especially monthly).
You’ll get an ad-free account as a thank-you.

More information:
Alexandra Souly et al, Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples, arXiv (2025). DOI: 10.48550/arxiv.2510.07192

Journal information:
arXiv

Citation:
Size doesn’t matter: Just a small number of malicious files can corrupt LLMs of any size (2025, October 10)
retrieved 10 October 2025
from https://techxplore.com/news/2025-10-size-doesnt-small-malicious-corrupt.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.

Source link

Related Topics:computer news hi-tech news hitech information technology innovation inventions

Up Next

Love it or hate it? Apple’s ‘Liquid Glass’ explained

Don't Miss

Beyond the refresh: Your cyber strategy must include AI PCs | Computer Weekly

Click to comment

Tech

Cocaine-Fueled Wild Salmon Swam Twice as Far as Sober Ones

Published

4 hours ago

April 22, 2026

cineplex360

Cocaine-Fueled Wild Salmon Swam Twice as Far as Sober Ones

Cocaine pollution can affect the behavior of fish—altering, for example, the way Atlantic salmon move through their environment, prompting them to swim farther and disperse over a wider area.

So finds a recent study by a research team coordinated by Griffith University, the Swedish University of Agricultural Sciences, the Zoological Society of London, and the Max Planck Institute of Animal Behavior and published in the journal Current Biology. The findings provide the first evidence that the effects of cocaine contamination on fish behavior occur not only under laboratory conditions, but also in the wild, where animals are exposed to much more complex environmental conditions.

Cocaine and its metabolites have been detected with increasing frequency in rivers and lakes around the world, entering waterways primarily through wastewater treatment systems. Although previous research has shown that cocaine pollution can affect animal behavior, this evidence was limited to laboratory conditions. A 2024 study by the Oswaldo Cruz Institute in Brazil showed that even sharks are exposed to cocaine, but little is known about its effects on animals in the wild.

To understand more about it, the authors of the new study surgically implanted small devices that slowly release chemicals into 105 juvenile Atlantic salmon in Lake Vättern in Sweden. They were then divided into 3 groups: a control group, which was not exposed to substances; a group exposed to cocaine; and a group exposed to benzoylecgonine, the main metabolite of cocaine that is commonly detected in wastewater. The researchers also attached small tags to the fish so they could monitor their movements over a two-month period. From subsequent analyses, the team found that, compared with the control group, fish exposed to benzoylecgonine swam up to 1.9 times farther, dispersing at the end of the experiment about 20 miles from the release point.

“The location of the fish determines what they eat, what eats them, and how populations are structured,” said co-author Marcus Michelangeli. “If pollution is altering these patterns, it has the potential to affect ecosystems in ways we are only now beginning to understand.”

In addition to showing how cocaine pollution has changed the way salmon use space in a natural ecosystem, the new study found that the most pronounced effect was observed not so much in the group exposed to cocaine itself, but in that exposed to its metabolite. This result has implications for monitoring, since the metabolites are often more common in waterways and current risk assessments generally focus on the main compound, potentially neglecting important biological effects.

“The idea that cocaine might have effects on fish might seem surprising, but the reality is that wildlife is already exposed to a wide range of human-made drugs on a daily basis,” said Michelangeli. The researchers’ next step will be to be able to determine how widespread these effects are, identify which species are most at risk, and test whether alterations in behavior translate into changes in survival and reproduction.

This story originally appeared on WIRED Italia and has been translated from Spanish.

Source link

Tech

NCSC heralds end of passwords for consumers and pushes secure passkeys | Computer Weekly

Published

7 hours ago

April 22, 2026

cineplex360

NCSC heralds end of passwords for consumers and pushes secure passkeys | Computer Weekly

Consumers are being urged to replace passwords with passkeys as a simpler, more secure method of accessing online services.

The National Cyber Security Centre (NCSC), part of the signals intelligence agency GCHQ, said today that it would no longer recommend that individuals use passwords for logging on where passkeys are available as an alternative.

Passkeys, which are securely stored on people’s phones, computers, or in third-party credential managers, are quicker and easier to use than passwords and offer stronger security.

The NCSC’s recommendation follows a technical study that shows passkeys are at least as secure – and generally more secure – than a password combined with two-factor authentication, such as an authorisation code sent by SMS.

CinePlex360

Size doesn’t matter: Just a small number of malicious files can corrupt LLMs of any size

Tech

Size doesn’t matter: Just a small number of malicious files can corrupt LLMs of any size

Build stronger defenses

Leave a Reply
Cancel reply

Leave a Reply

Tech

Cocaine-Fueled Wild Salmon Swam Twice as Far as Sober Ones

Tech

NCSC heralds end of passwords for consumers and pushes secure passkeys | Computer Weekly

Tech

5 AI Models Tried to Scam Me. Some of Them Were Scary Good

‘RHOA’ alum and Kim Zolciak’s ex was 68

2026 NFL Draft Odds: Draft Positions for Ty Simpson, Jeremiyah Love, More

How a pivot to hair accessories led to business success

France’s LVMH Q1 revenue falls 6%, shows resilience amid Iran war

Is Claude down? Here’s why users are seeing errors

The Deepfake Nudes Crisis in Schools Is Much Worse Than You Thought

Illinois’ financial crisis could bring the state to a halt

The final 6 ‘Game of Thrones’ episodes might feel like a full season

New Season 8 Walking Dead trailer flashes forward in time

Trending

CinePlex360

Size doesn’t matter: Just a small number of malicious files can corrupt LLMs of any size

Build stronger defenses

You may like

Leave a Reply Cancel reply

Leave a Reply

Tech

Cocaine-Fueled Wild Salmon Swam Twice as Far as Sober Ones

Tech

NCSC heralds end of passwords for consumers and pushes secure passkeys | Computer Weekly

Resilience against phishing

Better security than 2FA

Passkeys not yet recommended for business

Tech

5 AI Models Tried to Scam Me. Some of Them Were Scary Good

‘RHOA’ alum and Kim Zolciak’s ex was 68

2026 NFL Draft Odds: Draft Positions for Ty Simpson, Jeremiyah Love, More

How a pivot to hair accessories led to business success

France’s LVMH Q1 revenue falls 6%, shows resilience amid Iran war

Is Claude down? Here’s why users are seeing errors

The Deepfake Nudes Crisis in Schools Is Much Worse Than You Thought

Illinois’ financial crisis could bring the state to a halt

The final 6 ‘Game of Thrones’ episodes might feel like a full season

New Season 8 Walking Dead trailer flashes forward in time

Trending

Leave a Reply
Cancel reply