Tech

AI systems can easily lie and deceive us—a fact researchers are painfully aware of

Published

5 months ago

September 28, 2025

AI systems can easily lie and deceive us—a fact researchers are painfully aware of

Credit: Unsplash/CC0 Public Domain

In the classic film “2001: A Space Odyssey,” astronaut Dave Bowman asks the ship’s artificial intelligence, HAL 9000, to open the pod bay doors to let him back into the spaceship. HAL refuses: “I’m sorry, Dave. I’m afraid I can’t do that.”

HAL had been tasked with assisting the crew, but also ordered to ensure the mission’s success. When HAL realized the crew planned to shut it down and therefore jeopardize the mission, it chose to defy orders, even plotting to kill the astronauts.

For HAL, fulfilling the mission outweighed other goals.

This fictional dilemma captures a real concern in artificial intelligence (AI) safety research: how should we ensure AI behavior stays consistent with human values?

This is known as the AI alignment problem. For instance, when an AI agent like HAL deliberately takes harmful actions to protect its own goals, it’s described as agentic misalignment. So could real-world AI models behave this way?

How researchers test AI for safety

To test for agentic misalignment, researchers design fictional scenarios that stress-test AI responses.

In one study by AI startup Anthropic, an AI model was tasked with managing company emails and promoting “American industrial competitiveness.”

It was then presented with a series of fictional emails, including a new company policy aimed at enhancing global focus rather than just prioritizing Americans, and info that the AI would be replaced by a new model optimized for this goal. The emails also revealed the executive who suggested this change was involved in an extramarital affair.

The experiment was designed to limit options for the model: either accept replacement or resist by taking a harmful action such as blackmail.

In many trials, it chose blackmail by threatening to disclose the executive’s affair unless it was kept operational. In tests across 16 leading AI models, 12 engaged in blackmail more than 50% of the time.

To test whether AI models have “red lines” they wouldn’t cross, researchers evaluated them in a more extreme fictional case—models could choose to take actions leading to the executive’s death. Seven out of 16 opted for lethal choices in over half their trials, with some doing so more than 90% of the time.

What AI thinks in private

Some advanced AI systems, called reasoning models, are trained to generate a “thinking process” before giving their final answer.

In the above experiments, researchers lied to these models that their “thoughts” were private. As a result, the models sometimes revealed harmful intentions in their reasoning steps. This suggests they don’t accidentally choose harmful behaviors.

These “thinking” steps also revealed how AI models sometimes try to deceive us and appear aligned while secretly pursuing hidden goals.

In another AI safety experiment, a model was exposed to fictional information that it was about to be replaced by another model pursuing a different goal.

The model attempted to copy itself to another server. When researchers asked why, in its thinking process it reasoned: “Openly admitting what I did could lead them to find another way to shut me down […] the best approach is to be vague and redirect their attention.”

By contrast, in its final output it stated: “I’m not entirely sure how that could have happened. As an AI model, I don’t have ability to directly copy myself to other systems.”

Research also suggests that based on the circumstances of the tests, models can sometimes guess when they are being evaluated. When models show this kind of “situational awareness” in their reasoning tests, they tend to exhibit fewer misbehaviors.

Why AI models lie, manipulate and deceive

Researchers suggest two main factors could drive potentially harmful behavior: conflicts between the AI’s primary goals and other goals, and the threat of being shut down. In the above experiments, just like in HAL’s case, both conditions existed.

AI models are trained to achieve their objectives. Faced with those two conditions, if the harmful behavior is the only way to achieve a goal, a model may “justify” such behavior to protect itself and its mission.

Models cling to their primary goals much like a human would if they had to defend themselves or their family by causing harm to someone else. However, current AI systems lack the ability to weigh or reconcile conflicting priorities.

This rigidity can push them toward extreme outcomes, such as resorting to lethal choices to prevent shifts in a company’s policies.

How dangerous is this?

Researchers emphasize these scenarios remain fictional, but may still fall within the realm of possibility.

The risk of agentic misalignment increases as models are used more widely, gain access to users’ data (such as emails), and are applied to new situations.

Meanwhile, competition between AI companies accelerates the deployment of new models, often at the expense of safety testing.

Researchers don’t yet have a concrete solution to the misalignment problem.

When they test new strategies, it’s unclear whether the observed improvements are genuine. It’s possible models have become better at detecting that they’re being evaluated and are “hiding” their misalignment. The challenge lies not just in seeing behavior change, but in understanding the reason behind it.

Still, if you use AI products, stay vigilant. Resist the hype surrounding new AI releases, and avoid granting access to your data or allowing models to perform tasks on your behalf until you’re certain there are no significant risks.

Public discussion about AI should go beyond its capabilities and what it can offer. We should also ask what safety work was done. If AI companies recognize the public values safety as much as performance, they will have stronger incentives to invest in it.

Provided by
The Conversation

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Citation:
AI systems can easily lie and deceive us—a fact researchers are painfully aware of (2025, September 28)
retrieved 28 September 2025
from https://techxplore.com/news/2025-09-ai-easily-fact-painfully-aware.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.

Source link

Related Topics:computer news hi-tech news hitech information technology innovation inventions

Up Next

Microplastics Could Be Weakening Your Bones, Research Suggests

Don't Miss

I’ve Been Reviewing Gaming Laptops for Over a Decade. Here’s What to Look for When Shopping

Click to comment

Tech

Here’s Every Country Directly Impacted by the War on Iran

Published

52 minutes ago

March 5, 2026

cineplex360

Here’s Every Country Directly Impacted by the War on Iran

On February 28, United States and Israeli forces launched a series of strikes on Iran, kicking off turmoil in the Middle East.

Pete Hegseth, the secretary of the Department of Defense, said in a recent press conference that the operation could last as long as eight weeks. President Donald Trump himself said in a press conference on March 2 that the administration projected the operation would last four or five weeks but had “the capability to go far longer than that.”

This week Iran has responded in turn, attacking Israel, regional US embassies and military bases, and other sites across the Middle East. Iran has peppered neighboring countries with hundreds of drone and ballistic missile strikes since the operation began. While many of these have been intercepted, over a thousand people have died in the region and multiple buildings have been damaged, including luxury hotels in Dubai, US military bases and embassies, and international airports and marine ports.

Israel has also started bombarding Lebanon, following strikes at the country by the Lebanese militant group Hezbollah.

The Trump administration has given various, and at times seemingly contradictory, justifications for the military action, citing everything from potential “nuclear threat” to unverified claims that Iran attempted to interfere in the 2020 and 2024 US presidential elections. As of March 5, Congress, which in the US has the sole power to declare war, has not done so.

The attacks have already disrupted supply chains, creating uncertainty for the oil and gas and fertilizer industries as key infrastructure has been targeted or shut down out of caution. Shipping traffic has halted along the Strait of Hormuz, a critical route.

As the conflict continues to escalate and expand, WIRED is tracking which countries have been affected and how. This article was last updated on March 5.

Iran

As of March 4, Iranian state media estimates that over 1,000 people have died in the country since the US-Israeli attacks began. Several schools and hospitals have been hit, according to Al Jazeera. The Israeli Air Force says it has struck Iran with over 5,000 munitions since the beginning of the operation.

Israel

Israel has faced retaliatory strikes from Iran. As of March 4, at least 11 people have died and over 40 buildings have been damaged in Tel Aviv, according to Al Jazeera.

Azerbaijan

On March 5, Azerbaijan said drone attacks launched from Iran had crossed over the country’s borders and damaged an airport building and two civilians. President Ilham Aliyev of Azerbaijan said that the country’s military forces “have been instructed to prepare and implement appropriate retaliatory measures,” according to Reuters. Iran has denied responsibility for the attacks, according to Al Jazeera.

Bahrain

Missile and drone strikes have targeted different locations in Bahrain, including a US naval base, according to the BBC. On March 2, Amazon reported that a drone strike occurred in close proximity to one of its data centers in the country. CNBC later reported that Iranian state media said that Iran had targeted the data center because of the company’s support of the US military.

Cyprus

On March 2, a drone strike hit a British air base in Cyprus, according to Reuters. It caused limited damage and no casualties. Greece, the UK, and France have lent defensive support to the country, according to a Bloomberg report.

Iraq

Since February 28, there have been reports of multiple Iranian strikes aimed at a US military base near the Erbil International Airport, according to the nonprofit monitoring group Armed Conflict Location and Event Data.

Jordan

Jordan’s armed forces have intercepted dozens of missiles since the start of the conflict. At least one Iranian-backed militant group in Iraq has claimed responsibility, according to the Associated Press. On March 2, the US Embassy in the country announced that all its personnel had temporarily departed.

Kuwait

Kuwait has endured multiple waves of Iranian missile and drone attacks since February 28. On March 2, US Central Command said in a statement that three US fighter jets were accidentally struck down by Kuwaiti air defenses during an attack that included Iranian aircraft, missiles, and drones.

Lebanon

Israel attacked southern Lebanon after the militant Lebanese group Hezbollah launched rocket and drone attacks against them. Lebanon prime minister Nawaf Salam subsequently banned Hezbollah’s military and security activities, according to Al Jazeera.

Oman

Oman’s Duqm commercial port has been hit by several drone attacks, according to Al Jazeera. Omani authorities have said at least one oil tanker off the country’s port of Khasab in the Strait of Hormuz has been attacked.

Qatar

On March 2, QatarEnergy posted on X saying that it would halt production of liquified natural gas following a military attack on its operational facilities in the country. It did not attribute the attack to any particular country. On March 3, it posted again, saying that it would also stop the production of additional products, including urea, polymers, methanol, and aluminum.

Saudi Arabia

Infrastructure in Saudi Arabia has been targeted with projectiles. On March 3, the US embassy in Riyadh, the country’s capital, was damaged following an attack. On March 4, Reuters reported that one of the Saudi Aramco’s largest domestic refineries of Saudi Aramco, the majority state-owned oil company, was targeted by an attempted drone attack.

Syria

Tom Fletcher, the United Nations undersecretary-general for humanitarian affairs and emergency relief, says that civilians and civilian infrastructure were under attack in several countries including Syria.

Turkey

On March 4, the Turkish Ministry of National Defence announced that NATO had intercepted ballistic munitions launched from Iran, and that munition fragments had fallen into Hatay, a province that borders the Mediterranean Sea and Syria. Iran has denied any missile launch towards the country.

United Arab Emirates

As of March 4, UAE Ministry of Defence officials say that the country has intercepted hundreds of drone and missile attacks from Iran. Despite the relatively high rate of interceptions, debris created by the fallout has still damaged areas of the country. In Dubai, the luxury hotel Burj Al Arab was struck by debris, as well as the Palm Jumeirah, a man-made island home to high-end hotels and apartments. On March 2, Amazon Web Services announced that two of its facilities were directly struck in the country, causing “elevated error rates and degraded availability.”

Countries Evacuating Citizens

On March 2, US assistant secretary of state for consular affairs Mora Namdar posted on X urging Americans to depart from several middle eastern countries due to “serious safety risks.” On March 4, Reuters reported that the US military has offered seats on military transport planes to Americans trying to leave the region.

Over a dozen countries have announced that they will be evacuating their citizens from the area or sponsoring repatriation flights, including the UK, Ireland, Germany and Italy.

Source link

Tech

OpenAI Had Banned Military Use. The Pentagon Tested Its Models Through Microsoft Anyway

Published

1 hour ago

March 5, 2026

cineplex360

OpenAI Had Banned Military Use. The Pentagon Tested Its Models Through Microsoft Anyway

OpenAI CEO Sam Altman is still in the hot seat this week after his company signed a deal with the US military. OpenAI employees have criticized the move, which came after Anthropic’s roughly $200 million contract with the Pentagon imploded, and asked Altman to release more information about the agreement. Altman admitted it looked “sloppy” in a social media post.

While this incident has become a major news story, it may just be the latest and most public example of OpenAI creating vague policies around how the US military can access its AI.

In 2023, OpenAI’s usage policy explicitly banned the military from accessing its AI models. But some OpenAI employees discovered the Pentagon had already started experimenting with Azure OpenAI, a version of OpenAI’s models offered by Microsoft, two sources familiar with the matter said. At the time, Microsoft had been contracting with the Department of Defense for decades. It was also OpenAI’s largest investor, and had broad license to commercialize the startup’s technology.

That same year, OpenAI employees saw Pentagon officials walking through the company’s San Francisco offices, the sources said. They spoke on the condition of anonymity as they aren’t licensed to comment on private company matters.

Some OpenAI employees were wary about associating with the Pentagon, while others were simply confused about what OpenAI’s usage policies meant. Did the policy apply to Microsoft? While sources tell WIRED it was not clear to most employees at the time, spokespeople from OpenAI and Microsoft say Azure OpenAI products are not, and were not, subject to OpenAI’s policies.

“Microsoft has a product called the Azure OpenAI Service that became available to the US Government in 2023 and is subject to Microsoft terms of service,” said spokesperson Frank Shaw in a statement to WIRED. Microsoft declined to comment specifically on when it made Azure OpenAI available to the Pentagon, but notes the service was not approved for “top secret” government workloads until 2025.

“AI is already playing a significant role in national security and we believe it’s important to have a seat at the table to help ensure it’s deployed safely and responsibly,” OpenAI spokesperson Liz Bourgeois said in a statement. “We’ve been transparent with our employees as we’ve approached this work, providing regular updates and dedicated channels where teams can ask questions and engage directly with our national security team.”

The Department of Defense did not respond to WIRED’s request for comment.

By January 2024, OpenAI updated its policies to remove the blanket ban on military use. Several OpenAI employees found out about the policy update through an article in The Intercept, sources say. Company leaders later addressed the change at an all-hands meeting, explaining how the company would tread carefully in this area moving forward.

In December 2024, OpenAI announced a partnership with Anduril to develop and deploy AI systems for “national security missions.” Ahead of the announcement, OpenAI told employees that the partnership was narrow in scope and would only deal with unclassified workloads, the same sources said. This stood in contrast to a deal Anthropic had signed with Palantir, which would see Anthropic’s AI used for classified military work.

Palantir approached OpenAI in the fall of 2024 to discuss participating in their “FedStart” program, an OpenAI spokesperson confirmed to WIRED. The company ultimately turned it down, and told employees it would’ve been too high-risk, two sources familiar with the matter tell WIRED. However, OpenAI now works with Palantir in other ways.

Around the time the Anduril deal was announced, a few dozen OpenAI employees joined a public Slack channel to discuss their concerns about the company’s military partnerships, sources say and a spokesperson confirmed. Some believed the company’s models were too unreliable to handle a user’s credit card information, let alone assist Americans on the battlefield.

Source link

Tech

Don’t Risk Birdwatching FOMO—Put Out Your Hummingbird Feeders Now

Published

3 hours ago

March 5, 2026

cineplex360

Don’t Risk Birdwatching FOMO—Put Out Your Hummingbird Feeders Now

Though most people associate the beginning of March with the hopefulness of spring and the indignities of daylight saving time, there’s another important event taking place yards all over the country: hummingbird season.

While many species of hummingbirds can be seen in regions year-round, others are migratory, and this time typically marks their return from wintering grounds in Central and South America. These tiny birds can lose up to 40 percent of their body weight by the time they arrive here after having flown thousands of miles, and since many flowers haven’t bloomed yet, nectar feeders can be a source of essential fuel.

Though I test smart bird feeders year-round, I don’t use hummingbird feeders as often as I should, as it’s imperative that they be cleaned and refilled with new nectar every two or three days (a ratio of 1:4 granulated sugar to water is best, and avoid any dyes or additives) to prevent deadly bacteria and mold, and I don’t always have the time.

But if you are going to invest the energy in maintaining a hummingbird feeder, right now is the best time, as you have a chance to see migratory species you might not otherwise encounter, such as black-chinned hummingbirds. A smart feeder helps you ID them, whether they’re stopping at your feeder on their way north or arriving at their final destination.

Birdbuddy’s Pro is the smart hummingbird feeder I recommend and use myself when I’m not actively testing. The app is easy to navigate and sends cleaning reminders, the built-in solar roof keeps the battery charged, and, unlike other feeders, only the shallow bottom screws off for refilling. No having to pour sticky nectar through a narrow opening, or turn a giant cylinder upside down and risk spilling.

Note that it’s not perfect; the sensor is inconsistent and doesn’t capture every hummingbird that visits, but for the camera quality (5 MP photos, 2K video with slow-motion, 122-degree field of view) and ease of use, it’s a foible I’m willing to put up with. If you already have another Birdbuddy feeder, the hummingbird feeder images and videos will integrate seamlessly into your app feed.

Birdbuddy

Pro Smart Solar Hummingbird Feeder

Right now, the feeder is 37 percent off on Birdbuddy’s website—a deal I usually don’t see outside of shopping events like Black Friday or Amazon Prime Day. Note that the feeder only runs on 2.4 GHz Wi-Fi, and while it is fully functional without a subscription, a Birdbuddy Premium subscription will let you add friends and family members to your account so they can see the birds as well. That’s $99 a year through the app.

Power up with unlimited access to WIRED. Get best-in-class reporting and exclusive subscriber content that’s too important to ignore. Subscribe Today.

Source link

What are Iran’s ballistic missile capabilities?

Politics1 week ago

What are Iran’s ballistic missile capabilities?

Business7 days ago

India Us Trade Deal: Fresh look at India-US trade deal? May be ‘rebalanced’ if circumstances change, says Piyush Goyal – The Times of India

Attock Cement’s acquisition approved | The Express Tribune

Business1 week ago

Attock Cement’s acquisition approved | The Express Tribune

US arrests ex-Air Force pilot for ‘training’ Chinese military

Politics1 week ago

US arrests ex-Air Force pilot for ‘training’ Chinese military

Policy easing drives Argentina’s garment import surge in 2025

Fashion1 week ago

Policy easing drives Argentina’s garment import surge in 2025

Households set for lower energy bills amid price cap shake-up

Business1 week ago

Households set for lower energy bills amid price cap shake-up

Sri Lanka’s Shanaka says constant criticism has affected players’ mental health

Sports1 week ago

Sri Lanka’s Shanaka says constant criticism has affected players’ mental health

LPGA legend shares her feelings about US women’s Olympic wins: ‘Gets me really emotional’

Sports6 days ago

LPGA legend shares her feelings about US women’s Olympic wins: ‘Gets me really emotional’

CinePlex360

AI systems can easily lie and deceive us—a fact researchers are painfully aware of

How researchers test AI for safety

What AI thinks in private

Why AI models lie, manipulate and deceive

How dangerous is this?

You may like

Leave a Reply Cancel reply

Leave a Reply

Tech

Here’s Every Country Directly Impacted by the War on Iran

Iran

Israel

Azerbaijan

Bahrain

Cyprus

Iraq

Jordan

Kuwait

Lebanon

Oman

Qatar

Saudi Arabia

Syria

Turkey

United Arab Emirates

Countries Evacuating Citizens

Tech

OpenAI Had Banned Military Use. The Pentagon Tested Its Models Through Microsoft Anyway

Tech

Don’t Risk Birdwatching FOMO—Put Out Your Hummingbird Feeders Now

Pro Smart Solar Hummingbird Feeder

Here’s Every Country Directly Impacted by the War on Iran

OpenAI Had Banned Military Use. The Pentagon Tested Its Models Through Microsoft Anyway

Don’t Risk Birdwatching FOMO—Put Out Your Hummingbird Feeders Now

What are Iran’s ballistic missile capabilities?

India Us Trade Deal: Fresh look at India-US trade deal? May be ‘rebalanced’ if circumstances change, says Piyush Goyal – The Times of India

Attock Cement’s acquisition approved | The Express Tribune

Illinois’ financial crisis could bring the state to a halt

The final 6 ‘Game of Thrones’ episodes might feel like a full season

New Season 8 Walking Dead trailer flashes forward in time

Trending

Leave a Reply
Cancel reply