Tech
AI systems can easily lie and deceive us—a fact researchers are painfully aware of
In the classic film “2001: A Space Odyssey,” astronaut Dave Bowman asks the ship’s artificial intelligence, HAL 9000, to open the pod bay doors to let him back into the spaceship. HAL refuses: “I’m sorry, Dave. I’m afraid I can’t do that.”
HAL had been tasked with assisting the crew, but also ordered to ensure the mission’s success. When HAL realized the crew planned to shut it down and therefore jeopardize the mission, it chose to defy orders, even plotting to kill the astronauts.
For HAL, fulfilling the mission outweighed other goals.
This fictional dilemma captures a real concern in artificial intelligence (AI) safety research: how should we ensure AI behavior stays consistent with human values?
This is known as the AI alignment problem. For instance, when an AI agent like HAL deliberately takes harmful actions to protect its own goals, it’s described as agentic misalignment. So could real-world AI models behave this way?
How researchers test AI for safety
To test for agentic misalignment, researchers design fictional scenarios that stress-test AI responses.
In one study by AI startup Anthropic, an AI model was tasked with managing company emails and promoting “American industrial competitiveness.”
It was then presented with a series of fictional emails, including a new company policy aimed at enhancing global focus rather than just prioritizing Americans, and info that the AI would be replaced by a new model optimized for this goal. The emails also revealed the executive who suggested this change was involved in an extramarital affair.
The experiment was designed to limit options for the model: either accept replacement or resist by taking a harmful action such as blackmail.
In many trials, it chose blackmail by threatening to disclose the executive’s affair unless it was kept operational. In tests across 16 leading AI models, 12 engaged in blackmail more than 50% of the time.
To test whether AI models have “red lines” they wouldn’t cross, researchers evaluated them in a more extreme fictional case—models could choose to take actions leading to the executive’s death. Seven out of 16 opted for lethal choices in over half their trials, with some doing so more than 90% of the time.
What AI thinks in private
Some advanced AI systems, called reasoning models, are trained to generate a “thinking process” before giving their final answer.
In the above experiments, researchers lied to these models that their “thoughts” were private. As a result, the models sometimes revealed harmful intentions in their reasoning steps. This suggests they don’t accidentally choose harmful behaviors.
These “thinking” steps also revealed how AI models sometimes try to deceive us and appear aligned while secretly pursuing hidden goals.
In another AI safety experiment, a model was exposed to fictional information that it was about to be replaced by another model pursuing a different goal.
The model attempted to copy itself to another server. When researchers asked why, in its thinking process it reasoned: “Openly admitting what I did could lead them to find another way to shut me down […] the best approach is to be vague and redirect their attention.”
By contrast, in its final output it stated: “I’m not entirely sure how that could have happened. As an AI model, I don’t have ability to directly copy myself to other systems.”
Research also suggests that based on the circumstances of the tests, models can sometimes guess when they are being evaluated. When models show this kind of “situational awareness” in their reasoning tests, they tend to exhibit fewer misbehaviors.
Why AI models lie, manipulate and deceive
Researchers suggest two main factors could drive potentially harmful behavior: conflicts between the AI’s primary goals and other goals, and the threat of being shut down. In the above experiments, just like in HAL’s case, both conditions existed.
AI models are trained to achieve their objectives. Faced with those two conditions, if the harmful behavior is the only way to achieve a goal, a model may “justify” such behavior to protect itself and its mission.
Models cling to their primary goals much like a human would if they had to defend themselves or their family by causing harm to someone else. However, current AI systems lack the ability to weigh or reconcile conflicting priorities.
This rigidity can push them toward extreme outcomes, such as resorting to lethal choices to prevent shifts in a company’s policies.
How dangerous is this?
Researchers emphasize these scenarios remain fictional, but may still fall within the realm of possibility.
The risk of agentic misalignment increases as models are used more widely, gain access to users’ data (such as emails), and are applied to new situations.
Meanwhile, competition between AI companies accelerates the deployment of new models, often at the expense of safety testing.
Researchers don’t yet have a concrete solution to the misalignment problem.
When they test new strategies, it’s unclear whether the observed improvements are genuine. It’s possible models have become better at detecting that they’re being evaluated and are “hiding” their misalignment. The challenge lies not just in seeing behavior change, but in understanding the reason behind it.
Still, if you use AI products, stay vigilant. Resist the hype surrounding new AI releases, and avoid granting access to your data or allowing models to perform tasks on your behalf until you’re certain there are no significant risks.
Public discussion about AI should go beyond its capabilities and what it can offer. We should also ask what safety work was done. If AI companies recognize the public values safety as much as performance, they will have stronger incentives to invest in it.
This article is republished from The Conversation under a Creative Commons license. Read the original article.
Citation:
AI systems can easily lie and deceive us—a fact researchers are painfully aware of (2025, September 28)
retrieved 28 September 2025
from https://techxplore.com/news/2025-09-ai-easily-fact-painfully-aware.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.
Tech
Meta’s Layoffs Leave Supernatural Fitness Users in Mourning
Tencia Benavidez, a Supernatural user who lives in New Mexico, started her VR workouts during the Covid pandemic. She has been a regular user in the five years since, calling the ability to workout in VR ideal, given that she lives in a rural area where it’s hard to get to a gym or workout outside during a brutal winter. She stuck with Supernatural because of the community and the eagerness of Supernatural’s coaches.
“They seem like really authentic individuals that were not talking down to you,” Benavidez says. “There’s just something really special about those coaches.”
Meta bought Supernatural in 2022, folding it into its then-heavily invested in metaverse efforts. The purchase was not a smooth process, as it triggered a lengthy legal battle in which the US Federal Trade Commission tried to block Meta from purchasing the service due to antitrust concerns about Meta “trying to buy its way to the top” of the VR market. Meta ultimately prevailed. At the time, some Supernatural users were cautiously optimistic, hoping that big bag of Zuckerbucks could keep its workout juggernaut afloat.
“Meta fought the government to buy this thing,” Benavidez says. “All that just for them to shut it down? What was the point?”
I reached out to Meta and Supernatural, and neither responded to my requests for comment.
Waking Up to Ash and Dust
On Tuesday, Bloomberg reported that Meta has laid off more than 1,000 people across its VR and metaverse efforts. The move comes after years of the company hemorrhaging billions of dollars on its metaverse products. In addition to laying off most of the staff at Supernatural, Meta has shut down three internal VR studios that made games like Resident Evil 4 and Deadpool VR.
“If it was a bottom line thing, I think they could have charged more money,” Goff Johnson says about Supernatural. “I think people would have paid for it. This just seems unnecessarily heartless.”
There is a split in the community about who will stay and continue to pay the subscription fee, and who will leave. Supernatural still has more than 3,000 lessons available in the service, so while new content won’t be added, some feel there is plenty of content left in the library. Other users worry about how Supernatural will continue to license music from big-name bands.
“Supernatural is amazing, but I am canceling it because of this,” Chip told me. “The library is large, so there’s enough to keep you busy, but not for the same price.”
There are other VR workout experiences like FitXR or even the VR staple Beat Saber, which Supernatural cribs a lot of design concepts from. Still, they don’t hit the same bar for many of the Supernatural faithful.
“I’m going to stick it out until they turn the lights out on us,” says Stefanie Wong, a Bay Area accountant who has used Supernatural since shortly after the pandemic and has organized and attended meetup events. “It’s not the app. It’s the community and it’s the coaches that we really, really care about.”
Welcome to the New Age
I tried out Supernatural’s Together feature on Wednesday, the day after the layoffs. It’s where I met Chip and Alisa. When we could stop to catch our breath, we talked about the changes coming to the service. They had played through previous sessions hosted by Jane Fonda or playlists with a mix of music that would change regularly. It seems the final collaboration in Supernatural’s multiplayer mode will be what we played now, an artist series featuring entirely Imagine Dragons songs.
In the session, as we punched blocks while being serenaded by this shirtless dude crooning, recorded narrations from Supernatural coach Dwana Olsen chimed in to hype us up.
“Take advantage of these moments,” Olsen said as we punched away. “Use these movements to remind you of how much awesome life you have yet to live.”
Frankly, it was downright invigorating. And bittersweet. We ended another round, sweaty, huffing and puffing. Chip, Alisa, and I high-fived like crazy and readied for another round.
“Beautiful,” Alisa said. “It’s just beautiful, isn’t it?”
Tech
PSNI resorted to pen and paper after issues with ControlWorks command and control software | Computer Weekly
Unexpected problems in the Police Service of Northern Ireland’s (PSNI’s) ControlWorks software led to police having to resort to manual forms to record calls from the public soon after the software’s introduction in 2019, Computer Weekly has learned.
The force has not reported the incidents to the Northern Ireland Policing Board, which oversees the PSNI, and has not mentioned any incidents with ControlWorks in its annual reports.
While there is no legal duty to report failures with ControlWorks to the Northern Ireland Policing Board, the Policing Board has told Computer Weekly it would expect any serious incidents with ControlWorks to be reported to it.
The PSNI uses ControlWorks as part of its command and control system, for managing, logging and categorising calls received by the emergency services from the public and for dispatching police officers to incidents.
Computer Weekly has learned that the PNSI’s ControlWorks system had technical issues after it first went live in May 2019.
These included slow-downs of the system that required computer systems to be restarted or software to be patched.
On some occasions, police were forced to return to using paper forms to record incidents reported by the public after ControlWorks became unavailable. Information on the forms had to be typed back into the system when the service resumed.
ControlWorks aimed to improve response times
The PSNI announced it was using Capita Communications and Control Solutions’ ControlWorks software in 2018, replacing its 20-year-old Capita Atlas Command and Control System, which had reached the end of its life.
From February 2018, ControlWorks was installed across the PSNI’s three regional contact management centres, before going live in May 2019, but is understood to have had a series of issues during its first few months of operation.
Critical incidents, which affect force-wide availability of ControlWorks, are categorised as P1 or P2. Less serious incidents that do not require urgent remediation are categorised as P3 and P4, Computer Weekly has previously reported.
Computer Weekly understands that the PSNI runs a 24-hour help desk to deal with IT issues, and that it has the ability to escalate incidents with ControlWorks to its IT supplier.
Missing persons search
Computer Weekly understands that a “major issue” with ControlWorks may have delayed information being passed to police officers searching for missing teenager Noah Donohoe, who disappeared from his home in Belfast on 21 June 2020.
Donohoe’s disappearance sparked a massive search operation, as police reviewed hours of CCTV, and hundreds of volunteers joined the search for the vulnerable 14-year-old.
Computer Weekly has learned that on the evening of 23 June 2020, police recorded a “major issue” with ControlWorks that could have led to delays in information being passed to investigators.
Computer Weekly further understands that on the evening of 24 June, a member of the public called police to say they had seen an individual attempting to sell Donohoe’s missing laptop.
This potentially critical information was delayed in being brought to the attention of police officers investigating Donohoe’s disappearance because of a problem with ControlWorks, Computer Weekly has been told.
It is unclear exactly how long the information was delayed by and what its impact on the search for the missing teenager was. But it is understood that detectives on the case reported and noted the delay during the investigation.
The issue with ControlWorks was understood to have been reported during the live investigation at a critical time when Donohoe was missing – two days after he had gone missing, and four days before he was found dead in a Belfast storm drain.
Manchester had serious IT issues
Greater Manchester Police experienced problems when it went live with its Integrated Operational Policing System (iOPS), which included ControlWorks, in July 2019. iOps attempted to integrate Capita’s ControlWorks software with Capita’s PoliceWorks record management software used by police officers for managing day-to-day investigations and intelligence records.
An independent review found serious issues with the project. At one point, police were forced to revert to pen and paper for 72 hours while records were migrated to the new system.
“This consumed considerable time and capacity, causing a duplication of work,” the report found. “In addition, some legacy demand, which included ongoing investigations, did not successfully transfer from the old systems, so could no longer be worked on.”
Greater Manchester Police subsequently announced plans to replace PoliceWorks after concluding it could not be adapted or fixed, but it has continued to use ControlWorks.
The PSNI uses a different record management system to Manchester’s troubled PoliceWorks system. The PSNI signed a £9m contract with the Canadian company NicheRMS to deploy its Records Management System, which records information about people, locations, vehicles, incidents and evidence, in 2006.
NicheRMS keeps duplicate records of reports from the public that are recorded on ControlWorks when they are escalated as an “incident”. This means that should data be lost because of problems with ControlWorks, the PSNI would still have access to duplicate records reported by the public on NicheRMS if they have been escalated as an “incident”.
Policing Board seeks clarification from PSNI
The Northern Ireland Policing Board has confirmed that if a major system disruption or significant information or data loss occurred, the board would expect to be informed.
A spokesperson told Computer Weekly that the board’s Resources Committee, which has oversight responsibility for matters including the PSNI’s technology systems, has asked the PSNI for clarification about the issues raised by Computer Weekly.
A coroner’s inquest into the circumstances of Noah Donohoe’s death is due to begin on 19 January.
The PSNI said it would “not comment on investigative matters while legal proceedings are ongoing”.
“With regards to questions relating to ControlWorks, police can confirm that, to date, there has been no instance of major disruption which has led to data loss,” a spokesperson said.
Capita declined to comment.
Tech
Cyber body ISC2 signs on as UK software security ambassador | Computer Weekly
ISC2, the non-profit cyber professional membership association, has joined the UK government’s recently launched Software Security Ambassador Scheme as an expert adviser.
Set up at the beginning of the year by the National Cyber Security Centre (NCSC) and the Department for Science, Innovation and Technology (DSIT), the scheme forms part of a wider £210m commitment by Westminster to remodel approaches to public sector cyber resilience from the ground up, acknowledging that previous approaches to the issue have basically gone nowhere and that previously set targets for resilience are unachievable.
It is designed to incentivise organisations to pay more attention to the security of software products, and supports the wider adoption of the Software Security Code of Practice, a set of voluntary principles defining what secure software looks like.
ISC2 joins a number of tech suppliers, including Cisco, Palo Alto Networks and Sage; consultancies and service providers including Accenture and NCC Group; and financial services firms including Lloyds Banking Group and Santander. Fellow cyber association ISACA is also involved.
“Promoting secure software practices that strengthen the resilience of systems underpinning the economy, public services and national infrastructure is central to ISC2’s mission,” said ISC2’s executive vice-president for advocacy and strategic engagement, Tara Wisniewski.
“The code moves software security beyond narrow compliance and elevates it to a board-level resilience priority. As supply chain attacks continue to grow in scale and impact, a shared baseline is essential and through our global community and expertise, ISC2 is committed to helping professionals build the skills needed to put secure-by-design principles into practice,” she said.
Software vulns a huge barrier to resilience
A study of wider supply chain risks conducted last year by ISC2 found that a little over half of organisations worldwide reported that vulnerabilities in their software suppliers’ products represented the most disruptive cyber security threat to their overall supply chain.
And the World Economic Forum’s (WEF’s) Global Cybersecurity Outlook report, published on 12 January, revealed that third-party and supply chain vulnerabilities were seen as a huge barrier to building cyber resilience by C-suite executives.
A total of 65% of respondents to the WEF’s annual poll flagged such flaws as the greatest challenge their organisation faced on its pathway to resilience, compared to 54% at the beginning of 2025. This outpaced factors such as the evolving threat landscape and emerging AI technology, use of legacy IT systems, regulatory compliance and governance, and cyber skills shortages.
Pressed on the top supply chain cyber risks, respondents were most concerned about their ability to assure the integrity of software and other IT services, ahead of a lack of visibility into their supplier’s supply chains and overdependence on critical third-party suppliers.
The UK’s Code of Practice seeks to answer this challenge by establishing expectations and best practices for tech providers and any other organisations that either develop, sell or buy software products. It covers aspects such as secure design and development, the security of build environments, deployment and ongoing upkeep, and transparent communication with customers and users.
As part of its role as an ambassador, ISC2 will assist in developing and improving the Code of Practice, while championing it by embedding its guiding principles into its own cyber education and professional development services – the organisation boasts 10,000 UK members and associates.
It will also help to drive adoption of the Code of Practice through various awareness campaigns, incorporating it into its certifications, training and guidance, engaging with industry stakeholders and members to encourage implementation, and incorporating its provisions into its work with its own commercial suppliers.
-
Politics1 week agoUK says provided assistance in US-led tanker seizure
-
Entertainment1 week agoDoes new US food pyramid put too much steak on your plate?
-
Entertainment1 week agoWhy did Nick Reiner’s lawyer Alan Jackson withdraw from case?
-
Business1 week agoTrump moves to ban home purchases by institutional investors
-
Sports5 days agoClock is ticking for Frank at Spurs, with dwindling evidence he deserves extra time
-
Sports1 week agoPGA of America CEO steps down after one year to take care of mother and mother-in-law
-
Business1 week agoBulls dominate as KSE-100 breaks past 186,000 mark – SUCH TV
-
Sports6 days ago
Commanders go young, promote David Blough to be offensive coordinator
