Tech
AI systems can easily lie and deceive us—a fact researchers are painfully aware of
In the classic film “2001: A Space Odyssey,” astronaut Dave Bowman asks the ship’s artificial intelligence, HAL 9000, to open the pod bay doors to let him back into the spaceship. HAL refuses: “I’m sorry, Dave. I’m afraid I can’t do that.”
HAL had been tasked with assisting the crew, but also ordered to ensure the mission’s success. When HAL realized the crew planned to shut it down and therefore jeopardize the mission, it chose to defy orders, even plotting to kill the astronauts.
For HAL, fulfilling the mission outweighed other goals.
This fictional dilemma captures a real concern in artificial intelligence (AI) safety research: how should we ensure AI behavior stays consistent with human values?
This is known as the AI alignment problem. For instance, when an AI agent like HAL deliberately takes harmful actions to protect its own goals, it’s described as agentic misalignment. So could real-world AI models behave this way?
How researchers test AI for safety
To test for agentic misalignment, researchers design fictional scenarios that stress-test AI responses.
In one study by AI startup Anthropic, an AI model was tasked with managing company emails and promoting “American industrial competitiveness.”
It was then presented with a series of fictional emails, including a new company policy aimed at enhancing global focus rather than just prioritizing Americans, and info that the AI would be replaced by a new model optimized for this goal. The emails also revealed the executive who suggested this change was involved in an extramarital affair.
The experiment was designed to limit options for the model: either accept replacement or resist by taking a harmful action such as blackmail.
In many trials, it chose blackmail by threatening to disclose the executive’s affair unless it was kept operational. In tests across 16 leading AI models, 12 engaged in blackmail more than 50% of the time.
To test whether AI models have “red lines” they wouldn’t cross, researchers evaluated them in a more extreme fictional case—models could choose to take actions leading to the executive’s death. Seven out of 16 opted for lethal choices in over half their trials, with some doing so more than 90% of the time.
What AI thinks in private
Some advanced AI systems, called reasoning models, are trained to generate a “thinking process” before giving their final answer.
In the above experiments, researchers lied to these models that their “thoughts” were private. As a result, the models sometimes revealed harmful intentions in their reasoning steps. This suggests they don’t accidentally choose harmful behaviors.
These “thinking” steps also revealed how AI models sometimes try to deceive us and appear aligned while secretly pursuing hidden goals.
In another AI safety experiment, a model was exposed to fictional information that it was about to be replaced by another model pursuing a different goal.
The model attempted to copy itself to another server. When researchers asked why, in its thinking process it reasoned: “Openly admitting what I did could lead them to find another way to shut me down […] the best approach is to be vague and redirect their attention.”
By contrast, in its final output it stated: “I’m not entirely sure how that could have happened. As an AI model, I don’t have ability to directly copy myself to other systems.”
Research also suggests that based on the circumstances of the tests, models can sometimes guess when they are being evaluated. When models show this kind of “situational awareness” in their reasoning tests, they tend to exhibit fewer misbehaviors.
Why AI models lie, manipulate and deceive
Researchers suggest two main factors could drive potentially harmful behavior: conflicts between the AI’s primary goals and other goals, and the threat of being shut down. In the above experiments, just like in HAL’s case, both conditions existed.
AI models are trained to achieve their objectives. Faced with those two conditions, if the harmful behavior is the only way to achieve a goal, a model may “justify” such behavior to protect itself and its mission.
Models cling to their primary goals much like a human would if they had to defend themselves or their family by causing harm to someone else. However, current AI systems lack the ability to weigh or reconcile conflicting priorities.
This rigidity can push them toward extreme outcomes, such as resorting to lethal choices to prevent shifts in a company’s policies.
How dangerous is this?
Researchers emphasize these scenarios remain fictional, but may still fall within the realm of possibility.
The risk of agentic misalignment increases as models are used more widely, gain access to users’ data (such as emails), and are applied to new situations.
Meanwhile, competition between AI companies accelerates the deployment of new models, often at the expense of safety testing.
Researchers don’t yet have a concrete solution to the misalignment problem.
When they test new strategies, it’s unclear whether the observed improvements are genuine. It’s possible models have become better at detecting that they’re being evaluated and are “hiding” their misalignment. The challenge lies not just in seeing behavior change, but in understanding the reason behind it.
Still, if you use AI products, stay vigilant. Resist the hype surrounding new AI releases, and avoid granting access to your data or allowing models to perform tasks on your behalf until you’re certain there are no significant risks.
Public discussion about AI should go beyond its capabilities and what it can offer. We should also ask what safety work was done. If AI companies recognize the public values safety as much as performance, they will have stronger incentives to invest in it.
This article is republished from The Conversation under a Creative Commons license. Read the original article.
Citation:
AI systems can easily lie and deceive us—a fact researchers are painfully aware of (2025, September 28)
retrieved 28 September 2025
from https://techxplore.com/news/2025-09-ai-easily-fact-painfully-aware.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.
Tech
PSNI resorted to pen and paper after issues with ControlWorks command and control software | Computer Weekly
Unexpected problems in the Police Service of Northern Ireland’s (PSNI’s) ControlWorks software led to police having to resort to manual forms to record calls from the public soon after the software’s introduction in 2019, Computer Weekly has learned.
The force has not reported the incidents to the Northern Ireland Policing Board, which oversees the PSNI, and has not mentioned any incidents with ControlWorks in its annual reports.
While there is no legal duty to report failures with ControlWorks to the Northern Ireland Policing Board, the Policing Board has told Computer Weekly it would expect any serious incidents with ControlWorks to be reported to it.
The PSNI uses ControlWorks as part of its command and control system, for managing, logging and categorising calls received by the emergency services from the public and for dispatching police officers to incidents.
Computer Weekly has learned that the PNSI’s ControlWorks system had technical issues after it first went live in May 2019.
These included slow-downs of the system that required computer systems to be restarted or software to be patched.
On some occasions, police were forced to return to using paper forms to record incidents reported by the public after ControlWorks became unavailable. Information on the forms had to be typed back into the system when the service resumed.
ControlWorks aimed to improve response times
The PSNI announced it was using Capita Communications and Control Solutions’ ControlWorks software in 2018, replacing its 20-year-old Capita Atlas Command and Control System, which had reached the end of its life.
From February 2018, ControlWorks was installed across the PSNI’s three regional contact management centres, before going live in May 2019, but is understood to have had a series of issues during its first few months of operation.
Critical incidents, which affect force-wide availability of ControlWorks, are categorised as P1 or P2. Less serious incidents that do not require urgent remediation are categorised as P3 and P4, Computer Weekly has previously reported.
Computer Weekly understands that the PSNI runs a 24-hour help desk to deal with IT issues, and that it has the ability to escalate incidents with ControlWorks to its IT supplier.
Missing persons search
Computer Weekly understands that a “major issue” with ControlWorks may have delayed information being passed to police officers searching for missing teenager Noah Donohoe, who disappeared from his home in Belfast on 21 June 2020.
Donohoe’s disappearance sparked a massive search operation, as police reviewed hours of CCTV, and hundreds of volunteers joined the search for the vulnerable 14-year-old.
Computer Weekly has learned that on the evening of 23 June 2020, police recorded a “major issue” with ControlWorks that could have led to delays in information being passed to investigators.
Computer Weekly further understands that on the evening of 24 June, a member of the public called police to say they had seen an individual attempting to sell Donohoe’s missing laptop.
This potentially critical information was delayed in being brought to the attention of police officers investigating Donohoe’s disappearance because of a problem with ControlWorks, Computer Weekly has been told.
It is unclear exactly how long the information was delayed by and what its impact on the search for the missing teenager was. But it is understood that detectives on the case reported and noted the delay during the investigation.
The issue with ControlWorks was understood to have been reported during the live investigation at a critical time when Donohoe was missing – two days after he had gone missing, and four days before he was found dead in a Belfast storm drain.
Manchester had serious IT issues
Greater Manchester Police experienced problems when it went live with its Integrated Operational Policing System (iOPS), which included ControlWorks, in July 2019. iOps attempted to integrate Capita’s ControlWorks software with Capita’s PoliceWorks record management software used by police officers for managing day-to-day investigations and intelligence records.
An independent review found serious issues with the project. At one point, police were forced to revert to pen and paper for 72 hours while records were migrated to the new system.
“This consumed considerable time and capacity, causing a duplication of work,” the report found. “In addition, some legacy demand, which included ongoing investigations, did not successfully transfer from the old systems, so could no longer be worked on.”
Greater Manchester Police subsequently announced plans to replace PoliceWorks after concluding it could not be adapted or fixed, but it has continued to use ControlWorks.
The PSNI uses a different record management system to Manchester’s troubled PoliceWorks system. The PSNI signed a £9m contract with the Canadian company NicheRMS to deploy its Records Management System, which records information about people, locations, vehicles, incidents and evidence, in 2006.
NicheRMS keeps duplicate records of reports from the public that are recorded on ControlWorks when they are escalated as an “incident”. This means that should data be lost because of problems with ControlWorks, the PSNI would still have access to duplicate records reported by the public on NicheRMS if they have been escalated as an “incident”.
Policing Board seeks clarification from PSNI
The Northern Ireland Policing Board has confirmed that if a major system disruption or significant information or data loss occurred, the board would expect to be informed.
A spokesperson told Computer Weekly that the board’s Resources Committee, which has oversight responsibility for matters including the PSNI’s technology systems, has asked the PSNI for clarification about the issues raised by Computer Weekly.
A coroner’s inquest into the circumstances of Noah Donohoe’s death is due to begin on 19 January.
The PSNI said it would “not comment on investigative matters while legal proceedings are ongoing”.
“With regards to questions relating to ControlWorks, police can confirm that, to date, there has been no instance of major disruption which has led to data loss,” a spokesperson said.
Capita declined to comment.
Tech
Cyber body ISC2 signs on as UK software security ambassador | Computer Weekly
ISC2, the non-profit cyber professional membership association, has joined the UK government’s recently launched Software Security Ambassador Scheme as an expert adviser.
Set up at the beginning of the year by the National Cyber Security Centre (NCSC) and the Department for Science, Innovation and Technology (DSIT), the scheme forms part of a wider £210m commitment by Westminster to remodel approaches to public sector cyber resilience from the ground up, acknowledging that previous approaches to the issue have basically gone nowhere and that previously set targets for resilience are unachievable.
It is designed to incentivise organisations to pay more attention to the security of software products, and supports the wider adoption of the Software Security Code of Practice, a set of voluntary principles defining what secure software looks like.
ISC2 joins a number of tech suppliers, including Cisco, Palo Alto Networks and Sage; consultancies and service providers including Accenture and NCC Group; and financial services firms including Lloyds Banking Group and Santander. Fellow cyber association ISACA is also involved.
“Promoting secure software practices that strengthen the resilience of systems underpinning the economy, public services and national infrastructure is central to ISC2’s mission,” said ISC2’s executive vice-president for advocacy and strategic engagement, Tara Wisniewski.
“The code moves software security beyond narrow compliance and elevates it to a board-level resilience priority. As supply chain attacks continue to grow in scale and impact, a shared baseline is essential and through our global community and expertise, ISC2 is committed to helping professionals build the skills needed to put secure-by-design principles into practice,” she said.
Software vulns a huge barrier to resilience
A study of wider supply chain risks conducted last year by ISC2 found that a little over half of organisations worldwide reported that vulnerabilities in their software suppliers’ products represented the most disruptive cyber security threat to their overall supply chain.
And the World Economic Forum’s (WEF’s) Global Cybersecurity Outlook report, published on 12 January, revealed that third-party and supply chain vulnerabilities were seen as a huge barrier to building cyber resilience by C-suite executives.
A total of 65% of respondents to the WEF’s annual poll flagged such flaws as the greatest challenge their organisation faced on its pathway to resilience, compared to 54% at the beginning of 2025. This outpaced factors such as the evolving threat landscape and emerging AI technology, use of legacy IT systems, regulatory compliance and governance, and cyber skills shortages.
Pressed on the top supply chain cyber risks, respondents were most concerned about their ability to assure the integrity of software and other IT services, ahead of a lack of visibility into their supplier’s supply chains and overdependence on critical third-party suppliers.
The UK’s Code of Practice seeks to answer this challenge by establishing expectations and best practices for tech providers and any other organisations that either develop, sell or buy software products. It covers aspects such as secure design and development, the security of build environments, deployment and ongoing upkeep, and transparent communication with customers and users.
As part of its role as an ambassador, ISC2 will assist in developing and improving the Code of Practice, while championing it by embedding its guiding principles into its own cyber education and professional development services – the organisation boasts 10,000 UK members and associates.
It will also help to drive adoption of the Code of Practice through various awareness campaigns, incorporating it into its certifications, training and guidance, engaging with industry stakeholders and members to encourage implementation, and incorporating its provisions into its work with its own commercial suppliers.
Tech
Asus Made a Split Keyboard for Gamers—and Spared No Expense
The wheel on the left side has options to adjust actuation distance, rapid-trigger sensitivity, and RGB brightness. You can also adjust volume and media playback, and turn it into a scroll wheel. The LED matrix below it is designed to display adjustments to actuation distance but feels a bit awkward: Each 0.1 mm of adjustment fills its own bar, and it only uses the bottom nine bars, so the screen will roll over four times when adjusting (the top three bars, with dots next to them, illuminate to show how many times the screen has rolled over during the adjustment). The saving grace of this is that, when adjusting the actuation distance, you can press down any switch to see a visualization of how far you’re pressing it, then tweak the actuation distance to match.
Alongside all of this, the Falcata (and, by extension, the Falchion) now has an aftermarket switch option: TTC Gold magnetic switches. While this is still only two switches, it’s an improvement over the singular switch option of most Hall effect keyboards.
Split Apart
Photograph: Henri Robbins
The internal assembly of this keyboard is straightforward yet interesting. Instead of a standard tray mount, where the PCB and plate bolt directly into the bottom half of the shell, the Falcata is more comparable to a bottom-mount. The PCB screws into the plate from underneath, and the plate is screwed onto the bottom half of the case along the edges. While the difference between the two mounting methods is minimal, it does improve typing experience by eliminating the “dead zones” caused by a post in the middle of the keyboard, along with slightly isolating typing from the case (which creates fewer vibrations when typing).
The top and bottom halves can easily be split apart by removing the screws on the plate (no breakable plastic clips here!), but on the left half, four cables connect the top and bottom halves of the keyboard, all of which need to be disconnected before fully separating the two sections. Once this is done, the internal silicone sound-dampening can easily be removed. The foam dampening, however, was adhered strongly enough that removing it left chunks of foam stuck to the PCB, making it impossible to readhere without using new adhesive. This wasn’t a huge issue, since the foam could simply be placed into the keyboard, but it is still frustrating to see when most manufacturers have figured this out.
-
Politics1 week agoUK says provided assistance in US-led tanker seizure
-
Entertainment1 week agoDoes new US food pyramid put too much steak on your plate?
-
Entertainment1 week agoWhy did Nick Reiner’s lawyer Alan Jackson withdraw from case?
-
Business1 week agoTrump moves to ban home purchases by institutional investors
-
Sports1 week agoPGA of America CEO steps down after one year to take care of mother and mother-in-law
-
Sports5 days agoClock is ticking for Frank at Spurs, with dwindling evidence he deserves extra time
-
Business1 week agoBulls dominate as KSE-100 breaks past 186,000 mark – SUCH TV
-
Sports6 days ago
Commanders go young, promote David Blough to be offensive coordinator
