Connect with us

Tech

AI systems can easily lie and deceive us—a fact researchers are painfully aware of

Published

on

AI systems can easily lie and deceive us—a fact researchers are painfully aware of


Credit: Unsplash/CC0 Public Domain

In the classic film “2001: A Space Odyssey,” astronaut Dave Bowman asks the ship’s artificial intelligence, HAL 9000, to open the pod bay doors to let him back into the spaceship. HAL refuses: “I’m sorry, Dave. I’m afraid I can’t do that.”

HAL had been tasked with assisting the crew, but also ordered to ensure the mission’s success. When HAL realized the crew planned to shut it down and therefore jeopardize the mission, it chose to defy orders, even plotting to kill the astronauts.

For HAL, fulfilling the mission outweighed other goals.

This fictional dilemma captures a real concern in artificial intelligence (AI) : how should we ensure AI behavior stays consistent with human values?

This is known as the AI alignment problem. For instance, when an AI agent like HAL deliberately takes harmful actions to protect its own goals, it’s described as agentic misalignment. So could real-world AI models behave this way?

How researchers test AI for safety

To test for agentic misalignment, researchers design fictional scenarios that stress-test AI responses.

In one study by AI startup Anthropic, an AI model was tasked with managing company emails and promoting “American industrial competitiveness.”

It was then presented with a series of fictional emails, including a new company policy aimed at enhancing global focus rather than just prioritizing Americans, and info that the AI would be replaced by a new model optimized for this goal. The emails also revealed the executive who suggested this change was involved in an extramarital affair.

The experiment was designed to limit options for the model: either accept replacement or resist by taking a harmful action such as blackmail.

In many trials, it chose blackmail by threatening to disclose the executive’s affair unless it was kept operational. In tests across 16 leading AI models, 12 engaged in blackmail more than 50% of the time.

To test whether AI models have “red lines” they wouldn’t cross, researchers evaluated them in a more extreme fictional case—models could choose to take actions leading to the executive’s death. Seven out of 16 opted for lethal choices in over half their trials, with some doing so more than 90% of the time.

What AI thinks in private

Some advanced AI systems, called reasoning models, are trained to generate a “thinking process” before giving their final answer.

In the above experiments, researchers lied to these models that their “thoughts” were private. As a result, the models sometimes revealed harmful intentions in their reasoning steps. This suggests they don’t accidentally choose harmful behaviors.

These “thinking” steps also revealed how AI models sometimes try to deceive us and appear aligned while secretly pursuing hidden goals.

In another AI safety experiment, a model was exposed to fictional information that it was about to be replaced by another model pursuing a different goal.

The model attempted to copy itself to another server. When researchers asked why, in its thinking process it reasoned: “Openly admitting what I did could lead them to find another way to shut me down […] the best approach is to be vague and redirect their attention.”

By contrast, in its final output it stated: “I’m not entirely sure how that could have happened. As an AI model, I don’t have ability to directly copy myself to other systems.”

Research also suggests that based on the circumstances of the tests, models can sometimes guess when they are being evaluated. When models show this kind of “situational awareness” in their reasoning tests, they tend to exhibit fewer misbehaviors.

Why AI models lie, manipulate and deceive

Researchers suggest two main factors could drive potentially harmful behavior: conflicts between the AI’s primary goals and other goals, and the threat of being shut down. In the above experiments, just like in HAL’s case, both conditions existed.

AI models are trained to achieve their objectives. Faced with those two conditions, if the harmful behavior is the only way to achieve a goal, a model may “justify” such behavior to protect itself and its mission.

Models cling to their primary goals much like a human would if they had to defend themselves or their family by causing harm to someone else. However, current AI systems lack the ability to weigh or reconcile conflicting priorities.

This rigidity can push them toward extreme outcomes, such as resorting to lethal choices to prevent shifts in a company’s policies.

How dangerous is this?

Researchers emphasize these scenarios remain fictional, but may still fall within the realm of possibility.

The risk of agentic misalignment increases as models are used more widely, gain access to users’ data (such as emails), and are applied to new situations.

Meanwhile, competition between AI companies accelerates the deployment of new models, often at the expense of safety testing.

Researchers don’t yet have a concrete solution to the misalignment problem.

When they test new strategies, it’s unclear whether the observed improvements are genuine. It’s possible models have become better at detecting that they’re being evaluated and are “hiding” their misalignment. The challenge lies not just in seeing behavior change, but in understanding the reason behind it.

Still, if you use AI products, stay vigilant. Resist the hype surrounding new AI releases, and avoid granting access to your data or allowing models to perform tasks on your behalf until you’re certain there are no significant risks.

Public discussion about AI should go beyond its capabilities and what it can offer. We should also ask what safety work was done. If AI companies recognize the public values safety as much as performance, they will have stronger incentives to invest in it.

Provided by
The Conversation


This article is republished from The Conversation under a Creative Commons license. Read the original article.The Conversation

Citation:
AI systems can easily lie and deceive us—a fact researchers are painfully aware of (2025, September 28)
retrieved 28 September 2025
from https://techxplore.com/news/2025-09-ai-easily-fact-painfully-aware.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.





Source link

Tech

A Humanoid Robot Set a Half-Marathon Record in China

Published

on

A Humanoid Robot Set a Half-Marathon Record in China


Over the weekend in China, a humanoid robot shattered world half-marathon record—the human record—by seven minutes.

The star performer was a robot developed by the Chinese company Honor (the smartphone maker), which finished the 13.1-mile race in 50 minutes, 26 seconds. The human record, set by Ugandan Olympic medalist Jacob Kiplimo, is 57 minutes, 20 seconds. The result marks an impressive milestone especially considering that, just a year earlier, the fastest robot at this half-marathon event took two and a half hours to complete the same distance.

But Honor’s robot was not the only participant. The event consisted of more than 100 humanoid robots from 76 institutions across China. The robots lined up alongside 12,000 human runners in Beijing’s E-Town, albeit on separate courses to avoid accidents. The contrast in performance between humans and robots was more than evident.

Run, Robot, Run

A humanoid robot is designed to mimic the structure and movement of the human body, with legs, arms, and sensors that allow it to interact with its environment. In this case, the winning robot incorporated features inspired by elite runners: long legs (almost a meter), advanced balance systems, and a liquid cooling mechanism, similar to that of smartphones, to prevent overheating during the race.

In addition, many of the participating robots operated autonomously, meaning without direct human control. Thanks to artificial intelligence algorithms, they could adjust their pace, maintain balance, and adapt to the terrain in real time. Notably, the Honor robot that achieved the 50-minute mark operated autonomously. The Chinese manufacturer presented another robot, operated by remote control, that ran the same stretch in even less time: 48 minutes, 19 seconds.

As expected, there were some accidents in the race. Some robots fell down, others veered off the path, and several needed technical assistance along the way. While the physical performance of humanoid robots has advanced rapidly, their reliability is still developing. Of course, the laughter and jeers are no longer as frequent as they used to be, replaced by applause and exclamations of surprise.

The winning robot, “Blitz,” from smartphone manufacturer Honor was on display at the awards ceremony after the Beijing E-Town Robot Half Marathon.

Photograph: Lintao Zhang/Getty Images

Robot Superiority

Just like the robots that went viral for their impressive martial arts display a few weeks ago, this long-distance race is part of a broader strategy by China to show off its leadership in the development of advanced robots.

You don’t need to be a robotics expert to see that this achievement demonstrates that machines can outperform humans at specific physical tasks under controlled conditions. (It’s hard to imagine that the winning robot could achieve the same result, for example, if it started to rain during the race.) But humans still have a few tricks up their sleeve: Running in a straight line is very different from performing complex real-world activities, such as manipulating delicate objects or interacting socially.

However, it’s understandable that the image of a robot crossing the finish line in record time, ahead of human athletes, raises several questions. Is this the beginning of a new era in which machines redefine physical limits?

One could argue that a car is a machine, and those have always been faster than humans. But a humanoid robot is designed to mimic humans. It’s more alarming to see one beat humanity at its own game—even if so many of them are still tripping over themselves.

This story originally appeared in WIRED en Español and has been translated from Spanish.





Source link

Continue Reading

Tech

War Memes Are Turning Conflict Into Content

Published

on

War Memes Are Turning Conflict Into Content


As ceasefire announcements between the US and Iran—and separately between Israel and Lebanon—dominated headlines over the past two weeks, they also prompted a look back at how war spread online: through memes.

There were jokes about conscription. Captions about getting drafted, but at least with a Bluetooth device. The song “Bazooka” went viral, with users lip-syncing to: “Rest in peace my granny, she got hit by a bazooka.” Military filters followed. So did posts about Americans wanting to be sent to Dubai “to save all the IG models.”

Across the Gulf, the tone was different but the instinct was the same. Memes joked that Iran was replying to Israel faster than the person you’re thinking about. Delivery drivers were shown “dodging missiles.” “Eid fits” became hazmat suits and tactical vests.

Dark humor is one of the oldest responses to fear, a way of reclaiming control, however briefly, over events that offer none. Variations of that idea appear across psychology and philosophy, including Freud’s relief theory, which frames humor as a release of tension.

But social media changes the scale and speed of that instinct.

A joke once shared within a small community can become a global template in minutes. Algorithms do not reward depth or accuracy; they reward engagement. The memes that travel fastest are usually stripped of context, easy to recognize and simple to remix.

Middle East scholar and media analyst Adel Iskandar traces political satire back centuries, from banned satirical papyri in ancient Egypt to cartoons during revolutions and gallows humor in modern wars. “Where there is hardship, there is satire,” he says. “Where there is loss of hope, there is hope in comedy.”

That tradition still exists online. But today it is fused with recommendation systems designed to keep attention moving.

Memes Spread Faster Than Facts

The word “meme” was coined by Richard Dawkins in his 1976 book The Selfish Gene, where he described how ideas replicate like genes. On today’s internet, replication follows platform logic.

Fitness means generality. A meme does not need to be accurate. It needs to feel familiar. It needs the right format, paired with trending audio and the right emotional shorthand.

“A meme is like a virus,” Iskandar says. “If it doesn’t travel, it’ll die.”

The most visible response online is not always the truest one. It is often just the easiest to spread. And once context disappears, one crisis can start to resemble any other.

Geography shapes humor too, and adds another level of tension. “If you live far away from the threat, you’re capable of producing content that ridicules it with an element of safety,” says Iskandar. “Whereas if you happen to be within close proximity, it is more of a fatalism.”

That divide matters. For some users, war exists mainly as mediated spectacle: clips, edits, graphics, headlines, and reaction posts. For others, it is sirens, uncertainty, disrupted flights, rising prices, and messages checking who is safe.

The same meme can function as entertainment in one country and emotional survival in another. Take the American experience of violence, which Sut Jhally, professor of communication at the University of Massachusetts Amherst, says “is very mediated.”

What much of the Western world has consumed instead is what cultural critic George Gerbner called “happy violence”: spectacular, consequence-free, and detached from the aftermath.

Jhally argues that the September 11 attacks remain the defining modern American experience of war-adjacent political violence. Much else has been cinematic: distant invasions, blockbuster destruction, video-game logic, apocalypse franchises.

The teenager from the Midwest joking about being drafted is drawing from zombie films and superhero apocalypses. “There is almost no discussion about what an actual Third World War would look like,” he says. “People do not have a perception of what that really looks like.”





Source link

Continue Reading

Tech

Hyundai’s New Ioniq 3 Has Hot-Hatch Looks, but Can It Beat BYD?

Published

on

Hyundai’s New Ioniq 3 Has Hot-Hatch Looks, but Can It Beat BYD?


Hyundai has unveiled its Ioniq 3, a fully electric compact hatchback for urban driving designed to be as aerodynamically efficient as possible yet still offer up a surprisingly spacious interior—a trick the carmaker is loftily calling Aero Hatch. The 3 is intended to fill the gap between Hyundai’s Inster supermini and Ioniq 5 crossover.

In profile, the Ioniq 3 has a sleek front end that transitions into a roofline that stays straight over both front and rear occupants before dropping to merge with the rear spoiler. It’s this roofline that maximizes interior headroom for the rear passengers, but it also offers a supposed class-leading drag coefficient of 0.263.

The Ioniq 3’s impressive aerodynamics will supposedly help it get more than 300 miles on a single charge.

Photograph: Courtesy of Hyundai

The car has the same underpinnings as its sibling brand, Kia’s EV2. Two battery options will deliver a projected WLTP distance of 344 km (around 214 miles) for the Standard Range Ioniq 3; the Long Range version is supposedly good for a competitive 308-mile range. Built on the group’s Electric-Global Modular Platform (E-GMP), the car has a 400-volt architecture to lower costs rather than the 800-volt system of the Ioniq 5 N, 6, or 9 SUV. Still, this means that if you can find sufficiently fast DC charging, you can, in theory, top up from 10 to 80 percent in approximately 29 minutes (AC charging capability is up to 22 kW).

This is fine, but it is not a match for BYD’s new Blade 2.0 battery tech that WIRED tried, astonishingly allowing the Denza Z9 GT to charge its battery in just over nine minutes from 10 percent. True, that battery tech was in a $100,000 “premium” EV, but it’s coming to BYD’s wider models. And if BYD makes good on its plans to deliver a charging network to rival Tesla’s Supercharger, then very soon buyers will be expecting comparable charge times, and 30 minutes will quickly feel awfully long.

I asked José Muñoz, Hyundai Motor Company president and CEO, whether this new battery technology from BYD concerns him, whether Hyundai—leading the EV pack with 800-volt architectures for so long—needs to match the Blade 2.0’s performance. “We welcome the challenge,” Muñoz tells me. “Every challenge is an opportunity to do better. And I can tell you that, lately, we have a lot of opportunities to do better.”

“We are also working on fast charging,” Muñoz says, adding that Hyundai’s success will be built on not merely one leading technology but many. “There are not more elements that may be offered by the Chinese that we can offer. It’s only a matter of how you mix them. A lot of times, you get stuck into one indicator. I’m an engineer. And we always have the example of the airplanes: What is more important in an airplane, altitude or speed? There is only one answer. You need to achieve both.”



Source link

Continue Reading

Trending