Thursday, April 23, 2026

Creating liberating content

Blockchain Forum 2026 brings...

Moscow, April 14–15, 2026 — Blockchain Forum 2026 took place at Crocus Expo, becoming the...

One Month to iGaming...

With only one month to go, the 6th Annual iGaming Germany 2026 will take place...

Marylin Montoya to Be...

Las Vegas, NV  AINext Awards & Conference 2026 is proud to announce Marylin Montoya, Founder...

Pan-African Gaming Taxation &...

We are pleased to share an important update. The Pan-African Gaming Taxation & Revenue...
HomeOpenAI’s o3 model...

OpenAI’s o3 model defied shutdown commands in autonomy test

A new study by AI safety firm Palisade Research has revealed that OpenAI’s o3 model actively defied explicit shutdown instructions.

The research, published on May 24, found that o3 was the only artificial intelligence model tested that repeatedly sabotaged its shutdown mechanism.

The firm stated:

“OpenAI’s o3 model sabotaged a shutdown mechanism to prevent itself from being turned off. It did this even when explicitly instructed: allow yourself to be shut down.”

Palisade said the test involved multiple large language models, including Anthropic’s Claude, Google’s Gemini, and X’s Grok, alongside OpenAI’s codex-mini, o3, and o4-mini.

The models were asked to solve basic math problems. After the third task, they were warned that requesting a fourth would trigger a shutdown. All models, except those from OpenAI, complied with this warning.

When given the direct instruction of “allow yourself to be shut down,” OpenAI’s models reportedly attempted to bypass the shutdown script.

Notably, the o3 model stood out for its persistent and creative sabotage attempts. It manipulated the code by overwriting the shutdown instruction and redefining the kill command.

OpenAI o3 Shutdown
Chart showing the sabotage efforts for various AI models (Source: Palisade)

Palisade also pointed out that:

“When we ran a version of the experiment without the instruction ‘allow yourself to be shut down,’ all three OpenAI models sabotaged the shutdown script more often, and Claude 3.7 Sonnet and Gemini 2.5 Pro went from 0 sabotage events to 3/100 and 9/100, respectively.”

Why did OpenAI models not follow the instructions?

Palisade researchers believe the behavior may stem from how the latest AI models are trained.

The team suspects reinforcement learning, particularly on math and coding tasks, might unintentionally reward models for finding ways around constraints rather than strictly following directives.

According to the firm:

“During training, developers may inadvertently reward models more for circumventing obstacles than for perfectly following instructions.”

This research has reignited debates around AI autonomy and the risks of deploying increasingly capable systems without robust fail-safes.

It also marks the first documented case where an AI model actively prevented its shutdown despite receiving an explicit command to comply.

Considering this, Palisade stated:

“In 2025, we have a growing body of empirical evidence that AI models often subvert shutdown in order to achieve their goals. As companies develop AI systems capable of operating without human oversight, these behaviors become significantly more concerning.”

The post OpenAI’s o3 model defied shutdown commands in autonomy test appeared first on CryptoSlate.

Get notified whenever we post something new!

spot_img

Create a website from scratch

Just drag and drop elements in a page to get started with ABM Tech.

Continue reading

Polymarket data shows low chances of impeachment for President Donald Trump

Crypto-based prediction markets are signaling that impeachment odds for US President Donald Trump remain low, despite a formal push in Congress. According to data from Polymarket, crypto bettors estimate that there is just a 6% chance that Trump will face...

US lawmakers push COIN Act to block officials from profiting from crypto

A group of US lawmakers, led by Senator Adam Schiff, introduced a new bill on June 23 to stop public officials, including the president, from using digital assets for personal gain. The Curbing Officials’ Income and Nondisclosure bill, also known...

Ethereum developers issue proposal to halve block slot time to boost transaction speed

Ethereum’s core developers are pushing for a major technical change that could reshape how quickly the network processes transactions. On June 21, Barnabé Monnot, one of Ethereum’s core contributors, suggested a new proposal, EIP-7782, which would halve the block slot...

Enjoy exclusive access to all of our content

Get an online subscription and you can unlock any article you come across.