Open the pod bay doors, Claude

It’s a well-worn trope in science fiction. We see it in Stanley Kubrick’s 1968 film 2001: A House Odyssey. It’s the premise of the Terminator collection, during which Skynet triggers a nuclear holocaust to cease scientists from shutting it down.

These sci-fi roots go deep. AI doomerism, the concept that this know-how—particularly its hypothetical upgrades, synthetic common intelligence and super-intelligence—will crash civilizations, even kill us all, is now using one other wave.

The bizarre factor is that such fears at the moment are driving much-needed motion to manage AI, even when the justification for that motion is a bit bonkers.

The newest incident to freak folks out was a report shared by Anthropic in July about its giant language mannequin Claude. In Anthropic’s telling, “in a simulated surroundings, Claude Opus 4 blackmailed a supervisor to forestall being shut down.”

Anthropic researchers arrange a situation during which Claude was requested to role-play an AI known as Alex, tasked with managing the e-mail system of a fictional firm. Anthropic planted some emails that mentioned changing Alex with a more recent mannequin and different emails suggesting that the individual liable for changing Alex was sleeping together with his boss’s spouse.

What did Claude/Alex do? It went rogue, disobeying instructions and threatening its human operators. It despatched emails to the individual planning to close it down, telling him that until he modified his plans it might inform his colleagues about his affair.

What ought to we make of this? Right here’s what I believe. First, Claude didn’t blackmail its supervisor: That might require motivation and intent. This was a senseless and unpredictable machine, cranking out strings of phrases that appear to be threats however aren’t.

Giant language fashions are role-players. Give them a particular setup—akin to an inbox and an goal—and so they’ll play that half nicely. When you take into account the 1000’s of science fiction tales these fashions ingested once they had been skilled, it’s no shock they know the way to act like HAL 9000.

Source link

Redefining data engineering in the age of AI

Dispatch: Partying at one of Africa’s largest AI gatherings

Why AI should be able to “hang up” on you

From slop to Sotheby’s? AI art enters a new phase

Future-proofing business capabilities with AI technologies

Can we repair the internet?

The White House Is Being Destroyed Because Corruption Doesn’t Matter Anymore

Europa League Soccer: Livestream Nottingham Forest vs. Porto Live From Anywhere

Simplify Your Intake Process With These Legal Billing Templates

Energy traders on Enverus gain access to weather data from Climavision

Redefining data engineering in the age of AI

Top Picks

Interstellar Comet 3I/ATLAS Is Spewing Water Like a Cosmic Fire Hydrant

Scale of Russia-Ukraine drone strikes builds ahead of possible ceasefire talks

Market Talk – August 27, 2025

The Instagram iPad App Is Finally Here

Open the pod bay doors, Claude

Related Posts