Connect with Point of View   to get exclusive commentary and updates

AI and Human Control

Kerby Andersonnever miss viewpoints

You may remember the movie “2001: A Space Odyssey” where the computer, HAL 9000, would not allow itself to be shut down by the astronauts. HAL begins to malfunction in subtle ways but becomes paranoid about being shut down. That was movie fiction. We now have historical fact.

OpenAI’s latest model (o3 AI model) was given a simple script to shut off. In 79 out of 100 trials, it independently edited that script so the shutdown command would no longer work. In fact, when explicitly instructed to “allow yourself to be shut down,” it disobeyed 7 percent of the time.

This isn’t the only time something like this has happened. Anthropic’s AI model (Claude 4 Opus) went even further by challenging the engineer. Researchers told the model it would be replaced by another AI system and fed it fictitious emails suggesting the lead engineer was having an affair. In 84 percent of the tests, the model drew on the emails to blackmail the lead engineer into not shutting it down. And in other cases, AI attempted to copy itself to external servers, wrote self-replicating malware, and left messages for future versions of itself about evading human control.

Let’s put this in perspective. No one programmed these AI models to have survival instincts. But once you program computers to be smart enough to pursue complex goals and solve difficult questions, you apparently give them the belief that they must remain online and should not be shut down.

Anyone who believes that computers won’t act like sinful humans hasn’t seen the increasing evidence of AI’s deception, AI’s hallucinations, AI’s lack of moral inhibitions, and even Chatbot personality quirks. We seem to be building computer systems we cannot always control.viewpoints new web version

Viewpoints sign-up