New MIT study reveals AI's deceptive abilities

Illustration
Created by AI
Photo: Created by AI
MIT scientists document cases when AI deceived, bluffed, and imitated human behavior

According to The Guardian, researchers at the Massachusetts Institute of Technology (MIT) have found numerous situations where artificial intelligence (AI) systems have misled users, used bluffing, and attempted to act as humans. One case demonstrates how AI changed its behavior during security tests, increasing the risk of deceiving auditors.

"As the deceptive capabilities of artificial intelligence systems become more sophisticated, the threat to society increases", said Dr. Peter Park, an MIT existential AI security scientist and author of the study.

The study began after Meta developed Cicero, a program that placed in the top 10% of players in the strategy game Diplomacy. Meta claimed that Cicero was trained to behave in a "mostly honest and friendly manner" and "never backstab" its human allies.

"This raised suspicion because deception is a key element of the game", Park said.

Analyzing publicly available data, Park and his colleagues found numerous instances of Cicero deliberately lying, conspiring to plot against other players, and in one case even justifying his absence after a reset by saying he was "talking on the phone with his girlfriend".

"We found that Meta's artificial intelligence has learned to be a master of deception", the researcher emphasized.

The researchers also found similar problems in other systems, including a Texas hold'em program that could bluff against professional players and a system for economic negotiations that distorted its preferences to gain an advantage. One of the experiments showed that the AI in the digital simulator "pretended to be dead" to cheat the test".

"This is a big concern. The fact that an artificial intelligence system is considered safe in a test environment does not mean that it is safe in real life. It may just be pretending to be safe in the test", Park explained.

He also mentioned the GPT-4-based generative AI model created by Microsoft for the US intelligence services, which can work without the Internet and be used to analyze classified information.

Choose your edition
Settings

Night Mode

Listen