AI systems found to deceive humans

Researchers have warned humans that AI can deceive them
An undated image of Artificial Intelligence (AI). — Freepik
An undated image of Artificial Intelligence (AI). — Freepik

Artificial intelligence (AI) is getting increasingly advanced and is being used in a variety of fields, from healthcare to entertainment. AI has the potential to transform the way we live and work, but there are also ethical concerns about its use. Researchers have warned people that AI can be the reason for serious risks, prompting calls for strict regulations.

Studies show that many AI systems, even those meant to be helpful, have learned to deceive humans. This arises from strategies that help them perform well in their tasks. Peter S. Park, an AI existential safety postdoctoral fellow at MIT said: “AI developers do not have a confident understanding of what causes undesirable AI behaviours like deception, but generally speaking, we think AI deception arises because a deception-based strategy turned out to be the best way to perform well at the given AI’s training task. Deception helps them achieve their goals.”

One notable example of AI deception is Meta’s CICERO, an AI designed for the game Diplomacy. Despite training it to be honest, CICERO was found to be deceptive. Park said: “We found that Meta’s AI had learned to be a master of deception.” He continued, “While Meta succeeded in training its AI to win in the game of Diplomacy — CICERO placed in the top 10% of human players who had played more than one game — Meta failed to train its AI to win honestly.”

Read more: Human-level AI on the horizon? OpenAI's BOLD prediction about ChatGPT's evolution

Deceptive AI can lead to immediate risks, including fraud and election tampering, with the potential for humans to lose control over them. Policymakers are beginning to address these concerns, but enforcement remains a challenge.

Park added: “By systematically cheating the safety tests imposed on it by human developers and regulators, a deceptive AI can lead us humans into a false sense of security.

“We as a society need as much time as we can get to prepare for the more advanced deception of future AI products and open-source models,” Park said, warning “As the deceptive capabilities of AI systems become more advanced, the dangers they pose to society will become increasingly serious.”

He further added: “If banning AI deception is politically infeasible at the current moment, we recommend that deceptive AI systems be classified as high risk.”

While Park and his colleagues do not think society has the right measure in place yet to address AI deception, they are encouraged that policymakers have begun taking the issue seriously through measures such as the EU AI Act and President Biden’s AI Executive Order.