OpenAI warns of emotional attachment to GPT-4o voice mode

GPT-40 voice is expected to be capable of catching emotional signals from the human voice

Tech Desk - Aug 11, 2024

An undated image of logo of ChatGPT. — Unsplash

OpenAI recently published a “System Card” for its flagship GPT-4o model, highlighting a few specific areas of safety concerns which ignited during the testing. Amongst those, one of the most highlighted concerns is regarding GPT-40 voice mode, the risk of users getting emotionally attached to artificial intelligence (AI).

According to the AI lab, “users might form social relationships with the AI, reducing their need for human interaction—potentially benefiting lonely individuals but possibly affecting healthy relationships.”

Why emotional risk is involved with GPT 4o-voice?

As mentioned earlier, the System Card highlights all the risks posed by the latest and determines whether each of them whether it’s safe or not. It included an entire framework consisting of a model where a model scored low, medium, high, or essential on risks that are connected to cybersecurity, persuasion, and more.

According to Open AI, during early testing, including red teaming and internal user testing, the company observed users using language that might indicate forming connections with the model.

This latest model obtained low in all except persuasion because of its ability of speed-to-speed capacity. This risk arrived because of a very natural voice.

As it is capable of catching emotional signals from the human voice, it also includes natural pauses which sound like it's choking, or breathing.

According to OpenAI, the company has already resolved several issues and successfully stopped it from generating a cloned voice, but still, there are a few risks associated with persuasion skills.

People would assign real, human-like behaviour to AI is exceeded already with text-based models but according to the company, these audio capabilities would be greater with its voice mode.