Google and AI vulnerable to image-based hacking, researchers warn

Attacks pose risks to identity theft and data security due to integration with sensitive domains
An undated image. — iStock
An undated image. — iStock

Leading cybersecurity company Trail of Bits has identified a serious flaw in large language models (LLMs). The researchers created a method that inserts malicious prompts into pictures so that AI systems can take advantage of them when downsampling and compressing them.

The technique introduces visual artefacts that hide instructions by taking advantage of image resizing.

Unauthorised actions are permitted because the model interprets these hidden commands as valid user input.

Artificial intelligence (AI) systems were instructed to carry out sensitive tasks in tests by manipulating images, such as extracting Google Calendar data and sending it to an external email address without the user's permission.

Vertex AI Studio, Google Assistant on Android, Google's Gemini CLI, and Gemini's web interface are all impacted by the vulnerability.

Building on previous research from TU Braunschweig, Trail of Bits developed "Anamorpher", an open-source program that produces malicious images.

Due to the integration of multimodal models with sensitive domains, this attack presents serious risks to identity theft and data security.

Conventional security measures cannot stop this kind of manipulation. Layered security, limiting input dimensions, previewing downscaled images, and requiring confirmation for sensitive operations are all suggestions made by the researchers.

"The implementation of secure design patterns and systematic safeguards that limit prompt injection, including multimodal attacks, is the strongest defence," the Trail of Bits team stressed.

To stop such vulnerabilities from being exploited, this discovery emphasises the necessity of strong security measures in AI development.