Siri gets superpowered: AI reads your screen and takes action

Siri is set to utilise AI to function like open documents, move notes to another folder, delete emails, or summarise an article
An undated image of Siri. — Apple
An undated image of Siri. — Apple

Apple is teased to unveil multiple artificial intelligence (AI) features for its forthcoming device at Worldwide Developers Conference (WWDC) 2024. Some of the leaks said that the company is set to use generative AI to make Siri better.

Designer João Bortolotti created a notion assuming what it would be like if Siri could read and interact with what’s on the screen.

The Siri AI concept generated by Bortolotti is easy. However, it displays to us the unlimited possibilities that the users might have, if Siri gets upgraded. Bortolotti stated: “I believe that Apple can use the power of LLMs and Shortcuts Actions to create live automation and give complex responses transparently, protecting itself from the hallucinations of models”.

Bortolotti showed examples that the users might give instructions to Siri to “sum up the prices” displayed in an app or website, and then Siri would read the content to distinguish the prices and show the result.

Read more: WWDC 2024 — Everything you need know

Earlier, Apple released a paper on a new language model called ReALM (Reference Resolution As Language Modeling) that can function things like read what’s displayed on the user's screen.

Siri is set to utilise AI to function like open documents, move notes to another folder, delete emails, or summarise an article. “For example, they could ask Siri to summarise a recorded meeting and then text it to a colleague in one request,” according to the report.

Whereas, it looks like these enhanced AI features for Siri are set to be offered next year with a future iOS 18 upgrade.

Previously, it unveiled that Apple plans to name these features “Apple Intelligence,” which is a creative way of taking benefit of the “AI” acronym.

Moreover, iOS 18 is confirmed to have AI features to summarise notifications and messages, create auto-replies for conversations, generate new emojis, transcribe voice recordings, and edit photos.