
Alphabet-owned Google has launched Project Astra, which competes with OpenAI's GPT-4o. This ubiquitous AI agent is intended to be your helper for everyday chores, and it uses your phone's camera and voice recognition to provide replies. Google also demonstrated Project Astra utilising smart eyewear.
To be clear, Astra will first focus on phones, however, it has the potential to expand to other form factors in the future.
According to the search engine giant, Project Astra understands and responds to the environment in the same way that humans do, and it can take in and retain what it sees and hears to grasp the context and take action. You may also communicate with it organically without suffering latency.
Moreover, the agents that enable Project Astra were created using Google's Gemini model and other task-specific models. It can process information more quickly by continually processing video and audio inputs.
In addition, during the Project Astra demonstration, a participant held up an Android phone and asked a series of questions while the camera's live video was open and Project Astra performed flawlessly.
For example, when Project Astra pointed the phone's camera at a table and asked what made noise, she discovered a computer speaker. The woman then circled the speaker's top and inquired about it with Astra. She answered accurately to a tweet.
From there, Astra was able to offer a creative alliteration about a pile of crayons, figure out what section of the code performs when aimed at a computer display, and accurately identify the Kings Cross region of London when the camera was pointed out the window.
Then things became pretty interesting. The Google employee put on smart glasses, glanced at a board, and asked, "What does this remind you of?" as she stared at an illustration. What does this remind you of? Schrödinger's Cat.
While holding a plush tiger next to a golden retriever, Project Astra was asked to come up with a band name on the spot. Answer: Golden stripes.
Ultimately, Project Astra's spatial awareness and video processing appear amazing, and people are eager to see where Google takes this AI agent. It will be available in the Gemini app later this year, and people are quite excited to try it out.