
Just like Google researchers upped brought about something which improved Google's ranking among the leading tech titans, Nvidia researchers have disclosed a new artificial intelligence (AI) model “Eagle.”
The new product from Nvidia augments machines’ abilities to process visual data.
Published on arXiv, the research entails various demonstrations conducted using Eagle, ranging from visual question answering to document analysis.
Read more: Google rolls out Gemini 1.5 Flash-8B, Gemini 1.5 Flash, Gemini 1.5 Pro
Nvidia's new AI model Eagle stretches the boundaries of what’s known as multimodal large language models (MLLMs), which is a blend of text and image processing capabilities.
“Eagle presents a thorough exploration to strengthen multimodal LLM perception with a mixture of vision encoders and different input resolutions,” stated the research paper.
The most outstanding property of Nvidia's Eagle is to process images at resolutions up to 1024×1024 pixels, allowing the AI to ingrain the finest of the details that are a no-brainer for tasks like optical character recognition (OCR).
The attainment of such a feat from Nvidia is way beyond many existing AI models.