Google introduces Gemma 2 2B AI model, outperforms GPT-3.5 with just 2bn parameters

Gemma 2 2B illustrates the importance of using model compression and distillation techniques

Tech Desk - Aug 02, 2024

An undated image of Google Gemma's logo. — Google

Google DeepMind, an Alphabet subsidiary focusing on artificial intelligence and machine learning, has just introduced its new AI model called Gemma 2 2B. Despite being a 2 billion parameter model, it is small enough to easily fit on a smartphone.

The model was first announced in June alongside Gemma 2 9B and 27B as Google’s response to rivals, including Meta and its Llama 3.1 family.

What is Gemma 2 2B?

According to Google, Gemma 2 2B is a lightweight model that produces outstanding results by learning from larger models through distillation. The search giant claims that Gemma 2 2B can surpass all GPT-3.5 models in showcasing exceptional conversational AI abilities.

Moreover, the new model can efficiently run on a wide range of hardware, from laptops to robust cloud deployments with Vertex AI and Google Kubernetes Engine (KGE). It is also optimised with the NVIDIA Tensor RT-LLM library, further enhancing its speed.

This new model demonstrates that lightweight models can run on a wider range of devices, not just powerful computers, and offer best-in-class performance for its size, outperforming other models.

Furthermore, Gemma 2 2B has just 2.6B parameters but was trained on a massive 2 trillion token dataset. On Chatbot Arena, the model scored 1130, which matches the scores of GPT-3.5 Turbo, significantly larger than Gemma 2 2B.

Users can access the Gemma 2 2B model at the Google AI Studio, and it is available through Ollama to run on devices.