HHEM | Flash Update: Google Gemma
See how Google Gemma hallucinates compared to other foundation models in the Hughes Hallucination Evaluation Model (HHEM)
2-minute read timeFollowing the recent release of Gemini 1.5, Google has just unveiled its latest contribution to the landscape of large language models (LLMs): Gemma, an open-source model available in 2B, 7B, and instruction fine-tuned variants. Using the open-source Hughes Hallucination Evaluation Model (HHEM), we quantify the tendency of Gemma to hallucinate when summarizing a set of facts, a key benchmark in applications that employ the Retrieval Augmented Generation (RAG) architecture.
Our updated leaderboard positions Gemma, with a hallucination rate of 7.5%, on par with Cohere’s Chat model, right below Llama2 13B. This is significantly better than Mistral 7B’s 9.4% rate, yet falls short of Llama2 7B’s 5.6%. Gemma’s answer rate of 100.0% also stands out, highlighting its suitability as a summarizer.
Moreover, the release of Gemma under the liberal “Gemma Terms of Use” is a strategic move by Google to facilitate easier integration into commercial and enterprise systems. This decision mirrors Microsoft’s earlier move with Phi 2, underscoring a significant shift towards open licensing in the LLM domain.
The table below, which reproduces the leaderboard as of February 21, 2024, shows Gemma’s performance in relation to other foundation models:
The implications of Gemma’s release echo the sentiment of our previous analysis: the landscape of proprietary LLM vendors is under increasing pressure to innovate and adjust pricing strategies. With Gemma, the message is clear — the race for efficiency, performance, and accessibility in LLMs is heating up, with end users standing to gain the most.