Skip to main content

Blog Post



HHEM | Flash Update: Google Gemma

See how Google Gemma hallucinates compared to other foundation models in the Hughes Hallucination Evaluation Model (HHEM)

Following the recent release of Gemini 1.5, Google has just unveiled its latest contribution to the landscape of large language models (LLMs): Gemma, an open-source model available in 2B, 7B, and instruction fine-tuned variants. Using the open-source Hughes Hallucination Evaluation Model (HHEM), we quantify the tendency of Gemma to hallucinate when summarizing a set of facts, a key benchmark in applications that employ the Retrieval Augmented Generation (RAG) architecture.

Our updated leaderboard positions Gemma, with a hallucination rate of 7.5%, on par with Cohere’s Chat model, right below Llama2 13B. This is significantly better than Mistral 7B’s 9.4% rate, yet falls short of Llama2 7B’s 5.6%. Gemma’s answer rate of 100.0% also stands out, highlighting its suitability as a summarizer.

Moreover, the release of Gemma under the liberal “Gemma Terms of Use” is a strategic move by Google to facilitate easier integration into commercial and enterprise systems. This decision mirrors Microsoft’s earlier move with Phi 2, underscoring a significant shift towards open licensing in the LLM domain.

The table below, which reproduces the leaderboard as of February 21, 2024, shows Gemma’s performance in relation to other foundation models:


The implications of Gemma’s release echo the sentiment of our previous analysis: the landscape of proprietary LLM vendors is under increasing pressure to innovate and adjust pricing strategies. With Gemma, the message is clear — the race for efficiency, performance, and accessibility in LLMs is heating up, with end users standing to gain the most.

Recommended Content

Code Repository

Get the HHEM on HuggingFace

Get the HHEM on HuggingFace

To code repository
Resource Image
Close Menu