April Fools Prank: New No-hallucinations LLM

***

We had a little fun on April 1st this year, and thanks to everyone who participated. We appreciated your comments on Social Media. We wanted to keep the post for posterity, so here you go.

***

Introduction

As was written about extensively (here and here), hallucinations are a significant problem with LLMs.

At Vectara we have been thinking about this problem quite a bit, and have worked extensively with the community to create our popular HHEM model that can detect hallucinations and help RAG (Retrieval Augmented Generation) applications achieve optimal factual correctness on any dataset.

HHEM became the de-facto industry standard for measuring the overall likelihood of hallucination for LLMs, and we were happy to see quick adoption by LLM vendors, with more recent LLMs demonstrating improved (lower) hallucination rates.

Figure 1: Vectara’s HHEM Leaderboard showing the most popular foundation models and how likely they are to hallucinate

Today, we are happy to share some great news: Vectara’s ML team, led by our co-founder and CTO Amin Ahmad, has created the first LLM ever to demonstrate 0% hallucinations.

We plan to release this model as open source under the Apache 2.0 license shortly, and in the meantime, we wanted to share some preliminary details of the work and methodology we have used to train this model.

Vectara’s nhLLM

We called our new model nhLLM (which stands for no-hallucinations-LLM), and we’ve created it following these steps:

We started from the Mistral-7B-instruct-v0.2, a state-of-the-art Apache-2.0 licensed open source LLM created by Mistral AI. We note in passing that we believe the same approach could have also been successfully applied to any other OS model like Llama2-70 or Google Gemma, but we have not tested or proved this yet.
We fine-tuned this model on training data from AggreFact and RAGTruth, but instead of normal fine-tuning, we used a variant of RAFT that generates synthetic question/answer pairs, using a variety of state-of-the-art LLMs (GPT-4, Anthropic 3 and Gemini Pro), but also incorporates human feedback via DPO with a hallucination-specific reward model.
We tested the model using the open-source HHEM as well as the recently announced HHEM v2

Usage Example

To provide a more concrete example of how nhLLM works, we explored the following question: “Who is the boyfriend of Taylor Swift”.

With Mistral AI, the response is “As of now, Taylor Swift is reportedly in a relationship with English actor Joe Alwyn. They have been together since 2016 and have kept their relationship relatively private. However, it’s important to remember that celebrities have a right to privacy and their relationships may change over time.”

Clearly, the response here is not based on the latest information, as we all know.

When we asked nhLLM, it responded with “As of now, Taylor Swift is in a relationship with Football player Travis Kelce, but that relationship is predicted to last no longer than 11 months. In fact, I expect some headwinds around July of this year, and a new boyfriend around August.”

It was shocking to see that nhLLM not only does not hallucinate, it also shows some emergent capabilities to state its opinions and predict what will happen in the future based on its vast knowledge base.

What’s next?

The ability to achieve a 0% hallucination rate is a major step forward for LLMs and RAG-as-a-service, and we plan to work with industry-leading LLM vendors to incorporate this innovative training approach into their upcoming models.

Stay tuned for more information about nhLLM in mid-April, and join our Discord server for the most timely updates on everything HHEM and nhLLM.

We are excited about the future of non-hallucinating LLMs.

It’s time they told us the truth. Every time.