Top Large Language Models (LLMs): GPT-4, LLaMA 2, Mistral 7B, ChatGPT, and More
The top large language models along with recommendations for when to use each based upon needs like API, tunable, or fully hosted. Updated regularly.
October 17 , 2023 by Suleman Kazi & Adel Elmahdy
Note: This living document will continue to be updated and has been updated to include new models as of Oct. 17, 2023
The language modeling space has seen amazing progress since the Attention is All You Need paper by Google in 2017 which introduced the concept of transformers (The ‘T’ in all the GPT models you‘ve probably heard about), taking the natural language processing world by storm and being the base of pretty much every advance in NLP since then.
As of this writing, that one single paper by Google has a whopping 68,147 citations, showing the volume of work being done in this space!
The current LLM landscape is quickly and constantly evolving, with multiple players all racing past each other to release a bigger, better, faster version of their model. Investors are pouring billions of dollars into NLP companies, with OpenAI alone having raised $11B.
For now though, we’ll be focusing primarily on instruction-following LLMs (or foundation models), a general purpose class of LLMs that do what you instruct them to. These differ from task-specific LLMs which are fine-tuned for just one task like summarization or translation (to learn more about task-specific models, read our article on use cases and real world applications of LLMs).
Here’s a list of some of the top LLMs announced and released in the last few years, as well as our recommended picks for different use-cases and constraints.
Table of Contents
- LLaMA 2
- Mistral 7B
- Stanford Alpaca
- FLAN UL2
- Pathways Language Model (PaLM)
OpenAI, Unknown Size, Not Open Source, API Access Only
Our pick for a fully hosted, API based LLM (Paid)
Announced on March 14, 2023, GPT (Generative Pre-trained Transformer) 4 is Open AI’s latest model. While not strictly a language-only model as it can take as inputs images as well as text, it shows impressive performance on a variety of tasks including several professional medical and law exams.
GPT-4 also expands on the maximum input length compared to previous iterations, increasing it to a maximum of 32,768 tokens (about 50 pages of text!). Unfortunately little has been revealed about the model architecture or datasets used for training this model.
Because of the breakthroughs in capabilities and quality and strong track record of OpenAI, GPT-4 wins our pick for the LLM to use if you do not want to host your own model and want to rely on an API. As of this writing, a subscription to ChatGPT Plus is required for access.
OpenAI, 20 billion parameters, Not Open Source, API Access Only
Our pick for a fully hosted, API based LLM (Free Tier)
ChatGPT is a text-only model and was released by Open AI in November 2022. It can perform a lot of the text-based functions that GPT-4 can, albeit GPT-4 usually exhibits better performance.
ChatGPT is a sibling model to InstructGPT. InstructGPT itself was specifically trained to receive prompts and provide detailed responses that follow specific instructions, while ChatGPT is designed to engage in natural language conversations. OpenAI frequently pushes updates and new features such as the recently announced ChatGPT plugins which unlock even more LLM use cases.
Basic (non-peak) access to ChatGPT does not require a subscription, making it suitable for personal projects or experimentation – if you need general access even during peak times, a ChatGPT Plus subscription is required.
Meta AI, Multiple Sizes, Downloadable
Our pick for best model for code understanding and completion
Our pick for a model to fine-tune for commercial and research purposes
Released in July 2023, Llama2 is Meta AI’s next generation of open source language understanding model. It comes in various sizes from 7B to 70B parameters. There are two model variants Llama Chat for natural language and Code Llama for code understanding. The models are free for research as well as commercial use and have double the context length of Llama 1.
Llama Code is our pick if you want a high performance code understanding model you want to host yourself. With multiple fine tuning scripts and flavors available online, Llama Chat is our pick if you want to fine-tune a model for your own application
Technology Innovation Institute, Multiple Sizes, Downloadable
Our pick for a self-hosted model for high quality, if compute isn’t an issue
The FALCON series of models were developed by the UAE’s Technology Innovation Institute (TII) and come in 180B, 40B, 7.5B and 1.3B versions. The 180B version, released in September 2023, is at the time of writing at the top of the Hugging Face Leaderboard for pre-trained Open Large Language Models and is available for both research and commercial use, and beats Lllama 2. However given the extremely large size, not everyone would want to work with the 180B parameter model, however if compute and hosting isn’t an issue we’re recommending it as our pick for a self hosted LLM due to the very high quality metrics it achieves on different tasks.
Mistral AI, 7.3 billion parameters, Downloadable
Our pick for a self-hosted model for commercial and research purposes
Announced in September 2023, Mistral is a 7.3B that outperforms Llama2 (13B!) on all benchmarks and Llama 1 34B on many benchmarks. It’s also released under the Apache 2.0 license making it feasible to use both for research as well as commercially. Given the quality Mistral 7B is able to achieve with a relatively small size that doesn’t require monstrous GPUs to host, Mistral 7B is our pick for the best overall self-hosted model for commercial and research purposes.
Open AI, 175 billion parameters, Not Open Source, API Access Only
Announced in June 2020, GPT-3 is pre-trained on a large corpus of text data, and then it is fine-tuned on a particular task. Given a text or sentence GPT-3 returns the text completion in natural language. GPT-3 exhibits impressive few-shot as well as zero-shot performance on NLP tasks such as translation, question-answering, and text completion.
BigScience, 176 billion parameters, Downloadable Model, Hosted API Available
Released in November of 2022 BLOOM (BigScience Large Open-Science Open-Access Multilingual Language Model) is a multilingual LLM that has been created by a collaboration of over 1,000 researchers from 70+ countries and 250+ institutions.
It generates text in 46 natural languages and 13 programming languages, and while the project shares the scope of other large-scale language models like GPT-3, it specifically aims to develop a more transparent and interpretable model. BLOOM can act as an instruction-following model to perform general text tasks that were not necessarily part of its training.
Google, 173 billion parameters, Not Open Source, No API or Downloads
LaMDA (Language Model for Dialogue Applications), announced in May 2021, is a model that is designed to have more natural and engaging conversations with users.
What sets LaMDA apart from other language models is the fact that it was trained on dialogue and the model was able to discern various subtleties that set open-ended discussions apart from other types of language.
The potential use cases for LaMDA are diverse, ranging from customer service and chatbots to personal assistants and beyond. LaMDA itself is built on an earlier Google Chatbot called Meena. The conversational service powered by LaMDA is called BARD.
Nvidia / Microsoft, 530 billion parameters, API Access by application
MT-NLG (Megatron-Turing Natural Language Generation), announced in October 2021, uses the architecture of the transformer-based Megatron to generate coherent and contextually relevant text for a range of tasks, including completion prediction, reading comprehension, commonsense reasoning, natural language inferences, word sense disambiguation.
Meta AI, Multiple Sizes, downloadable by application
Announced February 2023 by Meta AI, the LLaMA model is available in multiple parameter sizes from 7 billion to 65 billion parameters. Meta claims LLaMA could help democratize access to the field, which has been hampered by the computing power required to train large models.
The model, like other LLMs, works by taking a sequence of words as an input and predicts a next word to recursively generate text. Access to the model is available only to researchers, government affiliates, those in academia, and only after submitting an application to Meta.
Stanford, 7 billion parameters, downloadable
Alpaca was announced in March 2023. It’s fine-tuned from Meta’s LLaMA 7B model that we described above and is trained on 52k instruction-following demonstrations.
One of the goals of this model is to help the academic community engage with the models by providing an open-source model that rivals OpenAI’s GPT-3.5 (text-davinci-003) models. To this end, Alpaca has been kept small and cheap (fine-tuning Alpaca took 3 hours on 8x A100s which is less than $100 of cost) to reproduce and all training data and techniques have also been released.
Combined with techniques like LoRA this model can be fine-tuned on consumer grade GPUs and can even be run (slowly) on a raspberry pi.
Google, 20 billion parameters, downloadable from HuggingFace
Flan-UL2 is an encoder decoder model and at its core is a souped-up version of the T5 model that has been trained using Flan. It shows performance exceeding the ‘prior’ versions of Flan-T5. Flan-UL2 has an Apache-2.0 license and can be self-hosted or fine tuned as the details for it’s usage and training have been released.
If Flan-UL2s 20 billion parameters are a little too much, consider the previous iteration of Flan-T5 which comes in five different sizes and might be more suitable for your needs.
DeepMind, 1.2 billion parameters, unavailable for use
Announced May 2022, Gato is deepmind’s multimodal model which,like GPT-4, is a single generalist model that can work on not just text but other modalities (images, Atari games and more) and perform multiple tasks such as image captioning and even controlling a robotic arm! Although the model itself hasn’t been released there is an open source project aiming to imitate its capabilities.
Google, 540 billion parameters, available via API
PaLM,announced April 2022, is based on Google’s Pathways AI architecture which aims to build models that can handle many different tasks and learn new ones quickly. PaLM is a 540 billion parameter model trained with the pathways system, can perform hundreds of language related tasks, and (at the time of launch) achieved state of the art performance on many of them.
One of the remarkable features of PaLM was generating explanations for scenarios requiring multiple complex logical steps such as explaining jokes.
Anthropic, Unknown Size, API Access after application
Announced March 2023 by Anthropic, Claude is described as a “next generation AI assistant”. Claude, like the other models on our list, can perform a variety of NLP tasks such as summarization, coding, writing and question answering.
It’s available in two modes: Claude, which is the full, high performance model, and Claude Instant which is a faster model at the expense of quality. Unfortunately, not many details are available about Claude’s training process or model architecture.
Tsinghua University, 6 billion Parameters, Downloadable
ChatGLM, announced March 2023 by Tsinghua University’s Knowledge Engineering Group (KEG) & Data Mining, is a bilingual (Chinese and English) language model that is available for download at HuggingFace.
Even though the model is large, with quantization it can be run on consumer-grade GPUs. ChatGLM claims to be similar to ChatGPT but optimized for the Chinese language and is one of the few LLMs available with an Apache-2.0 license that allows commercial use.
You may have noticed the recency of many of these LLMs – this space is evolving quickly and accelerating even faster, also denoted by the increasing number of parameters. But a model is only as good as its application.
Here at Vectara, we’re leveraging LLMs as a fulcrum and NLP prompts as the lever to help users search, find, and discover meaning from large volumes of their own business data.
Sign up for a free account at Vectara, upload a data set, and execute searches to see just how meaningful a search experience can be.