Blog Post

Large Language Models

Top Large Language Models (LLMs): GPT-4, LLaMA, FLAN UL2, BLOOM, and More

The top large language models along with recommendations for when to use each based upon needs like API, tunable, or fully hosted.

The language modeling space has seen amazing progress since the Attention is All You Need paper by Google in 2017 which introduced the concept of transformers (The ‘T’ in all the GPT models you‘ve probably heard about), taking the natural language processing world by storm and being the base of pretty much every advance in NLP since then.

As of this writing, that one single paper by Google has a whopping 68,147 citations, showing the volume of work being done in this space!

The current LLM landscape is quickly and constantly evolving, with multiple players all racing past each other to release a bigger, better, faster version of their model. Investors are pouring billions of dollars into NLP companies, with OpenAI alone having raised $11B.

For now though, we’ll be focusing primarily on instruction-following LLMs (or foundation models), a general purpose class of LLMs that do what you instruct them to. These differ from task-specific LLMs which are fine-tuned for just one task like summarization or translation (to learn more about task-specific models, read our article on use cases and real world applications of LLMs).

Here’s a list of some of the top LLMs announced and released in the last few years, as well as our recommended picks for different use-cases and constraints.


Table of Contents

GPT-4

ChatGPT

GPT-3

BLOOM

LaMDA

MT-NLG

LLaMA

Stanford Alpaca

FLAN UL2

GATO

Pathways Language Model (PaLM)

Claude

ChatGLM

Conclusion

GPT-4

OpenAI, Unknown Size, Not Open Source, API Access Only

Our pick for a fully hosted, API based LLM (Paid)

Announced on March 14, 2023, GPT (Generative Pre-trained Transformer) 4 is Open AI’s latest model. While not strictly a language-only model as it can take as inputs images as well as text, it shows impressive performance on a variety of tasks including several professional medical and law exams.

GPT-4 also expands on the maximum input length compared to previous iterations, increasing it to a maximum of  32,768 tokens (about 50 pages of text!). Unfortunately little has been revealed about the model architecture or datasets used for training this model.

Because of the breakthroughs in capabilities and quality and strong track record of OpenAI, GPT-4 wins our pick for the LLM to use if you do not want to host your own model and want to rely on an API. As of this writing, a subscription to ChatGPT Plus is required for access.

ChatGPT

OpenAI, 20 billion parameters, Not Open Source, API Access Only

Our pick for a fully hosted, API based LLM (Free Tier)

ChatGPT is a text-only model and was released by Open AI in November 2022. It can perform a lot of the text-based functions that GPT-4 can, albeit GPT-4 usually exhibits better performance.

ChatGPT is a sibling model to InstructGPT. InstructGPT itself was specifically trained to receive prompts and provide detailed responses that follow specific instructions, while ChatGPT is designed to engage in natural language conversations. OpenAI frequently pushes updates and new features such as the recently announced ChatGPT plugins which unlock even more LLM use cases.

Basic (non-peak) access to ChatGPT does not require a subscription, making it suitable for personal projects or experimentation – if you need general access even during peak times, a ChatGPT Plus subscription is required.

GPT-3 

Open AI, 175 billion parameters, Not Open Source, API Access Only

Announced in June 2020, GPT-3 is pre-trained on a large corpus of text data, and then it is fine-tuned on a particular task. Given a text or sentence GPT-3 returns the text completion in natural language. GPT-3 exhibits impressive few-shot as well as zero-shot performance on NLP tasks such as translation, question-answering, and text completion.

BLOOM 

BigScience, 176 billion parameters, Downloadable Model, Hosted API Available

Released in November of 2022 BLOOM (BigScience Large Open-Science Open-Access Multilingual Language Model) is a multilingual LLM that has been created by a collaboration of over 1,000 researchers from 70+ countries and 250+ institutions.

It generates text in 46 natural languages and 13 programming languages, and while the project shares the scope of other large-scale language models like GPT-3, it specifically aims to develop a more transparent and interpretable model. BLOOM can act as an instruction-following model to perform general text tasks that were not necessarily part of its training.

LaMDA

Google, 173 billion parameters, Not Open Source, No API or Downloads

LaMDA (Language Model for Dialogue Applications), announced in May 2021, is a model that is designed to have more natural and engaging conversations with users.

What sets LaMDA apart from other language models is the fact that it was trained on dialogue and the model was able to discern various subtleties that set open-ended discussions apart from other types of language.

The potential use cases for LaMDA are diverse, ranging from customer service and chatbots to personal assistants and beyond. LaMDA itself is built on an earlier Google Chatbot called Meena. The conversational service powered by LaMDA is called BARD and will be available via API “soon”.

MT-NLG

Nvidia / Microsoft, 530 billion parameters, API Access by application

MT-NLG (Megatron-Turing Natural Language Generation), announced in October 2021,  uses the architecture of the transformer-based Megatron to generate coherent and contextually relevant text for a range of tasks, including completion prediction, reading comprehension, commonsense reasoning, natural language inferences, word sense disambiguation.

LLaMA

Meta AI, Multiple Sizes, downloadable by application

Announced February 2023 by Meta AI, the LLaMA model is available in multiple parameter sizes from 7 billion to 65 billion parameters. Meta claims LLaMA could help democratize access to the field, which has been hampered by the computing power required to train large models.

The model, like other LLMs, works by taking a sequence of words as an input and predicts a next word to recursively generate text. Access to the model is available only to researchers, government affiliates, those in academia, and only after submitting an application to Meta. 

Stanford Alpaca 

Stanford, 7 billion parameters, downloadable

Our pick for a self-hosted model for non-commercial purposes

Our pick for a model to fine-tune for non-commercial purposes

Alpaca was announced in March 2023. It’s fine-tuned from Meta’s LLaMA 7B model that we described above and is trained on 52k instruction-following demonstrations.

One of the goals of this model is to help the academic community engage with the models by providing an open-source model that rivals OpenAI’s GPT-3.5 (text-davinci-003) models. To this end, Alpaca has been kept small and cheap (fine-tuning Alpaca took 3 hours on 8x A100s which is less than $100 of cost) to reproduce and all training data and techniques have also been released.

Alpaca wins our pick for the model to use if you only want it for research/personal projects as the license explicitly prohibits commercial use. However combined with techniques like LoRA this model can be fine-tuned on consumer grade GPUs and can even be run (slowly) on a raspberry pi.

FLAN UL2

Google, 20 billion parameters, downloadable from HuggingFace

Our pick for a self-hosted model for commercial usage

Our pick for a model to fine-tune for commercial purposes

Flan-UL2 is an encoder decoder model and at its core is a souped-up version of the T5 model that has been trained using Flan. It shows performance exceeding the ‘prior’ versions of Flan-T5. Flan-UL2 has an Apache-2.0 license and is our pick for a self-hosted or fine tunable model as the details for it’s usage and training have been released.

If Flan-UL2s 20 billion parameters are a little too much, consider the previous iteration of Flan-T5 which comes in five different sizes and might be more suitable for your needs.

GATO

DeepMind, 1.2 billion parameters, unavailable for use

Announced May 2022, Gato is deepmind’s multimodal model which,like GPT-4, is a single generalist model that can work on not just text but other modalities (images, Atari games and more) and perform multiple tasks such as image captioning and even controlling a robotic arm! Although the model itself hasn’t been released there is an open source project aiming to imitate its capabilities.

Pathways Language Model (PaLM)

Google, 540 billion parameters, available via API

PaLM,announced April 2022, is based on Google’s Pathways AI architecture which aims to build models that can handle many different tasks and learn new ones quickly. PaLM is a 540 billion parameter model trained with the pathways system, can perform hundreds of language related tasks, and (at the time of launch) achieved state of the art performance on many of them.

One of the remarkable features of PaLM was generating explanations for scenarios requiring multiple complex logical steps such as explaining jokes.

Claude

Anthropic, Unknown Size, API Access after application

Announced March 2023 by Anthropic, Claude is described as a “next generation AI assistant”. Claude, like the other models on our list, can perform a variety of NLP tasks such as summarization, coding, writing and question answering.

It’s available in two modes: Claude, which is the full, high performance model, and Claude Instant which is a faster model at the expense of quality. Unfortunately, not many details are available about Claude’s training process or model architecture.

ChatGLM 

Tsinghua University, 6 billion Parameters, Downloadable

ChatGLM, announced March 2023 by Tsinghua University’s Knowledge Engineering Group (KEG) & Data Mining, is a bilingual (Chinese and English) language model that is available for download at HuggingFace.

Even though the model is large, with quantization it can be run on consumer-grade GPUs. ChatGLM claims to be similar to ChatGPT but optimized for the Chinese language and is one of the few LLMs available with an Apache-2.0 license that allows commercial use.

*Note: Some other LLMs we haven’t added here but were also released in the past couple of years: Gopher, GLaM, Chinchilla 

Conclusion

You may have noticed the recency of many of these LLMs – this space is evolving quickly and accelerating even faster, also denoted by the increasing number of parameters. But a model is only as good as its application.

Here at Vectara, we’re leveraging LLMs as a fulcrum and NLP prompts as the lever to help users search, find, and discover meaning from large volumes of their own business data.

Sign up for a free account at Vectara, upload a data set, and execute searches to see just how meaningful a search experience can be.

Recommended Blogs

Subscribe to Vectara's Newsletter