Uncategorized
The Great Search Disruption
June 8, 2023 by Ofer Mendelevitch CJ Cenizal | 5 min read
Read NowBlog Post
Large Language Models
The top large language models along with recommendations for when to use each based upon needs like API, tunable, or fully hosted.
March 28, 2023 by Suleman Kazi Adel Elmahdy
The language modeling space has seen amazing progress since the Attention is All You Need paper by Google in 2017 which introduced the concept of transformers (The ‘T’ in all the GPT models you‘ve probably heard about), taking the natural language processing world by storm and being the base of pretty much every advance in NLP since then.
As of this writing, that one single paper by Google has a whopping 68,147 citations, showing the volume of work being done in this space!
The current LLM landscape is quickly and constantly evolving, with multiple players all racing past each other to release a bigger, better, faster version of their model. Investors are pouring billions of dollars into NLP companies, with OpenAI alone having raised $11B.
For now though, we’ll be focusing primarily on instruction-following LLMs (or foundation models), a general purpose class of LLMs that do what you instruct them to. These differ from task-specific LLMs which are fine-tuned for just one task like summarization or translation (to learn more about task-specific models, read our article on use cases and real world applications of LLMs).
Here’s a list of some of the top LLMs announced and released in the last few years, as well as our recommended picks for different use-cases and constraints.
Table of Contents
Pathways Language Model (PaLM)
OpenAI, Unknown Size, Not Open Source, API Access Only
Our pick for a fully hosted, API based LLM (Paid)
Announced on March 14, 2023, GPT (Generative Pre-trained Transformer) 4 is Open AI’s latest model. While not strictly a language-only model as it can take as inputs images as well as text, it shows impressive performance on a variety of tasks including several professional medical and law exams.
GPT-4 also expands on the maximum input length compared to previous iterations, increasing it to a maximum of 32,768 tokens (about 50 pages of text!). Unfortunately little has been revealed about the model architecture or datasets used for training this model.
Because of the breakthroughs in capabilities and quality and strong track record of OpenAI, GPT-4 wins our pick for the LLM to use if you do not want to host your own model and want to rely on an API. As of this writing, a subscription to ChatGPT Plus is required for access.
OpenAI, 20 billion parameters, Not Open Source, API Access Only
Our pick for a fully hosted, API based LLM (Free Tier)
ChatGPT is a text-only model and was released by Open AI in November 2022. It can perform a lot of the text-based functions that GPT-4 can, albeit GPT-4 usually exhibits better performance.
ChatGPT is a sibling model to InstructGPT. InstructGPT itself was specifically trained to receive prompts and provide detailed responses that follow specific instructions, while ChatGPT is designed to engage in natural language conversations. OpenAI frequently pushes updates and new features such as the recently announced ChatGPT plugins which unlock even more LLM use cases.
Basic (non-peak) access to ChatGPT does not require a subscription, making it suitable for personal projects or experimentation – if you need general access even during peak times, a ChatGPT Plus subscription is required.
Open AI, 175 billion parameters, Not Open Source, API Access Only
Announced in June 2020, GPT-3 is pre-trained on a large corpus of text data, and then it is fine-tuned on a particular task. Given a text or sentence GPT-3 returns the text completion in natural language. GPT-3 exhibits impressive few-shot as well as zero-shot performance on NLP tasks such as translation, question-answering, and text completion.
BigScience, 176 billion parameters, Downloadable Model, Hosted API Available
Released in November of 2022 BLOOM (BigScience Large Open-Science Open-Access Multilingual Language Model) is a multilingual LLM that has been created by a collaboration of over 1,000 researchers from 70+ countries and 250+ institutions.
It generates text in 46 natural languages and 13 programming languages, and while the project shares the scope of other large-scale language models like GPT-3, it specifically aims to develop a more transparent and interpretable model. BLOOM can act as an instruction-following model to perform general text tasks that were not necessarily part of its training.
Google, 173 billion parameters, Not Open Source, No API or Downloads
LaMDA (Language Model for Dialogue Applications), announced in May 2021, is a model that is designed to have more natural and engaging conversations with users.
What sets LaMDA apart from other language models is the fact that it was trained on dialogue and the model was able to discern various subtleties that set open-ended discussions apart from other types of language.
The potential use cases for LaMDA are diverse, ranging from customer service and chatbots to personal assistants and beyond. LaMDA itself is built on an earlier Google Chatbot called Meena. The conversational service powered by LaMDA is called BARD and will be available via API “soon”.
Nvidia / Microsoft, 530 billion parameters, API Access by application
MT-NLG (Megatron-Turing Natural Language Generation), announced in October 2021, uses the architecture of the transformer-based Megatron to generate coherent and contextually relevant text for a range of tasks, including completion prediction, reading comprehension, commonsense reasoning, natural language inferences, word sense disambiguation.
Meta AI, Multiple Sizes, downloadable by application
Announced February 2023 by Meta AI, the LLaMA model is available in multiple parameter sizes from 7 billion to 65 billion parameters. Meta claims LLaMA could help democratize access to the field, which has been hampered by the computing power required to train large models.
The model, like other LLMs, works by taking a sequence of words as an input and predicts a next word to recursively generate text. Access to the model is available only to researchers, government affiliates, those in academia, and only after submitting an application to Meta.
Stanford, 7 billion parameters, downloadable
Our pick for a self-hosted model for non-commercial purposes
Our pick for a model to fine-tune for non-commercial purposes
Alpaca was announced in March 2023. It’s fine-tuned from Meta’s LLaMA 7B model that we described above and is trained on 52k instruction-following demonstrations.
One of the goals of this model is to help the academic community engage with the models by providing an open-source model that rivals OpenAI’s GPT-3.5 (text-davinci-003) models. To this end, Alpaca has been kept small and cheap (fine-tuning Alpaca took 3 hours on 8x A100s which is less than $100 of cost) to reproduce and all training data and techniques have also been released.
Alpaca wins our pick for the model to use if you only want it for research/personal projects as the license explicitly prohibits commercial use. However combined with techniques like LoRA this model can be fine-tuned on consumer grade GPUs and can even be run (slowly) on a raspberry pi.
Google, 20 billion parameters, downloadable from HuggingFace
Our pick for a self-hosted model for commercial usage
Our pick for a model to fine-tune for commercial purposes
Flan-UL2 is an encoder decoder model and at its core is a souped-up version of the T5 model that has been trained using Flan. It shows performance exceeding the ‘prior’ versions of Flan-T5. Flan-UL2 has an Apache-2.0 license and is our pick for a self-hosted or fine tunable model as the details for it’s usage and training have been released.
If Flan-UL2s 20 billion parameters are a little too much, consider the previous iteration of Flan-T5 which comes in five different sizes and might be more suitable for your needs.
DeepMind, 1.2 billion parameters, unavailable for use
Announced May 2022, Gato is deepmind’s multimodal model which,like GPT-4, is a single generalist model that can work on not just text but other modalities (images, Atari games and more) and perform multiple tasks such as image captioning and even controlling a robotic arm! Although the model itself hasn’t been released there is an open source project aiming to imitate its capabilities.
Google, 540 billion parameters, available via API
PaLM,announced April 2022, is based on Google’s Pathways AI architecture which aims to build models that can handle many different tasks and learn new ones quickly. PaLM is a 540 billion parameter model trained with the pathways system, can perform hundreds of language related tasks, and (at the time of launch) achieved state of the art performance on many of them.
One of the remarkable features of PaLM was generating explanations for scenarios requiring multiple complex logical steps such as explaining jokes.
Anthropic, Unknown Size, API Access after application
Announced March 2023 by Anthropic, Claude is described as a “next generation AI assistant”. Claude, like the other models on our list, can perform a variety of NLP tasks such as summarization, coding, writing and question answering.
It’s available in two modes: Claude, which is the full, high performance model, and Claude Instant which is a faster model at the expense of quality. Unfortunately, not many details are available about Claude’s training process or model architecture.
Tsinghua University, 6 billion Parameters, Downloadable
ChatGLM, announced March 2023 by Tsinghua University’s Knowledge Engineering Group (KEG) & Data Mining, is a bilingual (Chinese and English) language model that is available for download at HuggingFace.
Even though the model is large, with quantization it can be run on consumer-grade GPUs. ChatGLM claims to be similar to ChatGPT but optimized for the Chinese language and is one of the few LLMs available with an Apache-2.0 license that allows commercial use.
*Note: Some other LLMs we haven’t added here but were also released in the past couple of years: Gopher, GLaM, Chinchilla
You may have noticed the recency of many of these LLMs – this space is evolving quickly and accelerating even faster, also denoted by the increasing number of parameters. But a model is only as good as its application.
Here at Vectara, we’re leveraging LLMs as a fulcrum and NLP prompts as the lever to help users search, find, and discover meaning from large volumes of their own business data.
Sign up for a free account at Vectara, upload a data set, and execute searches to see just how meaningful a search experience can be.
Uncategorized
June 8, 2023 by Ofer Mendelevitch CJ Cenizal | 5 min read
Read Nowgrounded generation
by Ofer Mendelevitch | 2 min read
Read Nowgrounded generation
May 31, 2023 by Justin Hayes | 13 min read
Read Nowgrounded generation
May 30, 2023 by Shane Connelly CJ Cenizal | 3 min read
Read NowLarge Language Models
May 18, 2023 by Ofer Mendelevitch | 6 min read
Read NowLarge Language Models
May 17, 2023 by Ofer Mendelevitch | 7 min read
Read NowLarge Language Models
May 16, 2023 by Ofer Mendelevitch | 8 min read
Read Nowgrounded generation
May 2, 2023 by Ofer Mendelevitch | 10 min read
Read NowSample App
April 4, 2023 by Ofer Mendelevitch CJ Cenizal | 6 min read
Read Now