Large Language Models
March 28, 2023 by Suleman Kazi Adel Elmahdy | 9 min readRead Now
Large Language Models
Large Language Models and the applications they power, like ChatGPT, are all over the news and our social media discussions these days. This article cuts through the noise and summarizes the most common use cases to which these are successfully being applied.
March 14, 2023 by Justin Hayes
If you have not yet heard about Large Language Models (LLMs), you will soon, because these are the beating hearts that power so many of the amazing artificial intelligence tools that have cropped up recently, and have even appeared in South Park.
Whether it’s ChatGPT, DALL-E, or our very own Vectara platform, LLMs are revolutionizing how humans interact with computers.
In this article we’ll discuss the most common use cases of large language models and problems they solve, but also challenges they face and thoughts on their future.
An LLM is a piece of software that understands language very well, and uses that understanding to take a certain action. The most common actions that LLMs provide are generating content, finding information, conversing, or helping to organize your data. This post focuses on human language. But it is important to note that other domains also have languages and therefore also benefit from LLMs. A common example is genomics, whose language is DNA.
To make LLMs useful for some specific task, an application will accept one or more prompts from a user, then provide that as input to the LLM. These prompts are usually a question, an instruction, a description, or some other sequence of text.
The LLM then decides (i.e. predicts) what information should be returned to the user, and the application uses that information to craft a response, such as an answer or some novel generated content.
Technically, LLMs are specialized deep neural networks that have been trained primarily on text, although some use images, video, or audio as well. They are very robust and broad in terms of how they can be used, which helps them to achieve widespread adoption.
To better understand what these are, let’s deconstruct the LLM name itself:
LLMs understand language so well because they figure out the relationships between words. To do this they leverage one or more powerful techniques recently developed by the machine learning research community. A few of the most noteworthy ones are:
At this point many people have heard of a few of the most noteworthy LLMs and LLM-powered applications, such as ChatGPT and DALL-E (which generated the header image for this post via an ‘oil painting of a computer with a chat dialog on the screen’s prompt). But there are many more. Some of these are open source while others are closed source, and some are software artifacts you must download and bundle into your application while others are services consumed via APIs.
One of the most common use cases of LLMs is to generate content based on one or more prompts from a user. The primary goal is to improve the efficiency of knowledge workers, or in some cases obviate the need to have a human in the loop if the task is rudimentary enough. Generative applications are numerous – conversational AI and chatbots, creation of marketing copy, code assistants, and even artistic inspiration.
With data volumes continuing to explode, especially as computer systems themselves generate more and more content, it becomes increasingly important to have good summaries so us humans can make sense of all those articles, podcasts, videos, and earnings calls. Thankfully, LLMs can do that too.
One flavor of this is abstractive summarization, where novel text is generated to represent the information contained in longer content. The other is extractive summarization, where relevant facts retrieved based on a prompt are extracted and summarized into a concise response/answer.
It is very common to use LLMs to convert text from one form to another – these are based on transformers after all. This could be done to correct spelling/grammar errors or to redact content. Translation can also be considered as a form of rewriting.
Traditional search offerings usually use keyword-based algorithms, sometimes employing knowledge graphs or pagerank style approaches as well, to look up information that is (hopefully) relevant to what the user is asking for.
These are fast giving way to LLM-based techniques, such as “neural search”, which understand language much more deeply and are able to find more relevant results. This is especially important now, with people more commonly searching for information using long form queries, explicit questions, or conversational prompts.
So the ubiquitous search box in websites and applications will get much smarter. But so will all the implicit usages of search which can enable capabilities such as recommendations, conversational AI, classification, and more.
This is essentially a combination of “Search” and “Summarize”. The application first uses LLMs to understand what the user is looking for and return a relevant set of information. Then it uses another LLM to summarize that information into a singular answer.
This daisy-chaining of LLMs, where one model’s output is used as another model’s input, is a common design, as these models are usually built with composability in mind.
Question answering capabilities can improve customer service and customer support outcomes, help analysts find insights more effectively, make sales teams more efficient, and make conversational AI systems more effective.
It is frequently useful to group documents together based on the content they contain. This helps users organize or understand the data available to them, and it can help content providers increase engagement by surfacing content in an easy-to-consume manner. As with Search, this relies on understanding of the meaning inherent to the data. But instead of using that understanding as part of a retrieval operation, it is used to group the data together into similar buckets.
This is similar to Cluster, but instead of placing data into previously-unknown groupings, with Classify the groupings are known in advance. Examples include intent classification, sentiment detection, and prohibited behavior identification. This can be done via a traditional supervised learning approach, where the classifier is trained on the embeddings, or via a few-shot approach, where prompt engineering is used to provide examples to a LLM that then learns how to do the classification.
As impressive as LLMs are, it’s still early days and there are serious challenges still to be overcome before we will see widespread adoption and acceptance. Some of these are intrinsic to LLMs themselves, and others have to do with the applications that use them. The Future of large language models section below offers perspectives on how some of these challenges can be mitigated or overcome.
Many of these challenges will undoubtedly be addressed in the coming years, while others will persist and be thorns in our sides for quite some time. In both cases the community of LLM Engineers, Software Developers, and Product Owners must be cognizant of these challenges, and build appropriate guardrails and transparency into the applications they create.
This is a very fast moving space, so it is difficult to predict the future. But even at this nascent stage there are strong indications that the following trends and developments will transpire.
With the rising popularity of LLMs – not to mention all the venture capital pouring into this space – we will see an explosion of new and derivative models. Advanced researchers will continue to push the envelope on the core LLMs while access to them will become democratized. And it will be much more common for them to be consumed within applications as opposed to in raw form.
But much like the recent NoSQL and big data booms, there will be too many LLM options, flavors, and vendors vying for our attention. They will all be very powerful, but most will not be very differentiated from each other, so variations in non-functional aspects such as size, license terms, cost, and ease of use will matter a lot.
Hence organizations will come to rely on a relatively small number of leading vendors and communities, who will help the average developer cut through all the noise and pick the right models and tools.
The capabilities provided by LLMs are frequently amazing. But there are equally amazing fails, as seen with ChatGPT and Bing in early 2023. This will cause people to have a healthy level of suspicion, to ensure that they bring the benefits of these capabilities to their organizations in a responsible, ethical, and legal manner.
For instance, these applications will be required to explain how they ended up with the answer or the content they provided. Table stakes will be something as simple as citations in generated answers, such as what Bing and Vectara (see image) can provide.
Additionally, end users – or potentially regulators – will require applications to be transparent about when artificial intelligence has generated a piece of information.
Shocker here… we will see a version of the familiar hype cycle, such as this 2022 report on AI, play out as the LLM space evolves. We can characterize this by the perspective of the people/teams that bring these into an organization. The market at large will progress along this maturity curve, and vendors will likely focus on one of the following steps.
LLM Engineers will develop new architectures that produce better results but with fewer parameters, resulting in faster and cheaper training. There will also be hardware and software improvements that let LLM Engineers get more mileage out of the silicon available to them, akin to what TPUs did for deep learning in the mid 2010s.
These lower costs will help expedite the evolution of the LLM ecosystem, and most importantly will result in lower costs to the end users. A recent example of this trend is seen with the model family used by ChatGPT being 10x cheaper than the previous version.
Many novel user experiences will be accessible using LLMs, but these will need to be packaged such that enterprises can safely use them. This means we’ll soon see investments made to provide all the boring-yet-hugely-important features like:
It truly is astonishing how quickly LLMs have become so powerful. We’re at the start of a revolution in how people interact with computers, where the amazing will become normal. In just a few years almost every application we use will in some way be powered by LLMs. That said, there is certainly a lot of work to be done to get there.
If you are anxious to get your hands dirty with an LLM, sign up with Vectara and then test out our APIs or web console. You can also learn about what you can build in our free tier. Happy discovering!
Large Language Models
March 28, 2023 by Suleman Kazi Adel Elmahdy | 9 min readRead Now
May 2, 2023 by Ofer Mendelevitch | 10 min readRead Now