Retrieval and Search
Natural Language will be the Dominant Computer Language of the Next Decade
Large language models are ushering a new future for using natural language to interact with computing systems. Simple natural language interfaces may be at the core of almost all future human computer interactions presenting new possibilities to create applications, art, and stories and surface interesting content and perspectives.
December 15 , 2022 by Ed Albanese
Natural human language is the primary interface for ChatGPT. In some ways, it operates the opposite of standard interactions with computing systems. For example, with ChatGPT, you can provide a natural language request in English or many other languages, and it returns a natural language one right back at you. Human language in and human language out, unless you expressly ask for a different form, such as code. If you want ChatGPT to switch from being a lookup tool to a storyteller, you just use natural language to ask a different question. You can ask for code, fiction, perspectives on interesting questions and more.
As Doug Turnbull notes about ChatGPT, “It’s a way of programming systems in natural language, including how to interact with them, and provides compelling results.” ChatGPT and other products that are powered by the same underlying technology of large language models (LLMs) have the potential to turn programming itself into a natural language construct. For example, you can use plain ol’ English to ask ChatGPT to write a “hello world” program in Python, and it will return that in Python code (see Example 1). The request is made just as if you are talking to another person. The “hello world” example, in all its simplicity, sticks with me. It is a canonical reference to our longstanding attempt to introduce programming to other people in a simple way that simultaneously, and invitingly, makes the computer itself seem alive. We, humans, write a program that asks the computer to return a human response, “Hello World” (implying, hey, I’m real, and I’m here, and I’m ‘alive’). But it has also always felt hollow; an attempt at making something sound alive and animated that obviously isn’t. When you ask someone – a real person – to tell a story, both you (the requester) and the person telling the story feel, seem, and are-in-fact alive in the truest sense of that word.
Figure 1: ChatGPT responds with examples of Python code.
What makes ChatGPT seem so interesting is that for the first time and in a broadly available way, when you ask a computer – via this interface in natural language – to tell you a story, it does so without any programming required in a way that feels more human and natural than ever before. And yes, this includes all of the potential for fallacy, exaggeration and inaccuracy we humans are known for. ChatGPT bypasses the need to write code itself. Who needs “Hello World” when you can simply ask, “How come there aren’t more electric cars in texas?” (see example 2) and truly feel like you are in a conversation with a knowledgeable and interesting friend?
Figure 2: ChatGPT responds with reasons why there aren’t more electric cars in Texas.
ChatGPT is powered by LLMs. Before ChatGPT, one of the few places LLMs have been widely deployed to great impact is within Google search. A reasonable person should marvel at the dynamism and utility of Google search in both how it receives input (natural language via voice, full statements and questions, context-aware requests) and the usefulness of the responses it promptly returns in human-readable form, thanks in some part to the trove of human-generated content crammed into html, pdf, etc. on billions of websites. Most people quickly notice the difference between a Google search and one in a different application or product that requires them to more deftly request information via “keywords” to find the document or snippet they need. For example, just try searching “the best soup in the world” on Wikipedia and the results are not good (Example 3). Place that same query into Google and they are indeed much better (Example 4). But even if you restrict Google to only search the Wikipedia site itself (site:en.wikipedia.org), they are still better than the native search on Wikipedia. (Example 5).
Figure 3: Wikipedia search offers no results matching a query for the best soup in the world.
Figure 4: Google search offers results matching a query for the best soup in the world.
Figure 5: Google search offers results matching a query for the best soup in the world, limited to Wikipedia.
That is partly because keyword search – the current technique used by most search engines today – is not a natural language prompt. It demands that users parse their question into a format that a keyword engine can understand, sometimes using their memory (I think this unique/special word was in the doc) or imagination (a document like the one I’m looking for would probably contain this word). It is not quite as specific as a programming language, but it is distinctly not a natural language exercise. This isn’t what most people want when they search. They actually have a real question. “What time will the next train leave?”; “How old is Michael J Fox?”; “Who said ‘If you can’t explain it simply, you don’t understand it well enough’?” We experience Google achieving this using neural networks and knowledge graphs. But what about every other application and site on the planet?
How is this obviously different and obviously valuable interface for finding information being delivered to customers of products not named Google search? This is the potential of LLM-powered search (aka neural search). If I said this last week, before so many have seen what ChatGPT can do, you might have been inclined to utter “meh”. But, you know, now that you have seen Google at its best and ChatGPT in its couple of weeks or so in the wild, that this would be a mistake.
ChatGPT just opened the door and the minds of many people across the world to see the power of LLMs, not only because of the interesting or surprising answers being provided, but also because of the simplicity of the interface – natural language. This interface may be at the core of almost all future human computer interactions, even those that benefit from compiled code as the code itself may be produced or co-produced with the help of an LLM. Natural language has the very real potential to be the dominant computer language of the next decade.