Building An LLM-Powered Search Box with Vectara
A sample app demonstrating how to build a search experience powered by Vectara
April 04 , 2023 by Ofer Mendelevitch & CJ Cenizal
Vectara is an API-based managed platform for building LLM-powered search applications. It’s easy to use and provides state-of-the-art relevance for search results.
When thinking about search, most developers think of the more traditional “Keyword-based search” but LLM-powered search is much more than that – it’s a substantial (not incremental) improvement in search relevance, since results are based on the “meaning” of your search query, and not hurdled by polysemous or homograph keywords or typos.
Applications of LLM-powered search include semantic search, question answering, conversational AI chatbots, product reviews, eCommerce, and many others.
Part of the strength of Vectara’s “search pipeline” platform is that we manage all the complexity for you, the developer, and you don’t have to stitch together various pieces of the search application like vector databases and embedding providers. The Search UI is one piece we don’t directly manage for you, since in most cases it needs to be connected to your internal application.
Today, we’re excited to launch our Search UI sample application. This application demonstrates how to build a modern search user interface that utilizes Vectara’s Search API while delivering super-fast results and displaying them for the user.
We built the Search UI application using NodeJS and React. We use Docker to package all the necessary code and components. You can run this Docker container locally, or on any cloud platform.
To get started, clone the Search-UI Github repository
git clone https://github.com/vectara/Search-UI
And then go to the folder just created:
The repo is organized as follows:
- src: this sub-folder contains the Node and JS source code
- server: contains a simple server of the application using Express
- public: contains various static assets
- config: this folder stores various configuration files. As we’ll discuss below, each configuration file can be used to create a custom search experience
- build.sh: shell script used to build the Docker container
- run.sh: shell script used to run the Search UI
Now let’s try to run the Search UI for searching the content of Vectara’s website. For this purpose, we’ve indexed all of the documents on www.vectara.com into a corpus with corpus_id=1. Now all we have to do is build the Docker container:
And then run it using the desired configuration file:
sh docker/run.sh config/vectara-website-search.yaml
The script opens up a web browser pointing to localhost:80. After a few seconds the Search UI app will load automatically:
As you can see there’s a search box, where you can type in any query, as well as four suggested queries underneath the search box like “What is Neural Search” or “how do I index a document?”. Note that Vectara’s engine is multi-lingual so search queries can also be for example in French as shown here.
Now if you click on “what is neural search?” (or type it in) -you’ll see the search results from Vectara:
The Configuration File
Let’s look more closely at the example YAML configuration file:
corpus_id: 1 customer_id: 1169579801 auth_api_key: "<A-VECTARA-SEARCH-API-KEY>" search_title_pre: "Conversational search demo across" search_title_inner: "Vectara website" search_title_post: "content" search_title_url: "https://www.vectara.com" Q1: "what is neural search?" Q2: "how do I index a document?" Q3: "¿Quién es Amr?" Q4: "who is Amr?"
We see that the configuration file controls various aspects of the Search UI:
- corpus_id, as you might expect, simply lists the Vectara Corpus ID where the documents to be searched reside. Change this to the corpus_id of one of your corpora as it appears in the Vectara console.
- customer_id is your Vectara customer id.
- auth_api_key is the API key that enables the search. If you haven’t already, generate an API key for search in the Vectara console (see here for a details). Make sure the API key is associated with the correct corpus, and then just copy this value here.
- search_title_pre, search_title_inner, search_title_post and search_title_url are used to control what is displayed on the top right of the search box (often used to describe the application). This area is composed of 3 text segments (pre, inner, post), concatenated together, with the inner part also including a hyperlink (defined in search_title_url).
- Q1, Q2, Q3 and Q4 which correspond to the 4 “suggested queries” that are displayed below the search box.
Once you index your own documents with Vectara’s Indexing API you can create another configuration file similar to the one above, and change the parameters to fit the new corpus.
How is this configuration file used in Docker? The prepare_config.py python script is used to transform all the configuration parameters into environment variables, and also add the REACT_APP prefix. These environment variables are then fed into the Docker container upon execution, so that the React app can read them in real time and display the appropriate text in the user interface.
Customizing Vectara’s Search-UI
We wanted to highlight a few areas specific to Vectara in the codebase:
- As mentioned above, the configuration file specifies the corpus_id, customer_id and auth_api_key; these are then utilized in App.tsx with the call to sendSearchRequest
- The title parameters (search_title_XXX) are used inside Headers.tsx
- The Q1-Q4 parameters are used in LandingPage.tsx
Want to see additional features, or make some changes to the source code? We’d love to hear from you! Just hop on over to the Search-UI Github repository and submit a PR or create an issue.