Application Development

Building An LLM-Powered Search Box with Vectara

A sample app demonstrating how to build a search experience powered by Vectara

April 04 , 2023 by Ofer Mendelevitch & CJ Cenizal

Vectara is an API-based managed platform for building LLM-powered search applications. It’s easy to use and provides state-of-the-art relevance for search results.

When thinking about search, most developers think of the more traditional “Keyword-based search” but LLM-powered search is much more than that – it’s a substantial (not incremental) improvement in search relevance, since results are based on the “meaning” of your search query, and not hurdled by polysemous or homograph keywords or typos.

Applications of LLM-powered search include semantic search, question answering, conversational AI chatbots, product reviews, eCommerce, and many others.

Part of the strength of Vectara’s “search pipeline” platform is that we manage all the complexity for you, the developer, and you don’t have to stitch together various pieces of the search application like vector databases and embedding providers. The Search UI is one piece we don’t directly manage for you, since in most cases it needs to be connected to your internal application.

Today, we’re excited to launch our Search UI sample application. This application demonstrates how to build a modern search user interface that utilizes Vectara’s Search API while delivering super-fast results and displaying them for the user.

Getting Started

We built the Search UI application using NodeJS and React. We use Docker to package all the necessary code and components. You can run this Docker container locally, or on any cloud platform.

To get started, clone the Search-UI Github repository

git clone https://github.com/vectara/Search-UI

And then go to the folder just created:

cd Search-UI

The repo is organized as follows:

src: this sub-folder contains the Node and JS source code
server: contains a simple server of the application using Express
public: contains various static assets
config: this folder stores various configuration files. As we’ll discuss below, each configuration file can be used to create a custom search experience
build.sh: shell script used to build the Docker container
run.sh: shell script used to run the Search UI

Now let’s try to run the Search UI for searching the content of Vectara’s website. For this purpose, we’ve indexed all of the documents on www.vectara.com into a corpus with corpus_id=1. Now all we have to do is build the Docker container:

sh docker/build.sh

And then run it using the desired configuration file:

sh docker/run.sh config/vectara-website-search.yaml

The script opens up a web browser pointing to localhost:80. After a few seconds the Search UI app will load automatically:

Figure 1: The Vectara Answer UI’s initial state presents a search box and suggested queries.

As you can see there’s a search box, where you can type in any query, as well as four suggested queries underneath the search box like “What is Neural Search” or “how do I index a document?”. Note that Vectara’s engine is multi-lingual so search queries can also be for example in French as shown here.

Now if you click on “what is neural search?” (or type it in) -you’ll see the search results from Vectara:

Figure 2: Vectara Answer presents search results in response to queries.

The Configuration File

Let’s look more closely at the example YAML configuration file:

config/vectara-website-search.yaml

corpus_id: 1
customer_id: 1169579801
auth_api_key: "&lt;A-VECTARA-SEARCH-API-KEY&gt;"

search_title_pre: "Conversational search demo across"
search_title_inner: "Vectara website"
search_title_post: "content"
search_title_url: "https://www.vectara.com"
Q1: "what is neural search?"
Q2: "how do I index a document?"
Q3: "¿Quién es Amr?"
Q4: "who is Amr?"

We see that the configuration file controls various aspects of the Search UI:

corpus_id, as you might expect, simply lists the Vectara Corpus ID where the documents to be searched reside. Change this to the corpus_id of one of your corpora as it appears in the Vectara console.
customer_id is your Vectara customer id.
auth_api_key is the API key that enables the search. If you haven’t already, generate an API key for search in the Vectara console (see here for a details). Make sure the API key is associated with the correct corpus, and then just copy this value here.
search_title_pre, search_title_inner, search_title_post and search_title_url are used to control what is displayed on the top right of the search box (often used to describe the application). This area is composed of 3 text segments (pre, inner, post), concatenated together, with the inner part also including a hyperlink (defined in search_title_url).
Q1, Q2, Q3 and Q4 which correspond to the 4 “suggested queries” that are displayed below the search box.

Once you index your own documents with Vectara’s Indexing API you can create another configuration file similar to the one above, and change the parameters to fit the new corpus.

How is this configuration file used in Docker? The prepare_config.py python script is used to transform all the configuration parameters into environment variables, and also add the REACT_APP prefix. These environment variables are then fed into the Docker container upon execution, so that the React app can read them in real time and display the appropriate text in the user interface.

Customizing Vectara’s Search-UI

Vectara’s search UI has an Apache 2.0 license and is meant to provide an example of a modern search UI using Vectara to power the search. If you’re a JavaScript / front-end developer, you might benefit from using this repository as a starting point, and making changes to develop your own application. Others may just want to use this out-of-the-box and control the behavior using the configuration file.

We wanted to highlight a few areas specific to Vectara in the codebase:

As mentioned above, the configuration file specifies the corpus_id, customer_id and auth_api_key; these are then utilized in App.tsx with the call to sendSearchRequest
The title parameters (search_title_XXX) are used inside Headers.tsx
The Q1-Q4 parameters are used in LandingPage.tsx

Want to see additional features, or make some changes to the source code? We’d love to hear from you! Just hop on over to the Search-UI Github repository and submit a PR or create an issue.

Building An LLM-Powered Search Box with Vectara

Getting Started

The Configuration File

Customizing Vectara’s Search-UI

Recommended Content

Vectara-ingest: Data Ingestion made easy

5 Reasons to Use Vectara’s LangChain Integration

Avoiding hallucinations in LLM-powered Applications

Vectara: Hybrid Search and Beyond [PDF]

Platform

Solutions

Resources

Pricing

Company