Vectara

Hacker guide

Introduction

Welcome Hackers! The purpose of this guide is to give you a quick start as well as a safe haven to help you maximize your time during the hackathon.

As we all know, time flies during a hackathon!

Vectara Overview

Vectara is the trusted GenAI platform. It’s designed to make it easy for you to build and deploy GenAI applications that can generate text-based answers using your particular data (this type of application flow is also known as RAG or retrieval-augmented-generation). You simply ingest your data and then build apps using the Query or Chat API.

Vectara provides RAG-as-a-service. What does that mean?

It means we take care of all the heavy lifting during both data ingest (text extraction, chunking, embedding and storing in the vector store) as well as during query and retrieval (embedding of the query, neural search or hybrid search, reranking and calling the LLM) – so you don’t have to. Instead you can focus your efforts on building your application. Furthermore, Vectara can be deployed on your VPC and On-Prem for enhanced data security and performance.

Use-cases include:

  • AI Assistants: provide direct answers to specific questions based on the data and build chatbots that can hold full conversations with humans, with back-and-forth exchanges. These assistants can be used for question answering, chatbots, and more.
  • AI Agents: integrate querying capabilities with other custom tools to safely act on the user’s behalf by identifying, prioritizing, and executing an action plan.
  • Semantic (Neural) Search: build applications powered by fast and powerful semantic search that finds documents that match the user’s intent.

Building with vectara-agentic

vectara-agentic is an open source package in Python that you can use to build AI Assistants and agents with Vectara.

You can look at the source code to learn more, or just read the documentation. It integrates seamlessly with Vectara and makes creating AI assistants quick and easy.

Here are a few demos we have built with vectara-agentic:

  • Finance-assistant – AI assistant for analysis of financial documents like 10-K.
  • EV-assistant – AI assistant based on electric vehicle information and data.

Getting Started with Vectara

You can get to know Vectara’s features in 5 minutes:

  • Sign up for a 30-day free Vectara trial.
  • Log in and take the 5-minute walk-through.

We have more resources to help you build apps with Vectara:

  • The Quick Start guide shows you how to use the Vectara Console.
  • Read the API recipes for common patterns with our APIs.
  • Our API playground shows you how Vectara’s API requests and responses are structured.
  • See “Additional resources” below for a comprehensive list.

General guidelines:

  • Vectara offers a 30-day free trial complete with nearly all of the enterprise features of the platform.
  • Use the Indexing API to ingest text data into a Vectara corpus or the File Upload API to upload files such as PDF, PPT, or DOC.
  • Use the Query API to run queries against the ingested data. A query can be used to retrieve search results, generate a summary, or add a new message to a multi-turn chat.
  • The Console provides a unified view of information about your Vectara account and corpora. You can also run example queries in the console for quick experimentation.

Common Questions

What is the Boomerang embedding model?

Boomerang is the name of Vectara’s newest embedding model. This model encodes text from your data as “vector embeddings” and is used to power the high performance retrieval process that is part of the RAG pipeline.

Read more about Boomerang, and the importance of using a good retrieval model for getting best results from RAG.

What is Mockingbird?

Mockingbird is Vectara’s large language model (LLM) that is trained specifically to excel at RAG-related tasks. This also increases the data security of our end-to-end pipeline because you never send your data to a third party LLM provider.

Check out our blog posts to learn more about Mockingbird’s features and evaluation metrics.

What is HHEM?

  • HHEM stands for “Hughes Hallucination Evaluation Model” and is an open source model created by Vectara that can evaluate hallucinations in LLM generated responses.
  • You can find on Hugging Face the Model card and the LLM Leaderboard
  • Blog posts:

HHEM is integrated into the Vectara platform (called FCS or Factual Consistency Score) and can be returned automatically with every API Query request.

What is reranking and why is it important?

Reranking is the process of reordering the retrieved search results and occurs in between the retrieval and generation steps. Reranking can have many effects, such as increasing the diversity of the search results used in the generated response. This ensures that the search results that are used for the generated response do not have redundant information, creating a more comprehensive answer.

To learn more about reranking and how to add a reranker to your application, check out this page.

Can I stream the RAG response?

Yes, Vectara supports streaming outputs. You can stream the response output by simply adding “stream_response”: true to your query body.

Check out the API API reference or our Query docs to learn more.

How can I switch from APIv1 to APIv2?

We’re glad you want to make the switch. Our APIv2 has many improvements as we recommend that all users utilize APIv2 for all Vectara API calls. If you have used APIv1 in the past, you can check out our migration guide, blog post, and example notebook to learn about the changes and easily transition to the version.

Does Vectara provide an SDK?

Yes, we have both a Python and Typescript APIs. They are both in beta right now.

Should I use RAG instead of fine-tuning?

Our experience shows that “Fine-tuning is for form, and RAG is for facts” as discussed here and here.

Additional Resources

API docshttps://docs.vectara.com/docs/
API playgroundhttps://docs.vectara.com/docs/rest-api/
Getting helpJoin our Discord server or forums if you have questions.
If you have any feedback for us, we would be glad to hear it – please let us know in the forums or our Discord channel.
Open source
  • Vectara-ingest helps with data ingestion – crawling data sources and indexing them into Vectara.
  • Vectara-answer is a user interface for question answering – demonstrates a UI concept.
  • React-search is a React package that allows you to integrate Vectara semantic search in any React app with just a few lines of code
  • React-chatbot is a React package that allows you to integrate Vectara Chat in any React app with just a few lines of code.
  • Create-UI is a fast way to generate a Vectara-powered sample codebase for a range of user interfaces.
Sample code

We have quite a few sample AI Assistants, please check here.

In addition, we have AI Assistant examples (such as finance-assistant and legal-assistant), using vectara-agentic

Additionally, we published some sample code in these Jupyter notebooks.

Blog posts
Youtube
Flowise + Vectara Tutorial
LangFlow + Vectara Tutorial
Ask LangChain Docs Video
More here
IntegrationsLangChain: https://blog.langchain.dev/langchain-vectara-better-together/
LlamaIndex: https://vectara.com/blog/llamaindex-vectara/
Airbyte: https://vectara.com/blog/vectara-and-airbyte/
Unstructured.IO: https://vectara.com/blog/building-genai-applications-with-vectara-and-unstructured/
DataVolo: https://vectara.com/blog/building-genai-enterprise-apps-with-vectara-and-datavolo/
Startup programhttps://vectara.com/startups