April 4, 2023 by Ofer Mendelevitch CJ Cenizal | 6 min readRead Now
Vectara discusses the new “InstantIndex” feature and how it helps batch and incremental indexing
February 28, 2023 by Tallat Shafaat
Considering the scale of data today, most data processing systems are designed to work efficiently in batch mode. This was the original premise of systems like Apache Hadoop and Apache Spark. While these systems were efficient at processing data in batches, several applications depend on near real-time processing capabilities. This resulted in systems like Apache Kafka and Spark Streaming.
In a search system, adding large amounts of data to an index at once – also known as batch indexing – enables systems to implement several performance and cost optimizations. At the same time, several search use cases require new documents to be added or existing documents to be updated in the index frequently.
For example, a chat application can add new chat messages to the index or you may want to add/remove products from a searchable product catalog very quickly. Adding a stream of incoming documents to an existing index is called incremental indexing.
The challenge in incremental indexing is to support small updates to large indexes in a performant and cheap manner while also minimizing the delay between the time the document was requested to be added to the index and the time it is available to serve queries.
Vectara supports both batch indexing and incremental indexing. This is important because some use cases have pre-processing steps for data that might be themselves batch. To support this pattern, it’s important that indexing can happen in near real-time. Consider Google’s search engine before the advent of Caffeine, which provided up to 50% better search relevance. Now with common use cases around speech-to-text and unstructured text, the power of instant indexing with immediate results is key to meeting user demand. Within incremental indexing, Vectara supports two modes: regular incremental indexing, which can take a few minutes from indexing to serving, and instant indexing, which takes a few seconds in most cases. All modes are available to customers without any configuration needed, albeit all modes may not be supported by all pricing plans.
In incremental indexing, Vectara batches the stream of incoming documents as much as possible. It breaks down the incoming stream of documents into chunks called journal entries. Each journal entry contains one or more document parts. Thus, a single document can span multiple journal entries. Similarly, document parts belonging to different documents can be part of the same journal entry. These journal entries are then applied to an existing index. The journals act as small batch updates and serve the purpose of batching as many updates as possible.
Our search nodes are designed to add data from journal entries to an existing index. Special attention is paid during journal application so that reads on the index are not blocked while updating the index. If reads were to be blocked, user queries could not be run on an index that is being updated due to the journal application. This means user queries will be rejected or incur long latencies if they wait for the journal writes to finish. Vectara makes sure user queries are not impacted by index updates.
You may notice in the journal creation and application process, there is a delay introduced by buffering data and by search nodes taking their time to notice new journals and applying them to the index. This can result in a time delay of a few minutes between when a document is sent to Vectara for indexing, and when it is available in the search node to be part of query results. Vectara solves this problem by introducing InstantIndex.
InstantIndex is a mode where documents sent by users are instantly available to be queried. Vectara achieves this by short-circuiting the incremental indexing process: bypassing the event streaming system and journaling [see figure 1 above]. When a document is received for indexing, indexing servers create in-memory data structures that use the same format as journal entries. We call these in-memory journal entries. These in-memory journals do not wait for multiple documents to arrive. As soon as the document has been processed, its data is compiled into the in-memory journal and sent to the search node. The search node immediately applies the journal to the corresponding index. Note that Vectara is a multi-tenant system, which has several search nodes and each user is assigned to a subset of the search nodes (based on the replication degree requested for the user). The indexing server is aware of which search nodes a customer is assigned to, and it multicasts the in-memory journal to the exact replica set of search nodes that are hosting the customer.
InstantIndex reduces the time between when a document was requested and when it is available to be returned in the result of queries. There are several business reasons why this may be needed, some listed below.
In today’s fast-paced world, waiting is never the ideal option. Companies want to leverage the power of instant to create strategic differentiation and get ahead of their predictions. In the world of indexing, context only becomes real once the relevant data has been indexed, and for many use cases, that indexing needs to happen immediately. Vectara supports both batch and incremental indexing, allowing users to leverage common stream processing engines or bypass them to speed delivery. Now with the introduction of instant indexing, the time between when a document was requested and when it is ultimately indexed is engineered to match your use case requirements.
Want to learn more about Vectara? Create your free account today.
May 31, 2023 by Justin Hayes | 13 min readRead Now