Web Crawler

Web Crawler

A command-line web crawler that can crawl either a web sitemap or a single page and ingest the contents of it into Vectara.

Turn a website or webpage into searchable content in Vectara

The web crawler currently has 2 modes of operation:

  1. Single URL
  2. Sitemap

For the former, provide the crawler with a URL and it will ingest it into Vectara. For the latter, provide the crawler with a root page, and it will retrieve the sitemap(s) and index all links from the sitemap.