We use our web crawler, known technically as Schemabot, for two objectives. First, we use the web crawler to discover schema markup on the website to inform reports like our Analyzer and Trend Report. Secondly, if you are a Highlighter customer, the crawler can be used to generate markup asynchronously.
Reason 1: In both scenarios, our crawler will go across through your website to collect or generate the most current data. The crawler will typically run weekly and go through your entire website to renew the data. Each crawl attempt to find a sitemap, and queue all the webpages. Afterwhich, the crawl will start on the home page and follow/queue the links discovered on the webpage.
Other Reasons: Honestly, the web is vast, and scenarios abound and we can't predict all the scenarios. We sometimes get broken relative links or GET parameters that create weird recursive URL patterns. While we have heurtistics to avoid this, If it looks like the crawler is stuck somewhere or underoptimized, please let us know. We don't want to waste your or our resources, so let's optimize the effort.