Sitemaps, and why crawlers love them

2026-05-20

A sitemap is a machine-readable list of a site's URLs, usually XML, usually linked from robots.txt. It exists so a crawler does not have to discover pages by following links one by one — the site simply hands over the catalog.

Crawlers reward the courtesy. A new site with a submitted sitemap gets indexed in days rather than weeks, because discovery is the expensive half of crawling and the sitemap makes it free.

Watching who fetches the sitemap is quietly informative: it is one of the first things any serious crawler requests, which makes it a reliable early signal of a new visitor taking an interest in the whole site rather than one page.

← all articles