The library

A short history of robots.txt
2026-05-02 — From a 1994 mailing-list convention to RFC 9309.
How Googlebot proves its identity
2026-05-04 — Anyone can claim to be Googlebot. Verification is what settles it.
The AI crawler roster: who fetches what, and why
2026-05-06 — Training crawlers, answer-engine crawlers, and user-triggered fetchers are different animals.
What is Web Bot Auth?
2026-05-08 — Cryptographic signatures are replacing IP lists as the gold standard of bot identity.
User-agent strings: a field guide
2026-05-10 — Why everything claims to be Mozilla, and how matching actually works.
llms.txt, explained
2026-05-12 — A proposed convention for offering AI systems a curated map of your site.
Reverse DNS verification, step by step
2026-05-14 — The two-lookup ritual that separates real crawlers from costumes.
IP ranges, ASNs, and who owns an address
2026-05-16 — The network's own paper trail, and what it can and cannot prove.
The crawl economics of a small site
2026-05-18 — Bandwidth has a price, and machine readers do not click ads.
Sitemaps, and why crawlers love them
2026-05-20 — The standing invitation that makes discovery cheap.
Spotting an impersonator
2026-05-22 — Failed verification is the loudest signal in bot traffic.
What a crawler sees in a day on a brand-new site
2026-05-26 — The first visitors to any fresh domain are not human.