Frenemy Lab
A small library about the machines that read the web — crawlers, fetchers, and agents — and how sites can tell them apart.

User-agent strings: a field guide

2026-05-10

Nearly every user-agent string opens with 'Mozilla/5.0', a fossil from the 1990s browser wars when servers sent different pages to different browsers and everyone lied to get the good version. The lie became load-bearing, and now it is permanent.

Real identity hides in the tokens that follow: 'Googlebot/2.1', 'GPTBot/1.2', 'curl/8.5.0'. Matching them is subtler than it looks — Telegram's preview fetcher contains the words 'like TwitterBot', so naive substring matching attributes its visits to the wrong company.

The practical rules: match case-insensitively, prefer the longest token that matches, and treat the whole string as a claim to be verified rather than a fact to be recorded.

← all articles