Check whether Googlebot can crawl a URL, with rules grouped by User-Agent. Free, private, runs in your browser.
This tool needs a server because the browser can’t fetch arbitrary cross-origin URLs. Our Cloudflare Worker fetches the public URL you typed and returns the parsed result — nothing is stored.
We’ll fetch /robots.txtfrom this site’s origin.
Test how a specific bot sees robots.txt — includes major search engines + 2026 AI crawlers.
Paths to evaluate against robots.txt for the User-Agent above.
How this works: a Cloudflare Worker at workers.convertful.app fetches the public robots.txt — nothing is stored. Pairs well with opengraph-preview and ssl-checker.
robots.txt is a plain-text file at the root of every website that tells crawlers which paths they're allowed to fetch. It's a polite request — well-behaved crawlers (Googlebot, Bingbot, AhrefsBot, every search engine) honour it. Bad actors ignore it. The file groups rules by User-Agent so you can give Googlebot different access from a generic *. Misconfigured robots.txt can completely deindex a site or, conversely, expose private endpoints to crawlers.
When two rules conflict, Google's resolver picks 'longest matching rule wins; on equal length, Allow beats Disallow.' So Disallow: /private/ + Allow: /private/public.html means /private/secret is blocked but /private/public.html is allowed. We highlight the matched rule so you can see the verdict's source. Other crawlers may behave slightly differently — Bingbot is mostly Google-compatible; small crawlers vary.
robots.txt blocks crawling — it does NOT block indexing. If a URL has been indexed previously or is linked from elsewhere, Google can keep it in the index even after a Disallow lands. To remove a URL from the index, use a noindex meta tag (which requires the page to be crawlable so the tag can be read) or the Search Console URL removal tool. For private content, use HTTP authentication, not robots.txt.
Test the URLs you actually care about — your homepage, your highest-traffic blog post, your sitemap.xml URL, your /api endpoints. If a URL you want indexed shows Disallow, check whether it's intentional. If a URL you want private shows Allow, fix it. The tool also surfaces every Sitemap: declaration in the file so you can sanity-check that your XML sitemap is registered and discoverable.
We proxy the robots.txt fetch through a Cloudflare Worker at workers.convertful.app — the browser can't read another site's robots.txt directly because of CORS. The Worker fetches the public file, returns the raw text plus a status code, and we parse it client-side. Nothing about your test session is stored. Your test paths and User-Agent strings stay in your browser; the Worker only sees the site URL.
Yes — once. We fetch /robots.txt over HTTPS with a 5-second timeout and the User-Agent ConvertfulBot/1.0. The raw file plus parsed structure is returned to your browser; nothing is stored.
Browsers can't fetch a different origin's robots.txt directly because of CORS, so we proxy the request through our Cloudflare Worker. The Worker fetches the public file and returns the result. We don't store the URL, the file, or your test paths.
Google's resolution rule is 'longest match wins; on ties, Allow beats Disallow.' So Disallow: /private/ + Allow: /private/public.html means /private/secret is blocked but /private/public.html is allowed. We highlight the matched rule so you can see exactly which line decided the verdict.
Yes. * matches any sequence (Disallow: /*.pdf blocks every .pdf path) and $ anchors the end of the path (Disallow: /end$ blocks /end but not /end-of-line). These are Google's spec extensions; not every crawler honours them, but the major ones do.