Robots.txt Tester

Check whether Googlebot can crawl a URL, with rules grouped by User-Agent. Free, private, runs in your browser.

This tool needs a server because the browser can’t fetch arbitrary cross-origin URLs. Our Cloudflare Worker fetches the public URL you typed and returns the parsed result — nothing is stored.

Site URL

We’ll fetch /robots.txtfrom this site’s origin.

User-Agent

Test how a specific bot sees robots.txt — includes major search engines + 2026 AI crawlers.

Test paths

Paths to evaluate against robots.txt for the User-Agent above.

How this works: a Cloudflare Worker at workers.convertful.app fetches the public robots.txt — nothing is stored. Pairs well with opengraph-preview and ssl-checker.

You might also need

OpenGraph PreviewPreview how a URL appears as a Twitter / Facebook / LinkedIn share card

SSL CheckerQuick TLS health check: cert issuer, expiry, SANs, HSTS

URL Encode / DecodeEncode or decode URL strings

HTTP Status CodesSearchable reference for every HTTP response code

What robots.txt Does

robots.txt is a plain-text file at the root of every website that tells crawlers which paths they're allowed to fetch. It's a polite request — well-behaved crawlers (Googlebot, Bingbot, AhrefsBot, every search engine) honour it. Bad actors ignore it. The file groups rules by User-Agent so you can give Googlebot different access from a generic *. Misconfigured robots.txt can completely deindex a site or, conversely, expose private endpoints to crawlers.

Allow vs Disallow Precedence

When two rules conflict, Google's resolver picks 'longest matching rule wins; on equal length, Allow beats Disallow.' So Disallow: /private/ + Allow: /private/public.html means /private/secret is blocked but /private/public.html is allowed. We highlight the matched rule so you can see the verdict's source. Other crawlers may behave slightly differently — Bingbot is mostly Google-compatible; small crawlers vary.

Crawler vs Indexer (Important Distinction)

robots.txt blocks crawling — it does NOT block indexing. If a URL has been indexed previously or is linked from elsewhere, Google can keep it in the index even after a Disallow lands. To remove a URL from the index, use a noindex meta tag (which requires the page to be crawlable so the tag can be read) or the Search Console URL removal tool. For private content, use HTTP authentication, not robots.txt.

How To Use The Result

Test the URLs you actually care about — your homepage, your highest-traffic blog post, your sitemap.xml URL, your /api endpoints. If a URL you want indexed shows Disallow, check whether it's intentional. If a URL you want private shows Allow, fix it. The tool also surfaces every Sitemap: declaration in the file so you can sanity-check that your XML sitemap is registered and discoverable.

Privacy + How The Server Works

We proxy the robots.txt fetch through a Cloudflare Worker at workers.convertful.app — the browser can't read another site's robots.txt directly because of CORS. The Worker fetches the public file, returns the raw text plus a status code, and we parse it client-side. Nothing about your test session is stored. Your test paths and User-Agent strings stay in your browser; the Worker only sees the site URL.

FAQ

Does this hit my live site?

Yes — once. We fetch /robots.txt over HTTPS with a 5-second timeout and the User-Agent ConvertfulBot/1.0. The raw file plus parsed structure is returned to your browser; nothing is stored.

This needs a server — what about my privacy?

Browsers can't fetch a different origin's robots.txt directly because of CORS, so we proxy the request through our Cloudflare Worker. The Worker fetches the public file and returns the result. We don't store the URL, the file, or your test paths.

Why can Allow and Disallow conflict?

Google's resolution rule is 'longest match wins; on ties, Allow beats Disallow.' So Disallow: /private/ + Allow: /private/public.html means /private/secret is blocked but /private/public.html is allowed. We highlight the matched rule so you can see exactly which line decided the verdict.

Does this support * and $?

Yes. * matches any sequence (Disallow: /*.pdf blocks every .pdf path) and $ anchors the end of the path (Disallow: /end$ blocks /end but not /end-of-line). These are Google's spec extensions; not every crawler honours them, but the major ones do.