robots.txt Generator

robots.txtis a plain-text file at the root of your site that tells web crawlers which paths they can and can't access. It's one of the oldest web standards (1994) and still the primary way sites communicate crawler policy. Every search engine, every AI training crawler, and every scraping tool checks this file on its first visit.

Fill out the user-agent blocks on the left. Each block targets one bot (or all bots with*) and lists the paths that bot may crawl (Allow) or must not crawl (Disallow). Add one or more Sitemap URLs at the bottom. The preview on the right updates as you type. When you're happy, click the Download button to save the file as robots.txt, then upload it to your site's root so it's accessible at https://your-site.com/robots.txt.

Click Load XooCode example above the form to see a realistic two-block example: one User-agent: * block allowing everything except legacy WordPress paths, and one User-agent: GPTBotblock disallowing everything (a common opt-out from OpenAI training). It also includes a Sitemap line pointing to XooCode's own sitemap.

If you also need to describe your site's content structure for AI crawlers (ChatGPT, Perplexity, Claude, Gemini), see the llms.txt Generator. The two files complement each other: robots.txt says which pages crawlers are allowed to fetch, llms.txt describes what's on those pages for AI context.

Everything runs in your browser. Your form data is never sent to XooCode's servers or logged anywhere. When you close the tab, nothing persists.

robots.txt form

User-agent blocks (1)

User-agent

Bot name (* for all bots). Common values shown in the dropdown.

Allow paths

Paths this bot may crawl. Use / to allow everything. Defaults to allow if no Disallow matches.

Disallow paths

Paths this bot must not crawl. Use / to block everything for this bot.

Crawl-delay (optional)

Seconds between requests. Honored by Bing, Yandex, and a few others. Googlebot ignores this — use Google Search Console for Googlebot rate limiting.

Sitemap URLs

Absolute URLs to your XML sitemap files. Emitted as standalone 'Sitemap:' lines at the end of the file. Multiple sitemaps are allowed.

Host directive (optional)

The canonical hostname for your site. Legacy directive but still used by some crawlers. Leave blank if unsure.

Live robots.txt preview

User-agent: *

Save this file as robots.txtin your site's root directory, so it's accessible at https://your-site.com/robots.txt. Crawlers check this path automatically on their first visit to your site. Everything runs in your browser. Nothing is sent to XooCode's servers.