What is a robots.txt file?

A robots.txt file is placed at the root of your website (e.g. https://example.com/robots.txt) and tells web crawlers which pages or sections they are allowed to access. It follows the Robots Exclusion Protocol and is the first file most crawlers fetch when they visit your site.

Does robots.txt prevent pages from appearing in Google?

No — robots.txt only blocks crawling, not indexing. If other sites link to a disallowed page, Google may still index the URL (showing it in results without a description). Use the 'noindex' meta tag or HTTP header to prevent a page from appearing in search results.

What does Disallow: / mean in robots.txt?

Disallow: / blocks the specified crawler from accessing your entire website. This is typically used for staging or development environments. An empty Disallow line (Disallow:) means everything is allowed. Use with caution — a misplaced Disallow: / on a production site can de-index your entire domain.

What are AI crawlers and should I block them?

AI crawlers are bots that scrape web content to train AI language models. Common examples include GPTBot (OpenAI), CCBot (Common Crawl), anthropic-ai (Anthropic), PerplexityBot, and Google-Extended. Blocking them does not affect your search engine rankings. Whether to block them is a personal decision about content licensing and AI training consent.

What does the Crawl-delay directive do?

Crawl-delay tells a crawler to wait a specified number of seconds between requests. This is useful for servers that struggle with frequent bot traffic. Important: Googlebot ignores Crawl-delay — use Google Search Console's crawl rate settings to control Google's crawl speed instead.

How do I validate my robots.txt file?

The most reliable way to test robots.txt is to use Google Search Console's robots.txt Tester (under Settings > robots.txt). You can also use Bing Webmaster Tools for Bingbot validation. For a quick check, simply navigate to https://yourdomain.com/robots.txt in a browser to confirm the file is served correctly.

Where should the robots.txt file be placed?

The robots.txt file must be placed at the root of your domain — exactly at https://example.com/robots.txt. It cannot be in a subdirectory. If your site lives at a subdomain (e.g. blog.example.com), you need a separate robots.txt at that subdomain's root.

Can I have multiple User-agent blocks in one robots.txt file?

Yes. You can have as many User-agent blocks as needed. Each block applies the Allow and Disallow rules to the specified bot. Use User-agent: * to create a catch-all rule for all crawlers not addressed by a specific block. Specific User-agent blocks take precedence over the wildcard.

Robots.txt Generator — Visual Builder with Bot Presets

Site owners, SEOs, and developers use this free robots.txt generator to control crawl access without hand-coding directives. Add rules visually, block AI crawlers in one click, set a crawl delay, and download a valid robots.txt file in seconds.

Global Settings

Sitemap URL

Crawl-Delay (seconds)

Bot Presets

Click a preset to add a disallow-all rule for that bot:

Rules ? Path patterns:
* = matches any sequence of characters
$ = end of URL (e.g. /*.pdf$ blocks PDF files)
/admin/ = blocks the /admin/ folder and all sub-paths

Live Preview

# robots.txt

Export

How to Use the Robots.txt Generator

Enter your sitemap URL in Global Settings so search engines can find it directly from robots.txt.
Click a bot preset to instantly add a disallow-all block for that specific crawler.
Use "Block All AI Crawlers" to add GPTBot, CCBot, anthropic-ai, PerplexityBot, ChatGPT-User, and Google-Extended in one click.
Click "+ Add User-Agent" to create a custom rule block, then set the path and directive (Allow or Disallow).
Watch the live preview panel update on the right as you build your rules.
Click Copy to Clipboard or Download robots.txt and upload it to your domain root.

Key Features

Visual rule builder — add user-agent blocks and Allow/Disallow directives without writing a single line of code.
Bot presets — one-click rules for Googlebot, Bingbot, GPTBot, CCBot, anthropic-ai, and PerplexityBot.
Block all AI crawlers — adds all known AI training bots in a single click, with no manual entry required.
Crawl-delay support — set a delay between requests for bots that respect this directive (note: Googlebot ignores it).
Live preview — the generated robots.txt syntax updates in real time as you configure each rule.
Download as robots.txt — exports a properly named, ready-to-upload file with one click.

Use Cases

Block admin and staging paths from search engines

Add a wildcard rule for User-agent: * with Disallow: /admin/ and Disallow: /staging/ to keep sensitive internal pages out of crawl queues. This does not remove them from search results — pair it with noindex tags for full protection.

Prevent AI training bots from scraping your content

Click "Block All AI Crawlers" to immediately add disallow rules for GPTBot, CCBot, anthropic-ai, PerplexityBot, and Google-Extended. This is the fastest way to opt out of AI training data collection without affecting your organic search visibility.

Configure a robots.txt file for a new website launch

Before going live, build your robots.txt using the visual editor: allow Googlebot, block your /wp-admin/ path if you are on WordPress, add your sitemap URL, then download and upload the file to your root directory. Reference it in Google Search Console to confirm it is being read correctly.

Reduce server load by limiting aggressive crawlers

If Bingbot or other crawlers are causing excessive server load, add a Crawl-delay directive in the appropriate user-agent block. Set a delay of 10–30 seconds to throttle request rates. Remember this has no effect on Googlebot — manage Google's crawl rate via Search Console instead.

Frequently Asked Questions

A robots.txt file is placed at the root of your website and tells web crawlers which pages they are allowed to access. It follows the Robots Exclusion Protocol and is the first file most crawlers fetch when visiting your site.

No — robots.txt only blocks crawling, not indexing. If other sites link to a disallowed page, Google may still index the URL without visiting it. Use the 'noindex' meta tag or HTTP header to prevent a page from appearing in search results.

Disallow: / blocks the crawler from accessing your entire website. This is typically used for staging environments. An empty Disallow line means everything is allowed. Use with caution — a misplaced Disallow: / on a production site can effectively de-index your entire domain.

AI crawlers scrape web content to train language models. Examples include GPTBot (OpenAI), CCBot (Common Crawl), anthropic-ai (Anthropic), and PerplexityBot. Blocking them does not affect search engine rankings. Whether to block them is a personal decision about content licensing and training data consent.

Crawl-delay tells a crawler to wait N seconds between page requests. Useful for servers struggling with bot traffic. Note: Googlebot ignores Crawl-delay — use Google Search Console's crawl rate settings to control Google's crawl speed instead.

Use Google Search Console's robots.txt Tester (under Settings) to test specific URLs against your rules. You can also navigate directly to https://yourdomain.com/robots.txt in a browser to confirm the file is being served correctly. Bing Webmaster Tools offers a similar tester for Bingbot.

The robots.txt file must be at the root of your domain — exactly at https://example.com/robots.txt. It cannot live in a subdirectory. If your site uses subdomains, each subdomain needs its own robots.txt file at its respective root.

Yes. You can have as many User-agent blocks as needed. Each block applies its Allow and Disallow rules to the specified bot. Specific User-agent blocks take precedence over the wildcard (User-agent: *) catch-all block.

Related Tools

A well-crafted robots.txt file is a small file with a significant job: it coordinates how dozens of different crawlers interact with your site every day. Get it wrong and you might accidentally block Googlebot from pages you want indexed, or leave your admin panel open to every bot on the web. This free robots.txt generator removes the risk by giving you a visual interface — no need to memorise Disallow syntax or worry about formatting errors. Add rules, block AI scrapers, set your sitemap, and download a clean file that is ready to upload straight to your domain root. Changes take effect as soon as the file is live and crawlers next request it.