Free tool

robots.txt generator

By Tiago CostaUpdated on July 01, 2026

Choose allow all, block all or build custom rules and see the file instantly. Copy or download it and publish it at your site's root. No sign up, right in your browser.

Mode

Sitemap (optional)

Your robots.txt

User-agent: *
Disallow:

Quick guide

Everything about robots.txt

The robots.txt generator builds the right file for your site in seconds. Choose allow all, block all or create custom rules, copy the result and publish it at your domain root. Below is a full guide to the syntax, the mistakes that hurt SEO and how to handle AI bots.

What robots.txt is and where it lives

robots.txt is a plain text file that tells search bots which parts of your site they may crawl. It is the first thing Googlebot reads when it reaches your domain. Its rules guide crawling, but they do not replace a password: sensitive content should always stay behind authentication.

The file must live at the domain root and open at https://yoursite.com/robots.txt. The name is always robots.txt, all lowercase. Each subdomain and each protocol has its own file, so blog.yoursite.com uses a separate robots.txt from yoursite.com.

robots.txt syntax and directives

A robots.txt file is made of blocks. Each block starts with the User-agent directive, which sets the bot the rules apply to, followed by one or more Disallow and Allow lines that close or open paths. The Sitemap line can appear anywhere and points to your site map.

Directive	What it does	Example
User-agent	Sets the bot that follows the block of rules. The asterisk applies to all.	User-agent: *
Disallow	Blocks crawling of a path or folder.	Disallow: /admin/
Allow	Opens a path inside a blocked folder.	Allow: /admin/public/
Sitemap	Points to the full site map URL.	Sitemap: https://yoursite.com/sitemap.xml
* (wildcard)	Stands for any sequence of characters in a path.	Disallow: /*?color=
$ (end of URL)	Marks the exact end of the URL.	Disallow: /*.pdf$
Disallow: /	Blocks the entire site at once.	Disallow: /

The * and $ wildcards

The asterisk (*) stands for any sequence of characters, so Disallow: /*.pdf$ blocks every PDF on the site. The dollar sign ($) marks the end of the URL and keeps you from blocking similar addresses by mistake. Both work on Google and Bing, but not every bot understands wildcards, so use them with care.

Where to place the file

Save the result as robots.txt and upload it to the main folder of your server, so it opens at https://yoursite.com/robots.txt. To test it, just type that address in your browser. Google also offers a robots.txt report in Search Console that shows the latest version it read and flags errors.

Blocking crawling vs blocking indexing

This is the most common SEO mistake with robots.txt. Blocking a page in robots.txt stops crawling, but it does not remove the page from Google's index. If other sites link to it, the address can still appear in search with no title or description. To take a page out of results, let it be crawled and use the <meta name='robots' content='noindex'> tag in the HTML. robots.txt controls who gets in; noindex controls what stays in the index.

Common mistakes that hurt SEO

Blocking the whole site by accident: a leftover Disallow: / line from staging stops Google from crawling any page.
Blocking CSS and JavaScript: Google needs those files to render the page. Closing /assets/ or /js/ can make it see a broken layout.
Using robots.txt to hide content: blocked pages still show up in search if they have links. The noindex tag exists for that.
Getting case wrong: paths are case sensitive, so /Admin and /admin are treated as different addresses.
Forgetting the Sitemap line: adding the Sitemap helps bots find every URL you want indexed.

How to control AI bots

AI bots respect robots.txt too. You decide whether your content feeds models like ChatGPT, Claude and Gemini or shows up in their answers. Each company uses its own User-agent, and you block or allow each one just like any other bot.

Bot	Company	What it does
GPTBot	OpenAI	Collects pages to train the ChatGPT models.
OAI-SearchBot	OpenAI	Indexes content that can appear in ChatGPT search.
ChatGPT-User	OpenAI	Visits a page when a user asks for it in ChatGPT.
ClaudeBot	Anthropic	Collects content to train Claude.
Google-Extended	Google	Controls use of your content in Gemini and Google AI.
PerplexityBot	Perplexity	Indexes pages to answer inside Perplexity.
CCBot	Common Crawl	Public dataset used by many AI models.

To block any of them, write a block with the bot's User-agent and a Disallow: / line. For example, User-agent: GPTBot followed by Disallow: / asks OpenAI's bot not to collect any page. Keep in mind this is a request: well-behaved bots obey it, but the block does not work as a technical barrier.

FAQ

Common questions about robots.txt

Is the robots.txt generator free?

Yes. It is 100% free, no sign up and no usage limit. The file is built in your browser.

Where do I put the generated file?

At the domain root, so it opens at https://yoursite.com/robots.txt. Just save it as robots.txt and upload it to your server.

How do I block the entire site?

Use the Block all mode, which generates User-agent: * and Disallow: /. This asks bots not to crawl any page.

Do I need a robots.txt for my site?

It is not required. Without one, bots crawl everything. It helps when you want to block specific areas or point to your sitemap.

Does robots.txt hide a page from search?

Not reliably. It blocks crawling, but the page can still appear if there are links. To remove it, use the noindex meta tag.

How do I stop AI bots like GPTBot from using my content?

Add a block with the bot's User-agent (for example GPTBot) followed by a Disallow: / line. The main AI bots, such as GPTBot, ClaudeBot and Google-Extended, respect robots.txt.

Automarticles

Want a blog that writes and optimizes itself?

Automarticles builds your full blog, with content optimized for Google and to be cited by AIs like ChatGPT. With no manual work.

Start for free

More free tools

Character CounterCount characters and words, and check each platform's limit.Word CounterCount words, check reading time and see your most used words.Keyword DensitySee your most used terms and the density of each one, in real time.Readability ScoreTest how easy your text is to read, with a 0 to 100 score.Google PreviewSimulate your page's snippet on Google before you publish.Slug GeneratorTurn titles into clean, SEO friendly URLs.Schema GeneratorCreate JSON-LD for FAQ, article, page, website, product and more in seconds.UTM BuilderBuild trackable URLs with UTM parameters for your campaigns.SEO ROI CalculatorCalculate leads, revenue, profit, ROI and payback on your SEO investment.llms.txt GeneratorBuild your site's llms.txt so AIs can understand and cite it.SEO Title GeneratorGenerate 8 optimized titles with AI, each at the right length for Google.Meta Description GeneratorCreate 150-160 character meta descriptions with AI, ready for Google.Blog Outline GeneratorBuild a full article structure with AI: H1, H2, H3 and conclusion.Content Idea Generator10 blog ideas with title and angle from your niche, in seconds.FAQ GeneratorGenerate 6 frequently asked questions with AI plus a ready-to-paste FAQPage schema.Text RewriterRewrite and paraphrase any text with AI, in the tone you need.Introduction GeneratorCreate 3 introductions with a hook, context and promise using AI.Article Conclusion GeneratorWrite your article's closing paragraph with AI: summary, takeaway and call to action.Content Brief GeneratorBuild a full SEO brief with AI to guide writers and agencies.Alt Text GeneratorCreate alt text for your images with AI, ready for accessibility and SEO.AI Visibility CheckerSee if ChatGPT knows your brand and how to improve your AI visibility.Meta Tag GeneratorGenerate title, description, Open Graph and Twitter Card ready to paste in the head.Passive Voice CheckerFind the sentences in passive voice and make your writing clear and direct.Headline AnalyzerGet a 0 to 100 score for your headline, plus tips to win more clicks.SEO Checklist26 technical, on-page, content and link items. Tick each one and track your progress.Grammar CheckerFix grammar, spelling and punctuation with AI, without changing the meaning.Article SummarizerTurn long text into a clear AI summary in seconds.Canonical Tag CheckerCheck any page's canonical tag and the most common mistakes.Open Graph CheckerSee a page's OG and Twitter tags and how the link looks when shared.Heading Tag CheckerCheck any page's H1 to H6 hierarchy and the most common mistakes.Broken Link CheckerFind 404s and dead links on any page, internal and external.robots.txt CheckerAnalyze any site's robots.txt and the mistakes that stall Google.XML Sitemap ValidatorValidate your sitemap.xml and see URLs, type and common errors.Keyword ResearchKeywords with real Google search volume, competition and CPC.Keyword Difficulty CheckerSee, from 0 to 100, how hard it is to rank each keyword on Google.SERP CheckerSee who holds Google's Top 10 for any keyword.Image CompressorCompress and convert images in the browser. Nothing is sent to a server.Sitemap FinderFind any website's sitemap from its URL.RSS Feed FinderDiscover the RSS, Atom, or JSON feed of any site from its URL.GEO Checklist24 items to appear and get cited in AI answers. Tick each one and track your progress.SEO and GEO DiagnosisFree SEO analysis and AI readiness, with your score and what to fix.

Glossary terms

NoindexNoindex is a directive that tells search engines not to include a page in the search results. It is applied through a robots meta tag in the HTML or through an HTTP header, and it makes Google drop the page from the index even when other sites link to it. Unlike robots.txt, which blocks crawling, noindex requires the page to stay crawlable so the search engine can read the instruction.CrawlerA crawler is a robot program that travels the web from link to link, downloading and reading pages to feed a search engine's index. Also called a spider, robot or bot, the best known example is Googlebot. The crawler is the first stage of search: before a page can be indexed and ranked, it has to be found and read by one of these crawlers.Robots.txtRobots.txt is a plain text file, saved in the root of a domain, that tells search engine crawlers which parts of a site they can or cannot crawl. It follows the Robots Exclusion Protocol and controls crawling, not indexing, so it is not the right tool to hide a page from search results.Crawl budgetCrawl budget is the number of pages a search engine like Google is willing to crawl on a site within a given period. It comes from the combination of how much your server can handle the robot's visits and how interested Google is in revisiting that content. On small sites it is rarely a problem, but on large sites every visit from the crawler becomes a scarce resource worth managing.

See the glossary