Free tool

robots.txt checker

By Tiago CostaUpdated on July 01, 2026

Paste a site URL and see its robots.txt analyzed: User-agent groups, blocking rules, Sitemaps and warnings. Right in your browser, no sign up.

Site URL

Quick guide

How to read and check robots.txt

The robots.txt is a text file at the root of a site, at yoursite.com/robots.txt, that tells search crawlers which parts of the site they may crawl. A simple mistake in this file can take down crawling for the whole site, so it pays to check it now and then.

How to read robots.txt line by line

The file is organized into groups. Each group starts with a User-agent (the crawler the rules apply to) and lists directives below it. The most common are Disallow (a path the crawler should not crawl) and Allow (an exception that opens a path inside a blocked area).

User-agent: * applies the rules to every crawler. You can also target a specific bot, such as Googlebot.
Disallow: /admin/ asks crawlers not to crawl anything inside /admin/.
Disallow: left empty opens the whole site for that group.
Sitemap: points to the full URL of your sitemap to help the engine find your pages.

Main directives and what they do

Directive	Example	What it does
User-agent	User-agent: *	Defines which crawler the group of rules applies to
Disallow	Disallow: /cart/	Asks the crawler not to crawl that path
Allow	Allow: /blog/	Opens a path inside a blocked area
Sitemap	Sitemap: https://yoursite.com/sitemap.xml	Points to where the site's sitemap lives

Common robots.txt mistakes

Mistake	Effect
Disallow: / for User-agent: *	Blocks the entire site from search engines
Missing robots.txt	Google crawls everything by default, with no guidance from you
No Sitemap line	You lose an easy hint that speeds up page discovery
Blocking CSS and JS	Google may render the page incompletely
Sitemap with a relative URL	The Sitemap line needs the full URL to be read

Blocking crawling and blocking indexing

These are different things. Disallow in robots.txt asks the crawler not to crawl the page. To remove a page from Google's index, use the noindex meta tag in the page's own <head>. Important detail: a page blocked in robots.txt can still show up in search, because Google never reads its noindex. To remove it from the index, open crawling and use noindex.

FAQ

robots.txt questions

Is the checker free?

Yes, free and no sign up. Paste the site URL and get the robots.txt analysis instantly.

Where is a site's robots.txt?

Always at the root of the domain, at yoursite.com/robots.txt. The tool derives the origin from the URL you paste and fetches that file automatically.

What does Disallow: / mean?

It is the rule that blocks the whole site for that group's crawler. When it appears for User-agent: *, no search crawler crawls the site, so the tool flags it in red.

Does robots.txt stop indexing on Google?

Not directly. It controls crawling. To remove a page from the index, use the noindex meta tag on the page and keep crawling open so Google can read it.

Do I need to declare the Sitemap in robots.txt?

It is not required, but it helps a lot. The Sitemap line tells the engine where your pages are and speeds up discovery. The tool warns you when it is missing.

Is my data stored?

The check runs on demand and the robots.txt content is not stored.

Automarticles

Want a blog born with the right technical SEO?

Automarticles builds your full blog, with robots.txt, sitemap, canonical and content optimized for Google and to be cited by AIs like ChatGPT. With no manual work.

Start for free

More free tools

Character CounterCount characters and words, and check each platform's limit.Word CounterCount words, check reading time and see your most used words.Keyword DensitySee your most used terms and the density of each one, in real time.Readability ScoreTest how easy your text is to read, with a 0 to 100 score.Google PreviewSimulate your page's snippet on Google before you publish.Slug GeneratorTurn titles into clean, SEO friendly URLs.Schema GeneratorCreate JSON-LD for FAQ, article, page, website, product and more in seconds.robots.txt GeneratorBuild your site's robots.txt and copy the result ready to use.UTM BuilderBuild trackable URLs with UTM parameters for your campaigns.SEO ROI CalculatorCalculate leads, revenue, profit, ROI and payback on your SEO investment.llms.txt GeneratorBuild your site's llms.txt so AIs can understand and cite it.SEO Title GeneratorGenerate 8 optimized titles with AI, each at the right length for Google.Meta Description GeneratorCreate 150-160 character meta descriptions with AI, ready for Google.Blog Outline GeneratorBuild a full article structure with AI: H1, H2, H3 and conclusion.Content Idea Generator10 blog ideas with title and angle from your niche, in seconds.FAQ GeneratorGenerate 6 frequently asked questions with AI plus a ready-to-paste FAQPage schema.Text RewriterRewrite and paraphrase any text with AI, in the tone you need.Introduction GeneratorCreate 3 introductions with a hook, context and promise using AI.Article Conclusion GeneratorWrite your article's closing paragraph with AI: summary, takeaway and call to action.Content Brief GeneratorBuild a full SEO brief with AI to guide writers and agencies.Alt Text GeneratorCreate alt text for your images with AI, ready for accessibility and SEO.AI Visibility CheckerSee if ChatGPT knows your brand and how to improve your AI visibility.Meta Tag GeneratorGenerate title, description, Open Graph and Twitter Card ready to paste in the head.Passive Voice CheckerFind the sentences in passive voice and make your writing clear and direct.Headline AnalyzerGet a 0 to 100 score for your headline, plus tips to win more clicks.SEO Checklist26 technical, on-page, content and link items. Tick each one and track your progress.Grammar CheckerFix grammar, spelling and punctuation with AI, without changing the meaning.Article SummarizerTurn long text into a clear AI summary in seconds.Canonical Tag CheckerCheck any page's canonical tag and the most common mistakes.Open Graph CheckerSee a page's OG and Twitter tags and how the link looks when shared.Heading Tag CheckerCheck any page's H1 to H6 hierarchy and the most common mistakes.Broken Link CheckerFind 404s and dead links on any page, internal and external.XML Sitemap ValidatorValidate your sitemap.xml and see URLs, type and common errors.Keyword ResearchKeywords with real Google search volume, competition and CPC.Keyword Difficulty CheckerSee, from 0 to 100, how hard it is to rank each keyword on Google.SERP CheckerSee who holds Google's Top 10 for any keyword.Image CompressorCompress and convert images in the browser. Nothing is sent to a server.Sitemap FinderFind any website's sitemap from its URL.RSS Feed FinderDiscover the RSS, Atom, or JSON feed of any site from its URL.GEO Checklist24 items to appear and get cited in AI answers. Tick each one and track your progress.SEO and GEO DiagnosisFree SEO analysis and AI readiness, with your score and what to fix.

Glossary terms

CloakingCloaking is a black hat SEO technique that consists of showing the search engine different content from what is displayed to the user, in order to manipulate the ranking. In practice, the server detects whether the visitor is Google's robot or a person and serves different versions of the same URL. Because it deceives both the search engine and the visitor, cloaking is forbidden by Google's guidelines and can lead to the page being removed from the results.NoindexNoindex is a directive that tells search engines not to include a page in the search results. It is applied through a robots meta tag in the HTML or through an HTTP header, and it makes Google drop the page from the index even when other sites link to it. Unlike robots.txt, which blocks crawling, noindex requires the page to stay crawlable so the search engine can read the instruction.CrawlerA crawler is a robot program that travels the web from link to link, downloading and reading pages to feed a search engine's index. Also called a spider, robot or bot, the best known example is Googlebot. The crawler is the first stage of search: before a page can be indexed and ranked, it has to be found and read by one of these crawlers.Robots.txtRobots.txt is a plain text file, saved in the root of a domain, that tells search engine crawlers which parts of a site they can or cannot crawl. It follows the Robots Exclusion Protocol and controls crawling, not indexing, so it is not the right tool to hide a page from search results.

See the glossary