✨ Get 25% OFFon any plan. Use the coupon:

What the X-Robots-Tag header is and how to control indexing

By Tiago CostaUpdated on July 2, 2026

Illustration of a server response with the X-Robots-Tag noindex header blocking a PDF and an image from search.
Definition

The X-Robots-Tag is an HTTP header that controls how the search engine indexes a URL. In practice, it:

  • travels in the server response, outside the page's HTML;
  • accepts the same directives as the meta robots, such as noindex and nofollow;
  • is the only way to apply noindex on PDFs, images and other non-HTML files;
  • lets you apply rules in bulk by file type on the server.

What the X-Robots-Tag header is

The X-Robots-Tag is an HTTP header, that is, a line of information the server sends along with a URL's response, even before the content itself. In that header, you can include indexing directives that tell search engines how to treat that page or file: whether it can be indexed, whether it should follow the links, whether it can show a cached version and so on.

The core difference from the well-known meta robots tag is where the instruction lives. The meta tag sits inside the HTML, in the page's head section. The X-Robots-Tag, on the other hand, sits in the server response, at a level before the content. Both achieve the same effect on indexing, but through different paths.

This distinction is not an unimportant technical detail. It solves a real problem: how to prevent the indexing of files that have no HTML, such as a PDF or an image. Without a head section to receive a meta tag, these files can only be controlled by the HTTP header.

X-Robots-Tag vs meta robots tag: when to use each

The two tools deliver the same directives, so the choice depends on what you need to control and where it is more practical to apply the rule:

  • Meta robots tag: ideal for individual HTML pages. Just insert the line <meta name="robots" content="noindex"> in the document head. It is simple and does not require server access.
  • X-Robots-Tag: ideal for non-HTML files (PDFs, images, videos) and for applying rules in bulk. Since it sits on the server, you can, for example, send noindex to all the PDFs in a folder at once.

It is worth reinforcing a rule that confuses many people: both the meta tag and the X-Robots-Tag only work if the search engine can crawl the URL and read the instruction. If you block the same page in robots.txt, the robot never gets to see the header, and the noindex directive is ignored. Crawl blocking and an indexing directive are distinct things that should not be combined on the same URL.

Infographic showing the meta robots tag in the HTML and the X-Robots-Tag in the HTTP header converging on the same out-of-index result.
Two paths to the same noindex: the meta robots tag in the HTML and the X-Robots-Tag in the HTTP header.

What values the X-Robots-Tag accepts

The X-Robots-Tag recognizes the same directives as the meta robots tag, and you can combine more than one, separated by a comma. The most used ones:

ValueWhat it tells the search engine
noindexDo not include this URL in search results.
nofollowDo not follow the links contained in that resource.
noneShortcut for noindex and nofollow together.
noarchiveDo not show a cached version of the resource.
nosnippetDo not show a text snippet or preview in the result.
unavailable_afterStop indexing the URL after a defined date.

You can also target the directive at a specific robot, naming the agent before the value (for example, applying the rule only to Googlebot). This flexibility makes the X-Robots-Tag a powerful tool for anyone managing many different file types.

How to set the X-Robots-Tag on the server

Since the X-Robots-Tag lives in the HTTP response, the configuration is done on the web server, not in the content. The most common paths:

  • Apache: in the configuration file or the .htaccess, a rule like Header set X-Robots-Tag "noindex" can be combined with an extension filter to target only the PDFs, for example.
  • Nginx: inside the server block, the directive add_header X-Robots-Tag "noindex"; applies the rule to the URLs you specify.
  • WordPress: SEO plugins let you mark content types with noindex, and some already handle the header for media files without you touching the server.

The point of attention is always the same: applying the rule to the right target. A poorly configured filter can end up sending noindex to important pages, so every change calls for a test before going to production. Getting this wrong drops entire pages from search without warning.

Use cases: PDFs, images and non-HTML files

It is in files that are not HTML pages that the X-Robots-Tag shines, because in them the meta tag simply does not exist. Typical situations:

  • Internal PDFs: catalogs, manuals and materials you do not want competing in the results with the site's pages.
  • Sensitive images: files that should stay accessible to whoever has the link, but out of image search.
  • Spreadsheets and documents: exports and reports that generate duplicate or low-value content for search.
  • Download areas: entire folders that make more sense out of the index.

It is worth remembering that these directives are part of a well-established universe. According to the survey by the Web Almanac by HTTP Archive, in 2024 the noindex directive was present on around 4.7% of desktop pages and 3.9% of mobile pages, a sign that indexing control, whether by meta tag or by header, is a common practice on the web. To complement it, the alt text remains the way to describe images you do want to index.

Illustration of a PDF and an image with the X-Robots-Tag noindex header, kept out of the search results page.

How to check and avoid errors with the X-Robots-Tag

Since the header does not appear in the visible HTML, it is easy to apply it by mistake or forget to remove it. A routine to check that everything is right:

  • Inspect the HTTP headers: browser developer tools or header checkers show whether the X-Robots-Tag is present and with which value.
  • Make sure the URL is crawlable: confirm it is not blocked in robots.txt, otherwise the search engine will not read the directive.
  • Use URL inspection: the Search Console URL inspection tool indicates whether Google sees the page as excluded by a noindex directive.
  • Check the status code: a correct HTTP status code, along with the right header, avoids contradictory signals for the robot.

The most expensive X-Robots-Tag mistake is the silent one: a forgotten noindex in a broad server rule can deindex entire sections without anyone noticing right away. Reviewing the headers after any infrastructure change is the best form of prevention.

FAQ

Frequently asked questions

What is the X-Robots-Tag header?

The X-Robots-Tag is an HTTP header sent by the server with indexing directives for search engines, such as noindex and nofollow. It plays the same role as the meta robots tag, but works for any type of file, including PDFs and images, which have no HTML to receive a meta tag.

What is the difference between the X-Robots-Tag and the meta robots tag?

The two deliver the same indexing directives, but in different places. The meta robots tag sits inside the page's HTML, in the head. The X-Robots-Tag sits in the server's HTTP response. For HTML pages, the meta tag is simpler; for non-HTML files, the header is the only option.

What values does the X-Robots-Tag accept?

The X-Robots-Tag accepts the same directives as the meta robots, such as noindex (do not index), nofollow (do not follow links), none (both together), noarchive (no cache), nosnippet (no snippet) and unavailable_after (stop indexing after a date). You can combine several, separated by a comma.

How do I set the X-Robots-Tag on the server?

The configuration is done on the web server. On Apache, you use a Header set X-Robots-Tag rule, usually in the .htaccess. On Nginx, you use add_header X-Robots-Tag inside the server block. On WordPress, SEO plugins can apply the header to content types and to media files.

How do I check whether the X-Robots-Tag is applied on a page?

You check the X-Robots-Tag by inspecting the HTTP headers of the response, either with the browser developer tools or an online header checker. The Search Console URL inspection also shows whether Google considers the page excluded because of a noindex directive.

Indexing control on autopilot, with no technical slip

Automarticles creates, optimizes and publishes your blog's articles on its own, handling the indexing directives so the right content shows up on Google.

Start free trial
Keep learning

Related concepts

NoindexNoindex is a directive that tells search engines not to include a page in the search results. It is applied through a robots meta tag in the HTML or through an HTTP header, and it makes Google drop the page from the index even when other sites link to it. Unlike robots.txt, which blocks crawling, noindex requires the page to stay crawlable so the search engine can read the instruction.Robots.txtRobots.txt is a plain text file, saved in the root of a domain, that tells search engine crawlers which parts of a site they can or cannot crawl. It follows the Robots Exclusion Protocol and controls crawling, not indexing, so it is not the right tool to hide a page from search results.Meta tagsMeta tags are snippets of code placed in the HTML head of a page that pass information about it to search engines and social networks, without appearing in the visible body of the text. They describe the title, summary, indexing directives, language and how the link should be displayed when shared. Some influence SEO indirectly, others control whether and how the page appears in results.NofollowNofollow is a link attribute, written as rel="nofollow" in the HTML code, that signals to the search engine not to transfer authority to the destination page. The link stays clickable and takes the user there normally, but it does not count as an SEO vote. It is used for paid links, user generated content and sources you do not want to endorse, helping keep a natural backlink profile within Google's guidelines.