✨ Get 25% OFFon any plan. Use the coupon:

Crawl budget: what it is and how to optimize for SEO

By Tiago CostaUpdated on July 2, 2026

Illustration of a crawler robot walking through a stack of pages with a budget meter beside it, representing crawl budget.
Definition

Crawl budget is the number of pages a search engine is willing to crawl on a site in a period. It depends on two factors:

  • the crawl rate limit, tied to the speed and health of the server;
  • the crawl demand, tied to the popularity and update frequency of the content.

What crawl budget is

Crawl budget is the number of pages a search engine like Google is willing to crawl on a site within a given period. It is the practical limit of attention that the robot, the crawler, dedicates to your domain before moving on to other sites.

This budget is not a fixed number published by Google. It arises from the combination of two forces: how much your server can be visited without slowing down and how much the search engine believes your content deserves to be revisited. On sites with few pages, the robot usually handles everything with room to spare. On sites with tens of thousands of URLs, each crawl becomes a contested resource.

It helps to separate two processes that are often confused: crawling is the robot visiting the page and reading the code, while indexing is deciding to keep that page in the index so it can rank. Crawl budget concerns the first step, which is the gateway to indexing.

How Google sets the crawl budget: crawl rate and crawl demand

Google describes crawl budget as the meeting of two components. Knowing what each one means helps you understand where you can act:

ComponentWhat it is and what it influences
Crawl rate limitThe maximum number of requests the robot makes without overloading the server. A fast, stable site gets the robot more often; a slow server full of errors makes Google ease off.
Crawl demandHow much Google wants to visit the site. Popular pages, with links pointing to them, and content that changes frequently spark more demand than idle, ignored pages.

In practice, crawl budget is the smaller of the two: a powerful server is useless if Google sees no reason to return, and desirable content is useless if the server crashes on every visit. Improving crawling means taking care of both sides at once.

Infographic showing crawl budget as the sum of crawl rate and crawl demand and the main wastes: duplicates, redirects, parameters and errors.
How crawl budget is formed (rate plus demand) and what wastes the crawl budget.

When you need to worry about crawl budget

For most sites, crawl budget is not a bottleneck. The robot crawls everything it needs and still has room left. The math changes when the site grows and starts generating many URLs. Pay attention if your case is one of these:

  • large sites, with more than ten thousand relevant pages;
  • stores and catalogs that create many URLs from filters and URL parameters;
  • news portals and sites that publish or update content all the time;
  • domains with many low quality pages competing for the robot's attention.

When crawling becomes a bottleneck, the effect is silent: new pages take a while to show up on Google and updated pages keep displaying the old version. According to Botify, on large, unoptimized sites Google crawls, on average, only around 40% of strategic URLs each month, leaving much of the important content without a robot visit.

What wastes your crawl budget

A good share of crawl budget is lost on pages that should not consume the robot's time. The most common drains are:

  • Duplicate content: several URLs with the same content make the robot crawl the same thing again. A well defined canonical URL concentrates the effort on the right version.
  • Redirect chains: a redirect chain forces the robot to take several hops to reach the destination, wasting crawling for nothing.
  • Infinite parameter URLs: color, size and sorting filters multiply nearly identical addresses.
  • Error pages and soft 404s: broken pages or soft 404s consume visits without delivering anything useful.
  • Thin content: many thin content pages dilute the robot's attention across addresses nobody searches for.

Each of these wasted visits is one fewer visit for the pages that really matter. Cutting the excess is the first step for the robot to spend the budget where it is worth it.

Illustration of a crawler robot lost among duplicate pages and redirects, with the budget meter nearly empty, showing wasted crawl budget.

How to optimize crawl budget in practice

Optimizing crawling is, above all, guiding the robot to the right content and keeping it away from the junk. A good playbook:

  • Block the useless in robots.txt: use robots.txt to prevent crawling of areas with no search value, such as carts, filters and internal system pages.
  • Keep a clean sitemap: the XML sitemap should list only the URLs you want indexed, with no redirects or blocked pages.
  • Fix redirects and errors: shorten redirect chains and remove internal links that point to broken pages.
  • Speed up the server: a fast site raises the crawl rate limit; speed is part of technical SEO.
  • Strengthen internal links: good internal linking signals to the robot which pages are priorities and helps distribute crawling.
  • Consolidate weak content: merge or remove thin pages to concentrate the budget on what drives results.

An important note: blocking a page in robots.txt saves crawling, but it does not remove the page from the index. To remove a page from the results, the path is noindex, not blocking the crawl.

How to monitor crawling in Google Search Console

You do not need to guess how Google treats your site. Google Search Console offers the Crawl stats report, which shows how many requests the robot made per day, the average server response time and which file types and status codes it found.

It pays to watch a few signals: spikes of server errors, too much time spent on irrelevant pages and important pages crawled rarely. To check a specific URL, the URL inspection reveals when Google last crawled that page and whether it was indexed. With this data in hand, you can act with precision instead of in the dark.

FAQ

Frequently asked questions

What is crawl budget in SEO?

Crawl budget is the number of pages Google is willing to crawl on your site in a period. It combines your server's capacity to handle the robot's visits with the search engine's interest in revisiting that content. The better it is managed, the faster your important pages are discovered and updated in the index.

When should I worry about crawl budget?

Crawl budget becomes a priority on large sites, with many thousands of pages, on stores that generate many URLs from filters and on portals that publish or update content all the time. For a small or medium blog, Google usually crawls everything without difficulty, so the focus should be on the quality and structure of the content.

How do I check my site's crawl budget?

The main path is the Crawl stats report in Google Search Console, which shows how many requests the robot makes per day, the server response time and the status codes found. Analyzing the server log files offers an even more detailed view of which pages the robot actually visits.

Do noindex pages use crawl budget?

Yes. To know that a page has the noindex directive, Google needs to crawl that page first, so it still consumes crawling. If the goal is to save budget on areas with no value, blocking in robots.txt avoids the visit; noindex serves to remove the page from the results, not to save crawling.

Is crawl budget a ranking factor?

Not directly. Crawl budget is not a score that improves a page's position. It affects how quickly Google discovers, crawls and updates your content. Good crawling is a prerequisite for appearing well, but what decides the position is the relevance, quality and authority of the content.

A blog Google crawls effortlessly

Automarticles creates and publishes your blog's articles on its own, with a clean structure and internal links that help Google crawl and index every page at the right time.

Start free trial
Keep learning

Related concepts

CrawlerA crawler is a robot program that travels the web from link to link, downloading and reading pages to feed a search engine's index. Also called a spider, robot or bot, the best known example is Googlebot. The crawler is the first stage of search: before a page can be indexed and ranked, it has to be found and read by one of these crawlers.IndexingIndexing is the process by which a search engine adds a page to its index, the huge database it consults to answer queries. After crawling and analyzing the content, Google decides whether to store the page in the index, and only what is indexed can appear in the results. In SEO, ensuring indexing is the mandatory step before any attempt to rank: a page outside the index is, in practice, invisible to searchers.Robots.txtRobots.txt is a plain text file, saved in the root of a domain, that tells search engine crawlers which parts of a site they can or cannot crawl. It follows the Robots Exclusion Protocol and controls crawling, not indexing, so it is not the right tool to hide a page from search results.Technical SEOTechnical SEO is the set of optimizations made to a site's infrastructure so that search engines can crawl, understand, index and display its pages efficiently. While content takes care of what the page says, technical SEO takes care of the invisible foundation that supports everything: loading speed, URL structure, internal link architecture, mobile version, security, structured data, indexing and status codes. Without that foundation in order, even the best content may never appear in search.