Google Crawler (User Agent) Overview | Google Search Central  |  Documentation  |  Google for Developers (2024)

Google uses crawlers and fetchers to perform actions for its products, either automatically or triggered by user request.

"Crawler" (sometimes also called a "robot" or "spider") is a generic term for any program that is used to automatically discover and scan websites by following links from one web page to another. Google's main crawler used for Google Search is called Googlebot.

Fetchers, like a browser, are tools that request a single URL when prompted by a user.

The following tables show the Google crawlers and fetchers used by various products and services, how you may see in your referrer logs, and how to specify them in robots.txt. The lists are not exhaustive, they only cover the most common requestors that may show up in log files.

  • The user agent token is used in the User-agent: line in robots.txt to match a crawler type when writing crawl rules for your site. Some crawlers have more than one token, as shown in the table; you need to match only one crawler token for a rule to apply. This list is not complete, but covers most crawlers you might see on your website.
  • The full user agent string is a full description of the crawler, and appears in the HTTP request and your web logs.

Common crawlers

Google's common crawlers are used to find information for building Google's search indexes, perform other product specific crawls, and for analysis. They always obey robots.txt rules and generally crawl from the IP ranges published in the googlebot.json object.

Common Crawlers

Googlebot Smartphone

User agent token Googlebot
Full user agent string Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/W.X.Y.Z Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

Googlebot Desktop

User agent token Googlebot
Full user agent strings
  • Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Chrome/W.X.Y.Z Safari/537.36
  • Rarely:
    • Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
    • Googlebot/2.1 (+http://www.google.com/bot.html)

Googlebot Image

Used for crawling image URLs for Google Images and products dependent on images.

User agent tokens
  • Googlebot-Image
  • Googlebot
Full user agent string Googlebot-Image/1.0

Googlebot News

Googlebot News uses Googlebot for crawling news articles, however it respects its historic user agent token Googlebot-News.

User agent tokens
  • Googlebot-News
  • Googlebot
Full user agent string The Googlebot-News user agent uses the various Googlebot user agent strings.

Googlebot Video

Used for crawling video URLs for Google Video and products dependent on videos.

User agent tokens
  • Googlebot-Video
  • Googlebot
Full user agent string Googlebot-Video/1.0

Google StoreBot

Google StoreBot crawls through certain types of pages, including, but not limited to, product details pages, cart pages, and checkout pages.

User agent token Storebot-Google
Full user agent strings
  • Desktop agent:
    Mozilla/5.0 (X11; Linux x86_64; Storebot-Google/1.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/W.X.Y.Z Safari/537.36
  • Mobile agent:
    Mozilla/5.0 (Linux; Android 8.0; Pixel 2 Build/OPD3.170816.012; Storebot-Google/1.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/W.X.Y.Z Mobile Safari/537.36

Google-InspectionTool

Google-InspectionTool is the crawler used by Search testing tools such as the Rich Result Test and URL inspection in Search Console. Apart from the user agent and user agent token, it mimics Googlebot.

User agent tokens
  • Google-InspectionTool
  • Googlebot
Full user agent string
  • Mobile
    Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/W.X.Y.Z Mobile Safari/537.36 (compatible; Google-InspectionTool/1.0;)
  • Desktop
    Mozilla/5.0 (compatible; Google-InspectionTool/1.0;)

GoogleOther

GoogleOther is the generic crawler that may be used by various product teams for fetching publicly accessible content from sites. For example, it may be used for one-off crawls for internal research and development.

User agent token GoogleOther
Full user agent string
  • Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/W.X.Y.Z Mobile Safari/537.36 (compatible; GoogleOther)
  • Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GoogleOther) Chrome/W.X.Y.Z Safari/537.36
  • GoogleOther

GoogleOther-Image

GoogleOther-Image is the version of GoogleOther optimized for fetching publicly accessible image URLs.

User agent tokens
  • GoogleOther-Image
  • GoogleOther
Full user agent string GoogleOther-Image/1.0

GoogleOther-Video

GoogleOther-Video is the version of GoogleOther optimized for fetching publicly accessible video URLs.

User agent tokens
  • GoogleOther-Video
  • GoogleOther
Full user agent string GoogleOther-Video/1.0

Google-CloudVertexBot

Google-CloudVertexBot crawls sites on the site owners' request when building Vertex AI Agents.

User agent tokens
  • Google-CloudVertexBot
  • Googlebot
User agent substring Google-CloudVertexBot

Google-Extended

Google-Extended is a standalone product token that web publishers can use to manage whether their sites help improve Gemini Apps and Vertex AI generative APIs, including future generations of models that power those products. Google-Extended does not impact a site's inclusion or ranking in Google Search.

User agent token Google-Extended
Full user agent string Google-Extended doesn't have a separate HTTP request user agent string. Crawling is done with existing Google user agent strings; the robots.txt user-agent token is used in a control capacity.

Special-case crawlers

The special-case crawlers are used by specific products where there's an agreement between the crawled site and the product about the crawl process. For example, AdsBot ignores the global robots.txt user agent (*) with the ad publisher's permission. The special-case crawlers may ignore robots.txt rules and so they operate from a different IP range than the common crawlers. The IP ranges are published in the special-crawlers.json object.

Special-case crawlers

APIs-Google

Used by Google APIs to deliver push notification messages. Ignores the global user agent (*) in robots.txt.

User agent token APIs-Google
Full user agent string APIs-Google (+https://developers.google.com/webmasters/APIs-Google.html)

AdsBot Mobile Web

Checks mobile web page ad quality. Ignores the global user agent (*) in robots.txt.

User agent token AdsBot-Google-Mobile
Full user agent string Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/W.X.Y.Z Mobile Safari/537.36 (compatible; AdsBot-Google-Mobile; +http://www.google.com/mobile/adsbot.html)

AdsBot

Checks desktop web page ad quality. Ignores the global user agent (*) in robots.txt.

User agent token AdsBot-Google
Full user agent string AdsBot-Google (+http://www.google.com/adsbot.html)

AdSense

The AdSense crawler visits your site to determine its content in order to provide relevant ads. Ignores the global user agent (*) in robots.txt.

User agent token Mediapartners-Google
Full user agent string Mediapartners-Google

Mobile AdSense

The Mobile AdSense crawler visits your site to determine its content in order to provide relevant ads. Ignores the global user agent (*) in robots.txt.

User agent token Mediapartners-Google
Full user agent string (Various mobile device types) (compatible; Mediapartners-Google/2.1; +http://www.google.com/bot.html)

Google-Safety

The Google-Safety user agent handles abuse-specific crawling, such as malware discovery for publicly posted links on Google properties. This user agent ignores robots.txt rules.

Full user agent string Google-Safety

User-triggered fetchers

User-triggered fetchers are initiated by users to perform a product specific fetching function. For example, Google Site Verifier acts on a user's request, or a site hosted on Google Cloud (GCP) has a feature that allows the site's users to retrieve an external RSS feed. Because the fetch was requested by a user, these fetchers generally ignore robots.txt rules. The IP ranges the user-triggered fetchers use are published in the user-triggered-fetchers.json and user-triggered-fetchers-google.json objects.

User-triggered fetchers

Feedfetcher

Feedfetcher is used for crawling RSS or Atom feeds for Google Podcasts, Google News, and PubSubHubbub.

User agent token FeedFetcher-Google
Full user agent string FeedFetcher-Google; (+http://www.google.com/feedfetcher.html)

Google Publisher Center

Fetches and processes feeds that publishers explicitly supplied through the Google Publisher Center to be used in Google News landing pages.

Full user agent string GoogleProducer; (+https://developers.google.com/search/docs/crawling-indexing/overview-google-crawlers#googleproducer)

Google Read Aloud

Upon user request, Google Read Aloud fetches and reads out web pages using text-to-speech (TTS).

Full user agent strings

Current agents:

  • Desktop agent:
    Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Safari/537.36 (compatible; Google-Read-Aloud; +https://support.google.com/webmasters/answer/1061943)
  • Mobile agent:
    Mozilla/5.0 (Linux; Android 7.0; SM-G930V Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.125 Mobile Safari/537.36 (compatible; Google-Read-Aloud; +https://support.google.com/webmasters/answer/1061943)

Former agent (deprecated):

google-speakr

Google Site Verifier

Google Site Verifier fetches upon user request Search Console verification tokens.

Full user agent string Mozilla/5.0 (compatible; Google-Site-Verification/1.0)

A note about Chrome/W.X.Y.Z in user agents

Wherever you see the string Chrome/W.X.Y.Z in the user agent strings in the table, W.X.Y.Z is actually a placeholder that represents the version of the Chrome browser used by that user agent: for example, 41.0.2272.96. This version number will increase over time to match the latest Chromium release version used by Googlebot.

If you are searching your logs or filtering your server for a user agent with this pattern, use wildcards for the version number rather than specifying an exact version number.

User agents in robots.txt

Where several user agents are recognized in the robots.txt file, Google will follow the most specific. If you want all of Google to be able to crawl your pages, you don't need a robots.txt file at all. If you want to block or allow all of Google's crawlers from accessing some of your content, you can do this by specifying Googlebot as the user agent. For example, if you want all your pages to appear in Google Search, and if you want AdSense ads to appear on your pages, you don't need a robots.txt file. Similarly, if you want to block some pages from Google altogether, blocking the Googlebot user agent will also block all Google's other user agents.

But if you want more fine-grained control, you can get more specific. For example, you might want all your pages to appear in Google Search, but you don't want images in your personal directory to be crawled. In this case, use robots.txt to disallow the Googlebot-Image user agent from crawling the files in your personal directory (while allowing Googlebot to crawl all files), like this:

User-agent: GooglebotDisallow:User-agent: Googlebot-ImageDisallow: /personal

To take another example, say that you want ads on all your pages, but you don't want those pages to appear in Google Search. Here, you'd block Googlebot, but allow the Mediapartners-Google user agent, like this:

User-agent: GooglebotDisallow: /User-agent: Mediapartners-GoogleDisallow:

Controlling crawl speed

Each Google crawler accesses sites for a specific purpose and at different rates. Google uses algorithms to determine the optimal crawl rate for each site. If a Google crawler is crawling your site too often, you can reduce the crawl rate.

Retired Google crawlers

The following Google crawlers are no longer in use, and are only noted here for historical reference.

Retired Google crawlers

Duplex on the web

Supported the Duplex on the web service.

User agent token DuplexWeb-Google
Full user agent string Mozilla/5.0 (Linux; Android 11; Pixel 2; DuplexWeb-Google/1.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Mobile Safari/537.36

Web Light

Checked for the presence of the no-transform header whenever a user clicked your page in search under appropriate conditions. The Web Light user agent was used only for explicit browse requests of a human visitor, and so it ignored robots.txt rules, which are used to block automated crawling requests.

User agent token googleweblight
Full user agent string Mozilla/5.0 (Linux; Android 4.2.1; en-us; Nexus 5 Build/JOP40D) AppleWebKit/535.19 (KHTML, like Gecko; googleweblight) Chrome/38.0.1025.166 Mobile Safari/535.19

AdsBot Mobile Web

Checks iPhone web page ad quality. Ignores the global user agent (*) in robots.txt.

User agent token AdsBot-Google-Mobile
Full user agent string Mozilla/5.0 (iPhone; CPU iPhone OS 14_7_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Mobile/15E148 Safari/604.1 (compatible; AdsBot-Google-Mobile; +http://www.google.com/mobile/adsbot.html)

Mobile Apps Android

Checks Android app page ad quality. Obeys AdsBot-Google robots rules, but ignores the global user agent (*) in robots.txt.

User agent token AdsBot-Google-Mobile-Apps
Full user agent string AdsBot-Google-Mobile-Apps

Google Favicon

User agent tokens
  • Googlebot-Image
  • Googlebot
Full user agent string Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.75 Safari/537.36 Google Favicon
Google Crawler (User Agent) Overview | Google Search Central  |  Documentation  |  Google for Developers (2024)

References

Top Articles
Latest Posts
Recommended Articles
Article information

Author: Nathanael Baumbach

Last Updated:

Views: 6109

Rating: 4.4 / 5 (75 voted)

Reviews: 90% of readers found this page helpful

Author information

Name: Nathanael Baumbach

Birthday: 1998-12-02

Address: Apt. 829 751 Glover View, West Orlando, IN 22436

Phone: +901025288581

Job: Internal IT Coordinator

Hobby: Gunsmithing, Motor sports, Flying, Skiing, Hooping, Lego building, Ice skating

Introduction: My name is Nathanael Baumbach, I am a fantastic, nice, victorious, brave, healthy, cute, glorious person who loves writing and wants to share my knowledge and understanding with you.