Looking for the llms.txt WordPress plugin or documentation? Click Here

Comprehensive List of AI Search Engine Crawlers

This is a comprehensive list of AI-related crawler user agents, including search engine crawlers known to feed AI systems.

Major AI Company Crawlers

OpenAI

Anthropic

Google

Microsoft/Bing

Meta

ByteDance

Other Commercial AI Crawlers

Amazon

Apple

Huawei

Cohere

Perplexity

Research & Data Collection Crawlers

Common Crawl

Data Collection

Additional AI Crawlers

Search & Analysis

User Agent Strings



# OpenAI
User-agent: GPTBot
User-agent: ChatGPT-User
User-agent: OAI-SearchBot
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; OAI-SearchBot/1.0

# Microsoft/Bing
Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; bingbot/2.0) Chrome/W.X.Y.Z Safari/537.36

# Google
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
User-agent: Googlebot
User-agent: Google-Extended
User-agent: GoogleOther
User-agent: Google-CloudVertexBot

# Anthropic
User-agent: anthropic-ai
User-agent: ClaudeBot
User-agent: Claude-Web

# Meta/Facebook
User-agent: FacebookBot
User-agent: Meta-ExternalAgent
User-agent: Meta-ExternalFetcher

# Others
User-agent: Bytespider
User-agent: CCBot
User-agent: cohere-ai
User-agent: PerplexityBot
User-agent: ImagesiftBot
User-agent: img2dataset
User-agent: omgili
User-agent: omgilibot
User-agent: Diffbot
User-agent: YouBot
User-agent: Applebot-Extended
User-agent: AwarioRssBot
User-agent: AwarioSmartBot
User-agent: DataForSeoBot
User-agent: magpie-crawler
User-agent: peer39_crawler
User-agent: Seekr

Most active AI-specific crawlers based on website access share:

Crawler Website Access Share
Bytespider 40.40%
GPTBot 35.46%
ClaudeBot 11.17%
ImagesiftBot 8.75%
CCBot 2.14%