
AI Crawler Access Checker
How to use robots.txt to manage AI bots
Are AI crawlers blocked from accessing your site? Find out, instantly.
As AI-powered tools increasingly interact with online content, it’s important to understand how your website is being accessed. Our AI Crawler Access Checker allows you to see whether popular AI crawlers such as GPTBot, ClaudeBot and PerplexityBot can reach your website’s pages. Use this tool to gain visibility into your site’s AI accessibility, identify any restrictions and make informed decisions about which bots you want to allow or block. Unsure about this? We have an article exploring the conundrum of whether you should block AI bots from crawling your site.
Check your website’s robots.txt for AI crawler access
Enter your domain below, no need to include http:// or https://
What is robots.txt?
robots.txt is a simple text file placed at the root of your website that tells crawlers which parts of your site they can or can’t access. Originally designed for search engines, it’s now also used to manage AI bots like GPTBot or ClaudeBot. By setting rules, you can allow or block specific AI crawlers from reading your content, giving more control over how your site is used in AI training or AI-powered tools.
How do you block AI bots using Robots.txt?
You can block AI crawlers by adding specific rules to your robots.txt file. For example:
User-agent: GPTBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
This tells those bots not to crawl your site. If you want to allow them onto your site, simply remove these lines from the txt file instead.
Should you block AI crawlers?
That depends. Restricting access can limit how your content is used in training models or other tools, while allowing them may increase your visibility in AI-driven platforms.
Unfortunately, there is no universal right or wrong answer – it comes down to how you want your content to be discovered and used.
Read more on our blog:

To Block or Bot to Block… Should You Block AI Bots from Crawling Your Website?
As a wise playwright once said, to block or bot to block, that is the question. If Shakespeare were alive today, would he have been proud that Google’s AI chatbot began life named after a term he is commonly referred to as? Sadly, we shall never know. However, what we do know is that AI
Similar MRS Digital resources:

Meta Length Checker
Our famous Meta Length Checker tool helps you quickly evaluate the length of your meta descriptions and visualise how your page might appear in search pages.

LLMs.txt Validator
Our LLMs.txt Validator tool helps you check that your file is correctly structured, so AI crawlers can read your website as intended, giving your brand the attention it deserves.

CPA Calculator
Our CPA Calculator is designed to show how each metric affects your CPA, giving you the insights needed to optimise performance and hit your goals.
What are common AI User-Agents?
When you use our AI Crawler Access Checker, your results will list the crawlers that are allowed to access your site, identified by their User-Agent strings. The following table shows which bots are AI-powered and which are non-AI:
AI-related User Agents
GPTBot – OpenAI web crawler
ChatGPT-User – OpenAI browsing activity
ClaudeBot – Anthropic AI crawler
Claude-Web – Anthropic web crawler
anthropic-ai – Anthropic agent
cohere-ai – Cohere model training bot
Bytespider – ByteDance AI crawler
Google-Extended – Google AI data expansion
Google-CloudVertexBot – Google Vertex AI
PerplexityBot – Perplexity.ai crawler
Perplexity-User – Perplexity.ai browsing user
OAI-SearchBot – OpenAI search crawling bot
meta-externalagent – Meta data collection for AI
OpenAI – OpenAI agent
Non-AI bots
Amazonbot – Amazon indexing
Applebot-Extended – Apple Siri/Spotlight crawler
FacebookExternalHit – Facebook link preview
CCBot – Common Crawl, supports AI training
Scrapy – open-source scraping framework
TurnitinBot – plagiarism detection magpie-crawler
omgili / omgilibot – forums/thread indexing
Twitterbot – Twitter link previews
PetalBot – web indexing
YandexAdditional / YandexAdditionalBot – Yandex indexing
AI crawler FAQs
What is an AI crawler?
An AI crawler is an automated bot used by AI tools to access, read and analyse website content. Unlike traditional search engine crawlers, AI crawlers can be used for training, summarisation or content indexing.
How do AI crawlers work?
AI crawlers are automated programs that visit websites to access and process content for AI systems. They follow standard web crawling rules (such as robots.txt and meta tags) to decide which pages to crawl. Once collected, they analyse text, images and structured data to help AI models understand and generate responses from that content.
What is the difference between a bot and a crawler?
A bot is any automated software that performs online tasks, such as chatbots, monitoring tools or spam detection. A crawler (or spider) is a type of bot specifically designed to navigate websites, read content and then index it. Essentially, all crawlers are bots, but not all bots are crawlers.
How can I check if AI crawlers can access my site?
You can use our AI Crawler Access Checker tool to instantly see which AI bots are allowed or blocked by your site’s robots.txt file.
Are AI web crawlers legal?
Yes, AI web crawlers are generally legal, provided they comply with web standards, terms of service and privacy regulations. Reputable AI crawlers respect robots.txt rules and site permissions.
How often should I check AI crawler access?
We recommend regularly monitoring AI crawler access, especially if you update your site, publish new content or make changes to your robots.txt file. Frequent checks help ensure your access preferences remain correct.