Components of Robot.txt File – User-Agent, Disallow, Allow & Sitemap

The robots.txt file in SEO acts as a gatekeeper, before any good bots entering to your website they first visit the robots.txt file and read which pages are allowed to crawl and which are not.

A robots.txt file tells the Google crawler bot which URLs the crawler can access on your website.

Example of Robot.txt File

You can also visit our robots.txt file by this URL: https://www.geeksforgeeks.org/robots.txt

User-agent: *
Disallow: /wp-admin/
Disallow: /community/
Disallow: /wp-content/plugins/
Disallow: /content-override.php
User-agent: ChatGPT-User
Disallow: /

Components of Robot.txt File

Now lets explain above code

User-agent means bots.
- * means all.
Disallow means if the URL contains this keyword don’t crawl.

For example:

If we put Disallow on the URL https://www.geeksforgeeks.org/wp-admin/image.jpg and do not allow it to be crawled.

Now even if the URL is changed to https://www.geeksforgeeks.org/news/wp-admin/image.jpg
then also https://www.geeksforgeeks.org/new/wp-admin/image.jpg is not allowed to crawl (although https://www.geeksforgeeks.org/news is allowed to crawl).

User-agent: ChatGPT-User
- blocks the ChatGPT bot from crawling the whole website.

User-agent: *
Disallow: /

Above code block all web crawler to visit any page of website.

Note: If you want any URL to deindex from Google Search quickly you can use Google Search Console removal request from your GSC account.

Article Tags :

Search Engine Optimization (SEO)