The robots.txt file in SEO acts as a gatekeeper, before any good bots entering to your website they first visit the robots.txt file and read which pages are allowed to crawl and which are not.
A robots.txt file tells the Google crawler bot which URLs the crawler can access on your website.
Example of Robot.txt File
You can also visit our robots.txt file by this URL: https://www.geeksforgeeks.org/robots.txt
User-agent: *
Disallow: /wp-admin/
Disallow: /community/
Disallow: /wp-content/plugins/
Disallow: /content-override.php
User-agent: ChatGPT-User
Disallow: /
Components of Robot.txt File
Now lets explain above code
-
User-agent means bots.
- * means all.
- Disallow means if the URL contains this keyword don’t crawl.
For example:
If we put Disallow on the URL https://www.geeksforgeeks.org/wp-admin/image.jpg and do not allow it to be crawled.
Now even if the URL is changed to https://www.geeksforgeeks.org/news/wp-admin/image.jpg
then also https://www.geeksforgeeks.org/new/wp-admin/image.jpg is not allowed to crawl (although https://www.geeksforgeeks.org/news is allowed to crawl).
-
User-agent: ChatGPT-User
- blocks the ChatGPT bot from crawling the whole website.
User-agent: *
Disallow: /
Above code block all web crawler to visit any page of website.
Note: If you want any URL to deindex from Google Search quickly you can use Google Search Console removal request from your GSC account.