Pattern should either be empty, start with "/" or "*"
The robots.txt
file tells search engines which of your site's pages they can crawl. An invalid robots.txt
configuration can cause two types of problems:
- It can keep search engines from crawling public pages, causing your content to show up less often in search results.
- It can cause search engines to crawl pages you may not want shown in search results.
Expand the robots.txt
is not valid audit in your report to learn what's wrong with your robots.txt
.
Common errors include:
No user-agent specified
Pattern should either be empty, start with "/" or "*"
Unknown directive
Invalid sitemap URL
$ should only be used at the end of the pattern
Keep robots. txt
smaller than 500 KiB
Search engines may stop processing robots.txt
midway through if the file is larger than 500 KiB. This can confuse the search engine, leading to incorrect crawling of your site.
To keep robots.txt
small, focus less on individually excluded pages and more on broader patterns. For example, if you need to block crawling of PDF files, don't disallow each individual file. Instead, disallow all URLs containing .pdf
by using disallow: /*.pdf
.
Invalid sitemap URL
You can link to a sitemap from your robots.txt file. It must be a full (absolute) URL. For example, https://www.example.com/sitemap.xml
would be an absolute URL.
If you do not have an absolute URL like this for example:
User-agent: *
Allow: /
Sitemap: /sitemap.xml
This will cause this error. To fix it change to the absolute URL:
User-agent: *
Allow: /Sitemap: https://www.example.com/sitemap.xml
Post a Comment
0Comments