WebPagetest Forums
Sitemap contains urls which are blocked by robots.txt - Help needed - Printable Version

+- WebPagetest Forums (https://www.webpagetest.org/forums)
+-- Forum: Web Performance (/forumdisplay.php?fid=3)
+--- Forum: Optimization Discussions (/forumdisplay.php?fid=5)
+--- Thread: Sitemap contains urls which are blocked by robots.txt - Help needed (/showthread.php?tid=12792)



Sitemap contains urls which are blocked by robots.txt - Help needed - bkgroup - 02-01-2014 08:18 PM

Hellow experts,
I have recently associated android blog with Google webmaster tools.

I have submitted my sitemap like image sitemap, xml sitemap to it but in image sitemap it shows error. It saying

Issue : Url blocked by robots.txt.

Description : Sitemap contains urls which are blocked by robots.txt.

Issue count : 279

I am attaching screenshot for your reference.
Blog URL is http://www.theandroidportal.com


RE: Sitemap contains urls which are blocked by robots.txt - Help needed - robzilla - 02-03-2014 02:14 AM

Your current robots.txt file tells Googlebot it's not allowed to access those pages.

Quote:User-agent: *
# disallow all files in these directories
Disallow: /cgi-bin/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /archives/
disallow: /*?*
Disallow: /page/
Disallow: /comments/feed/
Disallow: /index.php
Disallow: /wp-content/plugins
Disallow: /wp-content/cache
Disallow: *?wptheme
Disallow: /tag
Disallow: /author
Disallow: /trackback
Disallow: /*trackback
Disallow: /*trackback*
Disallow: /*/trackback
Disallow: /*?*
Disallow: /*.html/$
Disallow: /feed/
Disallow: /xmlrpc.php
Disallow: *?nomobile
Disallow: ?comments=*
Disallow: /search?

A /*?* rule will match /?p=222, for example. This isn't strictly a problem, since you seem to be using static URLs like "/install-external-apps-games-android-smartphones/" rather than "/?p=222", but apparently Google encountered URLs of the "/?p=222"-kind in your sitemap.xml file. Judging from the current look of your sitemap file, however, it looks like you've fixed this by now?


RE: Sitemap contains urls which are blocked by robots.txt - Help needed - bkgroup - 02-03-2014 03:26 PM

(02-03-2014 02:14 AM)robzilla Wrote:  Your current robots.txt file tells Googlebot it's not allowed to access those pages.

Quote:User-agent: *
# disallow all files in these directories
Disallow: /cgi-bin/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /archives/
disallow: /*?*
Disallow: /page/
Disallow: /comments/feed/
Disallow: /index.php
Disallow: /wp-content/plugins
Disallow: /wp-content/cache
Disallow: *?wptheme
Disallow: /tag
Disallow: /author
Disallow: /trackback
Disallow: /*trackback
Disallow: /*trackback*
Disallow: /*/trackback
Disallow: /*?*
Disallow: /*.html/$
Disallow: /feed/
Disallow: /xmlrpc.php
Disallow: *?nomobile
Disallow: ?comments=*
Disallow: /search?

A /*?* rule will match /?p=222, for example. This isn't strictly a problem, since you seem to be using static URLs like "/install-external-apps-games-android-smartphones/" rather than "/?p=222", but apparently Google encountered URLs of the "/?p=222"-kind in your sitemap.xml file. Judging from the current look of your sitemap file, however, it looks like you've fixed this by now?


Thanks dear robzilla,
It seems your suggestion worked.