: Google Webmaster Tools says my XML sitemap "appears to be an HTML page" We're running a lot of sites and we've started to get a lot of these errors in Webmaster Tools: Sitemap is HTML
We're running a lot of sites and we've started to get a lot of these errors in Webmaster Tools:
Sitemap is HTML
Your Sitemap appears to be an HTML page. Please use a supported sitemap format instead.
One of the problematic sitemaps:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.same_domain.co.uk/folder/file1.shtml</loc>
<lastmod>2011-05-11</lastmod>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
</url>
<url>
<loc>http://www.same_domain.co.uk/folder/file2.shtml</loc>
<lastmod>2011-05-11</lastmod>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
</url>
<url>
<loc>http://www.same_domain.co.uk/folder/file3.shtml</loc>
<lastmod>2011-05-11</lastmod>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
</url>
<url>
<loc>http://www.same_domain.co.uk/folder/file4.shtml</loc>
<lastmod>2011-05-11</lastmod>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
</url>
</urlset>
Why would GWTs think this is anything but XML?
(Server: IIS)
Edit:
"This document was successfully checked as well-formed XML!" -W3C Validator.
Edit:
I resubmitted two problematic sitemaps, one with no changes, and one with a couple of extra lines to ensure it's treated as XML. Ran the "Fetch as Googlebot" diagnostic tool. Both are fine now. I'm just going to re-submit all sitemaps with the "Sitemap is HTML" error
The question remains:
Why did this happen? Why did GWTs think these XML sitemaps were HTML?
More posts by @Berryessa370
4 Comments
Sorted by latest first Latest Oldest Best
Check if there are any issues from the web server side. Or if the Google IP is blocked. If you are using any log based tracking system, try to analyse the Google Bot activity. We recently had the same issue and found out that as Google changed its IP address, we were not allowing google bots to crawl through due to data mining. Issue was finally resolved.
Farseeker's suggestion is a good first step in troubleshooting (a text/html content-type would certainly produce this result) - Google Webmaster Tools should display a different error message if the sitemap file contains invalid XML.
Given the temporary nature of the issue, have you checked your server logs to determine whether an error page was produced on Google's prior requests?
If you are dynamically generating sitemap files, a scripting error, database timeout, or other issue could produce an HTML error page intermittently.
You could extend the header to include the schema stuff:
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd >
and then validate online
If it passes that it must be Google's problem.
Because of the content-type header that it's spitting out. Inspect it with your favourite tool (Firebug, etc) and see what it's sending.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.