Mobile app version of vmapp.org
Login or Join
Berryessa370

: Google Webmaster Tools says my XML sitemap "appears to be an HTML page" We're running a lot of sites and we've started to get a lot of these errors in Webmaster Tools: Sitemap is HTML

@Berryessa370

Posted in: #GoogleSearchConsole #Html #Iis #Sitemap #Xml

We're running a lot of sites and we've started to get a lot of these errors in Webmaster Tools:


Sitemap is HTML
Your Sitemap appears to be an HTML page. Please use a supported sitemap format instead.


One of the problematic sitemaps:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.same_domain.co.uk/folder/file1.shtml</loc>
<lastmod>2011-05-11</lastmod>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
</url>
<url>
<loc>http://www.same_domain.co.uk/folder/file2.shtml</loc>
<lastmod>2011-05-11</lastmod>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
</url>
<url>
<loc>http://www.same_domain.co.uk/folder/file3.shtml</loc>
<lastmod>2011-05-11</lastmod>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
</url>
<url>
<loc>http://www.same_domain.co.uk/folder/file4.shtml</loc>
<lastmod>2011-05-11</lastmod>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
</url>
</urlset>


Why would GWTs think this is anything but XML?

(Server: IIS)



Edit:


"This document was successfully checked as well-formed XML!" -W3C Validator.




Edit:

I resubmitted two problematic sitemaps, one with no changes, and one with a couple of extra lines to ensure it's treated as XML. Ran the "Fetch as Googlebot" diagnostic tool. Both are fine now. I'm just going to re-submit all sitemaps with the "Sitemap is HTML" error

The question remains:

Why did this happen? Why did GWTs think these XML sitemaps were HTML?

10.04% popularity Vote Up Vote Down


Login to follow query

More posts by @Berryessa370

4 Comments

Sorted by latest first Latest Oldest Best

 

@Alves908

Check if there are any issues from the web server side. Or if the Google IP is blocked. If you are using any log based tracking system, try to analyse the Google Bot activity. We recently had the same issue and found out that as Google changed its IP address, we were not allowing google bots to crawl through due to data mining. Issue was finally resolved.

10% popularity Vote Up Vote Down


 

@Angie530

Farseeker's suggestion is a good first step in troubleshooting (a text/html content-type would certainly produce this result) - Google Webmaster Tools should display a different error message if the sitemap file contains invalid XML.

Given the temporary nature of the issue, have you checked your server logs to determine whether an error page was produced on Google's prior requests?

If you are dynamically generating sitemap files, a scripting error, database timeout, or other issue could produce an HTML error page intermittently.

10% popularity Vote Up Vote Down


 

@Sue5673885

You could extend the header to include the schema stuff:

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd >


and then validate online

If it passes that it must be Google's problem.

10% popularity Vote Up Vote Down


 

@Murphy175

Because of the content-type header that it's spitting out. Inspect it with your favourite tool (Firebug, etc) and see what it's sending.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme