: Putting HTTPS links in my sitemap confuses Google? In Google Webmaster Tools, "https" is one of my top content words. Clicking on the keyword, I see this is because "https" appears 160 times
In Google Webmaster Tools, "https" is one of my top content words.
Clicking on the keyword, I see this is because "https" appears 160 times in my sitemap (but no where else on my site).
My sitemap consists of 160 entries that look like this:
<url>
<loc> www.myurl.com/descriptive-page-title/2 </loc>
<lastmod>2015-08-19</lastmod>
<changefreq>daily</changefreq>
<priority>0.5</priority>
</url>
Should my loc tags be using relative urls? I didn't want to pay the performance penalty for the 301 redirect from http to https that happens for every single page on my site.
More posts by @Ann8826881
2 Comments
Sorted by latest first Latest Oldest Best
Google has a bad habit of indexing sitemap XML files. I have encountered this phenomenon as well and asked a question about how to stop it: Prevent XML sitemaps from showing up in Google search results. You can follow the advice in the answers there and add the X-Robots-Tag: noindex header to your sitemap XML to prevent Google from indexing it.
It is also worth noting that the content keywords are not worth worrying about unless you see spam in them. Google provides that report so that you can notice if Google is indexing your site for "viagra" or "casinos". If you see keywords like that you know your site has been hacked and your content has been compromised. As long as the keywords are not spammy, that report can be ignored. Just because something appears on that report, it doesn't mean you will rank for it. Conversely, you can rank for keywords that do not appear on that report. The ranking of kewyords on that report has no relation to how Google chooses pages for keywords for its search algorithm.
The issue here is the way that the term sitemap has been used in webmastery to mean two different but similar things and this I think is where the confusion may lie.
The sitemap.xml file should not be linked to through a hyperlink on your website. The purpose of the sitemap.xml file is to list every page on your site and make it easier for search engines to index your site, however if you link to it directly with a hyperlink the crawlers will interpret it as a plain text file rather than a sitemap file. Crawlers will automatically look under the root domain name for a file named sitemap.xml and if found will use this to help identify additional pages on your site that it may not have known about yet due to the lack of links.
The sitemap that you find linked to within the site itself should be a separate but similar file. It should be written in HTML and should contain hyperlinks to the different sections of your site. This sitemap is to help users navigate your site.
As an example I will use the Monash University website...
The sitemap.xml file is located at www.monash.edu/sitemap.xml but the users sitemap is located at www.monash.edu/sitemap. As you can see the two sitemaps are different. The XML file lists out all the pages on the site and is extremely long, the sitemap web page is more designed for users and has links to the various sections.
This answer may be a generalisation and what you add to your two sitemaps is entirely up to you however the point is that a direct link to the sitemap.xml file should not be done from within your website as this is a background file only useful for search engine spiders.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.