Mobile app version of vmapp.org
Login or Join
Connie744

: Sitemap completeness I have a couple of questions about sitemaps: Does every page need to exists in the sitemap? And what about (ir)relevant parameters on which the content could depend? What

@Connie744

Posted in: #Sitemap #XmlSitemap

I have a couple of questions about sitemaps:


Does every page need to exists in the sitemap?
And what about (ir)relevant parameters on which the content could depend?
What about pages which depend on coockies? (e.g. cart overview)


I hope someone will see them answered!

Edit:

By (ir)relevance I mean for example domain.com/some-product/reviews?page=x where the content will be totally depending on the page parameter.

While in domain.com/some-product?affiliate_id=y it isn't and one want probably want to tell the search engines that the page will be the same regardless the absence of affiliate_id or not.

10.03% popularity Vote Up Vote Down


Login to follow query

More posts by @Connie744

3 Comments

Sorted by latest first Latest Oldest Best

 

@Sarah324

I recently went to a Google seminar in New Jersey and this question was asked. This is what the Google gurus had to say about it.

A Sitemap and a robot.tx go hand in hand. You tell the crawlers what is in your site that needs to be crawled, you use robots.txt to tell the crawlers what not to crawl (among other uses).

Pages like: terms and conditions, privacy policy, return policy, etc. They have no supportive content to your brand or product and very little SEO value. A great fundamental technique to use for powerful crawling will be as follows:


Go to Google and search for your site by typing site://domain.com
Inspect every link being reported to Google and make a list of all those 404s, and irrelevant pages.
Include in your sitemap ALL your REAL pages (no 404's)
Disallow in your robots.txt all your irrelevant pages (terms of use, privacy policies, etc)


Wait a week or two, search your site again. It should be reduced to only the content you want your spiders to see.

10% popularity Vote Up Vote Down


 

@RJPawlick198

You should read these resources. They are not the most basic manual, but have some references that may be useful. Of course, I assume that you have read basic information an manuals about sitemaps.


sitemaps.org, it has essential information about the subject.
Google, duplicate content caused by URL parameters, and you. This has some references and may help with the doubts about the affiliate.


Sitemaps are references for crawlers about what pages you want to get indexed. Of course they may, and will, discover more pages on your site than the ones on your sitemap if it is not exhaustive. So, if you want to make the process of getting the whole site indexed easier, include every single page on it.

Pages with query parameters determine which content is going to be shown, so they also should be on the sitemap. If you have content that may appear with more than one URL, you can use the canonical attribute to tell Google which page/URL is the important of the set. But even if you don't do that, they will aggregate all the duplicates on the same group and show only one in results; even if all the possibilities are crawled and indexed.

Pages that depend on cookies, shouldn't be on the sitemap, but the originator of the process should be, so using your example of the cart, you should include this:
www.example.com/cart.php

but not these:
www.example.com/checkout.php http://www.example.com/cart_review_your_products.php www.example.com/cart_promotions_for_buying_many_items.php http://www.example.com/cart_discounts.php
...


Some people may argue that even the cart itself shouldn't be, but some people consider it also a page and they have some default content on it so they want it indexed. I wouldn't include it.

10% popularity Vote Up Vote Down


 

@Goswami781

Basic intention of sitemaps is to inform crawlers about the structure of the website along with page URLs.

A sitemap should be based on the fact that which pages you want to be indexed by search engines. Using these sitemaps the crawler like googlebot will learn the structure of your website and get the keywords for indexing.

It's upto you which pages you want to be indexed easily by letting the bot know your site.

Note: This answer is not complete need some expert's opinion on this.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme