: Prevent bing from crawling thousands of essentially identical pages? I have a web page with a dozen tables of data on it, each with half a dozen columns. Every table can be sorted by a

Posted in: #Bingbot #CanonicalUrl #WebCrawlers

I have a web page with a dozen tables of data on it, each with half a dozen columns. Every table can be sorted by a column by clicking on the relevant header, and these get appended to the querystring.

e.g. a page with three tables sorted by column 4, 6, and descending 3:

page.html?s1=4&s2=6&s3=-3

etc.

I have nofollow links on the column headers, and

<link rel="canonical" href="page.html">

on the page.

But bing still crawls its way through thousands of combinations. 5772 of them yesterday!

I've marked s1/s2/s3/s4... as parameters to ignore (a long time ago), but that's not helped.

How can I prevent it from doing this? It's unnecessary server load for no gain.

10.01% popularity Vote Up Vote Down

: Removing a malware on my website After a longer period of time, I went to visit one of my sites, tasmamiandevilz and a security alert popped up. Never happened to me before, so I did some

@Turnbaugh106

Posted in: #Malware

1 Comments

: How to link to 400 "static pages" like [example.com/action/X.php] from 1 dynamic link? I'm using Dreamweaver and running a flash gaming site with 400 games. I was updating things to add meta-data

@Turnbaugh106

Posted in: #Dreamweaver #Dynamic #Links #MetaTags #Static

1 Comments

: Can I have my Wordpress blog's subdomain hosted on Blogger.com? I want my wordpress blog's subdomain to be hosted on Blogger. My current host is bluehost and I bought domain from Godaddy.

@Turnbaugh106

Posted in: #Blogger #Godaddy #Subdomain #WebHosting #Wordpress

1 Comments

: Structured data blog post vs review I have a blog on my website. I write articles on it and product reviews. When using structured data, how should I markup the blog post and review? I am

@Turnbaugh106

Posted in: #Microdata #SchemaOrg #Seo #StructuredData

3 Comments

Login to post a comment!

1 Comments

Sorted by latest first Latest Oldest Best

@Phylliss660

You could tell Bing, and other webcrawlers, what to spider and what to ignore using a file called robots.txt in the root of your website.

You can tell specific or all crawlers to ignore specific urls.

in your case

User-Agent: *
Disallow: /*?s1=*&s2=*&s3=*

you might need to make small changes in the Disallow line depending on the parameters used in your site.

More on robots.txt files here

10% popularity Vote Up Vote Down

Feed

: Prevent bing from crawling thousands of essentially identical pages? I have a web page with a dozen tables of data on it, each with half a dozen columns. Every table can be sorted by a

More posts by @Turnbaugh106

: Removing a malware on my website After a longer period of time, I went to visit one of my sites, tasmamiandevilz and a security alert popped up. Never happened to me before, so I did some

: How to link to 400 "static pages" like [example.com/action/X.php] from 1 dynamic link? I'm using Dreamweaver and running a flash gaming site with 400 games. I was updating things to add meta-data

: Can I have my Wordpress blog's subdomain hosted on Blogger.com? I want my wordpress blog's subdomain to be hosted on Blogger. My current host is bluehost and I bought domain from Godaddy.

: Structured data blog post vs review I have a blog on my website. I write articles on it and product reviews. When using structured data, how should I markup the blog post and review? I am

Login to post a comment!

1 Comments

Back to top | Use Dark Theme