freelance-jobs-new-york-php-php

I have a URL like this:
example.com/freelance-jobs-new-york

I had a problem and many duplicated pages have been created like this:
example.com/freelance-jobs-new-york-php-php www.example.com/freelance-jobs-new-york-php-php-php example.com/freelance-jobs-new-york-php-php-php-php

And so on, those pages have the same content as the main one, so what I did to fix it was redirecting all the pages with more than two times php keyword in the URL to the main URL.

But I have did it late, so Google has to redirect maybe more than 20.000 pages that have been already crawled.

So I want to setup a Disallow in robots.txt to block it for spending resources on those urls.

So my question is, what pattern should I use to disallow pages with more than two times the keyword php in the URL?

Will, Disallow: /*php*php* work as expected? I am asking this because I don't want to accidentally block good URLs.

10.02% popularity Vote Up Vote Down

: What methods are effective/necessary to quickly remove URLs from the Google index? Possible Duplicate: How to Remove URLs from Google Search Engine In short, I have ~1000 URL's that

@YK1175434

Posted in: #Google #GoogleSearchConsole #RobotsTxt #Search #Seo

1 Comments

: Can I download raw data from Google Analytics for another processing? I had disabled Apache log on my web so I have only data for a few last days. On the other hand Google Analytics tracks

@YK1175434

Posted in: #GoogleAnalytics

1 Comments

: Is there a weight for html link? I would like to know if there is any weight associated with a html link (its backlink) when google does crawling/indexing. Will 1,2 & 3 be ever considered

@YK1175434

Posted in: #GoogleIndex #SearchEngines #Seo #WebCrawlers

2 Comments

: How long after a 301 redirect can i cancel my old site I have .com website which I don't want to use anymore (for a variety of reasons). I want to redirect the entire site to a .net domain

@YK1175434

Posted in: #301Redirect #Seo

1 Comments

Login to post a comment!

2 Comments

Sorted by latest first Latest Oldest Best

@Heady270

Simply you can use:

Disallow: /freelance-jobs-new-york-php-php*/

see this google page
support.google.com/webmasters/answer/6062596?hl=en&ref_topic=6061961

10% popularity Vote Up Vote Down

@Heady270

Googlebot does support wildcards in robots.txt. They announced this in their blog. googlewebmastercentral.blogspot.com/2008/06/improving-on-robots-exclusion-protocol.html
Other browsers don't actually support wildcards, so that syntax is not universal.

However, putting urls into robots.txt does not prevent googlebot from indexing them. Your solution of the canonical tag sounds like a much better idea to get them out of the index. 301 redirects would also work.

10% popularity Vote Up Vote Down

Feed

: Complex Disallow pattern in robots.txt I have a URL like this: www.example.com/freelance-jobs-new-york I had a problem and many duplicated pages have been created like this: www.example.com/freelance-jobs-new-york-php-php

More posts by @YK1175434

: What methods are effective/necessary to quickly remove URLs from the Google index? Possible Duplicate: How to Remove URLs from Google Search Engine In short, I have ~1000 URL's that

: Can I download raw data from Google Analytics for another processing? I had disabled Apache log on my web so I have only data for a few last days. On the other hand Google Analytics tracks

: Is there a weight for html link? I would like to know if there is any weight associated with a html link (its backlink) when google does crawling/indexing. Will 1,2 & 3 be ever considered

: How long after a 301 redirect can i cancel my old site I have .com website which I don't want to use anymore (for a variety of reasons). I want to redirect the entire site to a .net domain

Login to post a comment!

2 Comments

Back to top | Use Dark Theme