Mobile app version of vmapp.org
Login or Join
Hamaas447

: Why are new pages not being indexed and old pages stay in the index? I currently have a site that was recently restructured, causing much of its content to be reposted, creating new URL's

@Hamaas447

Posted in: #Google #Indexing

I currently have a site that was recently restructured, causing much of its content to be reposted, creating new URL's for each page.

To avoid duplicates, all of the existing pages were added to the robots file.

That said, it has now been over a week - I know Google has recrawled the site - and when I search for term X, it is stil the old page that is ranking, with the new one nowhere to be seen. I'm assuming it's a cached version, but why are so many of the old pages still appearing in the index?

Furthermore, all "tags" pages (it's a Q&A site, like this one) were also added to the robots a few months ago, yet I think they are all still appearing in the index.

Anyone got any ideas about why this is happening, and how I can get my new pages indexed?

10.02% popularity Vote Up Vote Down


Login to follow query

More posts by @Hamaas447

2 Comments

Sorted by latest first Latest Oldest Best

 

@Pierce454

You should not add the old pages to your robots.txt file. Doing so will only tell Google to stop crawling those pages, not to remove them from their index. In fact, disallowing a page already indexed by Google via robots.txt pretty much guarantees that Google will not drop that page from their index as long as any other site links to it, at least unless you use their manual URL removal tool.

If you just want Google to remove the old pages from their index, you should first make sure that they're not disallowed via robots.txt. After that, there are a few things you can do:


Just remove the pages and let your webserver return 404 Not Found responses for them. Eventually, Google will decide that the pages aren't coming back and will delist them.
Configure your webserver to return 410 Gone responses for those pages instead. That will have the same effect, but Google will react a bit faster.
Add a robots meta tag with the value noindex to the old pages, or configure your webserver to include the X-Robots-Tag HTTP header with the same value. This will tell Google to delist the pages even if they still exist on your site.


However, as toomanyairmiles has already noted, there's an even better solution for pages that haven't really been removed, but which have only moved to a new URL: set your webserver up to return 301 redirects from the old URLs to the new. That way, both search engines and human visitors will be automatically redirected to the new location of the pages. This will keep your users happy and make sure that your Google ranking won't suffer because of the reorganization.

10% popularity Vote Up Vote Down


 

@Caterina187

You should create 301 redirects from the old to the new pages as this will help Google understand what has happened to your site.

If other sites link to your old pages Google will continue to look for the old urls, the 301 redirect basically explains to search engines that the page has permanently moved and where is has moved to.

If you're using Apache then you just create a .htaccess file in the root of your site which contains lines like this:-

redirect 301 /path/to/old-url.htm www.yourdomain.com/path/to/new-url.htm
redirect 301 /path/to/old-url.htm www.yourdomain.com/path/to/new-url.htm


One line per redirect, the first url should not contain the domain name, but the second one should.

As @DisgruntledGoat points out If any of the urls match a pattern i.e.
www.domain.com/media/news/name-of post


Then you can use the mod_rewrite feature of Apache to redirect all urls with a short statement placed in the .htaccess file.

If you don't have a Google Webmaster tools account sign up for one, this will help you understand how your site is ranking and what URLs Google is searching for.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme