Mobile app version of vmapp.org
Login or Join
Fox8124981

: Rewritten URLs stopped getting indexed a month ago I'm working on a website for someone which sells their products on their web site. There is an item.php page which displays the item for sale.

@Fox8124981

Posted in: #Dynamic #Google #ModRewrite #Static #UrlRewriting

I'm working on a website for someone which sells their products on their web site. There is an item.php page which displays the item for sale. I switched all item pages/inks from dynamic URLs to static ones, and Google indexed the pages just fine. However, about a month ago, they stopped indexing all newly added pages. All old pages, previous to a month ago, are still in their search results, in their rewritten URL format.

Things to note:


There is a sitemap.xml which has all item links written in their static format(rewritten format). The newly added products are in the start of the sitemap. So it goes from newer, to older items.
Google webmaster tools knows about the sitemap(i've added it manually)
Sitemap tests show no errors
The sitemap is in robots.txt
All new products are also linked from site's home page under as "new items" list(so bots can find items there as well, without the sitemap)
New items are always added to the sitemap.
Google's index number on the sitemap seems to be stuck at a number such as 2500 submitted, but only 2350 indexed.
When items are no longer for sale, they are removed from sitemap and all pages on the website. And if someone tries to access the page, page reports as a 410 GONE to browsers/spiders.
Google doesn't show any site errors in the webmaster tools
Google fetch utility fetches all the pages and links fine with success! Which makes it even more weird!


Why I believe this has to do with dynamic URL being the issue: I also have newly added categories which are NOT rewritten in the URLs, so they will be like example.com/category.php?id=Shoes, and Google will always find those newer pages! Yet, not the specific new items which have rewritten URLs.

I'm thinking of switching back to dynamic URLs, since it seems static URLs are not being added to index now...

Any ideas? Thank you all.

This is the .htaccess

AddHandler application/x-httpd-php53s .php
ErrorDocument 404 /404.shtml
RewriteEngine On
RewriteRule ^buy-used/[^/]+/(.*)$ buy-used/item.php?id= [L]

<Files 403.shtml>
order allow,deny
allow from all
</Files>

deny from 5.52.34.246


robots.txt file here:

# robots.txt generated at www.mcanerin.com User-agent: *
Disallow: /cgi-bin/
Sitemap: www.ahparts.com/sitemap.xml

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Fox8124981

1 Comments

Sorted by latest first Latest Oldest Best

 

@Turnbaugh106

Google has indexed your site perfect as expected and rather quickly too as mentioned in comment. Google will always try to return results from other sites if you do not use the site:youdomain.com your-url-or-keyword-here so at the moment I am seeing pages being returned from your site, if it doesn't pop up then you define the search more, Since Google is treating your search a broad search and often will return results from what it feels is the best result to the person searching.

Using your own search term as an example www.ahparts.com/buy-used/1991-Acura-NSX-HOOD-BLACK-60100-SL0-A90ZZ/57923
I search using ahparts.com HOOD BLACK 60100-SL0-A90ZZ

Google Search Webmasters www.bybe.net/downloads/google-search-webmasters01.jpg
Now as you can see above you was hoping that the URL of the actual black hood would be returned but instead of has returned the parent page... This is purely because Google believes that the parent page is more valuable to the searcher, this is purely because the actual hood page is being treated as 'thin content', you have some pictures and very little text for Google to go off on... So if you want these pages ranking better it's time to write more content on those pages as the parent pages are giving better siginals to Google and is being preferred by Google. (Using a broad search such as ahparts.com often will only return 1-2 results max of your site).

Now let's try doing this search with site:ahparts.com HOOD BLACK 60100-SL0-A90ZZ

Google Search Webmasters www.bybe.net/downloads/google-search-webmasters02.jpg
As you can see both pages are now found this is because Google now knows that you only want results from your domain and it's disgrading what they believe is best for you and just matching everything possible on the site.

Fatter Content

If you want your pages ranking better then its simply the case of beefing up your pages, the parent page in comparision to the hood page that you would think Google would prefer is not because of the thin content, you can have a million pictures and Google will still prefer the text content page since its able to verify the page contents more. So I understand theres so much you can write about a hood but maybe list the things like the metal type, and some other details from the stickers on the back, its not my field so I wouldn't have a clue but for sure beefing up the pages will only increase your rankings.

Page Problems

You also have some major page problems which could be causing Google to rank the pages incorrectly, on the front page for example you have JavaScript before the HTML and HEAD tags, this can cause all types of problems for search engines nevermind the end customers. This problem actually gets worse thoughout the site:

Referring to the example page we used take a look at this:

<html>
<head>
<title>Buy 0 1991 Acura NSX trim(HOOD BLACK 60100-SL0-A90ZZ) Sale # 57923</title>
<LINK REL='SHORTCUT ICON' HREF='http://www.ahparts.com/images/ahparts.ico'>

<meta name='description' content='57923 For sale - Buy 1991 Acura NSX HOOD BLACK 60100-SL0-A90ZZ - Used Category: Hoods - Online OEM Honda & Acura auto wrecker recyclers! We search our scrap yard & pull for free. Computerized dismantlers inventory & WORLDWIDE SHIPPING! From body to mechanical parts, we salvage almost anything. Buy from our junkyard & save big by avoiding dealer pricing!'>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<HTML><HEAD>
<LINK REL='SHORTCUT ICON' HREF='http://www.ahparts.com/images/ahparts.ico'>
<title>Used Honda Acura Parts 877-859-0023 Auto Wreckers Recyclers - Discounted Cheap</title>
<meta name="description" content="Your specialized OEM Honda & Acura auto wrecker recyclers! We search our scrap yard & pull for free. Computerized dismantlers inventory & WORLDWIDE SHIPPING! From body to mechanical parts, we salvage almost anything. Buy discounted & cheap from our junkyard. Save big by avoiding dealer pricing!">
<meta name="keywords" content="specialized, used, OEM, Honda, Acura, auto wreckers, recyclers, scrap yard, dismantlers, inventory, WORLDWIDE SHIPPING, body, mechanical parts, salvage, junkyard, save, avoid dealer pricing">


You should never have more than one head and content must always be wrapped within the HTML tag, also you should take w3c validation very seriously - not only can this help search engines rank your pages easier it ensures that your end customers hardly get any problems. 1-2 errors is generally not a problem depending if they are not major ones which you have but at present you have over 298 errors which you should get resolved.
validator.w3.org/check?uri=http%3A%2F%2Fwww.ahparts.com%2Fused-parts.php%3Fid%3D13015&charset=%28detect+automatically%29&doctype=Inline&group=0

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme