: Why is Google still not indexing my !# website? I have been working on a website which uses #! (2minutecv.com), but even after 6 weeks of the site up and running and conforming to the Google
I have been working on a website which uses #! (2minutecv.com), but even after 6 weeks of the site up and running and conforming to the Google hash bang guidelines stated here, you can still see that Google still hasn't indexed the site yet. For example if you use Google to search for 2MinuteCV.com benefits it does not find this page which is referenced from the homepage.
Can anyone tell me why Google isn't indexing this website?
Update:
Thanks for al lthe help with this answer. So just to make sure I understand what is wrong. According to the answers Google never actually indexes the pages after the Javascript has run. I need to create a "shadow site" which google indexes (which google calls HTNL snapshots). If I am right in thinking this then I can pick a winner for the bounty
Update 2:
Since this was posted we have now switched to having static HTML files on our server. This is simple, fast, and gets indexed properly by google
More posts by @Heady270
6 Comments
Sorted by latest first Latest Oldest Best
EDIT : This answer is very similar to Anthony Hatzopoulos
Are you aware that there is NO content on your rendered pages (sorry for yet another answer, but they are different suggestions)? This has nothing to do with load time either.
Google can read the rendered HTML of a web page - however, if you go to www.2minutecv.com/#!/en_us/benefits and view the source (right click the page View Source), you will see that all you have is empty DIV tags, a few javascript commands, and a "all rights reserved" footer - despite the fact I can see your page has content and sections (such as Stay focused on the content), a search in the View Source does not display the information or the text. You actually have a rendered HTML page with 0 content according to the source file!
I think Google is seeing your page but (I assume) because it can't see any content in the rendered output (which is the only thing Google can read) it doesn't index it.
The reason google is not following your Shebang (#!) links is because when the page loads initially they do not exist and they are no where to be found in the source code. In other words with javascript disabled you do not have a single <a> anchor tag in your html source of your page. The only thing that will be indexed is a blank page with copyright. Home, Benefits, How it works, and FAQ links get loaded via javascript. Disable javascript and you get this (which is what google gets):
Google will not index what it cannot crawl. Neither will other search engines. Google can run javascript but don't bank on it being used for crawling content (yet). It will parse some javascript and ajax links. In your case your page source has none.
So you need to add static tags on your page linking to these #! pages AND it wouldn't hurt if you added a sitemap.xml. Which by the way is strangely pointing to another 'youcaneat.at' website all together. www.2minutecv.com/sitemap.xml
And if you can avoid it, stop asynchronously loading everything after page load. Your site is not fancy enough to need ajax and there is no real benefit in your case to employ the tactic.
The main reason for your pages not being indexed is because there are no html links.
You're providing javascript links to the other pages and while the #! denotes that it should be a different page - you're not upholding your end of Google's javascript crawling agreement:
An agreement between crawler and server
In order to make your AJAX application crawlable, your site needs to
abide by a new agreement. This agreement rests on the following:
The site adopts the AJAX crawling scheme. For each URL that has
dymanically produced content, your server provides an HTML snapshot,
which is the content a user (with a browser) sees. Often, such URLs
will be AJAX URLs, that is, URLs containing a hash fragment, for
example example.com/index.html#key=value, where #key =value is the
hash fragment. An HTML snapshot is all the content that appears on the
page after the JavaScript has been executed. The search engine indexes
the HTML snapshot and serves your original AJAX URLs in search
results.
(quote from developers.google.com on 17th febr 2012)
Since you do not provide a html fallback by which the crawler can determine what is static vs what is javascript it most likely will refuse to crawl its content.
Secondly, since the non #! urls all point to some 'youcaneat.at' page which bares no resembles, Google's bot is most likely to assume its a 'spam' attack, which will definitely not improve your chances of getting your javascript indexed.
Rule of thumb to keep in mind: stay with html when you can because Google promises you, that it might index javascript, at best.
Jochem.
The reason is because you're using the hash symbol. This indicates that the link is on the same page to the bot.
www.2minutecv.com/#!/en_us/benefits
An example of this is the stackexchange website. Go to the FAQ section, the links on the right use local html links which reads the name prefixed by the hash symbol.
webmasters.stackexchange.com/faq#reputation
Google will not index #reputation because it's just part of the page and webmasters.stackexchange.com/faq has already been indexed.
I would assume, due to your link URL, Google thinks www.2minutecv.com/#!/en_us/benefits is the same page as www.2minutecv.com/ and regardless of whether it can crawl it or not is not indexing it.
Do you have any other sites pointing to it? Ironically the fact you've added a link to it from this site will ensure it does get indexed (not 100% but I would put money on)
Any way, it is indexed:
Google Link
Also, your code is poor... You have this code (as an example - this is copied from your site):
<img src="/images/arrow_to_login2.png" style="z-index: 3; top:292px; left: 315px; position:absolute;"></img>
There is no closing img tag, it is self closing... This is just one example, if your site is not coded well, then Google may struggle or fail, or index it only in part. I strongly suggest you put your website name into the W3C Markup Validation and correct it. This will help.
From Google FAQ:
Q: My site isn't indexed yet!
A: Crawling and indexing are processes which can take some time and which rely on many factors. In general,
we cannot make predictions or guarantees about when or if your URLs
will be crawled or indexed. When looking at a site's indexing in
Webmaster Tools, make sure that you have both the "www" and the
"non-www" versions (like "www.example.com" and "example.com") verified
and have a set a preferred domain. Keep in mind that while a Sitemap
file can help us learn about your site, it does not guarantee indexing
or increase your site's ranking.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.