: How to introduce large number of pages (that are only accessible through an ajax driven search mechanism) to search engines We've got a web application with the main purpose of searching through
We've got a web application with the main purpose of searching through a set of records
search mechanism itself is completely AJAX based. A typical search URL would be something like this:
/search#arg0=v0&arg1=v1 (it can go up to 20 args)
Currently there are about 1000 records added to the database daily. Each 30 to 50 records out of those 1000 may have similar page titles (but not exactly the same). I would like to get those results indexed.
My solution is to add links of some specified search queries to the sitemap. We don't know how frequently crawling happens, or how many records should be display as search results (which makes some records to be missed). We can check the user agent string to get rid of AJAX driven stuff. and send the snapshot of the response to search engines. To solve the missed records issue I thought maybe store the time of last crawl somewhere and show the results according to that value. but that imposes a security issue. if someone changes the user agent value and send a request, that crawl time value can be easily changed.
Any ideas? is there a better, more straight-forward way than this?
More posts by @Dunderdale272
1 Comments
Sorted by latest first Latest Oldest Best
Changing the content on the page for search engines based on user agent is called cloaking. Using cloaking is against Google's webmaster content guidelines and can get your site removed from the search results.
Instead you should use crawlable AJAX where you replace the # in the URL by #! and support server side snapshots through the _escaped_fragment URL parameter.
It is also problematic that you are proposing the crawling and indexing of search results. Google runs its own search engine and wants to direct users to the content, not to additional search results. They call that search results in search results. Having your search results pages crawled and indexed is another thing that can get your site removed from Google.
Instead you should create pages for your actual content and put those pages into the sitemap directly. Just having content listed in your sitemap is not enough to get it ranked well, see the sitemap paradox. You also need to figure out how to link your content together. You should be able to click and navigate to all the content on your site without using search. "related content" lists like the one on the right of this Stack Exchange page are one very popular way to do so.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.