: How to make Google index files retrieved from database? We use Joomla with Remository to store and manage publications (don't ask me why). Files (PDF) are stored in a database and can be accessed
We use Joomla with Remository to store and manage publications (don't ask me why). Files (PDF) are stored in a database and can be accessed via dynamic, rewritten links of the form
domain.de/some/path/filename.html
Here is an example: some file
Current browsers reliably detect that they get a PDF. wget uses the .html filename but after renaming I get a working PDF file. curl behaves similarly; piping its output into a (suitably named) files gives a working file. All this leads me to believe that -- against all odds, one might say -- the data our system provides is generally valid and understandable for clients.
However, Google does not seem to index PDF files referenced by such links. Our publication list is indexed, but the PDFs linked there are not (they don't show up in web and Scholar searches).
How can we tell search robots to retrieve our files and index them?
More posts by @Murray155
1 Comments
Sorted by latest first Latest Oldest Best
You cannot tell them but give them a strong hint by providing a sitemap. Google may or may not index those these even with a sitemap. It will tell you how many of the sitemap files were indexed. You need a Google Webmaster Tools account and register your website with them. Once done, sitemap submissions and index status appears the reports.
From a search engine's perspective it really does not matter where the data comes from, only that it is accessible. You may be doing something fancy that Google does not like but it is not the fact your documents are in the database.
From the link you provided, I see something automatically trying to download when clicking on your links which may count as an unwanted drive-by download, so be careful and is really a poor user experience. If the link is meant to be a download, then there are pages too many. Check your mime-types too as they may simply be confusing the Google crawler.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.