: Google Structured Data Testing Tool gives "The URL is unreachable" error The SDTT will not read my pages. No matter which subpage I try to fetch, I get the error: The URL is unreachable.
The SDTT will not read my pages. No matter which subpage I try to fetch, I get the error:
The URL is unreachable. Ensure robots.txt is accessible and the server is responding with a 200 status code.
An example page is: www.rsvp.dk/events/del-popolo-secret-adventures
All pages are loading fine (200 status). Other sections of my site will test fine, but all pages under /events (the only ones I really care about!) will not load. I didn't have a robots.txt file, but I tried adding one, and testing it for those URLs shows they're allowed.
The page at /events used to be an Angular app that loaded list data and details pages asynchronously, but it was changed about a month ago to plain old HTML pages. The new pages are being indexed somewhat, but not as well as I'd want them to, and structured data (JSON-LD) is not being picked up. I don't know if the previous Angular page could still somehow be affecting Google?
For what it's worth, if I copy/paste the source into the window, it verifies fine, so it seems that Google for some reason just can't access them?
UPDATE: Since some sections of the site, such as /blog can be accessed fine, it can't be an SSL or blocking issue. The strange thing is that I can create a new URL for the same content, replacing /events with /popups. This new path still can't be reached. I'm really lost here...
More posts by @Gretchen104
2 Comments
Sorted by latest first Latest Oldest Best
Case solved. For anyone else with a similar issue later, I had an i18n culture resolving mechanism that is based on cookies. If no preference cookie is set, it'll do a geolocation lookup based on the host IP address, and set a cookie according to the country of origin of the request.
I'm not sure if it's the simple act of setting a cookie on the response that caused the issue for SDTT, or some kind of problem resolving the IP address and associated country, but not setting a cookie in the response solved it, so that might be the issue.
I had a fallback in place, so getting a strange response from geolocation lookup should have been handled, also, I've tried running it locally by spoofing a number of IP addresses of the googlebot (according to chceme.info/ips/) but these all behaved as expected and returned 200. I can't be certain there isn't a bug in it somewhere, but right now my money's on the response including a cookie.
I tested your site with the command line tool CURL version 7.16.2 with no useragent string and the googlebot user agent string and it produced the following results:
curl: (60) SSL certificate problem, verify that the CA cert is OK. Details:
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
More details here: curl.haxx.se/docs/sslcerts.html
curl performs SSL certificate verification by default, using a "bundle"
of Certificate Authority (CA) public keys (CA certs). The default
bundle is named curl-ca-bundle.crt; you can specify an alternate file
using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
the bundle, the certificate verification probably failed due to a
problem with the certificate (it might be expired, or the name might
not match the domain name in the URL).
What I would suggest then is for you to make sure you are using valid SSL certificates and that your webpages load fast from data centers near Google. You can do this at the webpagetest.org website.
If that fails, then I'd recommend not using SSL for all your generic informative pages and use it only for pages that involve sensitive data such as when users are logged in or are filling out a form containing sensitive data.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.