Mobile app version of vmapp.org
Login or Join
Caterina187

: How is HTTP 418 treated by Google and others as it is not a "real" error? I was wondering if you know how Google and other search engines treat a website with HTTP status code 418 I'm a

@Caterina187

Posted in: #Google #HttpHeaders #Seo

I was wondering if you know how Google and other search engines treat a website with HTTP status code 418 I'm a teapot.

According to this Wikipedia article, it can be used as a client error code (4xx). I would like to use this error code for an easter egg website, which should, nevertheless, be found by the search engines.

According to this 4 year old blog post, status 418 will be ignored by Google. Do you have any more recent information about this topic? How do the other search engines react on status 418 (mainly because it is a 4xx code).

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Caterina187

1 Comments

Sorted by latest first Latest Oldest Best

 

@Ogunnowo487

If you use the "Fetch as Google" tool in Google Search Console on a page that returns a "418 I'm a Teapot" status then it simply reports an "Error" and indexing cannot be requested for this page.

In the screenshot below, the circled "Error"s are the result of requesting a page that returns a 418 status. No further information is available at this stage.



According to my access log, both Googlebot and Search Console have visited this page, but it has not yet appeared in the index.

Just to clarify, this is a new page, not previously indexed. It is linked from a page that is indexed, which has also been resubmitted (together with "linked pages") for indexing - seen in the screenshot above. I have also submitted an XML sitemap that contains this page (although the "Indexed" count is not yet being reported - SEE UPDATE BELOW). To be honest, I don't hold much hope - I would be surprised if it did get indexed. Not only because it's a 4xx code, but because it's not a 2xx success code.

Ordinarily, you can do a "Fetch as Google" test and then request the page be indexed. This is usually very quick ("instant") for a single page - but this option is not available on the above page.


According to this 4 year old blog post, status 418 will be ignored by Google.


By "ignored", they mean it is treated as 200 OK status. (Which isn't really the same as being "ignored" in my book, unless it was literally ignored and Google did "nothing"?) The "problem" with that blog post, is that they are testing an already indexed page. Returning a 4xx status wouldn't necessarily make the page drop from the index anyway, at least not for a considerable time (depending on crawl rate), although they did reportedly wait "a few weeks". They also make no mention of reported crawl errors in Google Webmaster Tools (since changed to Google Search Console).


it is not a "real" error


Or is it? It may have been implemented as a "joke" in the beginning, however, it does arguably indicate an "error state". I think it would be more contradictory for a 4xx code to not be treated as an "error state". And it's still "current". The original RFC 2324 from 1998 that defined this status code was even updated in 2014 with RFC 7168.

Most tools will see the 418 status as an error. Or only see 200 as success. "Apache log viewer" and "Screaming Frog SEO Spider" certainly see the 418 code as an error.

Some web servers reportedly implement the 418 status code:

stackoverflow.com/questions/24018008/is-there-a-server-that-implements-http-status-code-418

Stack Exchange even make use of this HTTP status code when detecting CSRF violations:

meta.stackexchange.com/questions/185426/stack-overflow-returning-http-error-code-418-im-a-teapot

UPDATE 2017-03-31 (2+ weeks later): The page that returns a 418 HTTP status code is not indexed by Google. The XML sitemap report in GSC now shows that only one of the two URLs submitted in the sitemap are indexed (one URL returns a 200 and is indexed, the other returns a 418 and is not indexed).

Incidentally, it took GSC almost 2 weeks to report on the index status of the URLs in the sitemap, but this does not relate to when the page(s) were actually indexed. For example, one page was already indexed at the time the sitemap was submitted, however, looking at the sitemap report alone it looks like the page was only indexed 13 days after the sitemap was submitted.

The URL that returns a 418 is now reported as a "Crawl Error" under Crawl > Crawl Errors and the 418 is stated as the response code. According to the report this was "detected" on 2017-03-16 (the next day after submitting the index request above), however, it was sometime before this was reported in GSC.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme