Mobile app version of vmapp.org
Login or Join
Caterina187

: Is it considered blackhat to show structured data to search engine bots but not humans? Is it considered blackhat if the JSON-LD structured data is only shown to search engines? The content

@Caterina187

Posted in: #Blackhat #Cloaking #JsonLd #Seo #StructuredData

Is it considered blackhat if the JSON-LD structured data is only shown to search engines? The content is still present on the web-site for users, but it's only marked up as structured data when we detect that the visitor is a bot. Can we get penalized because of this?

This is to add a layer of protection against people trying to scrape the site.

(I changed the title of the question to re-iterate my point. There are instances when showing one type of content to search engines and another type of content to humans is permissible, example is sites that use Flash. I'm wondering if an exception exists for structured data.)

Example of the data

Here's one example of structured data that's inside the source code that the bots are supposed to read. This same content has to be present on the page in visible format as well so the humans can read it.

{
"@context": "http://schema.org",
"@type": "Person",
"address": {
"@type": "PostalAddress",
"addressLocality": "Good city",
"addressRegion": "Great State",
"postalCode": "47918",
"streetAddress": "701 N Nice St"
},
"name": "Firstname LastName",
"telephone": "765-764-1111"
}

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Caterina187

1 Comments

Sorted by latest first Latest Oldest Best

 

@Jamie184

Cloaking is when you show content to a search engine that is different than what you show to a user.

Google tests web pages from outside of it's own network and you will never know. If there is a difference between what you show googlebot and users, Google will spot-check more pages for differences. If enough pages appear to have significant enough differences, the penalty is applied.

It is as simple as that.

To answer your question, it can be cloaking if the content change is significant enough. Never show different content to search engines than to users. Just keep it all simple.

[Update]

Thank you for the example of what you are specifically asking.

Cloaking is what I have defined earlier, however, there is a bit of tolerance especially in light of desktop versus mobile. In the early days, cloaking could simply be determined by capturing the page twice, once via the crawler, and once external to the crawler often from another network, and comparing the checksum for each page. However, these days, it is not so simple with desktop versus mobile.

We know that Google can fetch a series of pages and determine templated content versus page content rather easily. In light of the state of the web these days, I would have to assume that some level of analysis takes place to compare the content portion of the page and possibly the template portion of the page separately. How pages are analyzed for cloaking these days will likely remain a mystery. However, it is reasonable to assume that some minor differences in the non-content portion of the page is to be expected in some cases.

The next question is, Is it wise to present JSON data to crawlers only?

No one can say specifically if a search engine, Google in particular since Bing seems to be rather tolerant, will see the omission of JSON as being deceptive. It has to be recognized as a risk even if it appears to be small and a reasonable thing to do. As a recommendation, I would say to include the JSON data to both users and crawlers to avoid any issues. Why? Because cloaking is not a small violation at least in Google's eyes. If cloaking is detected, Google will spot check the site and then apply the penalty. This is an automated process. Once the penalty is applied, it can take quite a while to remove the penalty and likely is a knock on the sites trust metrics effecting search even after the penalty is lifted.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme