Mobile app version of vmapp.org
Login or Join
Ann8826881

: GWMT Show Non Existent Backlinks "Via Intermediate Link" I run 2 sites, Site A and Site B. These are ecommerce sites, with site A being built first, and both sites selling the same thing.

@Ann8826881

Posted in: #Backlinks #DuplicateContent #GoogleSearchConsole #Links

I run 2 sites, Site A and Site B. These are ecommerce sites, with site A being built first, and both sites selling the same thing. They are hosted separately, use separate CMS (one magento and one pinnacle), and do not use duplicated content. We sell chemicals, this is important because we make available for download (by law mind you) the Labels and MSDS's for said chemicals. These are legal documents that are written by the manufacturers for the consumer's safety, etc. My problem is that my GWMT account shows over 16k backlinks going from site B to site A, and all 16k can be associated with 24 pdf files on site A that were simply copy and pasted into site b's server. GWMT shows them as backlinks "via intermediate" links which should indicate both a link somewhere AND a redirect somewhere, neither of which exist. There are no links, there are no redirects anywhere in my code. These should not be showing up in the first place. I have no unnatural link flag warnings. Is anyone having this issue, or have a solution for this issue? Why is google showing these as links in the first place? Am i being penalized for duplicate content?

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Ann8826881

1 Comments

Sorted by latest first Latest Oldest Best

 

@Megan663

Google is certainly recognizing the duplicate content. unless you are seeing ranking drops, it is unlikely that you are being penalized. What happens is likely:


You publish the PDF on one site: site1.example.com/docs/chemical-foo-safety-data.pdf
You publish the PDF on the other site: site2.example.com/pdf/safty-data-chemical-foo.pdf
Googebot recognizes that these two documents are identical
Google chooses site1.example.com/docs/chemical-foo-safety-data.pdf as the canonical document
When Google sees a link to site2.example.com/pdf/safty-data-chemical-foo.pdf it treats it as if it were a link to the canonical document.
You get messages about redirects because of how Google handles canonicalization.


While you have a legal obligation to publish this content and make it available to your users, you don't have to make it crawlable. There are several reasons that you might want to block it in robots.txt:


It was not authored by you and can likely be found on other sites.
You have it on both of your sites, and Google is clearly seeing your two sites as related because of it.
You wouldn't expect users to come to your site when searching for words or phrases contained in these documents. Even if the documents do attract visitors, you would rather have users land on a page where they could purchase products.


For more information, you should read What is duplicate content and how can I avoid being penalized for it on my site? about when Google does and does not penalize for duplicate content.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme