Mobile app version of vmapp.org
Login or Join
Angie530

: After site was hacked, how to remove foreign pages from Google's Index? I'm dealing with a site that was hacked a while back. Google indexed thousands of pages using JAPANESE results. I have

@Angie530

Posted in: #GoogleSearchConsole #HackedSite #Htaccess

I'm dealing with a site that was hacked a while back. Google indexed thousands of pages using JAPANESE results. I have used my robots file to disallow everything except the pages that actually exist on my site and used htaccess to create 404s for pages that don't exist.

Google continues to show sitelinks (in JAPANESE) to pages on my site. If I check webmaster tools, there are still thousands of pages indexed and content keywords show mainly JAPANESE terms.

There is no JAPANESE version or text on the website.

What is different about this issue is that google is showing the site link text in JAPANESE and now linking to my top pages that exist. I can't disallow these pages. I also need to change the fact that google has all this foreign info in its index and still contains URLs that don't exist on the site.

The snippets served by google all return a 404 but they are still in the index.

How are they still indexing this content?

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Angie530

1 Comments

Sorted by latest first Latest Oldest Best

 

@Kimberly868

Don't block them from being crawled - this doesn't remove them from the index. It only stops Googlebot from looking at them.

Normally, the fastest way is to use the Search Console removal tool. For the numbers you're talking that doesn't sound possible as they have to be entered one-by-one.

The next fastest in my experience would be to create a sitemap that does an alt language mapping - Sitemaps are crawled and processed very soon after being submitted. If you tell Google each of the bad URIs are Chinese language (rel="alternate" hreflang="zh-Hans") and then put in real URIs as the "en" alternatives - this will replace them in English language engines. You can use the same URI multiple times.

Example:

<url>
<loc>http://www.example.com/bad-chinese-page/</loc>
<xhtml:link
rel="alternate"
hreflang="zh-Hans"
href="http://www.example.com/bad-chinese-page/"
/>
<xhtml:link
rel="alternate"
hreflang="en"
href="http://www.example.com/good-page/"
/>
</url>


Make sure each of these pages is returning a 410 error. This doesn't just tell Google the server can't find the content - it categorically says it's no longer there. They'll be dropped faster from the index.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme