Mobile app version of vmapp.org
Login or Join
Cody1181609

: Remove all content not related to index.html from the web and google I have seen many suggestions on various redirects, but none were simple and many had no accepted answer. I have a site

@Cody1181609

Posted in: #301Redirect #Htaccess

I have seen many suggestions on various redirects, but none were simple and many had no accepted answer.

I have a site I wish to completely remove from google and have only my homepage available

In that homepage I have image css and js files so they of course should not be redirected

My plan was to redirect all .html and all .php that are not the /index.html in root to the /index.html in root

Of course / should also be allowed.

So /js, /css, /img and /images should be left alone

Any other php or html page I thought I wanted to have 301 to /index.html

This worked but as pointed out in a comment, does not tell Google that the content that it indexed is no longer supposed to be there

Stackoverflow: how-to-redirect-all-pages-only-to-index-html-using-htaccess-file-and-not-redirect

RewriteEngine on
RewriteCond %{REQUEST_URI} !^/index.html$
RewriteCond %{REQUEST_URI} !.(gif|jpe?g|png|css|js)$
RewriteRule .* /index.html [L,R=301]


So my amended question is

How to tell google my content is gone and redirect all requests for content (bookmarked pages for example or external links) made to my site to /index.html

Update

ErrorDocument 410 /error-docs/error410.html

RewriteEngine on
RewriteCond %{REQUEST_URI} !^/error-docs/
RewriteCond %{REQUEST_URI} !=/index.html
RewriteCond %{REQUEST_URI} !=/
RewriteCond %{REQUEST_URI} !.(gif|jpe?g|png|css|js)$
RewriteRule .* - [G]


almost works

But I want to return 410 on all files in 4 subfolders and whatever is under them

I have

/index.html
/images/
/js/
/css/
/unwantedfolder/with/stuff/imagesandhtml
/anotherunwantedfolder/with/stuff/imagesandhtml


I want to give 410 for now on all request to anywhere in the unwanted folders

If I add

RewriteRule ^unwantedfolder - [G]


like this

ErrorDocument 410 /error-docs/error410.html

RewriteEngine on
RewriteCond %{REQUEST_URI} !^/error-docs/
RewriteCond %{REQUEST_URI} !=/index.html
RewriteCond %{REQUEST_URI} !=/
RewriteCond %{REQUEST_URI} !.(gif|jpe?g|png|css|js)$
RewriteRule .* - [G]

RewriteRule ^unwantedfolder - [G]


nothing happens to
www.myserver.com/unwantedfolder/bla/images/someimage.png

It shows without any redirection, likely due to the !.(gif|jpe?g|png|css|js)$ earlier
whereas
www.myserver.com/unwantedfolder/bla/somepage.html


does get a error410 page

10.02% popularity Vote Up Vote Down


Login to follow query

More posts by @Cody1181609

2 Comments

Sorted by latest first Latest Oldest Best

 

@Ann8826881

Following on from comments... since you are wanting to completely remove these pages from Google's index then simply redirecting (301) them (as requested in your original question) is not necessarily the correct thing to do. Redirection is saying that the page has moved. Yes, Google is likely to drop the original page from the index... eventually, but that could take some time. Trying to preserve PR by redirecting all pages to the homepage is unlikely to provide the SEO benefit you might hope for, and this is generally confusing for users.

I would suggest serving a custom 410 (Gone) for these pages, with a prominent link to the homepage (if you wish) and not actually send the user to the homepage directly - unless your homepage is your 410!?

Modifying your current .htaccess rules:

ErrorDocument 410 /error-docs/e410.html

RewriteEngine on
RewriteCond %{REQUEST_URI} !^/error-docs/
RewriteCond %{REQUEST_URI} !=/index.html
RewriteCond %{REQUEST_URI} !.(gif|jpe?g|png|css|js)$
RewriteRule . - [G]


The single hyphen (-) in the RewriteRule substitution passes the URL through unchanged. The G (GONE) flag returns a 410 status code and results in your custom 410 being served. An exception for the /error-docs/ folder is also required.

CHANGE: Note, I've changed the RewriteRule pattern from .* (meaning "anything") to simply . (single period) (meaning "something"). This is an alternative to specifying an additional RewriteCond directive for the root URL. So, the following is unnecessary:

RewriteCond %{REQUEST_URI} !=/


This should ensure that Google will remove these pages as-soon-as. You should also be able to see confirmation of this in terms of a crawl error report in GWT (yes, it is a crawl error, but it is intentional). This also provides a meaningful message to users and should encourage them to update/delete their bookmarks as required.

UPDATE: As mentioned in comments, the above rules still permit all the "gallery" images to be accessed (in a sub folder). In order to prevent the gallery images, we can add another RewriteRule following the directives above:

# (Above directives go here...)

RewriteRule ^galleryd - [G]


This will block all URLs (including images) that start /gallery1, /gallery2, etc. (Note that the / prefix is intentionally omitted from the RewriteRule pattern.) However, the directives at the top will still allow all the other images, necessary to build your homepage.

Note that this second RewriteRule is entirely separate from the previous RewriteRule and RewriteCond directives above. RewriteCond directives only apply to the single RewriteRule that follows them. So, the RewriteCond %{REQUEST_URI} !.(gif|jpe?g|png|css|js)$ does not apply to this second RewriteRule.

Summary

The following is the complete set of rules:

ErrorDocument 410 /error-docs/e410.html

RewriteEngine on

# Serve 410 to all files except:
# error documents, /index.html, / (root) and images
RewriteCond %{REQUEST_URI} !^/error-docs/
RewriteCond %{REQUEST_URI} !=/index.html
RewriteCond %{REQUEST_URI} !.(gif|jpe?g|png|css|js)$
RewriteRule . - [G]

# Serve 410 to EVERYTHING within the /unwantedfolder
# >>> including images <<<
RewriteRule ^unwantedfolder - [G]

# Serve 410 to EVERYTHING within the /anotherunwantedfolder
RewriteRule ^anotherunwantedfolder - [G]

10% popularity Vote Up Vote Down


 

@Debbie626

How about:

RewriteRule (html|php)$ www.example.com [R=301,L]


This matches all requests which end either with html or php strings.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme