: Remove all content not related to index.html from the web and google I have seen many suggestions on various redirects, but none were simple and many had no accepted answer. I have a site
I have seen many suggestions on various redirects, but none were simple and many had no accepted answer.
I have a site I wish to completely remove from google and have only my homepage available
In that homepage I have image css and js files so they of course should not be redirected
My plan was to redirect all .html and all .php that are not the /index.html in root to the /index.html in root
Of course / should also be allowed.
So /js, /css, /img and /images should be left alone
Any other php or html page I thought I wanted to have 301 to /index.html
This worked but as pointed out in a comment, does not tell Google that the content that it indexed is no longer supposed to be there
Stackoverflow: how-to-redirect-all-pages-only-to-index-html-using-htaccess-file-and-not-redirect
RewriteEngine on
RewriteCond %{REQUEST_URI} !^/index.html$
RewriteCond %{REQUEST_URI} !.(gif|jpe?g|png|css|js)$
RewriteRule .* /index.html [L,R=301]
So my amended question is
How to tell google my content is gone and redirect all requests for content (bookmarked pages for example or external links) made to my site to /index.html
Update
ErrorDocument 410 /error-docs/error410.html
RewriteEngine on
RewriteCond %{REQUEST_URI} !^/error-docs/
RewriteCond %{REQUEST_URI} !=/index.html
RewriteCond %{REQUEST_URI} !=/
RewriteCond %{REQUEST_URI} !.(gif|jpe?g|png|css|js)$
RewriteRule .* - [G]
almost works
But I want to return 410 on all files in 4 subfolders and whatever is under them
I have
/index.html
/images/
/js/
/css/
/unwantedfolder/with/stuff/imagesandhtml
/anotherunwantedfolder/with/stuff/imagesandhtml
I want to give 410 for now on all request to anywhere in the unwanted folders
If I add
RewriteRule ^unwantedfolder - [G]
like this
ErrorDocument 410 /error-docs/error410.html
RewriteEngine on
RewriteCond %{REQUEST_URI} !^/error-docs/
RewriteCond %{REQUEST_URI} !=/index.html
RewriteCond %{REQUEST_URI} !=/
RewriteCond %{REQUEST_URI} !.(gif|jpe?g|png|css|js)$
RewriteRule .* - [G]
RewriteRule ^unwantedfolder - [G]
nothing happens to
www.myserver.com/unwantedfolder/bla/images/someimage.png
It shows without any redirection, likely due to the !.(gif|jpe?g|png|css|js)$ earlier
whereas
www.myserver.com/unwantedfolder/bla/somepage.html
does get a error410 page
More posts by @Cody1181609
2 Comments
Sorted by latest first Latest Oldest Best
Following on from comments... since you are wanting to completely remove these pages from Google's index then simply redirecting (301) them (as requested in your original question) is not necessarily the correct thing to do. Redirection is saying that the page has moved. Yes, Google is likely to drop the original page from the index... eventually, but that could take some time. Trying to preserve PR by redirecting all pages to the homepage is unlikely to provide the SEO benefit you might hope for, and this is generally confusing for users.
I would suggest serving a custom 410 (Gone) for these pages, with a prominent link to the homepage (if you wish) and not actually send the user to the homepage directly - unless your homepage is your 410!?
Modifying your current .htaccess rules:
ErrorDocument 410 /error-docs/e410.html
RewriteEngine on
RewriteCond %{REQUEST_URI} !^/error-docs/
RewriteCond %{REQUEST_URI} !=/index.html
RewriteCond %{REQUEST_URI} !.(gif|jpe?g|png|css|js)$
RewriteRule . - [G]
The single hyphen (-) in the RewriteRule substitution passes the URL through unchanged. The G (GONE) flag returns a 410 status code and results in your custom 410 being served. An exception for the /error-docs/ folder is also required.
CHANGE: Note, I've changed the RewriteRule pattern from .* (meaning "anything") to simply . (single period) (meaning "something"). This is an alternative to specifying an additional RewriteCond directive for the root URL. So, the following is unnecessary:
RewriteCond %{REQUEST_URI} !=/
This should ensure that Google will remove these pages as-soon-as. You should also be able to see confirmation of this in terms of a crawl error report in GWT (yes, it is a crawl error, but it is intentional). This also provides a meaningful message to users and should encourage them to update/delete their bookmarks as required.
UPDATE: As mentioned in comments, the above rules still permit all the "gallery" images to be accessed (in a sub folder). In order to prevent the gallery images, we can add another RewriteRule following the directives above:
# (Above directives go here...)
RewriteRule ^galleryd - [G]
This will block all URLs (including images) that start /gallery1, /gallery2, etc. (Note that the / prefix is intentionally omitted from the RewriteRule pattern.) However, the directives at the top will still allow all the other images, necessary to build your homepage.
Note that this second RewriteRule is entirely separate from the previous RewriteRule and RewriteCond directives above. RewriteCond directives only apply to the single RewriteRule that follows them. So, the RewriteCond %{REQUEST_URI} !.(gif|jpe?g|png|css|js)$ does not apply to this second RewriteRule.
Summary
The following is the complete set of rules:
ErrorDocument 410 /error-docs/e410.html
RewriteEngine on
# Serve 410 to all files except:
# error documents, /index.html, / (root) and images
RewriteCond %{REQUEST_URI} !^/error-docs/
RewriteCond %{REQUEST_URI} !=/index.html
RewriteCond %{REQUEST_URI} !.(gif|jpe?g|png|css|js)$
RewriteRule . - [G]
# Serve 410 to EVERYTHING within the /unwantedfolder
# >>> including images <<<
RewriteRule ^unwantedfolder - [G]
# Serve 410 to EVERYTHING within the /anotherunwantedfolder
RewriteRule ^anotherunwantedfolder - [G]
How about:
RewriteRule (html|php)$ www.example.com [R=301,L]
This matches all requests which end either with html or php strings.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.