Mobile app version of vmapp.org
Login or Join
Harper822

: Redirect $.html to $/ mod_rewrite rule clash I have already remapped URL's from for example /sub/test/ to test.html but I also want to redirect /sub/test.html to /sub/test/ but here seems to

@Harper822

Posted in: #Apache #Apache2 #ModRewrite #UrlRewriting

I have already remapped URL's from for example /sub/test/ to test.html but I also want to redirect /sub/test.html to /sub/test/ but here seems to be a clash between the rules.

.htaccess files is in the /sub directory and should only apply to /sub directory and any sub-directories under it.

Here's the .htaccess file:

<IfModule mod_rewrite.c>
Options -MultiViews -Indexes
RewriteEngine On
RewriteBase /sub

# redirect urls without trailing slash
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !index.html
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule (.*)$ / [L,R=301]

# redirect .html to url #RewriteCond %{REQUEST_FILENAME} !-d #RewriteCond %{REQUEST_FILENAME} -f #RewriteRule (.*).html$ / [L,R=301]

# remap url to a .html file
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule (.*)/$ .html [L]

</IfModule>


Logic is that /sub/test.html path gets redirected to /sub/test/ and that is remapped back to /sub/test.html internally.

.htaccess files is in the /sub directory and should only apply to /sub directory and any sub-directories under it.

The commented lines cause a redirect loop to /sub/test/ which is the desired url. If the only the commented lines are left there is no redirect loop so it seems that there is a conflict between the rules. What is causing the redirect loop?

Non-existent file redirect loop:

How to deal with non-existing file redirect loops? for example /web.html redirects to /web/ while non-existing /web2.html end up looking like /web.html.html.html....

10.03% popularity Vote Up Vote Down


Login to follow query

More posts by @Harper822

3 Comments

Sorted by latest first Latest Oldest Best

 

@Kaufman445

Here is the final .htaccess that works almost perfectly.

<IfModule mod_rewrite.c>
Options -MultiViews -Indexes
RewriteEngine On #RewriteBase /sub

# Assign the accessed subdirectory to an environment variable
RewriteCond %{REQUEST_URI} ^(/[^/]+)/
RewriteRule ^ - [E=SUBDIR:%1]

ErrorDocument 404 /%{ENV:SUBDIR}/404.html

# redirect urls without trailing slash
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !index.html
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule (.*)$ %{ENV:SUBDIR}// [L,R=301]

# redirect index.html to sub root
RewriteCond %{THE_REQUEST} index.html
RewriteRule ^index.html$ /%{ENV:SUBDIR}/ [L,R=301]

# redirect .html to url
RewriteCond %{THE_REQUEST} (.*) /%{ENV:SUBDIR}/.+.html
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule (.*).html$ / [L,R=301]

# remap url to a .html file
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{DOCUMENT_ROOT}/%{ENV:SUBDIR}/.html -f
RewriteRule (.*)/$ .html [L]

</IfModule>


The only change that would be nice is to make it more portable by making the RewriteBase dynamic. I am sure there is a way to dynamically get the /sub directory and use it instead of setting the RewriteBase. That would mean that /sub directory could be directory with any name or even multiple sub-directories.

10% popularity Vote Up Vote Down


 

@Alves908

What is causing the redirect loop?


When the "last directive" executes, the rewriting process doesn't suddenly stop completely, only the current pass stops. The whole rewriting process starts again from the top! The process only stops completely when the URL passes through unchanged (or when it hits the END flag in Apache 2.4, as Ivo van der Veeken mentions in his answer).

The "last directive" is either the last directive in the file, or a directive that gets processed with the L (last) flag.


RewriteRule (.*)/$ .html [L]


So, when the above directive executes, the rewriting process starts again, but this time gets caught by the (commented out) directives above, which strips the .html (just added) and redirects. On the next request, the .html is added again (for the internal rewrite), but the rewriting process starts again, strips the .html and redirects, etc. etc. etc.

To break the loop you can either use the END flag (Apache 2.4+) as mentioned above, instead of L. Or, use an additional condition (RewriteCond directive) on your .html redirect that checks against THE_REQUEST. This server variable holds the value of the initial request (the actual request header sent from the client), not the rewritten request, so it will fail to match when the request is rewritten thus breaking the "loop". This works on all versions of Apache. So, try something like:

# redirect .html to url
RewriteCond %{THE_REQUEST} ^[A-Z]+ /.+.html HTTP/ #RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule (.+).html$ / [L,R=301]


The above condition makes sure that the .html exists on the initial request only, not the rewritten request. I don't think you really need to check that it's not a directory, unless you have directories that are named <something>.html?!

THE_REQUEST looks something like:

GET /sub/test.html HTTP/1.1



Non-existent file redirect loop:

# remap url to a .html file
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule (.*)/$ .html [L]


Your rewrite directives append .html to the request whenever the requested file does not exist. So, does-not-exist/ gets rewritten to does-not-exist.html which gets externally redirected (by your first rule) to does-not-exist.html/ which gets rewritten to does-not-exist.html.html, etc. etc.

You can include an additional check to make sure that the rewritten file would exist before actually rewriting to it. eg. RewriteCond %{DOCUMENT_ROOT}/sub/.html -f - the additional complexity is because your URLs end with a slash. In context:

# remap url to a .html file
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{DOCUMENT_ROOT}/sub/.html -f
RewriteRule (.*)/$ .html [L]


EDIT: To make this more generic (to avoid having to include the /sub subdirectory), you could try changing the 3rd condition to:

RewriteCond %{REQUEST_FILENAME}.html -f


This relies on Apache internally stripping trailing slashes from the request.


make it more portable by making the RewriteBase dynamic


Providing this .htaccess file is in the same directory that you are specifying for the RewriteBase directive (which it is in this instance), then you could dynamically assign the subdirectory to an environment variable and use that in your RewriteRule substitutions.

# Removed... #RewriteBase /sub

# Assign the accessed subdirectory to an environment variable
RewriteCond %{REQUEST_URI} ^(/[^/]+)/
RewriteRule ^ - [E=SUBDIR:%1]

# Use this environment variable as a prefix to your substitutions
:
RewriteRule (.*)$ %{ENV:SUBDIR}// [L,R=301]


Incidentally, this is only required for your external redirects. You do not need this for internal rewrites in per-directory .htaccess files, since the directory-prefix is automatically added back to relative substitutions.


That would mean that /sub directory could be directory with any name or even multiple sub-directories.


For multiple subdirectories you could change the CondPattern with something like:

RewriteCond %{REQUEST_URI} ^(/.+)/


By default the regex is greedy, so everything between the first and last slash is captured. However, this has potential to break on some URLs/servers and could possibly open you up to an XSS attack, so I would code for your specific requirement, rather than being too generic.

10% popularity Vote Up Vote Down


 

@Samaraweera270

First off, RewriteCond %{REQUEST_FILENAME} !index.html won't work, you'll need !"index.html", with the quotes. Otherwise it's not a valid regex.

According to the docs, you should use the [END] flag instead of [L], to prevent further request processing (internally, your rewrite is reevaluated by the Apache redirect rules). Apparently, there's a bug with [END] that's only been fixed in 2.4.9 though, so you may have to find another way.

I'd also switch the order of the statements. Most of your links should point to the clean URL, so this should be a bit faster

# remap url to a .html file
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule (.*)/$ .html [END]

# redirect .html to url
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule (.*).html$ / [L,R=301]

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme