Mobile app version of vmapp.org
Login or Join
Martha676

: Remove extension from URL using a rewrite without resulting in a redirect loop On the server, I have a file (on the filesystem) called page.html, which I want to be accessed as site.com/page

@Martha676

Posted in: #301Redirect #Htaccess #ModRewrite #UrlRewriting

On the server, I have a file (on the filesystem) called page.html, which I want to be accessed as site.com/page So if someone goes to site.com/page.html, it should 301 redirect to site.com/page

I've seen rewrite rules that will handle rewriting /page -> /page.html internally, but forcing it to 301 redirect /page.html -> /page as well causes a redirect loop for me.

The END flag looks like it can be used to do what I want, but it is not yet supported.

I've also tried using ENV as follows:

RewriteRule ^page$ /page.html [L,E=END:1]
RewriteCond %{ENV:END} !1
RewriteRule ^page.html$ /page [R=301,L]


But that results in a redirect loop as well.

10.03% popularity Vote Up Vote Down


Login to follow query

More posts by @Martha676

3 Comments

Sorted by latest first Latest Oldest Best

 

@Ann8826881

A common way to prevent the redirect loop is to check against THE_REQUEST server variable, which contains the initial request header, rather than the URL-path of the rewritten URL (which the RewriteRule pattern matches against), which naturally gets updated as the URL is rewritten.

THE_REQUEST value does not change as the URL is internally rewritten, and contains a string of the form:

GET /page.html HTTP/1.1


So this could be rewritten as:

# Redirect direct requests for /page.html to the canonical URL
RewriteCond %{THE_REQUEST} .html
RewriteRule ^(page).html$ / [R=301,L]

# Rewrite the "pretty" URL to the actual filesystem path
RewriteRule ^(page)$ .html [L]


In this case, THE_REQUEST condition simply makes sure that .html is present in the initially requested URL. The check for page.html is left to the RewriteRule pattern - which is more efficient (since this is checked first).

It's always preferable to have any external redirects before the internal rewrites.

10% popularity Vote Up Vote Down


 

@Carla537

The problem is that, when you use mod_rewrite in an .htaccess file or a <Directory> section, every successful RewriteRule — even an internal one — causes the request to be restarted internally, and thus the whole rewrite ruleset to be reprocessed.

Thus, what's happening is that, when the user visits /page, your internal RewriteRule matches and rewrites the URL to /page.html. But that makes Apache restart the request processing and run your ruleset again, causing the external rewrite rule to match and trigger a 301 redirect back to /page.

A quick and dirty (but effective!) fix is to make your internal rewrite rule append a dummy parameter like redirect=no to the URL, and check for that parameter in the external rewrite rule. Here's an example based on this answer I wrote for a similar question on Stack Overflow:

RewriteEngine On
RewriteBase /

# Externally rewrite page.html -> page, unless query includes redirect=no:
RewriteCond %{QUERY_STRING} !(^|&)redirect=no(&|$)
RewriteRule ^(page).html$ / [NS,R=301,L]

# Internally rewrite page -> page.html, add redirect=no to query:
RewriteRule ^(page)$ .html?redirect=no [NS,QSA]


(Of course, feel free to replace redirect=no with something else if it conflicts with an actual URL parameter you might be using.)

10% popularity Vote Up Vote Down


 

@BetL925

I asked this same question on StackOverflow. To get it to work properly, you have to use environment variables:

RewriteRule ^page$ /page.html [L,E=LOOP:1]
RewriteCond %{ENV:REDIRECT_LOOP} !1
RewriteRule ^page.html$ /page [R=301,L]


This is because mod_rewrite does multiple passes through your rules. During the first pass, it sets the environment variable. During the second pass, it prepends the variable with the REDIRECT_ prefix, so you have to read it as REDIRECT_LOOP.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme