Mobile app version of vmapp.org
Login or Join
Ogunnowo487

: URL-path with encoded question mark results in incorrect redirect when copied to target URL I have a simple RewriteRule, appending a string to some URLs: RewriteRule ^labels/([^/]+)/?$ /labels//releases/

@Ogunnowo487

Posted in: #301Redirect #Htaccess #ModRewrite #Redirects

I have a simple RewriteRule, appending a string to some URLs:

RewriteRule ^labels/([^/]+)/?$ /labels//releases/ [R=301,L,NC]


The links are database driven, and this is working very well.
Well, apart from one of them.
The name contains ?! at the end, let's say label?!.

The links generated are correct, ie:

/labels/label%3F%21


But the redirect is applied twice, and obviously doesn't consider the encoded question mark as part of the URL. The resulting link should be:

/labels/label%3F%21/releases/


But we get instead:

/labels/label/releases/?!/releases/


I can see the rule is actually applied twice, but I am sure this can be easily solved if I get over my first problem:
Why does rewriterules see the encoded %2F as an actual query string delimiter? How can I mitigate this case?

Thank you for any hint!

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Ogunnowo487

1 Comments

Sorted by latest first Latest Oldest Best

 

@Ogunnowo487

You need the B flag to escape the backreference and the NE (noescape) to prevent the resulting substitution (ie. the backreference) being doubly encoded. For example:

RewriteRule ^labels/([^/]+)/?$ /labels//releases/ [B,NE,R=301,L,NC]


You will need to clear your browser cache, as the previous (erroneous) 301 will have been cached.




Why does rewriterules see the encoded %2F as an actual query string delimiter?


Slight typo there I think... you mean %3F. Yes, you end up getting 2 redirects because...


when you request /labels/label%3F%21, the RewriteRule pattern matches against the %-decoded URL-path ie. /labels/label?!. According to your rule, label?! is then copied into the substitution, resulting in a redirect to /labels/label?!/releases/ (the ?! does not get re-encoded automatically). Which is a URL-path of /labels/label and a query string of !/releases/. That's where the query string comes from.
On the redirected request, /labels/label matches your RewriteRule pattern (the query string is ignored at this stage). This time just label is copied into the substitution, to become /labels/label/releases/. And then the query string from the request is passed through to the substitution, to result in a second redirect to /labels/label/releases/?!/releases/.


The B flag escapes the captured pattern. eg. label?! is escaped to become label%3F%21.

And the NE flag prevents the % being encoded as %25 (ie. effectively doubly encoding the backreference). eg. label%3F%21 would otherwise be encoded as label%253F%2521.

Aside: mod_rewrite does not automatically encode the first ? in the substitution since it is assumed this starts the query string. However, subsequent ? will get automatically encoded. eg. Given RewriteRule ^foo$ /bar??? [R,L], a request for /foo results in a redirect to /bar?%3f%3f (note the second two ? are URL encoded, but the first one isn't).

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme