: URL-path with encoded question mark results in incorrect redirect when copied to target URL I have a simple RewriteRule, appending a string to some URLs: RewriteRule ^labels/([^/]+)/?$ /labels//releases/
I have a simple RewriteRule, appending a string to some URLs:
RewriteRule ^labels/([^/]+)/?$ /labels//releases/ [R=301,L,NC]
The links are database driven, and this is working very well.
Well, apart from one of them.
The name contains ?! at the end, let's say label?!.
The links generated are correct, ie:
/labels/label%3F%21
But the redirect is applied twice, and obviously doesn't consider the encoded question mark as part of the URL. The resulting link should be:
/labels/label%3F%21/releases/
But we get instead:
/labels/label/releases/?!/releases/
I can see the rule is actually applied twice, but I am sure this can be easily solved if I get over my first problem:
Why does rewriterules see the encoded %2F as an actual query string delimiter? How can I mitigate this case?
Thank you for any hint!
More posts by @Ogunnowo487
1 Comments
Sorted by latest first Latest Oldest Best
You need the B flag to escape the backreference and the NE (noescape) to prevent the resulting substitution (ie. the backreference) being doubly encoded. For example:
RewriteRule ^labels/([^/]+)/?$ /labels//releases/ [B,NE,R=301,L,NC]
You will need to clear your browser cache, as the previous (erroneous) 301 will have been cached.
Why does rewriterules see the encoded %2F as an actual query string delimiter?
Slight typo there I think... you mean %3F. Yes, you end up getting 2 redirects because...
when you request /labels/label%3F%21, the RewriteRule pattern matches against the %-decoded URL-path ie. /labels/label?!. According to your rule, label?! is then copied into the substitution, resulting in a redirect to /labels/label?!/releases/ (the ?! does not get re-encoded automatically). Which is a URL-path of /labels/label and a query string of !/releases/. That's where the query string comes from.
On the redirected request, /labels/label matches your RewriteRule pattern (the query string is ignored at this stage). This time just label is copied into the substitution, to become /labels/label/releases/. And then the query string from the request is passed through to the substitution, to result in a second redirect to /labels/label/releases/?!/releases/.
The B flag escapes the captured pattern. eg. label?! is escaped to become label%3F%21.
And the NE flag prevents the % being encoded as %25 (ie. effectively doubly encoding the backreference). eg. label%3F%21 would otherwise be encoded as label%253F%2521.
Aside: mod_rewrite does not automatically encode the first ? in the substitution since it is assumed this starts the query string. However, subsequent ? will get automatically encoded. eg. Given RewriteRule ^foo$ /bar??? [R,L], a request for /foo results in a redirect to /bar?%3f%3f (note the second two ? are URL encoded, but the first one isn't).
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.