Mobile app version of vmapp.org
Login or Join
Vandalay111

: A more specific question about a tricky redirect case me again. :) I will try to keep this more to the point. Synopsis: I am trying to redirect URLs like this: www.desktopscenes.com/Scenes from

@Vandalay111

Posted in: #ModRewrite #UrlRewriting

me again. :) I will try to keep this more to the point.

Synopsis: I am trying to redirect URLs like this:
desktopscenes.com/Scenes from Yellowstone (2003)/slides/Morning Glory Pool.html

to this:

desktopscenes.com/Scenes_from_Yellowstone_-_2003/slides/Morning_Glory_Pool.htm

And I mostly have code that does this. And even inserts the "slides" part if it is left out. Here's the code.

RewriteEngine On
Rewritecond %{HTTP_HOST} ^(desktopscenes.com desktopscenes.com)$ [nc]
RewriteRule ^(Scenes.*)s(.*)$ _ [N,NE]
RewriteRule ^(Scenes.*)((.*)$ -_ [NE]
RewriteRule ^(Scenes.*))(.*)/slides/(.*)html$ desktopscenes.com//slides/htm [NE,R=301,L]
RewriteRule ^(Scenes.*))(.*)/slides/(.*)jpg$ desktopscenes.com//slides/jpg [NE,R=301,L]
RewriteRule ^(Scenes.*))/([^/]*)jpg$ desktopscenes.com//slides/jpg [NE,R=301,L]
RewriteRule ^(Scenes.*))/([^/]*)html$ desktopscenes.com//slides/htm [NE,R=301,L]


It works fine IF I get a "well-formed" request passed. The problem is that when I pass a URL that doesn't match one of my "L" patterns, I get an endless loop. I thought that it would just fall off the end of the RewriteRules and return whatever the URL was at that point. Was I wrong about that?

For example, if I take the "(2003)" out of the request like so:

www.desktopscenes.com/Scenes from Yellowstone/Morning Glory Pool.jpg


Then the rewrite trace log fills with zillions of entries showing that it is creating cascading junk like this:

/Scenes from Yellowstone/Morning Glory_Pool.jpg/Morning Glory_Pool.jpg/Morning Glory_Pool.jpg/Morning Glory_Pool.jpg/Morning Glory_Pool.jpg/Morning Glory_Pool.jpg/Morning Glory_Pool.jpg/Morning Glory_Pool.jpg/Morning Glory_Pool.jpg/Morning Glory_Pool.jpg/Morning Glory_Pool.jpg/Morning Glory_Pool.jpg/Morning Glory_Pool.jpg/Morning Glory_Pool.jpg ...


Each line has another entry.

So, clearly I am doing something very wrong here. Does anyone know what it is?

All I want is that if the URL matches my pattern it does the rewrite, and otherwise, it just leaves it alone. :)

Thanks in advance for your patience and help!

Something I tried

I put this after the other lines:

RewriteRule ^(.*)$ [L]


Figuring it would just be "if you got here, then stop." Didn't work.

I am missing something fundamental here. Halp! :)

Info from Log

Here's a line from the rewrite log (IP omitted)

[15/Sep/2015:22:06:57 --0400] [www.desktopscenes.com/sid#80111ced0][rid#80a25a0a0/initial] (3) [perdir {my user folder}] applying pattern '^(Scenes.*)s(.*)$' to uri 'Scenes from Yellowstone/Morning Glory_Pool.html/Morning Glory_Pool.html/Morning Glory_Pool.html/Morning Glory_Pool.html/Morning Glory_Pool.html/Morning Glory_Pool.html/Morning Glory_Pool.html


..... this "/Morning Glory_Pool.html" is appended hundreds of times.

This suggests it is getting stuck on the first RewriteRule. Even though it's not actually replacing all the spaces. Bear in mind this works fine if the URL is in the expected format. None of this makes sense to me, anyone else?

More Info from Another Experiment

I enabled the rules except the first one that replaces all the whitespace ("s") with underscores. With this done the problematic URL no longer causes an infinite loop. Of course, the rest of the script now no longer does what's necessary, but that's a piece of info.

This seems like it should be the simplest part of this. I'm baffled.

And Another Try

Apparently you can limit the "N" tag with a number, so I set it to "N=20". Still locks up with an infinite loop.

More Oddness

Even if I remove the "N" from the initial rule, which means it should only replace ONE instance of a space with an underscore, the rule is called repeatedly and they all get replaced. Except sometimes I then get a server error. The rewrite_log shows "internal redirects". I am lost.

Yet Another Experiment

To try to simplify this down as much as possible, I stripped everything down to just this:

RewriteEngine On
Rewritecond %{HTTP_HOST} ^(desktopscenes.com desktopscenes.com)$ [nc]
RewriteRule ^(Scenes.*)s(.*)$ _ [L]


In theory, this should replace one instance of white space with an underscore and stop.

As a test, I put this in my browser:
www.desktopscenes.com/Scenes%20from%20Nowhere/Test%20This%20Else.jpg

And what I got was this error message:

"The requested URL /Scenes_from_Nowhere/Test_This_Else.jpg was not found on this server."


Notice all the spaces have been turned to underscores!

This one line also generated 46 lines in a rewrite_log. I saved them to this file: www.desktopscenes.com/rewrite_log_one_request.txt
Notice how it says "INTERNAL REDIRECT" several times, and keeps running the same rule over and over again. It's even ignoring the "[L]" flag.

Slowly Learning Here

Okay I now understand that this is inherently interative, so it will keep calling the same code over and over regardless.

I guess what I don't understand is why in the error cases it is attaching another copy of the text after the last "/" character, which is causing the endless looping.

Man this is confusing.

Add Path Info Postfix?

This is being done over and over in the case where the errors occur. It is what is causing the infinite looping. But it only happens sometimes. How do I stop it?

Someone on another thread mentioned "Options -Multiview" but I am not sure what other effects that would have.

Thanks.

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Vandalay111

1 Comments

Sorted by latest first Latest Oldest Best

 

@Chiappetta492

I would aproach this another way since .htaccess is gonna be messy. Part of creating code is keeping it maintainable. No-one is ever gonna touch that code again.

I'd use the help of another language like PHP (or another language you like) for this:

# Redirect if the url isn't what I'd like
RewriteCond %{REQUEST_URI} s
RewriteRule ^(.*) pretty_url.php?url= [L]


The contents of pretty_url.php:

$url = preg_replace("/[s]+/", "_", $_GET{'url']); // replace all spaces's
$url = strtolower($url);
header("Location: ".$url, true, 301); // permanent redirect
exit;


The permanent redirect tells the SEO bots that "This will be the URL used for this page from now on.".

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme