Mobile app version of vmapp.org
Login or Join
LarsenBagley505

: Handling bots that request URLs in paths In my server log, I found at least one IP address to be requesting a full URL in an awkward place. For example, the header the client sends to my

@LarsenBagley505

Posted in: #Connections #HttpHeaders #Proxy

In my server log, I found at least one IP address to be requesting a full URL in an awkward place. For example, the header the client sends to my server is this:

GET www.3rdpartysite.com/file.php HTTP/1.1


And here, I'm expecting request headers to be more like this:

GET /path/to/file.php HTTP/1.1
Host: example.com


This makes me think hackers are trying to break my website, but then I look here at www.w3.org/Protocols/rfc2616/rfc2616-sec5.html and it talks about that first GET request being valid for proxies.

My server has cpanel and whm installed but I don't use proxies for my website. My question then is, if I force apache to return an error or redirect to all HTTP request headers beginning with...

GET

...and I request remote systems to issue headers in this format....

GET /path/to/resource HTTP/x.x
Host: example.com


would my idea work with all web browsers? or would at least one legit web browser break?

I just have a feeling some hacker is using my server to connect to another.

10.02% popularity Vote Up Vote Down


Login to follow query

More posts by @LarsenBagley505

2 Comments

Sorted by latest first Latest Oldest Best

 

@Eichhorn148

The HTTP 1.1 spec is very clear that

GET /path/to/resource HTTP/1.1
Host: example.com


and

GET example.com/path/to/resource HTTP/1.1


are equivalent requests. This is because the request starts with Request-Line which is defined as Method-Token Request-URI Protocol-Version and the Request-URI can be absolute: "*" | absoluteURI | abs_path | authority.

You should not try to configure your web server to respond differently to the different formats of requests. You would be breaking the spec. While browsers today typically use the former request format, there is no guarantee that they will continue to do so in the future. You don't want your website to suddenly stop working with the latest version of some browser.



You should instead ensure that your server does not serve content for unknown hosts. A request for any third party site should return a 404 not found (or possibly even 400 bad request). Bots that request third party sites are typically testing for open proxy servers.

One way to configure your web server to do so is to configure the first (default) virtual host to return a 404 page. Every legitimate site would be in a later virtual host directive.

10% popularity Vote Up Vote Down


 

@LarsenBagley505

These happen all the time. I see this at least a dozen times a day in my server logs. Best bet is to block the connection from coming in at the firewall or gateway and that way it doesn't hit your server, otherwise if it isn't a big deal for you and isn't causing you too many hassles and you aren't seeing other errors in relation to this connection then you can pretty safely ignore it.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme