Mobile app version of vmapp.org
Login or Join
Reiling115

: What does it mean when a User-Agent has another User-Agent inside it? Basically, sometimes the user-agent will have its normal user-agent displayed, then at the end it will have the "User-Agent:

@Reiling115

Posted in: #Browsers #CrossBrowser #HttpHeaders #InternetExplorer6 #UserAgent

Basically, sometimes the user-agent will have its normal user-agent displayed, then at the end it will have the "User-Agent: " tag displayed, and right after it another user-agent is shown. Sometimes, the second user-agent is just appended to the first one without the "User-Agent: " tag.

Here are some samples I've seen: The first few contain the "User-Agent: " tag in the middle somewhere, and I've changed its font to make it easier to to see.


Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; Trident/4.0; GTB6;
User-agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1);
SLCC1; .NET CLR 2.0.50727; .NET CLR 3.0.04506)

Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6;
MRA 5.10 (build 5339); User-agent: Mozilla/4.0 (compatible; MSIE 6.0;
Windows NT 5.1; SV1); .NET CLR 1.1.4322; .NET CLR 2.0.50727)

Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0;
User-agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1);
.NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)

Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0;
User-agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1);
.NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152)


Here are some without the "User-Agent: " tag in the middle, but just two user agents that seem stiched together.


Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0;
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1); .NET CLR
3.5.30729)

Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6;
IPMS/6568080A-04A5AD839A9; TCO_20090713170733; Mozilla/4.0
(compatible; MSIE 6.0; Windows NT 5.1; SV1); InfoPath.2)


Now, just to add a few notes to this. I understand that the "User-Agent: " tag is normally a header, and what follows a typical "User-Agent: " string sequence is the actual user agent that is sent to servers etc, but normally the "User-Agent: " string should not be part of the actual user agent, that is more like the pre-fix or a tag indicating that what follows will be the actual user agent.

Additionally, I may have thought, hey, these are just two user agents pasted together, but on closer inspection, you realize that they are not. On all of these dual user agent listings, if you look at the opening bracket "(" just before the "compatible" keyword, you realize the pair to that bracket ")" is actually at the very end, the end of the second user agent. So, the first user agents closing bracket ")" never occurs before the second user agent begins, it's always right at the end, and therefore, the second user agent is more like one of the features of the first user agent, like: "Trident/4.0" or "GTB6" etc etc...

The other thing to note that the second user agent is always MSIE 6.0 (Internet Explorer 6.0), interesting.

What I had initially thought was it's some sort of Virtual Machine displaying the browser in use & the browser that is installed, but then I thought, what'd be the point in that?

Finally, right now, I am thinking, it's probably soem sort of "Compatibility View" type thing, where even if MSIE 7.0 or 8.0 is installed, when my hypothetical the "Display In Internet Explorer 6.0" mode is turned on, the user agent changes to something like this. That being, IE 8.0 is installed, but is rendering everything as IE 6.0 would.

Is there or was there such a feature in Internet Explorer? Am I on to something here? What are your thoughts on this? If you have any other ideas, please feel free to let us know.

At the moment, I'm just trying to understand if these are valid User Agents, or if they are invalid. In a list of about 44,000 User Agents, I've seen this type of Dual User Agent about 400 times. I've closely inspected 40 of them, and every single one had MSIE 6.0 as the "second" user agent (and the first user agent a higher version of MSIE, such as 7 or 8). This was true for all except one, where both user agents were MSIE 8.0, here it is:


Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0;
Mozilla/4.0 (compatible; MSIE 8.0; Win32; GMX); GTB0.0)


This occured once in my 40 "close" inspections. I've estimated the 400 in 44,000 by taking a sample of the first 4,400 user agents, and finding 40 of these in the MSIE/Windows user agents, and extrapolated that to estimate 40. There were also similar things occuring for non MSIE user agents where there were two Mozilla's in one user agent, the non MSIE ones would probably add another 30% on top of the ones I've noted. I can show you samples of them if anyone would like.

There we have it, this is where I'm at, what do you guys think?

10.04% popularity Vote Up Vote Down


Login to follow query

More posts by @Reiling115

4 Comments

Sorted by latest first Latest Oldest Best

 

@Angela700

"User-Agent:" inside the string is not W3C normal, haven't found a justification for this string tag inside, I personally block it.
Spent tons of hours analyzing IP's, agent strings, referrers, etc.

If it helps others:

I use cascade procedures to isolate the bad traffic, lowering the verification time and server resources as low profile as possible:

1) ROBOTS.txt

With most complete list of bad bots I could get to Disallow them and shorten bad traffic.

2) SESSION_Start

Routine with arrays of bad stuff:


URL's meant to hack directly: "xmlrpc|wordpress|wp-admin|..."
Good bots: "baidu|bingpreview|bingbot|adidxbot|googlebot.com|..."
Common bad strings in Agent: "user-agent:|genieo|majestic12|..."
Large list of NOT common bad strings: "abcdatos|almaden|amsu.ru|chaos|..."
(this list can have 3,000 strings easily)


a) If request is a bad URL, user gets error 401.3. Traps wp-admin, xmlrpc.xml and other URL attacks.

b) Loop if "Good bots" strings in User-Agent and bypass the rest of checking.

c) Loop into the "short list of known bad agents", if found, send Error 403.1

d) If not in short list, loop into large list of "bad strings" and send 401.3
(end of the road for the user).

3) MONITOR page

When a string is found; IP, Country, string, URL, Referrer and Agent is kept in memory, to be seen in real time monitor page in red, to be reviewed.

Example: "puffin" or "user-agent:" may be a mistake, so both are researched if is part of a browser, software, TV, device, tablet or phone and correct array.

Having a huge list of browser and versions helps to verify the agent string.

CONCLUSION:

I found among others that "User-Agent:" string inside happen at certain hours of the day, IP's belong mostly to US and are close to same network and geo-locations, have a close relation with the same browser type, version and compatibility.

Denying "User-Agent:" after a week seem that is not coming anymore for days or weeks, seems like whatever is doing it, got the 403.1 forbidden and stopped insisting.

Analyzing this string at www.useragentstring.com/index.php seems that effectively "user-agent:" is a unknown feature among other clues that helped to determine is not a common user nor a good spider.

10% popularity Vote Up Vote Down


 

@Cofer257

Almost certainly this is a bug in the software that is accessing your web site - a browser, a browser plugin, or a script/command-line program like wget. It's like someone tried to modify the User-agent header but instead pasted it inside a standard header.

Unless the user's particular IP is causing problems overloading the server, there is no real action you can take.

10% popularity Vote Up Vote Down


 

@Rambettina238

Take a look at the accepted answer on Stackoverflow, you find links to the standard definitions RFC2616 and RFC1945.

From my own experiments during the time I developed a visitor tracking plugin I can add that you find lots of lots of wrong and fake user agent headers in your logs.

Example: If someone spiders your site with command line tools like wget there is a user-agent-string parameter that allows to add anything you like. Also lots of SEO spider tools allow you to add invididual strings, be it just to hide themselfs. Some privacy plugins fake headers (or let you change them). Script kiddie hacker tools also fake. Oh, yes, and if you have votes on your site that are important (say: people can win something or it creates some hype) you will see lots of fakes :-)

10% popularity Vote Up Vote Down


 

@Pope3001725

Check these pages out:
www.user-agents.org/ whatsmyuseragent.com/CommonUserAgents.asp
It shows the user agents you've defined above with a description of what they are.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme