: Interpretation of empty User-agent How should I interpret a empty User-agent? I have some custom analytics code and that code has to analyze only human traffic. I have got a working list of

How should I interpret a empty User-agent? I have some custom analytics code and that code has to analyze only human traffic. I have got a working list of User-agents denoting human traffic, and bot traffic, but the empty User-agent is proving to be problematic. And I am getting lots of traffic with empty user agent - 10%.

Additionally - I have crafted the human traffic versus bot traffic user agent list by analyzing my current logs. As such I might be missing a lot of entries in there. Is there a well maintained list of user agents denoting bot traffic, OR the inverse a list of user agents denoting human traffic?

10.03% popularity Vote Up Vote Down

: Keep search engine from indexing specific content on your site Possible Duplicate: Preventing robots from crawling specific part of a page I've got a pretty weird scenario that I

@Cofer257

Posted in: #Seo

1 Comments

: When does Twitter api V1.1 will be active? When does Twitter api V1.1 will be active? I can see warnings for Deprecated version, but when do we expect the new version to come alive.

@Cofer257

Posted in: #Api #Twitter

1 Comments

: Disqus 2012 comments NOT being indexed by Google We run a high-traffic website at http://www.onedirection.net and we've been using Disqus throughout this year, initially to great effect. We

@Cofer257

Posted in: #Seo #Wordpress

2 Comments

: Counting total number of distinct real people that have seen a web page? What is most accurate method to count total number of distinct real people that have seen a web page? (not number of

@Cofer257

Posted in: #Statistics

1 Comments

Login to post a comment!

2 Comments

Sorted by latest first Latest Oldest Best

@Ogunnowo487

I work for a security company and among other things we monitor Bad Bot traffic.

Based on my experience, humans visits with blank user-agent data indicate scraping/spamming attempts (usually scraping) made by "headless browser" bots.

These visitors can sometimes execute JS, and so they will appear in GA - still, this dose not make them human :)

Apologize for the "plug" but please know that, if needed, we offer free Bad Bot protection services - coupled with CDN acceleration and other goodies.

In this specific case our system would recognize this visit as "suspicious", verified it against known attack vectors and - if still unsure - performed further test and challenges. These challenges are performed seamlessly, without causing any delay to the session.

10% popularity Vote Up Vote Down

@Phylliss660

If you want to analyze only "human traffic" I would not count the ones with empty or missing user agent string. In my experience almost any browser will always send one. Even most privacy plugins or extensions rather fake (include other OS or Client name) or "normalize" (e.g. no release numbers) or randomize (e.g. sometimes FF, sometimes IE strings) the UA strings, but not completely remove them (as this might cause problems with some sites that rely on it, even if that's no good idea.)

A simple request with no UA can be done like this:

wget --user-agent="" example.com

As you see you can add anything you want. Sites that store and publish UA's found "in the wild" are not of great use as they find lot's of crap.

Maybe someone just recursively fetched your content. Or used some SEO tool to analyze your site (some allow users to manually change the header, others with the intent to ignore a robots.txt line). Things like that. In those situations UA header is often faked to hide client and purpose.

If these requests keep constantly around it might be helpful to further analyze the headers (Proxies?) or the IPs (A certain block? Privacy concerned company/Proxy?)

10% popularity Vote Up Vote Down

Feed

: Interpretation of empty User-agent How should I interpret a empty User-agent? I have some custom analytics code and that code has to analyze only human traffic. I have got a working list of

More posts by @Cofer257

: Keep search engine from indexing specific content on your site Possible Duplicate: Preventing robots from crawling specific part of a page I've got a pretty weird scenario that I

: When does Twitter api V1.1 will be active? When does Twitter api V1.1 will be active? I can see warnings for Deprecated version, but when do we expect the new version to come alive.

: Disqus 2012 comments NOT being indexed by Google We run a high-traffic website at http://www.onedirection.net and we've been using Disqus throughout this year, initially to great effect. We

: Counting total number of distinct real people that have seen a web page? What is most accurate method to count total number of distinct real people that have seen a web page? (not number of

Login to post a comment!

2 Comments

Back to top | Use Dark Theme