[cabfpub] More changes to proposed policy update

Bjørn Vermo bv at norbionics.com
Sat May 26 12:59:07 MST 2012



Saturday, May 26, 2012, 6:27:25 PM, Ryan Hurst wrote:

> Wen-Cheng Wang,



> Back when I was at CyberSafe and Valicert (mid to late 90s) I saw stuff like
> that too, I do not doubt there are a handful of people out there running
> stuff from this era still but thank goodness it’s just a handful.



> One of the coolest parts of the Internet in 2012 is we now have real and
> accurate statistics
> <http://gs.statcounter.com/#browser-ww-monthly-201104-201204-bar>  on what
> is out there (we really did not back then), for example in the case of
> browsers on the Internet (which is what “publicly trusted” certificates
> are about) we know that less than .7% of the internet uses “Other” types
> (not Firefox, Safari, Opera, Chrome IE).
...

I  think  you are overly optimistic if you believe we have anything near
"real and accurate" statistics of web client usage.

There  are  many factors that cause such statistics to be off by up to an
order  of  magnitude in some cases,  especially  for  the  smaller or more specialised
clients. This can also teach us a thing or two about the consequences of
using a field for something else than it was intended for.

The  main  reason  such  statistics  are  crap  is  a  phenomenon called
spoofing.  The  user agent string in html worked fine up to when it said
which  version  of Mosaic it came from and the NCSA servers logged that. Then we got a commercial browser
with Netscape. They also made servers and tools, and had a business model based on
adding  value  through extensions to the standard. Therefore, they parsed the
user  agent  string  to  see  if they were dealing with one of their own
"enhanced" clients.

Enter  Internet  Explorer.  In order to work with those browser-sniffing
servers,  it  identified itself as Netscape (compatible). From there, it
only got worse.

While new methods have been added for client identification,they are all
getting  spoofed because some website operators in their infinite wisdom
decide  that they will not serve their precious content except to a couple
of clients which they happen to know how work.

The  result of this is that today we identify the client as whatever the
website  wants  to hear, using lots of trickery to guess what that might
be.  Embedded browsers generally pretend to be whatever was most popular
at their time of launch.

In  addition,  there  are  gateways  that  will  strip  or  modify  this
information, e.g. mobile or TV operators.

Given  this, the 0.7% "other" only constitutes the minority that did not
implement effective spoofing, and is certainly a lower bound.

This  should  not change the conclusion, though, since the spoofing that
bloats  statistics  for  IE  and  Mozilla  will  in most cases come from
clients with the same functionality.

-- 
Best regards,
 Bjørn                            mailto:bv at norbionics.com



More information about the Public mailing list