Interesting People mailing list archives

Critique of the Rimm study -- from Brian Reid

From: David Farber <farber () central cis upenn edu>
Date: Thu, 6 Jul 1995 00:47:07 -0400
Date: Wed, 05 Jul 95 20:30:49 -0700
From: Brian Reid <reid () pa dec com>




I have read a preprint of the Rimm study of pornography and I am so
distressed by its lack scientific credibility that I don't even know
where to begin critiquing it. Normally when I am sent a publication for
review, if I find a flaw in it I can identify it and say "here, in this
paragraph, you are making some unwarranted assumptions". In this study
I have trouble finding measurement techniques that are *not* flawed.
The writer appears to me not to have a glimmer of an understanding even
of basic statistical measurement technique, let alone of the
application of that technique to something as elusive and ill-defined
as USENET.


I have been measuring USENET readership and analyzing USENET content,
and publishing studies of what I find since April 1986. I have spent
years refining the measurement techniques and the data processing
algorithms. Despite those 9 years of working on the problem, I still do
not believe that it is possible to get measurements whose accuracy is
within a factor of 10 of the truth. In other words, if I measure
something that seems to be 79, the truth might be 790 or 7.9 or
anywhere in between. Despite this inaccuracy, the measurements are
interesting, because whatever unknowns it is that they are measuring,
these unknowns are similar from one month to the next, so that the
study of trends is meaningful. As long as you are aware of what it is
that you are taking the ratio of, it is also meaningful to compare
USENET measurements, because whatever the errors might be, they are
often similar in two numbers from the same measurement set, and they
are multiplicative, so they tend to cancel out in quotient.


In other words, in the results that I publish, the two kinds of measurements
that are meaningful enough to pay attention to for serious scholarship
are the normalized month-to-month trends in the readership percentages
of a given newsgroup, and the within-the-same-month ratio of the
readership of one newsgroup to the readership of another. The reason
that I publish the numbers is primarily to enable trend analysis; it is
not reasonable to take a single-point measurement seriously.


No matter what the level of accuracy you are seeking, it is imperative
that you understand what it is that you are measuring. Whenever you
cannot measure an entire population, you must find and measure a
sample, and the error in your measurement will be magnified if your
sample is not a representative sample. A small error in understanding
the nature of the sample population will lead to an error like the
famous "Dewey defeats Truman" headline in the 1948 US Presidential
election. A large error in understanding the nature of the sample
population can lead to results that are completely meaningless, such as
measuring pregnancy rates in a population whose age and sex are unknown.


Rimm has made three "beginner's errors" that, in my opinion, when taken
together, render his numbers completely meaningless:


    1. He has selected a very homogeneous population to measure. While
       he has chosen not to identify his population, he has included
       enough of his sample data to allow me to correlate his numbers
       with my own numbers for the same measurement period. His data
       correlate exactly with my numbers for Pittsburgh newsgroups in
       that measurement period; only his own university (Carnegie-Mellon)
       has widespread enough campus networking to make it possible for
       him to sample that large a population. It is therefore almost
       certain that he has measured his own university. I received my
       Ph.D. in Computer Science from Carnegie-Mellon University, and I
       am very aware that it is dominantly male and dominantly a
       technology school.  The behavior of computer-using students at
       a high-tech urban engineering school might not be very similar
       to the behavior of other student populations, let alone
       non-student populations.


    2. He has measured only one time period, January 1995. Having lived
       at Carnegie-Mellon University for a number of years, I know
       first-hand that student interests in January are extremely
       different from student interests in September or April. When
       measuring human behavior about which very little is known, it is
       important to take numerous measurements over time and to look for
       time series. Taking the last few years worth of my data and
       doing a trend analysis in the newsgroups that he has named as
       pornographic shows an average 3:1 seasonal trend change between
       low-readership months (November and April) and high-readership
       months (September and January). But the trends are different in
       different newsgroups. A single-point measurement is not nearly
       as meaningful as a series of measurements.


    3. He makes the assumption that by seeing a data reference to an
       image or a file, it is possible to tell what the individual did
       with the file. We in the network measurement business are very
       careful to explain what it is that our measurements mean. Here
       is the standard explanation that I publish with my monthly
       measurements to talk about the number that Rimm calls "number
       of downloads".


          To "read" a newsgroup means to have been presented with the
          opportunity to look at at least one message in it. Going
          through a newsgroup with the "n" key counts as reading it.
          For a news site, "user X reads group Y" means that user
          X's .newsrc file has marked at least one unexpired message
          in Y.


       Rimm used my network measurement software tools to take his data,
       and he did not anywhere in his article state that he had made changes
       to them, so I must conclude that his numbers and my numbers are
       derived from the same software. But the number that he is using for
       "number of downloads" is the same number that I call "number of
       readers" by the above definition. It has nothing to do with the
       number of downloads. In fact, it is not possible for this
       measurement system to tell whether or not a file has been downloaded;
       it can tell whether or not a person has been presented with
       the opportunity to download a file but it cannot tell whether the
       user answered "yes" or "no".


In summary, I do not consider Rimm's analysis to have enough technical rigor
to be worthy of publication in a scholarly journal.


Brian Reid, Ph.D.
Director, Network Systems Laboratory
Digital Equipment Corporation
Palo Alto, California
reid () pa dec com
http://www.research.digital.com/nsl/people/reid/bio.html
Current thread:

Critique of the Rimm study -- from Brian Reid David Farber (Jul 05)