Interesting People mailing list archives

Why Good Data Can Be Hard to Find


From: David Farber <dave () farber net>
Date: Sat, 19 Apr 2008 06:35:16 -0700


________________________________________
From: Kurt Albershardt [kurt () nv net]
Sent: Saturday, April 19, 2008 2:34 AM
To: David Farber
Subject: Why Good Data Can Be Hard to Find

<http://blogs.wsj.com/numbersguy/why-good-data-can-be-hard-to-find-322/>
April 18, 2008, 5:12 pm

Here's a data dilemma for companies: Are the customers and users they engage with representative of the population at 
large? That question is common online, where it's cheaper to find people but harder to ensure they represent Internet 
users as a whole — let alone all Americans.

Two examples surfaced this week. An Internet-monitoring firm's estimate of Google's first-quarter performance appeared 
to have been contradicted by Google's earnings report. Meanwhile, an Amazon unit, which provides popular but widely 
criticized measures of online traffic, updated its algorithm because the people it was tracking weren't typical of the 
Web at large.

The Google discrepancy sent research firm comScore's stock tumbling in after-hours trading Thursday, though it 
rebounded somewhat Friday. At issue was comScore's report earlier in the week that the number of clicks on ads by U.S. 
users of Google's search engine had dropped 9.3% in the first quarter compared with the fourth quarter, and had risen 
just 1.8% over the prior year.

In its earnings release, Google reported world-wide paid clicks had risen 4% from the fourth quarter, and 20% from a 
year earlier. "Paid clicks growth is much higher than has been speculated by third parties," CEO Eric Schmidt told a 
conference call.

ComScore attempted Friday to reconcile its estimate with Google's results by pointing out that Google's U.S. growth in 
paid clicks apparently was much slower than the overall growth. But part of the error may arise from comScore's 
methodology, as Slate's Chris Wilson has argued. The company recruits Internet users to install software to monitor 
their Web behavior. What if those who are asked to participate, and agree, aren't typical of all Google users?

"We are always checking for sample biases and trying to correct for them," comScore Chief Executive Magid Abraham told 
me. "So, while I cannot entirely rule out a bias, I do not think it is the major factor here." ComScore uses a 
telephone survey to check that its panel is in line with all Internet users — though that survey itself, Mr. Abraham 
notes, is also subject to potential error.

Meanwhile, Amazon's Alexa unit announced this week that it was changing its rankings of Web sites by popularity. The 
company tracks the Internet habits of users of its browser toolbar, and until this week, that was the sole basis for 
Alexa rankings. These rankings have long been criticized by Web administrators, including Peter Norvig, Google's 
director of research, because Alexa users may not behave like the Internet as a whole. (Alexa countered that site logs 
also are a flawed traffic measure.)

Now Alexa is incorporating other sources of data — though it says the prior ranking "wasn't wrong before, but it was 
different." Some sites saw big changes in their rankings following Alexa's move: The tech blog TechCrunch said it fell 
far from its prior position in Drudge Report territory (rarefied air in Web-traffic terms). On Friday afternoon, Drudge 
Report ranked 545th, compared with TechCrunch's ranking of 1,784th, according to Alexa's new math.

...

-------------------------------------------
Archives: http://www.listbox.com/member/archive/247/=now
RSS Feed: http://www.listbox.com/member/archive/rss/247/
Powered by Listbox: http://www.listbox.com


Current thread: