Interesting People mailing list archives
IP: Size of the web, and attempts to filter it
From: Dave Farber <farber () cis upenn edu>
Date: Wed, 26 Jan 2000 09:17:24 -0500
Date: Wed, 26 Jan 2000 08:39:36 -0500 From: Jamie McCarthy <jamie () mccarthy org> Subject: Size of the web, and attempts to filter it To: farber () cis upenn edu X-Mailer: Mailsmith 1.1.4 (Bluto) Hi Dave, The Censorware Project today released a "dynamic essay" on the size of the world-wide web. Accurate reports on its size are few and far-between. But using what we do know and applying a bit of extrapolation, our Michael Sims has set up a webpage that gives a daily estimate -- and uses it to put into context the concept of "filtering the web." http://censorware.org/web_size/ Here are some excerpts... So, as of today (these figures are dynamically generated on a daily basis), the web has roughly: 1,570,000,000 pages; 29,400,000,000,000 bytes of text; 353,000,000 images; and 5,880,000,000,000 bytes of image data. In just the last 24 hours, the web has added: 3,180,000 new pages; 59,700,000,000 new bytes of text; 716,000 new images; and 11,900,000,000 new bytes of image data. And of course, any web page can be changed or removed or any time. Changes may be minor, major, or total. According to Alexa, which is striving valiantly to create archive snapshots of major portions of the web, the average lifespan of a webpage is about 44 days, which means that in the last 24 hours, about: 35,600,000 pages changed; and 8,020,000 images changed. ... Okay, so we've established the target that censorware companies have to shoot at. Millions of pages being created and changing every single day. In fact, to keep up with the changes, you'd need to download about 873,000,000,000 bytes of information per day, which would mean you'd need a connection capable of downloading 10,100,000 bytes per second. ... ...you'd need just 20,200 reviewers working every day to keep up. If your company kept five-day work weeks, you'd need 28,300 reviewers working Monday through Friday, no vacations, no holidays, no coffee breaks. Even at a measly seven dollars per hour, this is going to cost your company hundreds of millions of dollars per year ($396,000,000 just for straight salaries, if you're keeping track), just for personnel costs. To compare, N2H2, the company behind Bess, had only $3.1 million in total revenues for 1998, just a little bit short of $396,000,000. The very concept is a joke - the censorware companies employ anywhere from zero (several companies) to 2 full-time (Logon Data/makers of X-Stop) to 15 fulltime and 58 part-time (N2H2/makers of Bess) website reviewers, nowhere near enough to review all the pages that changed on any given day, let alone the rest of the web. None of them employs even one one-thousandth of the number of workers required. So what do they do? -- Jamie McCarthy jamie () mccarthy org http://jamie.mccarthy.org/
Current thread:
- IP: Size of the web, and attempts to filter it Dave Farber (Jan 26)