Interesting People mailing list archives

Roll Your Own Google


From: David Farber <dave () farber net>
Date: Wed, 14 Dec 2005 19:26:55 -0500



Begin forwarded message:

From: Dewayne Hendricks <dewayne () warpspeed com>
Date: December 14, 2005 10:50:44 AM EST
To: Dewayne-Net Technology List <dewayne-net () warpspeed com>
Subject: [Dewayne-Net] Roll Your Own Google
Reply-To: dewayne () warpspeed com

[Note:  This item comes from reader Randy Burge.  DLH]

From: Randy Burge <burge () proactiveteams com>
Date: December 13, 2005 8:46:05 AM PST
To: Dewayne Hendricks <dewayne () dandin com>
Subject: Roll Your Own Google


To access the live links in the article:
<http://www.wired.com/news/technology/0,1282,69817,00.html? tw=wn_tophead_2>

Roll Your Own Google

By Jeff MacIntyre

02:00 AM Dec. 13, 2005 PT

In a move with potentially far-reaching implications for the search market, Alexa Internet is opening up its huge web crawler to any programmer who wants paid access to its rich trove of internet data.

Alexa, a subsidiary of Amazon.com that is best known for its traffic rankings, on Monday unveiled Alexa Web Search Platform, a set of online tools for searching, indexing, computing, storing and publishing vast quantities of net data.

Alexa claims it's the first time that developers, students and startups will be given inexpensive access to an industrial-scale web crawler -- the same technology used by industry giants like Yahoo (Yahoo Slurp) and Google (Googlebot).

"It sounds innocuous but it's big," said Alexa CEO Bruce Gilliat. "We're giving access to billions of pages and computing resources.... Users have never had this opportunity before. Big industry has ruled search, because it was the only player with access to the tools."

Alexa spiders 4 billion to 5 billion pages a month and archives 1 terabyte of data a day. The new platform will allow developers to build their own search engines.

"If it is what they claim it is, it strikes me that this is nontrivial news," said search industry pundit and author John Battelle. "Anyone can crawl the web, but crawling and maintaining an index at scale is very difficult and very expensive. They are providing convenient access to something that was very dear."

Battelle said the move, if it pans out as promised, could have a big impact on the search industry, and could possibly lessen Google's growing dominance in web search.

Alexa's offering may help "create an ecosystem (in search) where something can occur outside the Googleverse," he said.

To illustrate the new service's potential, Alexa developed a photo search engine that allows users to query photo metadata normally hidden from standard keyword searches, such as the date the photo was taken or the camera used.

Musipedia, another Alexa prototype, provides users with the ability to search the web by melody. Give the engine a keyword or melodic contour, and it returns similar music. Musipedia allows users to input their own whistling as a query.

From computer scientists to web hobbyists, Gilliat predicted Alexa's inexpensive services will spawn numerous creative results. Costs are priced at $1 per transaction, which range from a CPU hour of computing time to gigabytes of uploads and downloads. Gilliat said a complete web snapshot should cost a "couple thousand" dollars.

[snip]

<http://www.wired.com/news/technology/0,1282,69817,00.html? tw=wn_tophead_2 >


Weblog at: <http://weblog.warpspeed.com>



-------------------------------------
You are subscribed as lists-ip () insecure org
To manage your subscription, go to
 http://v2.listbox.com/member/?listname=ip

Archives at: http://www.interesting-people.org/archives/interesting-people/


Current thread: