Interesting People mailing list archives
Re: New whitehouse.gov robots.txt file
From: David Farber <dave () farber net>
Date: Wed, 21 Jan 2009 10:37:57 -0500
Begin forwarded message: From: "David P. Reed" <dpreed () reed com> Date: January 21, 2009 10:20:53 AM EST To: dave () farber net Cc: ip <ip () v2 listbox com> Subject: Re: [IP] New whitehouse.gov robots.txt fileWho decided that "whitehouse.gov", which is by definition a PR site, not a news site or an information source, has "the country's" robots.txt file? Obama is not "the country". He's just another elected official, presenting his side of the story.
That he wants the entire site to be indexed by spidering is not "amazing". If you are pushing propaganda, that's exactly what you would do.
What's more interesting is what the executive branch chooses to reveal about, say, its actual ongoing surveillance activities, the names of the prisoners at Guantanamo and in CIA rendition sites. In other words, what The Sunlight Foundation would index, if only it could get to it, or what the National Security Archive would archive, if only ...
Or what lobbyists are meeting in the West Wing each day, and the subject matter of those meetings.
There will be nothing on whitehouse.gov that is likely to change these matters. High symbolism, but empty symbolism of that sort sets an expectation that is unlikely to be met unless *we* as Americans look beyond the cheap symbolism and demand transparency and sunlight.
David Farber wrote:
Begin forwarded message: From: Joseph Lorenzo Hall <joehall () gmail com> Date: January 21, 2009 8:03:57 AM EST To: Dave Farber <dave () farber net> Subject: New whitehouse.gov robots.txt file (see here: http://www.kottke.org/09/01/the-countrys-new-robotstxt-file via Aaron Burstein) Hi Dave, Here's another fascinating sign of increased transparency in the new administration: The whitehouse.gov robots.txt file -- a file that specifies what areas of a web site that web spiders may crawl[1] -- has gone from 2400 lines to just two lines: User-agent: * Disallow: /includes/ This means that most of whitehouse.gov will now be available to search engines and other web resources that use automated crawlers to retrieve, index, etc. content. best, Joe [1]: http://en.wikipedia.org/wiki/Robots.txt -- Joseph Lorenzo Hall ACCURATE Postdoctoral Research Associate UC Berkeley School of Information Princeton Center for Information Technology Policy http://josephhall.org/ ------------------------------------------- Archives: https://www.listbox.com/member/archive/247/=now RSS Feed: https://www.listbox.com/member/archive/rss/247/ Powered by Listbox: http://www.listbox.com
------------------------------------------- Archives: https://www.listbox.com/member/archive/247/=now RSS Feed: https://www.listbox.com/member/archive/rss/247/ Powered by Listbox: http://www.listbox.com
Current thread:
- New whitehouse.gov robots.txt file David Farber (Jan 21)
- <Possible follow-ups>
- Re: New whitehouse.gov robots.txt file David Farber (Jan 21)
- Re: New whitehouse.gov robots.txt file David Farber (Jan 21)