nanog mailing list archives

Re: news from Google


From: JC Dill <jcdill.lists () gmail com>
Date: Fri, 11 Dec 2009 19:38:18 -0800

Seth Mattinen wrote:
JC Dill wrote:
Seth Mattinen wrote:
Hell, all you gmail users on this list right now are feeding the machine with all our data.

The part that gets me: everyone seems happy with this.

This list has public archives that are already crawled and archived by Google. For example:

http://www.merit.edu/mail.archives/nanog/threads.html
http://seclists.org/nanog/2009/Dec/434

Subscribing to the list with a gmail account doesn't change anything about what Google knows about the list or list members.


Those URL's don't seem to include "google.com" in them. Maybe I'm misreading them.

I *found* them by searching with Google. I found the second link by searching for a unique phrase from your email:

http://www.google.com/search?q=nanog+%22feeding+the+machine

A mere 1 hour after you emailed it to the NANOG list, Google web search has that email archived from the website on seclists.org.

Crawlers can be excluded with robots.txt if so chosen by the site owner so long as google respects said file.

Google does respect that file, but you are counting on other subscribers respecting the site owner's wishes regarding web archives. In my experience, this has become a futile fight. If the list doesn't have a web accessible archive, it's likely one of the list's subscribers might start their own archive or have it archived with one of the many archive sites e.g. gmane.

Some lists also respect a "no archive" header that some people choose to include with their messages.

If you are emailing a publicly archived mailing list that you know is web archived and likely spidered by Google, a "no archive" header is mostly useless. When someone replies to your email (as I'm doing now) your quoted text in the reply will be archived, preserving what you posted to the list. At best, the "no archive" header merely messes up threading. The "no archive" header idea never really worked in the first place - witness all the old usenet server posts that ended up on dejagoogle even when the posts had "no archive" headers.

Preventing my email to gmail from entering their vast database of whatever they track doesn't have any such control features that I'm aware of.

Preventing any email you send to anyone from being leaked out to the public is something you have no control of. I.e. the CRU hacked email controversy. If you don't want what you write to be posted on or archived on the internet and findable with web searches, don't use the internet to write or transmit it. Even then, you are at risk of someone scanning and posting what you write. As a NANOG subscriber you should be clueful enough to know all of this already. So what's the big issue here?

jc



Current thread: