nanog mailing list archives

Re: looking for hostname geographic hint validation


From: Bradley Huffaker <bhuffake () caida org>
Date: Wed, 28 Aug 2013 12:16:14 -0700

On Wed, Aug 28, 2013 at 04:07:05PM +0100, Ben wrote:
Dear Bradley,

So basically you're asking others to do your homework for you ?   ;-)

Actually no, I'm asking people to do something which I can not.  

While it is true I could test against a manual inference, I would simply
be checking one inference against another. Agreement would only prove
that the algorithm does what I expect. Only the operators, who actually
know what they are doing, can give me the ground truth I need to test my
inferences against reality.

For example, picking one example from your list ....

<iata>([^a-z]+[a-z]+\d*){3}.ic.ac.uk

Far from being IATA codes, the intermediate subdomains actually refer to 
departments (DepartmentOfComputing and CHemistry in the two I quoted).

Sorry to rain on your parade, but someone had to say it.  ;-)

You are most likely right, but I am not looking for perfection.  I am
hoping for an inference that will get me with in 10 km of the actual
city most of the time.

Given the validation I have so far, out of the 19,611 hostnames for which a
location is inferred, and I have validation data, we infer the city
correctly 93% of the time.

While there is work left to do, it is far from the lost cause you
present.

-- 
    the value of a world model is not how accurately it captures reality
    but how often it leads us to take appropriate action


Current thread: