nanog mailing list archives

Re: dns and software, was Re: Reliable Cloud host ?


From: Joe Greco <jgreco () ns sol net>
Date: Thu, 1 Mar 2012 09:22:07 -0600 (CST)

On 03/01/2012 06:26 AM, William Herrin wrote:
On Thu, Mar 1, 2012 at 7:20 AM, Owen DeLong<owen () delong com>  wrote:
The simpler approach and perfectly viable without mucking
up what is already implemented and working:

Don't keep returns from GAI/GNI around longer than it takes
to cycle through your connect() loop immediately after the GAI/GNI call.
The even simpler approach: create an AF_NAME with a sockaddr struct
that contains a hostname instead of an IPvX address. Then let
connect() figure out the details of caching, TTLs, protocol and
address selection, etc.  Such a connect() could even support a revised
TCP stack which is able to retry with the other addresses at the first
subsecond timeout rather than camping on each address in sequence for
the typical system default of two minutes.

The effect of what you're recommending is to move all of this
into the kernel, and in the process greatly expand its scope. Also:
even if you did this, you'd be saddled with the same problem because
nothing existing would use an AF_NAME.

The real issue is that gethostbyxxx has been inadequate for a very
long time. Moving it across the kernel boundary solves nothing and
most likely causes even more trouble: what if I want, say, asynchronous
name resolution? What if I want to use SRV records? What if a new DNS
RR comes around -- do i have do recompile the kernel? It's for these
reasons and probably a whole lot more that connect just confuses the
actual issues.

When I was writing the first version of DKIM I used a library that I scraped
off the net called ARES. It worked adequately for me, but the most notable
thing was the very fact that I had to scrape it off the net at all. As far as
I could tell, standard distos don't have libraries with lower level access to
DNS (in my case, it needed to not block). Before positing a super-deluxe
gethostbyxx that does addresses picking, etc, etc, it would be better to
lobby all of the distos to settle on a decomposed resolver library from
which that and more could be built.

It's deeper than just that, though.  The whole paradigm is messy, from
the point of view of someone who just wants to get stuff done.  The
examples are (almost?) all fatally flawed.  The code that actually gets
at least some of it right ends up being too complex and too hard for
people to understand why things are done the way they are.

Even in the "old days", before IPv6, geez, look at this:

bcopy(host->h_addr_list[n], (char *)&addr->sin_addr.s_addr, sizeof(addr->sin_addr.s_addr));

That's real comprehensible - and it's essentially the data interface 
between the resolver library and the system's addressing structures
for syscalls.

On one hand, it's "great" that they wanted to abstract the dirty details
of DNS away from users, but I'd say they failed pretty much even at that.

... JG
-- 
Joe Greco - sol.net Network Services - Milwaukee, WI - http://www.sol.net
"We call it the 'one bite at the apple' rule. Give me one chance [and] then I
won't contact you again." - Direct Marketing Ass'n position on e-mail spam(CNN)
With 24 million small businesses in the US alone, that's way too many apples.


Current thread: