nanog mailing list archives

Re: Why the US Government has so many data centers


From: George Herbert <george.herbert () gmail com>
Date: Fri, 18 Mar 2016 22:21:01 -0700


So...

Before I go on, I have not been in Todd's shoes, either serving nor directly supporting an org like that.

However, I have indirectly supported orgs like that and consulted at or supported literally hundreds of commercial and 
a few educational and nonprofit orgs over the last 30 years. 

There are corner cases where distributed resilience is paramount, including a lot of field operations (of all sorts) on 
ships (and aircraft and spacecraft), or places where the net really is unstable.  Any generalizations that wrap those 
legitimate exceptions in are overreaching their valid descriptive range.

That said, the vast bulk of normal world environments, individuals make justifications like Todd's and argue for 
distributed services, private servers, etc.  And then do not run them reliably, with patches, backups, central security 
management, asset tracking, redundancy, DR plans, etc.

And then they break, and in some cases are and will forever be lost.  In other cases they will "merely" take 2, 5, 10, 
in one case more than 100 times longer to repair and more money to recover than they should have.

Statistically these are very very poor operational practice.  Not so much because of location (some) but because of 
lack of care and quality management when they get distributed and lost out of IT's view.

Statistically, several hundred clients in and a hundred or so organizational assessments in, if I find servers that 
matter under desks you have about a 2% chance that your IT org can handle supporting and managing them appropriately.

If you think that 98% of servers in a particular category being at high risk of unrecoverable or very difficult 
recovery when problems crop up is acceptable, your successor may be hiring me or someone else who consults a lot for a 
very bad day's cleanup.

I have literally been at a billion dollar IT disaster and at tens of smaller multimillion dollar ones trying to clean 
it up.  This is a very sad type of work.

I am not nearly as cheap for recoveries as for preventive management and proactive fixes. 


George William Herbert
Sent from my iPhone

On Mar 18, 2016, at 9:28 PM, Todd Crane <todd.crane () n5tech com> wrote:

I was trying to resist the urge to chime in on this one, but this discussion has continued for much longer than I had 
anticipated... So here it goes

I spent 5 years in the Marines (out now) in which one of my MANY duties was to manage these "data centers" (a part of 
me just died as I used that word to describe these server rooms). I can't get into what exactly I did or with what 
systems on such a public forum, but I'm pretty sure that most of the servers I managed would be exempted from this 
paper/policy.

Anyways, I came across a lot of servers in my time, but I never came across one that I felt should've been located 
elsewhere. People have brought up the case of personal share drive, but what about the combat camera (think public 
relations) that has to store large quantities (100s of 1000s) of high resolution photos and retain them for years. 
Should I remove that COTS (commercial off the shelf) NAS underneath the Boss' desk and put in a data center 4 miles 
down the road, and force all that traffic down a network that was designed for light to moderate web browsing and 
email traffic just so I can check a box for some politician's reelection campaign ads on how they made the government 
"more efficient"

Better yet, what about the backhoe operator who didn't call before he dug, and cut my line to the datacenter? Now we 
cannot respond effectively to a natural disaster in the Asian Pacific or a bombing in the Middle East or a platoon 
that has come under fire and will die if they can't get air support, all because my watch officer can't even login to 
his machine since I can no longer have a backup domain controller on-site

These seem very far fetched to most civilian network operators, but to anybody who has maintained military systems, 
this is a very real scenario. As mentioned, I'm pretty sure my systems would be exempted, but most would not. When 
these systems are vital to national security and life & death situations, it can become a very real problem. I 
realize that this policy was intended for more run of the mill scenarios, but the military is almost always grouped 
in with everyone else anyways. 

Furthermore, I don't think most people realize the scale of these networks. NMCI, the network that the Navy and 
Marine Corps used (when I was in), had over 500,000 active users in the AD forest. When you have a network that size, 
you have to be intentional about every decision, and you should not leave it up to a political appointee who has 
trouble even checking their email. 

When you read how about much money the US military hemorrhages, just remember.... 
- The multi million dollar storage array combined with a complete network overhaul, and multiple redundant 100G+ DWDM 
links was "more efficient" than a couple of NAS that we picked up off of Amazon for maybe $300 sitting under a desk 
connected to the local switch. 
- Using an old machine that would otherwise be collecting dust to ensure that users can login to their computers 
despite conditions outside of our control is apparently akin to treason and should be dealt with accordingly.
</rant>


--Todd

Sent from my iPad

On Mar 14, 2016, at 11:01 AM, George Metz <george.metz () gmail com> wrote:

On Mon, Mar 14, 2016 at 12:44 PM, Lee <ler762 () gmail com> wrote:


Yes, *sigh*, another what kind of people _do_ we have running the govt
story.  Altho, looking on the bright side, it could have been much
worse than a final summing up of "With the current closing having been
reported to have saved over $2.5 billion it is clear that inroads are
being made, but ... one has to wonder exactly how effective the
initiative will be at achieving a more effective and efficient use of
government monies in providing technology services."

Best Regards,
Lee

That's an inaccurate cost savings though most likely; it probably doesn't
take into account the impacts of the consolidation on other items. As a
personal example, we're in the middle of upgrading my site from an OC-3 to
an OC-12, because we're running routinely at 95+% utilization on the OC-3
with 4,000+ seats at the site. The reason we're running that high is
because several years ago, they "consolidated" our file storage, so instead
of file storage (and, actually, dot1x authentication though that's
relatively minor) being local, everyone has to hit a datacenter some 500+
miles away over that OC-3 every time they have to access a file share. And
since they're supposed to save everything to their personal share drive
instead of the actual machine they're sitting at, the results are
predictable.

So how much is it going to cost for the OC-12 over the OC-3 annually? Is
that difference higher or lower than the cost to run a couple of storage
servers on-site? I don't know the math personally, but I do know that if we
had storage (and RADIUS auth and hell, even a shell server) on site, we
wouldn't be needing to upgrade to an OC-12.


Current thread: