Interesting People mailing list archives

IP: Complete ATC power failure in the U.S. Northwest ^-- from RISKS


From: Dave Farber <farber () cis upenn edu>
Date: Wed, 03 Feb 1999 03:49:54 -0500



Date: Sun, 31 Jan 1999 22:36:22 -0800 
From: "Paul Cox" <pcox () eskimo com> 

Some time back, I wrote about some of the various risks involved with 
air-traffic control computer-communication systems. I mentioned how even 
with backup systems controllers are still extremely vulnerable to the power 
going out.
Well, it happened. On 15 Jan 1999, at 2 pm, the power failed at Seattle 
Center, an en-route ATC facility that covers nearly 300,000 square miles of 
the NW United States. I had the unusually good luck *not* to be at work at 
the time.
The power failed during a normal, routine quarterly test procedure on the 
power supply units. To be honest, I don't understand the technical side of 
how/why the power failed; I had it explained and I still didn't get it. But 
the gist of it was that during the test, a circuit board didn't do what it 
was supposed to, and there was a very brief (less than a second) 
interruption of the power to our systems.
Unfortunately, our systems cannot handle any interruption, and thus all the 
computers had to be rebooted and recalibrated. Our communication systems 
failed completely, as they are totally computer-dependent, digitized, 
touch-screen interface modules. This system took over a half-hour to 
reload.
Our main radar displays all went out as well, and the backup display system 
failed too. It took anywhere from 45 to 75 minutes for any radar displays 
to come back at various sectors in the building.
Recall my description of how frantic controllers are when suddenly the 
separation requirements for aircraft go from 5 miles to 20 miles (radar 
environment to non-radar)? That is exactly what happened.
The only system that DID work is a hard-wired, emergency radio backup system 
that only needs power to be supplied to it, but which is old-fashioned 
enough that it doesn't have computers running it. Ironically, we had to 
fight and fight and fight to have this system installed when our recent 
upgrades took place.
Without it, we would have been completely helpless to communicate with any 
aircraft in the area. With it, we were able to restore limited ATC service 
within a couple of minutes, falling back on the "good old days" method of 
pilots staying on specific airways, reporting their progress over certain 
points on the ground, and using paper strips, pencils, and our heads to 
figure out whether anyone was in conflict with anyone else.
This failure simply drives home yet again that backup systems are only as 
good as the main systems IF those backups are equally dependent upon a power 
supply. In fact, our backup communications system and backup radar display 
systems were essentially worthless to us, because they failed at the same 
instant as the main system did when the power died.
As long as you have a single point of failure in any system, it doesn't 
matter how many backups you have downstream if they are dependent on that 
point.
Fortunately, we still had the old-fashioned radios, not to mention the Mark 
I Human Brain Wet Computers working for us.
Paul Cox ZSE


Current thread: