Full Disclosure mailing list archives
RE: Windoze almost managed to 200x repeat 9/11
From: "joe" <mvp () joeware net>
Date: Fri, 24 Sep 2004 11:50:32 -0400
From the article
"The servers are timed to shut down after 49.7 days of use in order to prevent a data overload, a union official told the LA Times. To avoid this automatic shutdown, technicians are required to restart the system manually every 30 days. An improperly trained employee failed to reset the system, leading it to shut down without warning, the official said." And "Soon after installation, however, the FAA discovered that the system design could lead to a radio system shutdown, and put the maintenance procedure into place as a workaround, the LA Times said. The FAA reportedly said it has been working on a permanent fix but has only eliminated the problem in Seattle. The FAA is now planning to institute a second workaround - an alert that will warn controllers well before the software shuts down." It would appear that the VSCS shut down, not the system. Further it would appear that someone failed to reboot the system and caused this, not that the system hung or died mid-restart. This article combined with other discussions makes it sound like the app itself had issue, the system didn't crash or drop. Kernel memory wasn't corrupted etc. The fact that they want it rebooted and the time frame mentioned, 49.7 days which happens to coincide perfectly with when the 32 bit DWORD output from GetTickCount has to roll over to 0, means they are probably basing some timing info off the output of GetTickCount and can't properly handle the rollover. GetTickCount is based off system start date. http://msdn.microsoft.com/library/default.asp?url=/library/en-us/sysinfo/bas e/gettickcount.asp Options are to have a thread managing your own timer values based on some floating point type or 64 bit integer or 64 bit high resolution timers (all of which just moves the problem further out and are all available right now and have been for some time) or properly handle the datatype used. A popular option which is even worse is to base things off the system clock. While you don't have to worry about a rollover for a long long time with Windows FILETIME (64 bit) and epoch if using ctime, at that point then you start getting all sorts of timing issues due to time correction software or the user changing the time. Anyway, had they used high resolution timers (QueryPerformanceCounter/QueryPerformanceFrequency) instead of GetTickCount they would have been working with an API available since like NT3.1/Win9x and would have been using 64 bit INTs and if I recall correctly wouldn't have had an issue until the system had been up for something like 100 years (200 if using unsigned) which obviously could NEVER happen with a Windows system. Been a while since I worked out the details of those functions. Anyway, many coders avoid them because they don't like working with 64 bit INTs. joe -----Original Message----- From: Barry Fitzgerald [mailto:bkfsec () sdf lonestar org] Sent: Friday, September 24, 2004 10:15 AM To: joe Cc: full-disclosure () lists netsys com Subject: Re: [Full-disclosure] Windoze almost managed to 200x repeat 9/11 joe wrote:
Where issues like this relate to the OS is in the fact that the OS itself shouldn't be brought down by a poorly designed app. Of course, you can shoot yourself in the foot in any OS, but an overflow in a local app should never take down the kernel. Unfortunately, memory management in MS Windows (though it's gotten better over time) is still not up to par and that is what causes a number of these issues. Not to mention poor system architecture and design on the part of MS. Was it MS Windows that actually held the code that brought the system down? Well, that depends on how far down you want to drill and where you place the burden of OS stability. If you place it on the OS, then Windows is fair game. If you place the burden of OS stability on the app, then you're foolish and don't understand OS design concepts. :) (said in jest, but then, so is most truth) The article doesn't make the situation entirely clear. Did the app intentionally restart the system and foul it? Did the restart occur because the app crashed? I'm skeptical because technical details like this are usually confused, mislabeled, or misreported... even (especially?) in tech rags. So, who holds the burden in this case depends on the answers to the questions above. -Barry _______________________________________________ Full-Disclosure - We believe in it. Charter: http://lists.netsys.com/full-disclosure-charter.html
Current thread:
- Re: Windoze almost managed to 200x repeat 9/11, (continued)
- Re: Windoze almost managed to 200x repeat 9/11 Troy (Sep 25)
- RE: Windoze almost managed to 200x repeat 9/11 Ron DuFresne (Sep 25)
- Re: Windoze almost managed to 200x repeat 9/11 Barry Fitzgerald (Sep 24)
- Re: Windoze almost managed to 200x repeat 9/11 Frank Knobbe (Sep 24)
- Re: Windoze almost managed to 200x repeat 9/11 Barry Fitzgerald (Sep 24)
- RE: Windoze almost managed to 200x repeat 9/11 joe (Sep 24)
- Re: Windoze almost managed to 200x repeat 9/11 Nancy Kramer (Sep 24)
- RE: Windoze almost managed to 200x repeat 9/11 joe (Sep 24)
- Re: Windoze almost managed to 200x repeat 9/11 Ron DuFresne (Sep 24)
- Re: Windoze almost managed to 200x repeat 9/11 Frank Knobbe (Sep 24)
- RE: Windoze almost managed to 200x repeat 9/11 joe (Sep 24)
- Re: Windoze almost managed to 200x repeat 9/11 ASB (Sep 24)
- Re: Windoze almost managed to 200x repeat 9/11 Michal Zalewski (Sep 24)
- RE: Windoze almost managed to 200x repeat 9/11 joe (Sep 24)
- RE: Windoze almost managed to 200x repeat 9/11 Michal Zalewski (Sep 24)
- RE: Windoze almost managed to 200x repeat 9/11 joe (Sep 24)
- Re: Windoze almost managed to 200x repeat 9/11 devis (Sep 25)
- Re: Windoze almost managed to 200x repeat 9/11 Barry Fitzgerald (Sep 24)
- Re: Windoze almost managed to 200x repeat 9/11 ASB (Sep 26)
- RE: [inbox] Re: Windoze almost managed to 200x repeat 9/11 Exibar (Sep 26)
- RE: [inbox] Re: Windoze almost managed to 200x repeat 9/11 Ron DuFresne (Sep 26)