Full Disclosure mailing list archives

RE: Windoze almost managed to 200x repeat 9/11


From: "joe" <mvp () joeware net>
Date: Fri, 24 Sep 2004 16:06:55 -0400

It says right in the article they were running Windows 2000 Advanced Server.
The systems were not impacted by the Win95 hang bug. Almost certainly
Windows was fine... period. The communication software puked based on the
same API function that the Windows 95 Dev guys screwed up with. The value
rolls over and either the application software detects that and shuts down
the application or the application crashes because of poor exception
handling.

If Windows crashed of its own accord, then yes, MS needs to share some
blame. If, what actually happened is a crappy app died and the OS was fine
the whole time, the responsibility rests with the application vendor and the
design/implementation team.

Should technicians be rebooting boxes as fixes. Absolutely not. However,
before assuming it is an OS issue, understand why they are rebooting it. In
this case I expect it was to reset the tick count for the application
itself. If it is because the app is eating all the memory up, that is one
hellacious memory leak they need to work on in the app. 


  joe



-----Original Message-----
From: full-disclosure-admin () lists netsys com
[mailto:full-disclosure-admin () lists netsys com] On Behalf Of Michal Zalewski
Sent: Friday, September 24, 2004 2:32 PM
To: ASB
Cc: full-disclosure () lists netsys com
Subject: Re: [Full-disclosure] Windoze almost managed to 200x repeat 9/11

On Fri, 24 Sep 2004, ASB wrote:

"The servers are timed to shut down after 49.7 days of use in order to 
prevent a data overload, a union official told the LA Times."

How you managed to read "OS failure" into this is rather astounding...

The statement above, even though either cleverly disguised by the
authorities, or mangled by the press, does ring a bell. It is not about
applications eating up too much memory, hence requiring an occassional
reboot, oh no.

Windows 9x had a problem (fixed by Microsoft, by the way) that caused them
to hang or crash after a jiffie counter in the kernel overflowed:

  http://support.microsoft.com/support/kb/articles/q216/6/41.asp

It would happen precisely after 49.7 days. Coincidence? Not very likely.
It seems that the system was running on unpatched Windows 95 or 98, and
rather than deploying a patch, they came up with a maintenance procedure
requiring a scheduled reboot every 30 days.

This is one hell of a ridiculous idea, and any attempt to blame a failure on
a technician who failed to reboot the box is really pushing it.

It is not uncommon for telecommunications, medical, flight control, banking
and other mission-critical applications to run on terribly ancient software
(and with a clause that requires them NOT to be updated, because the
software is not certified against those patches).

In the end, the OS and decision-makers that implemented the system and
established ill-conceived workarounds should split the blame.

/mz

_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.netsys.com/full-disclosure-charter.html

_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.netsys.com/full-disclosure-charter.html


Current thread: