Full Disclosure mailing list archives

Re: "IO wait chains" in Linux??


From: Valdis.Kletnieks () vt edu
Date: Mon, 07 Feb 2011 15:06:00 -0500

On Mon, 07 Feb 2011 18:29:04 GMT, "Cal Leeming [Simplicity Media Ltd]" said:

21000 - /usr/local/sbin/nginx - [D]
 - /tmp/.somefile
    - other PIDs waiting on this file (not just children of the parent)
        - 51283 - /usr/local/sbin/apache (4.6 seconds)
        - 31028 - /usr/local/sbin/python2.6 (1.9 seconds)

Sadly, I don't know much about how the kernel and the IO schedulers handle
these things behind the scenes, so what I'm asking for may be impossible
(apart from your other suggestion using watchdog+dmesg).

You need to distinguish between I/O that's supposed to complete quickly (for
instance, local disk I/O), and stuff that can reasonably take a while (reads
from a network connection, waiting for a file lock, etc).  The 'D' state is for
stuff that's supposed to be fast, if it's taking more than milliseconds you
have a big problem - as in "you have hit an actual kernel bug or hardware
has failed on you".  More often, you'll block inside open(), flock(), poll(),
and similar places, resulting in a 'sleeping' status.

So the big question is "what you're trying to accomplish" rather than "is there
a CLI tool that does XYZ" - most of the problems won't be found by checking for
XYZ at all...

Attachment: _bin
Description:

_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.grok.org.uk/full-disclosure-charter.html
Hosted and sponsored by Secunia - http://secunia.com/

Current thread: