Vulnerability Development mailing list archives
Re: BitchX /ignore bug
From: friedl () MTNDEW COM (Stephen J. Friedl)
Date: Tue, 4 Jul 2000 19:08:36 -0700
BlueBoar asked that I post this to the list as a whole, so I've expanded on it a bit. BlueBoar wrote:
I've seen a number of these print string vulnerabilities pop up lately. I gather that the programmer writes their printf or equiv wrong, and these attacks are getting interpreted as formatting strings somehow.
The printf() functions (with siblings fprintf and sprintf) take a string parameter that is a format string, and it contains little % tokens that represent the type and interpretation of the parameters that follow. Unlike many other languages, C permits variadic functions and the I/O is not built into the language. printf("string=%s, number=%d, float=%f\n", "hello, world", 1234, 3.1415); Except on the most unbelivably bizarre platforms, these parameters are generally passed on the stack in the usual order for that architecture. On the Intel machines, for instance, they params are push right to left and the stack grows down. Other architectures can and do either of these differently. So this call would be: push high word of double (a slice of pi?) push low word of double push 1234 push addr of "hello, world" push addr of "string=%s, number=%d, float=%f\n" call printf add sp, $20 Once inside printf(), it essentially takes the address of the one fixed parameter -- the format string -- and walks up the stack getting parameters as suggested by the % tokens. Fetch a string (by pointer), a double, or whatever. The stack looks something like this after the call to printf: +------------+ | local var | local to calling function, but not touched here +------------+ | high of pi | +------------+ | low of pi | +------------+ | 1234 | +------------+ | stringaddr | --------> "hello, world" +------------+ FMT-> | stringaddr | --------> "string=%s, number=%d, float=%f\n" +------------+ | old PC | (aka "return address") +------------+ |old frameptr| (on Intel, it's EBP) +------------+ | local #1 | local param of *printf* function +------------+ | local #2 | +------------+ | ,,,,,,,,, | What's important here is that (generally speaking) printf has no way of knowing how many parameters were *actually* pushed onto the stack. It has to trust the format string, and it's entirely possible to make a mistake: printf("hello, world = %s\n"); push address of "hello, world" Oops! The argument pointer will be looking at random data for the sting +------------+ | local var | local to calling function * IS THIS A STRING? * +------------+ FMT-> | stringaddr | --------> "hello, world = %s\n" +------------+ | old PC | (aka "return address") +------------+ |old frameptr| (on Intel, it's EBP) +------------+ | local #1 | local param of *printf* function +------------+ | local #2 | +------------+ | ,,,,,,,,, | When printf sees the %s in the format string, it looks next up the stack and grabs the next word found there. This word value has nothing to do with anything interesting -- it's essentially random -- and treates it like a pointer to a string (an array of characters). Then it follows it until it finds a NUL byte, and you usually either get garbage or the program core dumps. The latter is due to accessing memory that's either zero (NULL pointer dereference) or reading memory you're not allowed to be in. If you have the source you can see what the last local variable is and make some guesses about what value might be used, but this won't be easy to determine without the source and even harder to find a way to do something with. But in no case does this permit random code execution. Printf does have internal buffers, but it's smart enough to never overflow them. What's happening with these recent bugs is that some extremely sloppy programmer has passed an unchecked string to printf *as the format parameter*, and this is simply wrong. You NEVER EVER do something like this: printf("Enter your name: "); gets(namebuf); printf(namebuf); /* BOOM */ Here your formats are at the mercy of the user, and it's simply never done. This is much, much sloppier than the buffer overflow problem, because doing this correctly is so easy: printf("%s", namebuf); gets it right every time. In the printf() case (and fprintf, which writes to a FILE pointer) there is simply no opportunity for anything other than random DoS, though you can increase your chances by putting lots of %s in the string: "%s%s%s%s%s" will follow five strings, not just one, and you're much more likely to break things this way. Now it is technically possible to get somewhere with the sprintf() function, which formats to a string buffer instead of a file. What would happen here is that the random garbage would be copied to the user's buffer, and random garbage is always much longer than any string buffer that you could find. If the buffer is on the stack, you have a buffer overflow exploit, but you have very little control over the random garbage. But in any case, I think the sprintf() case is really unlikely. Though the "printf"-as-"print-string" idiom is common (but wrong), it's just not used as a copy-string idiom. That's what strcpy() is for, and I don't think I've ever seen it used in this wrong way. Finally we get to the user-defined formatting functions vsprintf() and friends. This is where you write your own printf-like function that integrates into your application: logprintf(LEVEL5, "hello, world = %d", n); This might format into a fixed-length buffer before sending to a logfile or something, and this lends itself to the mistaken use of: logprintf(LEVEL5, userbuf); You are much more likely to get an overflow with the *contents* of userbuf than you are in including some % token and hoping that random data will happen to be shellcode. What's more likely is the fishing of data from an application that they don't want visible. Since these phantom % tokens are walking up the stack picking up local variables, if any of them are intersting you can possibly see them. On my Intel Linux box: $ cat secret.c #include <stdio.h> int main() { char userbuf[254]; double pi = 3.1415; int magicnum = 1234; char *secret = "secret password!"; gets(userbuf); printf(userbuf); // user-controlled format string! Bad! return 0; } $ cc -o secret secret.c $ ./secret me-> secret="%s" magic=%d pi=%f output-> secret="secret password!" magic=1234 pi=3.141500 You can always use %d markers to skip past values you don't care about: %s is first string interesting? %d %s how about the second one? %d %d %s ... and so on Using "0x%08lx" prints the values in hex, which might ring some bells. I guess upon reflection somebody will inevitably find a real shellcode exploit for this, but it's going to be way, way harder than "regular" buffer overflows. Steve Stephen J. Friedl / Software Consultant / Tustin, CA / 714-544-6561
Current thread:
- Re: BitchX /ignore bug Stephen J. Friedl (Jul 04)
- Re: BitchX /ignore bug Stephen J. Friedl (Jul 05)
- Re: BitchX /ignore bug Benjamin Karas (Jul 05)
- Re: BitchX /ignore bug Daniel Jacobowitz (Jul 05)
- <Possible follow-ups>
- Re: BitchX /ignore bug Thomas Dullien (Jul 05)
- Re: BitchX /ignore bug Ron DuFresne (Jul 06)
- Re: BitchX /ignore bug Keith Simonsen (Jul 06)
- Re: BitchX /ignore bug Steve Mosher (Jul 06)
- Re: BitchX /ignore bug Joe User (Jul 06)
- Re: BitchX /ignore bug Security Mail Acct. (Jul 06)
- wwwboard my help reveal user name and password Julian Linton (Jul 07)
- Re: BitchX /ignore bug Ron DuFresne (Jul 06)