oss-sec mailing list archives

Re: Out-of-bounds read & write in the glibc's qsort()


From: Solar Designer <solar () openwall com>
Date: Mon, 5 Feb 2024 18:23:45 +0100

On Mon, Feb 05, 2024 at 03:56:41PM +0000, Qualys Security Advisory wrote:
On Sun, Feb 04, 2024 at 05:35:20PM +0100, Solar Designer wrote:
It's so invasive I cannot easily tell whether qsort() remained robust
after it or not.  There's no longer a "tmp_ptr != base_ptr &&" check.
So, lacking known-working tests in glibc tree, we don't know about glibc
2.39's status with respect to this issue.

The "tmp_ptr != base_ptr" bounds check was originally added to the
_quicksort() function, but is not needed anymore in glibc 2.39 because
the old fallback to quick sort (the _quicksort() function) has been
completely removed and replaced by a fallback to heap sort.

Note, just in case: we have not reviewed the implementation of this new
fallback to heap sort.

Oh, I should have spent a bit more time looking at the latest glibc
before posting.  I just did.  So it indeed did not reintroduce this same
issue.  That's great.

Regarding the tests, I now see that one of them explicitly calls
heapsort_r(), so it tests that fallback code in this way, however the
rest simply call qsort() or qsort_r(), so they only test non-fallback
code.  It'd improve code coverage of these tests if they first do what
they do now, and then repeat the same after setting RLIMIT_AS to 0.

On Mon, Feb 05, 2024 at 05:02:52PM +0800, Alexander E. Patrakov wrote:
On Mon, Feb 5, 2024 at 4:45???PM Alexander E. Patrakov <patrakov () gmail com> wrote:
On Mon, Feb 5, 2024 at 4:40???PM Alexander E. Patrakov <patrakov () gmail com> wrote:
On Mon, Feb 5, 2024 at 12:36???AM Solar Designer <solar () openwall com> wrote:
I don't have a glibc 2.39 build handy.  Perhaps someone on a distro that
has already updated can run the attached test program and let us know?

Here you go: no output on Arch Linux.

[aep@aep-haswell tmp]$ gcc ./glibc-qualys-rocky-qsort-test.c
[aep@aep-haswell tmp]$ ./a.out
[aep@aep-haswell tmp]$ /lib64/libc.so.6
GNU C Library (GNU libc) stable release version 2.39.

Sorry, I should have followed the instructions.

[aep@aep-haswell tmp]$ while true; do n=$((RANDOM*64+RANDOM+1));
prlimit --as=$((n*4/2*3)) ./a.out $n; done

This results in a mix of these outputs:

PASSED
./a.out: error while loading shared libraries: libc.so.6: failed to
map segment from shared object
Segmentation fault

Upon investigation, I have to add: the segmentation faults come from
code that runs before main(), so they do not indicate a problem in
qsort().

Sorry, I should have included usage instructions.  It's like this:

gcc glibc-qualys-rocky-qsort-test.c -o glibc-qualys-rocky-qsort-test -O2
while true; do n=$((RANDOM*64+RANDOM+1)); echo $n; ./glibc-qualys-rocky-qsort-test $n; done

In other words, almost same as Qualys', but with prlimit omitted because
the program itself now takes care of it.  With our current patched glibc
in Rocky Linux SIG/Security, the output is like this:

396121
PASSED
77207
PASSED
683895
PASSED
1402983
PASSED

and so on.  No crashes anymore.  Before the one-line patch, it would hit
the test program's abort() within seconds, like Qualys had observed:

153916
PASSED
990497
PASSED
1501673
PASSED
1344354
PASSED
176197
PASSED
326004
Aborted (core dumped)
1892398
Aborted (core dumped)
834837
PASSED
2066676
PASSED
589237
Aborted (core dumped)

As to the occasional segfaults when you do use prlimit, I also saw them
on Rocky Linux 9.  They appeared to come from the kernel right after
execve() fails and kind of returns control back to prlimit.  I think
they're a symptom of execve() concluding it ran out of memory too late
for it to allow the original program to continue running.  As I recall
from patching this code in the kernel many years ago, such conditions
did and probably still do exist.  That's kind of fine.

Alexander


Current thread: