Bugtraq mailing list archives

Re: Linux kernels DoSable by file-max limit


From: Andrea Arcangeli <andrea () suse de>
Date: Wed, 10 Jul 2002 23:07:41 +0200

On Sun, Jul 07, 2002 at 10:54:44PM +0200, Paul Starzetz wrote:
Hi,

the recently mentioned problem in BSD kernels concerning the global 
limit of open files seems to be present in the Linux-kernel too. However 
as mentioned in the advisory about the BSD specific problem the Linux 
kernel keeps some additional file slots reserved for the root user. This 
code can be found in the fs/file_table.c source file (2.4.18):

struct file * get_empty_filp(void)
{
   static int old_max = 0;
   struct file * f;

   file_list_lock();
   if (files_stat.nr_free_files > NR_RESERVED_FILES) {
   used_one:
       f = list_entry(free_list.next, struct file, f_list);

[...]

   /*
    * Use a reserved one if we're the superuser
    */
[*]  if (files_stat.nr_free_files && !current->euid)
       goto used_one;


Greping the source code (2.4.18) reveals that the limit is pretty low:

./include/linux/fs.h:#define NR_RESERVED_FILES 10 /* reserved for root */

well, that's not really secure in the first place, I mean there's
nothing to exploit, it's more an hack to try to have more chances to
keep an usable machine as root after you hit the file-max, but it's not
guaranteed to work at all regardless of malicious or non malicious
workloads. Linux never enforce to keep the nr_free_files to a level >=
NR_RESERVED_FILES, it just tries to do that lazily, but it's not
guaranteed you will have any nr_free_files when you happen to need them.

For example if you keep only opening files since boot and you never
execute a single close() or exit() syscall, you will never get any
nr_free_file available, so no matter who you are (root or not), you will
never pass this test "if (files_stat.nr_free_files && !current->euid)"
because nr_free_files will be always zero.

Furthmore that part of the vfs file allocation management needs a
rewrite (hope it will happen in 2.5) and the file-max should go away
like the inode-max gone away too in 2.3. At the moment all released
files have no way to be releaed dynamically, and that's not good. There
should be a proper slab cache and the fput should kmem_cache_free,
instead of putting the file into the unshrinkable
"file_table.c::free_list". But this is more a linux-kernel topic...

After we make possible to shrink the released files, the file-max limit
can go away (we need it now or we can pin all the ram into this not
shrinkable "free_list"). Then if you allocate all the ram into files you
will run the machine oom at some point. Which moves the DoS issues
elsewere: in the memory management area, which becomes a generic
problem, not specific to the file allocations anymore. After you hit the
oom point, even if you could allocate the file with a
root-file-reserved-pool, still you may not be able to allocate the
dentry and the inode then.

Anyways regardless of the memory management oom possible DoSes (when
running out of ram resources), removing the file-max is a goodness
because it makes the usability of linux much better, if you need lots of
files in a temporarly spike of load, then you won't be left with an huge
leak of files hanging around the the vm will shrink them as you need
more ram later. And if you hit oom, it's very likely (though not
guaranteed, also considering the different algorithms to handle oom
conditions, some deadlock prone, some not deadlock prone) that the
offending task will be killed too rendering any malicious attack much
less reproducible than now.

[..]
Exploitability to get uid=0 has not been confirmed yet but seems possible.

If that's the case it's an userspace bug in the suid apps that you're
executing, certainly it's not a kernel issue.

Andrea


Current thread: