oss-sec mailing list archives

Re: CVE-2019-5736: runc container breakout (all versions)


From: Aleksa Sarai <asarai () suse de>
Date: Wed, 13 Feb 2019 01:37:08 +1100

On 2019-02-12, Florian Weimer <fweimer () redhat com> wrote:
+   memfd = memfd_create(MEMFD_COMMENT, MFD_CLOEXEC|MFD_ALLOW_SEALING);
+   if (memfd < 0)
+           goto err_binfd;

Is it really necessary to use a memfd_create here?  Do you really need
sealing?  It's a bit odd to add a new system call dependency in a
security update.  The ability fexecve a memfd descriptor is also rather
odd.  I wouldn't have expected execute permissions on memfd descriptors,
so this sounds like a kernel bug (which now can't be fixed).

The benefit of memfd_create is that you can make sure it's a memfd and
that it's sealed -- which means that you don't end up in a situation
where someone has configured their setup such that you think it's safe
when it isn't.

I don't agree that memfd execution is necessarily a kernel bug --
fexec(2) only gives you ETXTBSY if the file is open for writing. But
when a memfd is sealed it's no longer possible to open it for writing
(with mapping_deny_writable). It's just like having any other tmpfs file
and execing it.

I saw some other patch with a O_TMPFILE replacement.  Does this really
work?  It's possible to create a new name with linkat, so that's not a
real win security-wise.

I'm not sure what you mean by "not a real win security-wise". Yes,
someone could linkat(2) the O_TMPFILE on the host and then execute it,
but I don't see what the exploit vector is (you'd need to have a process
on the host that decides to find the O_TMPFILE fd and linkat(2) it --
and you'd have to linkat onto the same tmpfs filesystem anyway). It's
also definitely a win from the perspective that the vulnerability is
fixed.

I don't like O_TMPFILE because you can't differentiate between O_TMPFILE
and an unlinked file -- but it's much better than nothing.

Could you just make a copy, under a different owner, and not care how
it is going to be modified?

That won't work for rootless (read: unprivileged) containers, since you
can't change the owner. We could add it for the privileged case, but now
we will have 3 different fallbacks and I really am not a fan of that.

And rootless containers with a mapping for the unprivileged user to root
(where the binary is owned by the user) are vulnerable. memfd_create (or
O_TMPFILE) protects against all of these worries, and doesn't require
any cleanup of resources after-the-fact.

I'm also not sure it'll make a difference since the container user has
kuid=0 anyway, though I'm not sure if it has CAP_DAC_OVERRIDE. We could
chown the O_TMPFILE...

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

Attachment: signature.asc
Description:


Current thread: