oss-sec mailing list archives

Re: CVE-2014-8159 kernel: infiniband: uverbs: unprotected physical memory access


From: Solar Designer <solar () openwall com>
Date: Fri, 3 Apr 2015 02:20:14 +0300

Guys,

Can you please remove oss-security from CC on further replies that also
go to linux-kernel?  oss-security is meant to be focused on security,
whereas threads on linux-kernel tend to wander off.  In fact, I'd
appreciate it if we avoid this combination of CC's in the future - that
is, if linux-kernel is CC'ed, please don't CC oss-security on the same
message (as the resulting thread is extremely likely to be at least
partially off-topic for oss-security).  If something relevant to
oss-security comes up in a discussion on LKML, it's better to post a
summary of it to oss-security separately.  Thanks!

Alexander

Over-quoting to illustrate what I mean:

On Thu, Apr 02, 2015 at 06:12:58PM +0000, Haggai Eran wrote:
On Thursday, April 2, 2015 7:44 PM, Shachar Raindel wrote:
-----Original Message-----
From: Yann Droneaud [mailto:ydroneaud () opteya com]
Sent: Thursday, April 02, 2015 7:35 PM
To: Haggai Eran
Cc: Shachar Raindel; Sagi Grimberg; oss-security () lists openwall com;
<linux-rdma () vger kernel org> (linux-rdma () vger kernel org); linux-
kernel () vger kernel org; stable () vger kernel org
Subject: Re: CVE-2014-8159 kernel: infiniband: uverbs: unprotected
physical memory access

Hi Haggai,

Le jeudi 02 avril 2015 ? 18:18 +0300, Haggai Eran a ?crit :
On 02/04/2015 16:30, Yann Droneaud wrote:
Hi,

Le jeudi 02 avril 2015 ? 10:52 +0000, Shachar Raindel a ?crit :
-----Original Message-----
From: Yann Droneaud [mailto:ydroneaud () opteya com]
Sent: Thursday, April 02, 2015 1:05 PM
Le mercredi 18 mars 2015 ? 17:39 +0000, Shachar Raindel a ?crit :

+      /*
+       * If the combination of the addr and size requested for this
memory
+       * region causes an integer overflow, return error.
+       */
+      if ((PAGE_ALIGN(addr + size) <= size) ||
+          (PAGE_ALIGN(addr + size) <= addr))
+              return ERR_PTR(-EINVAL);
+

Can access_ok() be used here ?

         if (!access_ok(writable ? VERIFY_WRITE : VERIFY_READ,
                        addr, size))
                  return ERR_PTR(-EINVAL);


No, this will break the current ODP semantics.

ODP allows the user to register memory that is not accessible yet.
This is a critical design feature, as it allows avoiding holding
a registration cache. Adding this check will break the behavior,
forcing memory to be all accessible when registering an ODP MR.


Where's the check for the range being in userspace memory space,
especially for the ODP case ?

For non ODP case (eg. plain old behavior), does get_user_pages()
ensure the requested pages fit in userspace region on all
architectures ? I think so.

Yes, get_user_pages will return a smaller amount of pages than
requested
if it encounters an unmapped region (or a region without write
permissions for write requests). If this happens, the loop in
ib_umem_get calls get_user_pages again with the next set of pages, and
this time if it the first page still cannot be mapped an error is
returned.


In ODP case, I'm not sure such check is ever done ?

In ODP, we also call get_user_pages, but only when a page fault occurs
(see ib_umem_odp_map_dma_pages()). This allows the user to pre-
register
a memory region that contains unmapped virtual space, and then mmap
different files into that area without needing to re-register.


OK, thanks for the description.

...

Another related question: as the large memory range could be registered
by user space with ibv_reg_mr(pd, base, size, IB_ACCESS_ON_DEMAND),
what's prevent the kernel to map a file as the result of mmap(0, ...)
in this  region, making it available remotely through IBV_WR_RDMA_READ /
IBV_WR_RDMA_WRITE ?


This is not a bug. This is a feature.

Exposing a file through RDMA, using ODP, can be done exactly like this.
Given that the application explicitly requested this behavior, I don't
see why it is a problem. Actually, some of our tests use such flows.
The mmu notifiers mechanism allow us to do this safely. When the page is
written back to disk, it is removed from the ODP mapping. When it is
accessed by the HCA, it is brought back to RAM.



I want to add that we would like to see users registering a very large memory region (perhaps the entire process 
address space) for local access, and then enabling remote access only to specific regions using memory windows. 
However, this isn't supported yet by our driver. Still, there are valid cases where you would still want the results 
of an mmap(0,...) call to be remotely accessible, in cases where there is enough trust between the local process and 
the remote process. It may help a middleware communication library register a large portion of the address space in 
advance, and still work with random pointers given to it by another application module.


Regards,
Haggai


Current thread: