diff options
author | Jens Axboe <axboe@kernel.dk> | 2024-03-13 03:24:21 +0100 |
---|---|---|
committer | Jens Axboe <axboe@kernel.dk> | 2024-04-15 16:10:26 +0200 |
commit | 87585b05757dc70545efb434669708d276125559 (patch) | |
tree | d3020002d23692f431baf61797e51d439312950f /io_uring/io_uring.h | |
parent | io_uring/kbuf: vmap pinned buffer ring (diff) | |
download | linux-87585b05757dc70545efb434669708d276125559.tar.xz linux-87585b05757dc70545efb434669708d276125559.zip |
io_uring/kbuf: use vm_insert_pages() for mmap'ed pbuf ring
Rather than use remap_pfn_range() for this and manually free later,
switch to using vm_insert_page() and have it Just Work.
This requires a bit of effort on the mmap lookup side, as the ctx
uring_lock isn't held, which otherwise protects buffer_lists from being
torn down, and it's not safe to grab from mmap context that would
introduce an ABBA deadlock between the mmap lock and the ctx uring_lock.
Instead, lookup the buffer_list under RCU, as the the list is RCU freed
already. Use the existing reference count to determine whether it's
possible to safely grab a reference to it (eg if it's not zero already),
and drop that reference when done with the mapping. If the mmap
reference is the last one, the buffer_list and the associated memory can
go away, since the vma insertion has references to the inserted pages at
that point.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Diffstat (limited to 'io_uring/io_uring.h')
-rw-r--r-- | io_uring/io_uring.h | 6 |
1 files changed, 4 insertions, 2 deletions
diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index 75230d914007..dec996a1c789 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -109,8 +109,10 @@ bool __io_alloc_req_refill(struct io_ring_ctx *ctx); bool io_match_task_safe(struct io_kiocb *head, struct task_struct *task, bool cancel_all); -void *io_mem_alloc(size_t size); -void io_mem_free(void *ptr); +void *io_pages_map(struct page ***out_pages, unsigned short *npages, + size_t size); +void io_pages_unmap(void *ptr, struct page ***pages, unsigned short *npages, + bool put_pages); enum { IO_EVENTFD_OP_SIGNAL_BIT, |