From 68c034fefe20eaf7d5569aae84584b07987ce50a Mon Sep 17 00:00:00 2001 From: Yoshihiro YUNOMAE Date: Tue, 23 Jul 2013 11:30:49 +0930 Subject: virtio/console: Quit from splice_write if pipe->nrbufs is 0 Quit from splice_write if pipe->nrbufs is 0 for avoiding oops in virtio-serial. When an application was doing splice from a kernel buffer to virtio-serial on a guest, the application received signal(SIGINT). This situation will normally happen, but the kernel executed a kernel panic by oops as follows: BUG: unable to handle kernel paging request at ffff882071c8ef28 IP: [] sg_init_table+0x2f/0x50 PGD 1fac067 PUD 0 Oops: 0000 [#1] SMP Modules linked in: lockd sunrpc bnep bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_timer snd microcode virtio_balloon virtio_net pcspkr soundcore i2c_piix4 i2c_core uinput floppy CPU: 1 PID: 908 Comm: trace-cmd Not tainted 3.10.0+ #49 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 task: ffff880071c64650 ti: ffff88007bf24000 task.ti: ffff88007bf24000 RIP: 0010:[] [] sg_init_table+0x2f/0x50 RSP: 0018:ffff88007bf25dd8 EFLAGS: 00010286 RAX: 0000001fffffffe0 RBX: ffff882071c8ef28 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880071c8ef48 RBP: ffff88007bf25de8 R08: ffff88007fd15d40 R09: ffff880071c8ef48 R10: ffffea0001c71040 R11: ffffffff8139c555 R12: 0000000000000000 R13: ffff88007506a3c0 R14: ffff88007c862500 R15: ffff880071c8ef00 FS: 00007f0a3646c740(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff882071c8ef28 CR3: 000000007acbb000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffff880071c8ef48 ffff88007bf25e20 ffff88007bf25e88 ffffffff8139d6fa ffff88007bf25e28 ffffffff8127a3f4 0000000000000000 0000000000000000 ffff880071c8ef48 0000100000000000 0000000000000003 ffff88007bf25e08 Call Trace: [] port_fops_splice_write+0xaa/0x130 [] ? selinux_file_permission+0xc4/0x120 [] ? wait_port_writable+0x1b0/0x1b0 [] do_splice_from+0xa0/0x110 [] SyS_splice+0x5ff/0x6b0 [] system_call_fastpath+0x16/0x1b Code: c1 e2 05 48 89 e5 48 83 ec 10 4c 89 65 f8 41 89 f4 31 f6 48 89 5d f0 48 89 fb e8 8d ce ff ff 41 8d 44 24 ff 48 c1 e0 05 48 01 c3 <48> 8b 03 48 83 e0 fe 48 83 c8 02 48 89 03 48 8b 5d f0 4c 8b 65 RIP [] sg_init_table+0x2f/0x50 RSP CR2: ffff882071c8ef28 ---[ end trace 86323505eb42ea8f ]--- It seems to induce pagefault in sg_init_tabel() when pipe->nrbufs is equal to zero. This may happen in a following situation: (1) The application normally does splice(read) from a kernel buffer, then does splice(write) to virtio-serial. (2) The application receives SIGINT when is doing splice(read), so splice(read) is failed by EINTR. However, the application does not finish the operation. (3) The application tries to do splice(write) without pipe->nrbufs. (4) The virtio-console driver tries to touch scatterlist structure sgl in sg_init_table(), but the region is out of bound. To avoid the case, a kernel should check whether pipe->nrbufs is empty or not when splice_write is executed in the virtio-console driver. V3: Add Reviewed-by lines and stable@ line in sign-off area. Signed-off-by: Yoshihiro YUNOMAE Reviewed-by: Amit Shah Reviewed-by: Masami Hiramatsu Cc: Amit Shah Cc: Arnd Bergmann Cc: Greg Kroah-Hartman Cc: stable@vger.kernel.org Signed-off-by: Rusty Russell --- drivers/char/virtio_console.c | 7 +++++++ 1 file changed, 7 insertions(+) (limited to 'drivers/char') diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c index 1b456fe9b87a..8722656cdebf 100644 --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -932,6 +932,13 @@ static ssize_t port_fops_splice_write(struct pipe_inode_info *pipe, if (is_rproc_serial(port->out_vq->vdev)) return -EINVAL; + /* + * pipe->nrbufs == 0 means there are no data to transfer, + * so this returns just 0 for no data. + */ + if (!pipe->nrbufs) + return 0; + ret = wait_port_writable(port, filp->f_flags & O_NONBLOCK); if (ret < 0) return ret; -- cgit v1.2.3 From 2b4fbf029dff5a28d9bf646346dea891ec43398a Mon Sep 17 00:00:00 2001 From: Yoshihiro YUNOMAE Date: Tue, 23 Jul 2013 11:30:49 +0930 Subject: virtio/console: Add pipe_lock/unlock for splice_write Add pipe_lock/unlock for splice_write to avoid oops by following competition: (1) An application gets fds of a trace buffer, virtio-serial, pipe. (2) The application does fork() (3) The processes execute splice_read(trace buffer) and splice_write(virtio-serial) via same pipe. get fds of a trace buffer, virtio-serial, pipe | fork()----------create--------+ | | splice(read) | ---+ splice(write) | +-- no competition | splice(read) | | splice(write) ---+ | | splice(read) | splice(write) splice(read) ------ competition | splice(write) Two processes share a pipe_inode_info structure. If the child execute splice(read) when the parent tries to execute splice(write), the structure can be broken. Existing virtio-serial driver does not get lock for the structure in splice_write, so this competition will induce oops. BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 IP: [] splice_from_pipe_feed+0x6f/0x130 PGD 7223e067 PUD 72391067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: lockd bnep bluetooth rfkill sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_timer snd soundcore pcspkr virtio_net virtio_balloon i2c_piix4 i2c_core microcode uinput floppy CPU: 0 PID: 1072 Comm: compete-test Not tainted 3.10.0ws+ #55 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 task: ffff880071b98000 ti: ffff88007b55e000 task.ti: ffff88007b55e000 RIP: 0010:[] [] splice_from_pipe_feed+0x6f/0x130 RSP: 0018:ffff88007b55fd78 EFLAGS: 00010287 RAX: 0000000000000000 RBX: ffff88007b55fe20 RCX: 0000000000000000 RDX: 0000000000001000 RSI: ffff88007a95ba30 RDI: ffff880036f9e6c0 RBP: ffff88007b55fda8 R08: 00000000000006ec R09: ffff880077626708 R10: 0000000000000003 R11: ffffffff8139ca59 R12: ffff88007a95ba30 R13: 0000000000000000 R14: ffffffff8139dd00 R15: ffff880036f9e6c0 FS: 00007f2e2e3a0740(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000018 CR3: 0000000071bd1000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffffffff8139ca59 ffff88007b55fe20 ffff880036f9e6c0 ffffffff8139dd00 ffff8800776266c0 ffff880077626708 ffff88007b55fde8 ffffffff811a6e8e ffff88007b55fde8 ffffffff8139ca59 ffff880036f9e6c0 ffff88007b55fe20 Call Trace: [] ? alloc_buf.isra.13+0x39/0xb0 [] ? virtcons_restore+0x100/0x100 [] __splice_from_pipe+0x7e/0x90 [] ? alloc_buf.isra.13+0x39/0xb0 [] port_fops_splice_write+0xe9/0x140 [] ? selinux_file_permission+0xc4/0x120 [] ? wait_port_writable+0x1b0/0x1b0 [] do_splice_from+0xa0/0x110 [] SyS_splice+0x5ff/0x6b0 [] tracesys+0xdd/0xe2 Code: 49 8b 87 80 00 00 00 4c 8d 24 d0 8b 53 04 41 8b 44 24 0c 4d 8b 6c 24 10 39 d0 89 03 76 02 89 13 49 8b 44 24 10 4c 89 e6 4c 89 ff 50 18 85 c0 0f 85 aa 00 00 00 48 89 da 4c 89 e6 4c 89 ff 41 RIP [] splice_from_pipe_feed+0x6f/0x130 RSP CR2: 0000000000000018 ---[ end trace 24572beb7764de59 ]--- V2: Fix a locking problem for error V3: Add Reviewed-by lines and stable@ line in sign-off area Signed-off-by: Yoshihiro YUNOMAE Reviewed-by: Amit Shah Reviewed-by: Masami Hiramatsu Cc: Amit Shah Cc: Arnd Bergmann Cc: Greg Kroah-Hartman Cc: stable@vger.kernel.org Signed-off-by: Rusty Russell --- drivers/char/virtio_console.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) (limited to 'drivers/char') diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c index 8722656cdebf..8a15af3e1a9d 100644 --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -936,16 +936,21 @@ static ssize_t port_fops_splice_write(struct pipe_inode_info *pipe, * pipe->nrbufs == 0 means there are no data to transfer, * so this returns just 0 for no data. */ - if (!pipe->nrbufs) - return 0; + pipe_lock(pipe); + if (!pipe->nrbufs) { + ret = 0; + goto error_out; + } ret = wait_port_writable(port, filp->f_flags & O_NONBLOCK); if (ret < 0) - return ret; + goto error_out; buf = alloc_buf(port->out_vq, 0, pipe->nrbufs); - if (!buf) - return -ENOMEM; + if (!buf) { + ret = -ENOMEM; + goto error_out; + } sgl.n = 0; sgl.len = 0; @@ -953,12 +958,17 @@ static ssize_t port_fops_splice_write(struct pipe_inode_info *pipe, sgl.sg = buf->sg; sg_init_table(sgl.sg, sgl.size); ret = __splice_from_pipe(pipe, &sd, pipe_to_sg); + pipe_unlock(pipe); if (likely(ret > 0)) ret = __send_to_port(port, buf->sg, sgl.n, sgl.len, buf, true); if (unlikely(ret <= 0)) free_buf(buf, true); return ret; + +error_out: + pipe_unlock(pipe); + return ret; } static unsigned int port_fops_poll(struct file *filp, poll_table *wait) -- cgit v1.2.3 From 057b82be3ca3d066478e43b162fc082930a746c9 Mon Sep 17 00:00:00 2001 From: Amit Shah Date: Mon, 29 Jul 2013 14:16:13 +0930 Subject: virtio: console: fix race with port unplug and open/close There's a window between find_port_by_devt() returning a port and us taking a kref on the port, where the port could get unplugged. Fix it by taking the reference in find_port_by_devt() itself. Problem reported and analyzed by Mateusz Guzik. CC: Reported-by: Mateusz Guzik Signed-off-by: Amit Shah Signed-off-by: Rusty Russell --- drivers/char/virtio_console.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) (limited to 'drivers/char') diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c index 8a15af3e1a9d..3beea9d478bc 100644 --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -272,9 +272,12 @@ static struct port *find_port_by_devt_in_portdev(struct ports_device *portdev, unsigned long flags; spin_lock_irqsave(&portdev->ports_lock, flags); - list_for_each_entry(port, &portdev->ports, list) - if (port->cdev->dev == dev) + list_for_each_entry(port, &portdev->ports, list) { + if (port->cdev->dev == dev) { + kref_get(&port->kref); goto out; + } + } port = NULL; out: spin_unlock_irqrestore(&portdev->ports_lock, flags); @@ -1036,14 +1039,10 @@ static int port_fops_open(struct inode *inode, struct file *filp) struct port *port; int ret; + /* We get the port with a kref here */ port = find_port_by_devt(cdev->dev); filp->private_data = port; - /* Prevent against a port getting hot-unplugged at the same time */ - spin_lock_irq(&port->portdev->ports_lock); - kref_get(&port->kref); - spin_unlock_irq(&port->portdev->ports_lock); - /* * Don't allow opening of console port devices -- that's done * via /dev/hvc -- cgit v1.2.3 From 671bdea2b9f210566610603ecbb6584c8a201c8c Mon Sep 17 00:00:00 2001 From: Amit Shah Date: Mon, 29 Jul 2013 14:17:13 +0930 Subject: virtio: console: fix race in port_fops_open() and port unplug Between open() being called and processed, the port can be unplugged. Check if this happened, and bail out. A simple test script to reproduce this is: while true; do for i in $(seq 1 100); do echo $i > /dev/vport0p3; done; done; This opens and closes the port a lot of times; unplugging the port while this is happening triggers the bug. CC: Signed-off-by: Amit Shah Signed-off-by: Rusty Russell --- drivers/char/virtio_console.c | 4 ++++ 1 file changed, 4 insertions(+) (limited to 'drivers/char') diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c index 3beea9d478bc..ffa7e46faff9 100644 --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -1041,6 +1041,10 @@ static int port_fops_open(struct inode *inode, struct file *filp) /* We get the port with a kref here */ port = find_port_by_devt(cdev->dev); + if (!port) { + /* Port was unplugged before we could proceed */ + return -ENXIO; + } filp->private_data = port; /* -- cgit v1.2.3 From ea3768b4386a8d1790f4cc9a35de4f55b92d6442 Mon Sep 17 00:00:00 2001 From: Amit Shah Date: Mon, 29 Jul 2013 14:20:29 +0930 Subject: virtio: console: clean up port data immediately at time of unplug We used to keep the port's char device structs and the /sys entries around till the last reference to the port was dropped. This is actually unnecessary, and resulted in buggy behaviour: 1. Open port in guest 2. Hot-unplug port 3. Hot-plug a port with the same 'name' property as the unplugged one This resulted in hot-plug being unsuccessful, as a port with the same name already exists (even though it was unplugged). This behaviour resulted in a warning message like this one: -------------------8<--------------------------------------- WARNING: at fs/sysfs/dir.c:512 sysfs_add_one+0xc9/0x130() (Not tainted) Hardware name: KVM sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:04.0/virtio0/virtio-ports/vport0p1' Call Trace: [] ? warn_slowpath_common+0x87/0xc0 [] ? warn_slowpath_fmt+0x46/0x50 [] ? sysfs_add_one+0xc9/0x130 [] ? create_dir+0x68/0xb0 [] ? sysfs_create_dir+0x39/0x50 [] ? kobject_add_internal+0xb9/0x260 [] ? kobject_add_varg+0x38/0x60 [] ? kobject_add+0x44/0x70 [] ? get_device_parent+0xf4/0x1d0 [] ? device_add+0xc9/0x650 -------------------8<--------------------------------------- Instead of relying on guest applications to release all references to the ports, we should go ahead and unregister the port from all the core layers. Any open/read calls on the port will then just return errors, and an unplug/plug operation on the host will succeed as expected. This also caused buggy behaviour in case of the device removal (not just a port): when the device was removed (which means all ports on that device are removed automatically as well), the ports with active users would clean up only when the last references were dropped -- and it would be too late then to be referencing char device pointers, resulting in oopses: -------------------8<--------------------------------------- PID: 6162 TASK: ffff8801147ad500 CPU: 0 COMMAND: "cat" #0 [ffff88011b9d5a90] machine_kexec at ffffffff8103232b #1 [ffff88011b9d5af0] crash_kexec at ffffffff810b9322 #2 [ffff88011b9d5bc0] oops_end at ffffffff814f4a50 #3 [ffff88011b9d5bf0] die at ffffffff8100f26b #4 [ffff88011b9d5c20] do_general_protection at ffffffff814f45e2 #5 [ffff88011b9d5c50] general_protection at ffffffff814f3db5 [exception RIP: strlen+2] RIP: ffffffff81272ae2 RSP: ffff88011b9d5d00 RFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff880118901c18 RCX: 0000000000000000 RDX: ffff88011799982c RSI: 00000000000000d0 RDI: 3a303030302f3030 RBP: ffff88011b9d5d38 R8: 0000000000000006 R9: ffffffffa0134500 R10: 0000000000001000 R11: 0000000000001000 R12: ffff880117a1cc10 R13: 00000000000000d0 R14: 0000000000000017 R15: ffffffff81aff700 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #6 [ffff88011b9d5d00] kobject_get_path at ffffffff8126dc5d #7 [ffff88011b9d5d40] kobject_uevent_env at ffffffff8126e551 #8 [ffff88011b9d5dd0] kobject_uevent at ffffffff8126e9eb #9 [ffff88011b9d5de0] device_del at ffffffff813440c7 -------------------8<--------------------------------------- So clean up when we have all the context, and all that's left to do when the references to the port have dropped is to free up the port struct itself. CC: Reported-by: chayang Reported-by: YOGANANTH SUBRAMANIAN Reported-by: FuXiangChun Reported-by: Qunfang Zhang Reported-by: Sibiao Luo Signed-off-by: Amit Shah Signed-off-by: Rusty Russell --- drivers/char/virtio_console.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) (limited to 'drivers/char') diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c index ffa7e46faff9..4e684faee10b 100644 --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -1518,14 +1518,6 @@ static void remove_port(struct kref *kref) port = container_of(kref, struct port, kref); - sysfs_remove_group(&port->dev->kobj, &port_attribute_group); - device_destroy(pdrvdata.class, port->dev->devt); - cdev_del(port->cdev); - - kfree(port->name); - - debugfs_remove(port->debugfs_file); - kfree(port); } @@ -1583,6 +1575,14 @@ static void unplug_port(struct port *port) */ port->portdev = NULL; + sysfs_remove_group(&port->dev->kobj, &port_attribute_group); + device_destroy(pdrvdata.class, port->dev->devt); + cdev_del(port->cdev); + + kfree(port->name); + + debugfs_remove(port->debugfs_file); + /* * Locks around here are not necessary - a port can't be * opened after we removed the port struct from ports_list -- cgit v1.2.3 From 92d3453815fbe74d539c86b60dab39ecdf01bb99 Mon Sep 17 00:00:00 2001 From: Amit Shah Date: Mon, 29 Jul 2013 14:21:32 +0930 Subject: virtio: console: fix raising SIGIO after port unplug SIGIO should be sent when a port gets unplugged. It should only be sent to prcesses that have the port opened, and have asked for SIGIO to be delivered. We were clearing out guest_connected before calling send_sigio_to_port(), resulting in a sigio not getting sent to processes. Fix by setting guest_connected to false after invoking the sigio function. CC: Signed-off-by: Amit Shah Signed-off-by: Rusty Russell --- drivers/char/virtio_console.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) (limited to 'drivers/char') diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c index 4e684faee10b..e4845f1c9a0b 100644 --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -1551,12 +1551,14 @@ static void unplug_port(struct port *port) spin_unlock_irq(&port->portdev->ports_lock); if (port->guest_connected) { + /* Let the app know the port is going down. */ + send_sigio_to_port(port); + + /* Do this after sigio is actually sent */ port->guest_connected = false; port->host_connected = false; - wake_up_interruptible(&port->waitqueue); - /* Let the app know the port is going down. */ - send_sigio_to_port(port); + wake_up_interruptible(&port->waitqueue); } if (is_console_port(port)) { -- cgit v1.2.3 From 96f97a83910cdb9d89d127c5ee523f8fc040a804 Mon Sep 17 00:00:00 2001 From: Amit Shah Date: Mon, 29 Jul 2013 14:23:21 +0930 Subject: virtio: console: return -ENODEV on all read operations after unplug If a port gets unplugged while a user is blocked on read(), -ENODEV is returned. However, subsequent read()s returned 0, indicating there's no host-side connection (but not indicating the device went away). This also happened when a port was unplugged and the user didn't have any blocking operation pending. If the user didn't monitor the SIGIO signal, they won't have a chance to find out if the port went away. Fix by returning -ENODEV on all read()s after the port gets unplugged. write() already behaves this way. CC: Signed-off-by: Amit Shah Signed-off-by: Rusty Russell --- drivers/char/virtio_console.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) (limited to 'drivers/char') diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c index e4845f1c9a0b..fc45567ad3ac 100644 --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -749,6 +749,10 @@ static ssize_t port_fops_read(struct file *filp, char __user *ubuf, port = filp->private_data; + /* Port is hot-unplugged. */ + if (!port->guest_connected) + return -ENODEV; + if (!port_has_data(port)) { /* * If nothing's connected on the host just return 0 in @@ -765,7 +769,7 @@ static ssize_t port_fops_read(struct file *filp, char __user *ubuf, if (ret < 0) return ret; } - /* Port got hot-unplugged. */ + /* Port got hot-unplugged while we were waiting above. */ if (!port->guest_connected) return -ENODEV; /* -- cgit v1.2.3