summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* fbdev: fix divide error in fb_var_to_videomodeShile Zhang2019-04-011-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To fix following divide-by-zero error found by Syzkaller: divide error: 0000 [#1] SMP PTI CPU: 7 PID: 8447 Comm: test Kdump: loaded Not tainted 4.19.24-8.al7.x86_64 #1 Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014 RIP: 0010:fb_var_to_videomode+0xae/0xc0 Code: 04 44 03 46 78 03 4e 7c 44 03 46 68 03 4e 70 89 ce d1 ee 69 c0 e8 03 00 00 f6 c2 01 0f 45 ce 83 e2 02 8d 34 09 0f 45 ce 31 d2 <41> f7 f0 31 d2 f7 f1 89 47 08 f3 c3 66 0f 1f 44 00 00 0f 1f 44 00 RSP: 0018:ffffb7e189347bf0 EFLAGS: 00010246 RAX: 00000000e1692410 RBX: ffffb7e189347d60 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffb7e189347c10 RBP: ffff99972a091c00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000100 R13: 0000000000010000 R14: 00007ffd66baf6d0 R15: 0000000000000000 FS: 00007f2054d11740(0000) GS:ffff99972fbc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f205481fd20 CR3: 00000004288a0001 CR4: 00000000001606a0 Call Trace: fb_set_var+0x257/0x390 ? lookup_fast+0xbb/0x2b0 ? fb_open+0xc0/0x140 ? chrdev_open+0xa6/0x1a0 do_fb_ioctl+0x445/0x5a0 do_vfs_ioctl+0x92/0x5f0 ? __alloc_fd+0x3d/0x160 ksys_ioctl+0x60/0x90 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x5b/0x190 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f20548258d7 Code: 44 00 00 48 8b 05 b9 15 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 89 15 2d 00 f7 d8 64 89 01 48 It can be triggered easily with following test code: #include <linux/fb.h> #include <fcntl.h> #include <sys/ioctl.h> int main(void) { struct fb_var_screeninfo var = {.activate = 0x100, .pixclock = 60}; int fd = open("/dev/fb0", O_RDWR); if (fd < 0) return 1; if (ioctl(fd, FBIOPUT_VSCREENINFO, &var)) return 1; return 0; } Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com> Cc: Fredrik Noring <noring@nocrew.org> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* fbdev: sm712fb: use 1024x768 by default on non-MIPS, fix garbled displayYifeng Li2019-04-012-16/+44
| | | | | | | | | | | | | | | | | | | Loongson MIPS netbooks use 1024x600 LCD panels, which is the original target platform of this driver, but nearly all old x86 laptops have 1024x768. Lighting 768 panels using 600's timings would partially garble the display. Since it's not possible to distinguish them reliably, we change the default to 768, but keep 600 as-is on MIPS. Further, earlier laptops, such as IBM Thinkpad 240X, has a 800x600 LCD panel, this driver would probably garbled those display. As we don't have one for testing, the original behavior of the driver is kept as-is, but the problem has been documented is the comments. Signed-off-by: Yifeng Li <tomli@tomli.me> Tested-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Cc: Teddy Wang <teddy.wang@siliconmotion.com> Cc: <stable@vger.kernel.org> # v4.4+ Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* fbdev: sm712fb: fix support for 1024x768-16 modeYifeng Li2019-04-011-0/+59
| | | | | | | | | | | | | | | In order to support the 1024x600 panel on Yeeloong Loongson MIPS laptop, the original 1024x768-16 table was modified to 1024x600-16, without leaving the original. It causes problem on x86 laptop as the 1024x768-16 support was still claimed but not working. Fix it by introducing the 1024x768-16 mode. Signed-off-by: Yifeng Li <tomli@tomli.me> Tested-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Cc: Teddy Wang <teddy.wang@siliconmotion.com> Cc: <stable@vger.kernel.org> # v4.4+ Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* fbdev: sm712fb: fix crashes and garbled display during DPMS modesettingYifeng Li2019-04-011-26/+38
| | | | | | | | | | | | | | | | | | | | On a Thinkpad s30 (Pentium III / i440MX, Lynx3DM), blanking the display or starting the X server will crash and freeze the system, or garble the display. Experiments showed this problem can mostly be solved by adjusting the order of register writes. Also, sm712fb failed to consider the difference of clock frequency when unblanking the display, and programs the clock for SM712 to SM720. Fix them by adjusting the order of register writes, and adding an additional check for SM720 for programming the clock frequency. Signed-off-by: Yifeng Li <tomli@tomli.me> Tested-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Cc: Teddy Wang <teddy.wang@siliconmotion.com> Cc: <stable@vger.kernel.org> # v4.4+ Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* fbdev: sm712fb: fix crashes during framebuffer writes by correctly mapping VRAMYifeng Li2019-04-012-9/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On a Thinkpad s30 (Pentium III / i440MX, Lynx3DM), running fbtest or X will crash the machine instantly, because the VRAM/framebuffer is not mapped correctly. On SM712, the framebuffer starts at the beginning of address space, but SM720's framebuffer starts at the 1 MiB offset from the beginning. However, sm712fb fails to take this into account, as a result, writing to the framebuffer will destroy all the registers and kill the system immediately. Another problem is the driver assumes 8 MiB of VRAM for SM720, but some SM720 system, such as this IBM Thinkpad, only has 4 MiB of VRAM. Fix this problem by removing the hardcoded VRAM size, adding a function to query the amount of VRAM from register MCR76 on SM720, and adding proper framebuffer offset. Please note that the memory map may have additional problems on Big-Endian system, which is not available for testing by myself. But I highly suspect that the original code is also broken on Big-Endian machines for SM720, so at least we are not making the problem worse. More, the driver also assumed SM710/SM712 has 4 MiB of VRAM, but it has a 2 MiB version as well, and used in earlier laptops, such as IBM Thinkpad 240X, the driver would probably crash on them. I've never seen one of those machines and cannot fix it, but I have documented these problems in the comments. Signed-off-by: Yifeng Li <tomli@tomli.me> Tested-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Cc: Teddy Wang <teddy.wang@siliconmotion.com> Cc: <stable@vger.kernel.org> # v4.4+ Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* fbdev: sm712fb: fix boot screen glitch when sm712fb replaces VGAYifeng Li2019-04-011-1/+5
| | | | | | | | | | | | | | When the machine is booted in VGA mode, loading sm712fb would cause a glitch of random pixels shown on the screen. To prevent it from happening, we first clear the entire framebuffer, and we also need to stop calling smtcfb_setmode() during initialization, the fbdev layer will call it for us later when it's ready. Signed-off-by: Yifeng Li <tomli@tomli.me> Tested-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Cc: Teddy Wang <teddy.wang@siliconmotion.com> Cc: <stable@vger.kernel.org> # v4.4+ Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* fbdev: sm712fb: fix VRAM detection, don't set SR70/71/74/75Yifeng Li2019-04-011-1/+3
| | | | | | | | | | | | | | | | | | | On a Thinkpad s30 (Pentium III / i440MX, Lynx3DM), the amount of Video RAM is not detected correctly by the xf86-video-siliconmotion driver. This is because sm712fb overwrites the GPR71 Scratch Pad Register, which is set by BIOS on x86 and used to indicate amount of VRAM. Other Scratch Pad Registers, including GPR70/74/75, don't have the same side-effect, but overwriting to them is still questionable, as they are not related to modesetting. Stop writing to SR70/71/74/75 (a.k.a GPR70/71/74/75). Signed-off-by: Yifeng Li <tomli@tomli.me> Tested-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Cc: Teddy Wang <teddy.wang@siliconmotion.com> Cc: <stable@vger.kernel.org> # v4.4+ Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* fbdev: sm712fb: fix brightness control on reboot, don't set SR30Yifeng Li2019-04-011-2/+2
| | | | | | | | | | | | | | | | | | | | | On a Thinkpad s30 (Pentium III / i440MX, Lynx3DM), rebooting with sm712fb framebuffer driver would cause the role of brightness up/down button to swap. Experiments showed the FPR30 register caused this behavior. Moreover, even if this register don't have side-effect on other systems, over- writing it is also highly questionable, since it was originally configurated by the motherboard manufacturer by hardwiring pull-down resistors to indicate the type of LCD panel. We should not mess with it. Stop writing to the SR30 (a.k.a FPR30) register. Signed-off-by: Yifeng Li <tomli@tomli.me> Tested-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Cc: Teddy Wang <teddy.wang@siliconmotion.com> Cc: <stable@vger.kernel.org> # v4.4+ Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* fbdev: sm712fb: fix white screen of death on reboot, don't set CR3B-CR3FYifeng Li2019-04-011-1/+5
| | | | | | | | | | | | | | | | | | | | | | | On a Thinkpad s30 (Pentium III / i440MX, Lynx3DM), rebooting with sm712fb framebuffer driver would cause a white screen of death on the next POST, presumably the proper timings for the LCD panel was not reprogrammed properly by the BIOS. Experiments showed a few CRTC Scratch Registers, including CRT3D, CRT3E and CRT3F may be used internally by BIOS as some flags. CRT3B is a hardware testing register, we shouldn't mess with it. CRT3C has blanking signal and line compare control, which is not needed for this driver. Stop writing to CR3B-CR3F (a.k.a CRT3B-CRT3F) registers. Even if these registers don't have side-effect on other systems, writing to them is also highly questionable. Signed-off-by: Yifeng Li <tomli@tomli.me> Tested-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Cc: Teddy Wang <teddy.wang@siliconmotion.com> Cc: <stable@vger.kernel.org> # v4.4+ Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* video: imsttfb: fix potential NULL pointer dereferencesKangjie Lu2019-04-011-0/+5
| | | | | | | | | | | | | In case ioremap fails, the fix releases resources and returns -ENOMEM to avoid NULL pointer dereferences. Signed-off-by: Kangjie Lu <kjlu@umn.edu> Cc: Aditya Pakki <pakki001@umn.edu> Cc: Finn Thain <fthain@telegraphics.com.au> Cc: Rob Herring <robh@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [b.zolnierkie: minor patch summary fixup] Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* video: hgafb: fix potential NULL pointer dereferenceKangjie Lu2019-04-011-0/+2
| | | | | | | | | | | When ioremap fails, hga_vram should not be dereferenced. The fix check the failure to avoid NULL pointer dereference. Signed-off-by: Kangjie Lu <kjlu@umn.edu> Cc: Aditya Pakki <pakki001@umn.edu> Cc: Ferenc Bakonyi <fero@drama.obuda.kando.hu> [b.zolnierkie: minor patch summary fixup] Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* fbdev: list all pci memory bars as conflicting aperturesGerd Hoffmann2019-04-011-4/+25
| | | | | | | | | | | | | Simply add all pci memory bars to struct apertures_struct in remove_conflicting_pci_framebuffers(), without depending on the res_id parameter. The plan is to drop the res_id parameter later on. For now keep the parameter, use it for sanity-checking and warn on inconsistencies. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* drivers: video: fbdev: Kconfig: pedantic cleanupsEnrico Weigelt, metux IT consult2019-04-016-190/+188
| | | | | | | | | Formatting of Kconfig files doesn't look so pretty, so let the Great White Handkerchief come around and clean it up. Signed-off-by: Enrico Weigelt, metux IT consult <info@metux.net> [b.zolnierkie: add missing patch description] Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* omapfb: Fix potential NULL pointer dereference in kmallocAditya Pakki2019-04-011-0/+2
| | | | | | | | | | Memory allocated, using kmalloc, for new_compat may fail. This patch checks for such an error and prevents potential NULL pointer dereference. Signed-off-by: Aditya Pakki <pakki001@umn.edu> Cc: Kangjie Lu <kjlu@umn.edu> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* xen, fbfront: mark expected switch fall-throughGustavo A. R. Silva2019-04-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. This patch fixes the following warning: drivers/video/fbdev/xen-fbfront.c: In function ‘xenfb_backend_changed’: drivers/video/fbdev/xen-fbfront.c:678:6: warning: this statement may fall through [-Wimplicit-fallthrough=] if (dev->state == XenbusStateClosed) ^ drivers/video/fbdev/xen-fbfront.c:681:2: note: here case XenbusStateClosing: ^~~~ Warning level 3 was used: -Wimplicit-fallthrough=3 Notice that, in this particular case, the code comment is modified in accordance with what GCC is expecting to find. This patch is part of the ongoing efforts to enable -Wimplicit-fallthrough. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* udlfb: introduce a rendering mutexMikulas Patocka2019-04-012-11/+31
| | | | | | | | | | | | | | | | | Rendering calls may be done simultaneously from the workqueue, dlfb_ops_write, dlfb_ops_ioctl, dlfb_ops_set_par and dlfb_dpy_deferred_io. The code is robust enough so that it won't crash on concurrent rendering. However, concurrent rendering may cause display corruption if the same pixel is simultaneously being rendered. In order to avoid this corruption, this patch adds a mutex around the rendering calls. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Bernie Thompson <bernie@plugable.com> Cc: Ladislav Michl <ladis@linux-mips.org> Cc: <stable@vger.kernel.org> [b.zolnierkie: replace "dlfb:" with "uldfb:" in the patch summary] Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* udlfb: fix sleeping inside spinlockMikulas Patocka2019-04-012-3/+59
| | | | | | | | | | | | | | | | | | | If a framebuffer device is used as a console, the rendering calls (copyarea, fillrect, imageblit) may be done with the console spinlock held. On udlfb, these function call dlfb_handle_damage that takes a blocking semaphore before acquiring an URB. In order to fix the bug, this patch changes the calls copyarea, fillrect and imageblit to offload USB work to a workqueue. A side effect of this patch is 3x improvement in console scrolling speed because the device doesn't have to be updated after each copyarea call. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Bernie Thompson <bernie@plugable.com> Cc: Ladislav Michl <ladis@linux-mips.org> Cc: <stable@vger.kernel.org> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* udlfb: delete the unused parameter for dlfb_handle_damageMikulas Patocka2019-04-011-12/+9
| | | | | | | | | | Remove the unused parameter "data" and unused variable "ret". Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Bernie Thompson <bernie@plugable.com> Cc: Ladislav Michl <ladis@linux-mips.org> Cc: <stable@vger.kernel.org> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* video: fbdev: atmel_lcdfb: drop AVR and platform_data supportAlexandre Belloni2019-04-012-113/+7
| | | | | | | | | | Make the driver OF only as since AVR32 has been removed from the kernel, there are only OF enabled platform using it. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* omapfb: add missing of_node_put after of_device_is_availableJulia Lawall2019-04-011-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add an of_node_put when a tested device node is not available. The semantic patch that fixes this problem is as follows (http://coccinelle.lip6.fr): // <smpl> @@ identifier f; local idexpression e; expression x; @@ e = f(...); ... when != of_node_put(e) when != x = e when != e = x when any if (<+...of_device_is_available(e)...+>) { ... when != of_node_put(e) ( return e; | + of_node_put(e); return ...; ) } // </smpl> Fixes: f76ee892a99e6 ("omapfb: copy omapdss & displays for omapfb") Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* video/macfb: Always initialize DAFB colour table pointer registerFinn Thain2019-04-011-1/+1
| | | | | | | | | Don't skip the framebuffer CLUT pointer register initialization when the first dafb_setpalette() invocation has regno equal to zero. Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* video/macfb: Call fb_invert_cmaps()Finn Thain2019-04-011-2/+1
| | | | | | | | The 'inverse' parameter has no effect otherwise. Remove set-but-unused variable. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* fbdev: atafb: Modernize printing of kernel messagesGeert Uytterhoeven2019-04-011-12/+13
| | | | | | | | | | | Now the driver has been converted to a platform driver, the legacy printk() calls without any log level can be replaced by proper dev_*() calls. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michael Schmitz <schmitzmic@gmail.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* fbdev: atafb: Fix broken frame buffer after kexecGeert Uytterhoeven2019-04-011-4/+30
| | | | | | | | | | | | | | | | | | | | | On Falcon, Atari frame buffer initialization relies on preprogrammed register values. These register values are changed to blank the console. Hence when using kexec to boot into a new kernel while the console is blanked, the new kernel cannot determine the current video configuration, and the Atari frame buffer driver fails to initialize: atafb: phys_screen_base 6ce000 screen_len 4096 Fix this by doing a straight-forward conversion of the driver to a platform device driver, and adding a shutdown handler that unblanks the display. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michael Schmitz <schmitzmic@gmail.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* fbdev: atafb: Remove obsolete module supportGeert Uytterhoeven2019-04-015-107/+1
| | | | | | | | | | | | | CONFIG_FB_ATARI is bool, hence the Atari frame buffer driver cannot be built as a module. In addition, the module support code refers to a function atafb_deinit(), which never existed. Replace module_init() by device_initcall(). Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michael Schmitz <schmitzmic@gmail.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* fbdev: atafb: Stop printing virtual screen_baseGeert Uytterhoeven2019-04-011-2/+2
| | | | | | | | | Printing (hashed) virtual addresses is useless. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michael Schmitz <schmitzmic@gmail.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* video/macfb: Remove redundant codeFinn Thain2019-04-011-24/+0
| | | | | | | | | | | | | | | | The value of info->var.bits_per_pixel get checked in macfb_setcolreg(). Remove additional checks as they are redundant. macfb_defined.activate gets initialized to FB_ACTIVATE_NOW by the struct initializer. Remove redundant assignments. macfb_defined.bits_per_pixel, .width and .height all get assigned unconditionally. Remove redundant initializers. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Tested-by: Stan Johnson <userm57@yahoo.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* video: fbdev: savage: fix indentation issueColin Ian King2019-04-011-3/+3
| | | | | | | | | The indentation in the if statement is not indented correctly, fix this with extra level of indentation. Signed-off-by: Colin Ian King <colin.king@canonical.com> Cc: Antonino Daplas <adaplas@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* video: fbdev: vesafb: fix indentation issueColin Ian King2019-04-011-2/+2
| | | | | | | | | There are a couple of statements that are indented too deeply, fix this by removing tabs. Also add a space after a comma to clean up a cppcheck warning. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* fbdev: mxsfb: implement FB_PRE_INIT_FB optionMelchior Franz2019-04-012-1/+13
| | | | | | | | | | | | | | | | | The FB_PRE_INIT_FB option keeps the kernel from reinitializing the display and prevents flickering during the transition from a bootloader splash screen to the kernel logo screen. Make this option available for the mxsfb driver. Signed-off-by: Melchior Franz <melchior.franz@ginzinger.com> Signed-off-by: Martin Kepplinger <martin.kepplinger@ginzinger.com> Signed-off-by: Manfred Schlaegl <manfred.schlaegl@ginzinger.com> Cc: Shawn Guo <shawnguo@kernel.org> Cc: Sascha Hauer <s.hauer@pengutronix.de> Cc: Pengutronix Kernel Team <kernel@pengutronix.de> Cc: Fabio Estevam <festevam@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
* Merge tag 'v5.1-rc3' of ↵Bartlomiej Zolnierkiewicz2019-04-0111790-260612/+515958
|\ | | | | | | | | | | | | | | | | https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into fbdev-for-next Linux 5.1-rc3 Sync with upstream (which now contains fbdev-v5.1 changes) to prepare a base for fbdev-v5.2 changes.
| * Linux 5.1-rc3v5.1-rc3Linus Torvalds2019-03-311-1/+1
| |
| * Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2019-03-3160-201/+409
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM fixes from Paolo Bonzini: "A collection of x86 and ARM bugfixes, and some improvements to documentation. On top of this, a cleanup of kvm_para.h headers, which were exported by some architectures even though they not support KVM at all. This is responsible for all the Kbuild changes in the diffstat" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (28 commits) Documentation: kvm: clarify KVM_SET_USER_MEMORY_REGION KVM: doc: Document the life cycle of a VM and its resources KVM: selftests: complete IO before migrating guest state KVM: selftests: disable stack protector for all KVM tests KVM: selftests: explicitly disable PIE for tests KVM: selftests: assert on exit reason in CR4/cpuid sync test KVM: x86: update %rip after emulating IO x86/kvm/hyper-v: avoid spurious pending stimer on vCPU init kvm/x86: Move MSR_IA32_ARCH_CAPABILITIES to array emulated_msrs KVM: x86: Emulate MSR_IA32_ARCH_CAPABILITIES on AMD hosts kvm: don't redefine flags as something else kvm: mmu: Used range based flushing in slot_handle_level_range KVM: export <linux/kvm_para.h> and <asm/kvm_para.h> iif KVM is supported KVM: x86: remove check on nr_mmu_pages in kvm_arch_commit_memory_region() kvm: nVMX: Add a vmentry check for HOST_SYSENTER_ESP and HOST_SYSENTER_EIP fields KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation) KVM: Reject device ioctls from processes other than the VM's creator KVM: doc: Fix incorrect word ordering regarding supported use of APIs KVM: x86: fix handling of role.cr4_pae and rename it to 'gpte_size' KVM: nVMX: Do not inherit quadrant and invalid for the root shadow EPT ...
| | * Merge tag 'kvmarm-fixes-for-5.1' of ↵Paolo Bonzini2019-03-289-75/+133
| | |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master KVM/ARM fixes for 5.1 - Fix THP handling in the presence of pre-existing PTEs - Honor request for PTE mappings even when THPs are available - GICv4 performance improvement - Take the srcu lock when writing to guest-controlled ITS data structures - Reset the virtual PMU in preemptible context - Various cleanups
| | | * KVM: arm/arm64: Comments cleanup in mmu.cZenghui Yu2019-03-281-14/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some comments in virt/kvm/arm/mmu.c are outdated. Update them to reflect the current state of the code. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> [maz: commit message tidy-up] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | | * KVM: arm/arm64: vgic-its: Make attribute accessors staticYueHaibing2019-03-201-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix sparse warnings: arch/arm64/kvm/../../../virt/kvm/arm/vgic/vgic-its.c:1732:5: warning: symbol 'vgic_its_has_attr_regs' was not declared. Should it be static? arch/arm64/kvm/../../../virt/kvm/arm/vgic/vgic-its.c:1753:5: warning: symbol 'vgic_its_attr_regs_access' was not declared. Should it be static? Signed-off-by: YueHaibing <yuehaibing@huawei.com> [maz: fixed subject] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | | * KVM: arm/arm64: Fix handling of stage2 huge mappingsSuzuki K Poulose2019-03-202-16/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We rely on the mmu_notifier call backs to handle the split/merge of huge pages and thus we are guaranteed that, while creating a block mapping, either the entire block is unmapped at stage2 or it is missing permission. However, we miss a case where the block mapping is split for dirty logging case and then could later be made block mapping, if we cancel the dirty logging. This not only creates inconsistent TLB entries for the pages in the the block, but also leakes the table pages for PMD level. Handle this corner case for the huge mappings at stage2 by unmapping the non-huge mapping for the block. This could potentially release the upper level table. So we need to restart the table walk once we unmap the range. Fixes : ad361f093c1e31d ("KVM: ARM: Support hugetlbfs backed huge pages") Reported-by: Zheng Xiang <zhengxiang9@huawei.com> Cc: Zheng Xiang <zhengxiang9@huawei.com> Cc: Zenghui Yu <yuzenghui@huawei.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | | * KVM: arm/arm64: Enforce PTE mappings at stage2 when neededSuzuki K Poulose2019-03-191-22/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 6794ad5443a2118 ("KVM: arm/arm64: Fix unintended stage 2 PMD mappings") made the checks to skip huge mappings, stricter. However it introduced a bug where we still use huge mappings, ignoring the flag to use PTE mappings, by not reseting the vma_pagesize to PAGE_SIZE. Also, the checks do not cover the PUD huge pages, that was under review during the same period. This patch fixes both the issues. Fixes : 6794ad5443a2118 ("KVM: arm/arm64: Fix unintended stage 2 PMD mappings") Reported-by: Zenghui Yu <yuzenghui@huawei.com> Cc: Zenghui Yu <yuzenghui@huawei.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | | * KVM: arm/arm64: vgic-its: Take the srcu lock when parsing the memslotsMarc Zyngier2019-03-191-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Calling kvm_is_visible_gfn() implies that we're parsing the memslots, and doing this without the srcu lock is frown upon: [12704.164532] ============================= [12704.164544] WARNING: suspicious RCU usage [12704.164560] 5.1.0-rc1-00008-g600025238f51-dirty #16 Tainted: G W [12704.164573] ----------------------------- [12704.164589] ./include/linux/kvm_host.h:605 suspicious rcu_dereference_check() usage! [12704.164602] other info that might help us debug this: [12704.164616] rcu_scheduler_active = 2, debug_locks = 1 [12704.164631] 6 locks held by qemu-system-aar/13968: [12704.164644] #0: 000000007ebdae4f (&kvm->lock){+.+.}, at: vgic_its_set_attr+0x244/0x3a0 [12704.164691] #1: 000000007d751022 (&its->its_lock){+.+.}, at: vgic_its_set_attr+0x250/0x3a0 [12704.164726] #2: 00000000219d2706 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [12704.164761] #3: 00000000a760aecd (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [12704.164794] #4: 000000000ef8e31d (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [12704.164827] #5: 000000007a872093 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [12704.164861] stack backtrace: [12704.164878] CPU: 2 PID: 13968 Comm: qemu-system-aar Tainted: G W 5.1.0-rc1-00008-g600025238f51-dirty #16 [12704.164887] Hardware name: rockchip evb_rk3399/evb_rk3399, BIOS 2019.04-rc3-00124-g2feec69fb1 03/15/2019 [12704.164896] Call trace: [12704.164910] dump_backtrace+0x0/0x138 [12704.164920] show_stack+0x24/0x30 [12704.164934] dump_stack+0xbc/0x104 [12704.164946] lockdep_rcu_suspicious+0xcc/0x110 [12704.164958] gfn_to_memslot+0x174/0x190 [12704.164969] kvm_is_visible_gfn+0x28/0x70 [12704.164980] vgic_its_check_id.isra.0+0xec/0x1e8 [12704.164991] vgic_its_save_tables_v0+0x1ac/0x330 [12704.165001] vgic_its_set_attr+0x298/0x3a0 [12704.165012] kvm_device_ioctl_attr+0x9c/0xd8 [12704.165022] kvm_device_ioctl+0x8c/0xf8 [12704.165035] do_vfs_ioctl+0xc8/0x960 [12704.165045] ksys_ioctl+0x8c/0xa0 [12704.165055] __arm64_sys_ioctl+0x28/0x38 [12704.165067] el0_svc_common+0xd8/0x138 [12704.165078] el0_svc_handler+0x38/0x78 [12704.165089] el0_svc+0x8/0xc Make sure the lock is taken when doing this. Fixes: bf308242ab98 ("KVM: arm/arm64: VGIC/ITS: protect kvm_read_guest() calls with SRCU lock") Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | | * KVM: arm/arm64: vgic-its: Take the srcu lock when writing to guest memoryMarc Zyngier2019-03-194-6/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When halting a guest, QEMU flushes the virtual ITS caches, which amounts to writing to the various tables that the guest has allocated. When doing this, we fail to take the srcu lock, and the kernel shouts loudly if running a lockdep kernel: [ 69.680416] ============================= [ 69.680819] WARNING: suspicious RCU usage [ 69.681526] 5.1.0-rc1-00008-g600025238f51-dirty #18 Not tainted [ 69.682096] ----------------------------- [ 69.682501] ./include/linux/kvm_host.h:605 suspicious rcu_dereference_check() usage! [ 69.683225] [ 69.683225] other info that might help us debug this: [ 69.683225] [ 69.683975] [ 69.683975] rcu_scheduler_active = 2, debug_locks = 1 [ 69.684598] 6 locks held by qemu-system-aar/4097: [ 69.685059] #0: 0000000034196013 (&kvm->lock){+.+.}, at: vgic_its_set_attr+0x244/0x3a0 [ 69.686087] #1: 00000000f2ed935e (&its->its_lock){+.+.}, at: vgic_its_set_attr+0x250/0x3a0 [ 69.686919] #2: 000000005e71ea54 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [ 69.687698] #3: 00000000c17e548d (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [ 69.688475] #4: 00000000ba386017 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [ 69.689978] #5: 00000000c2c3c335 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [ 69.690729] [ 69.690729] stack backtrace: [ 69.691151] CPU: 2 PID: 4097 Comm: qemu-system-aar Not tainted 5.1.0-rc1-00008-g600025238f51-dirty #18 [ 69.691984] Hardware name: rockchip evb_rk3399/evb_rk3399, BIOS 2019.04-rc3-00124-g2feec69fb1 03/15/2019 [ 69.692831] Call trace: [ 69.694072] lockdep_rcu_suspicious+0xcc/0x110 [ 69.694490] gfn_to_memslot+0x174/0x190 [ 69.694853] kvm_write_guest+0x50/0xb0 [ 69.695209] vgic_its_save_tables_v0+0x248/0x330 [ 69.695639] vgic_its_set_attr+0x298/0x3a0 [ 69.696024] kvm_device_ioctl_attr+0x9c/0xd8 [ 69.696424] kvm_device_ioctl+0x8c/0xf8 [ 69.696788] do_vfs_ioctl+0xc8/0x960 [ 69.697128] ksys_ioctl+0x8c/0xa0 [ 69.697445] __arm64_sys_ioctl+0x28/0x38 [ 69.697817] el0_svc_common+0xd8/0x138 [ 69.698173] el0_svc_handler+0x38/0x78 [ 69.698528] el0_svc+0x8/0xc The fix is to obviously take the srcu lock, just like we do on the read side of things since bf308242ab98. One wonders why this wasn't fixed at the same time, but hey... Fixes: bf308242ab98 ("KVM: arm/arm64: VGIC/ITS: protect kvm_read_guest() calls with SRCU lock") Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | | * arm64: KVM: Always set ICH_HCR_EL2.EN if GICv4 is enabledMarc Zyngier2019-03-192-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The normal interrupt flow is not to enable the vgic when no virtual interrupt is to be injected (i.e. the LRs are empty). But when a guest is likely to use GICv4 for LPIs, we absolutely need to switch it on at all times. Otherwise, VLPIs only get delivered when there is something in the LRs, which doesn't happen very often. Reported-by: Nianyao Tang <tangnianyao@huawei.com> Tested-by: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | | * KVM: arm64: Reset the PMU in preemptible contextMarc Zyngier2019-03-191-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We've become very cautious to now always reset the vcpu when nothing is loaded on the physical CPU. To do so, we now disable preemption and do a kvm_arch_vcpu_put() to make sure we have all the state in memory (and that it won't be loaded behind out back). This now causes issues with resetting the PMU, which calls into perf. Perf itself uses mutexes, which clashes with the lack of preemption. It is worth realizing that the PMU is fully emulated, and that no PMU state is ever loaded on the physical CPU. This means we can perfectly reset the PMU outside of the non-preemptible section. Fixes: e761a927bc9a ("KVM: arm/arm64: Reset the VCPU without preemption and vcpu state loaded") Reported-by: Julien Grall <julien.grall@arm.com> Tested-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | Documentation: kvm: clarify KVM_SET_USER_MEMORY_REGIONPaolo Bonzini2019-03-281-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The documentation does not mention how to delete a slot, add the information. Reported-by: Nathaniel McCallum <npmccallum@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| | * | KVM: doc: Document the life cycle of a VM and its resourcesSean Christopherson2019-03-281-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The series to add memcg accounting to KVM allocations[1] states: There are many KVM kernel memory allocations which are tied to the life of the VM process and should be charged to the VM process's cgroup. While it is correct to account KVM kernel allocations to the cgroup of the process that created the VM, it's technically incorrect to state that the KVM kernel memory allocations are tied to the life of the VM process. This is because the VM itself, i.e. struct kvm, is not tied to the life of the process which created it, rather it is tied to the life of its associated file descriptor. In other words, kvm_destroy_vm() is not invoked until fput() decrements its associated file's refcount to zero. A simple example is to fork() in Qemu and have the child sleep indefinitely; kvm_destroy_vm() isn't called until Qemu closes its file descriptor *and* the rogue child is killed. The allocations are guaranteed to be *accounted* to the process which created the VM, but only because KVM's per-{VM,vCPU} ioctls reject the ioctl() with -EIO if kvm->mm != current->mm. I.e. the child can keep the VM "alive" but can't do anything useful with its reference. Note that because 'struct kvm' also holds a reference to the mm_struct of its owner, the above behavior also applies to userspace allocations. Given that mucking with a VM's file descriptor can lead to subtle and undesirable behavior, e.g. memcg charges persisting after a VM is shut down, explicitly document a VM's lifecycle and its impact on the VM's resources. Alternatively, KVM could aggressively free resources when the creating process exits, e.g. via mmu_notifier->release(). However, mmu_notifier isn't guaranteed to be available, and freeing resources when the creator exits is likely to be error prone and fragile as KVM would need to ensure that it only freed resources that are truly out of reach. In practice, the existing behavior shouldn't be problematic as a properly configured system will prevent a child process from being moved out of the appropriate cgroup hierarchy, i.e. prevent hiding the process from the OOM killer, and will prevent an unprivileged user from being able to to hold a reference to struct kvm via another method, e.g. debugfs. [1]https://patchwork.kernel.org/patch/10806707/ Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| | * | KVM: selftests: complete IO before migrating guest stateSean Christopherson2019-03-283-2/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Documentation/virtual/kvm/api.txt states: NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_PAPR and KVM_EXIT_EPR the corresponding operations are complete (and guest state is consistent) only after userspace has re-entered the kernel with KVM_RUN. The kernel side will first finish incomplete operations and then check for pending signals. Userspace can re-enter the guest with an unmasked signal pending to complete pending operations. Because guest state may be inconsistent, starting state migration after an IO exit without first completing IO may result in test failures, e.g. a proposed change to KVM's handling of %rip in its fast PIO handling[1] will cause the new VM, i.e. the post-migration VM, to have its %rip set to the IN instruction that triggered KVM_EXIT_IO, leading to a test assertion due to a stage mismatch. For simplicitly, require KVM_CAP_IMMEDIATE_EXIT to complete IO and skip the test if it's not available. The addition of KVM_CAP_IMMEDIATE_EXIT predates the state selftest by more than a year. [1] https://patchwork.kernel.org/patch/10848545/ Fixes: fa3899add1056 ("kvm: selftests: add basic test for state save and restore") Reported-by: Jim Mattson <jmattson@google.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| | * | KVM: selftests: disable stack protector for all KVM testsSean Christopherson2019-03-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since 4.8.3, gcc has enabled -fstack-protector by default. This is problematic for the KVM selftests as they do not configure fs or gs segments (the stack canary is pulled from fs:0x28). With the default behavior, gcc will insert a stack canary on any function that creates buffers of 8 bytes or more. As a result, ucall() will hit a triple fault shutdown due to reading a bad fs segment when inserting its stack canary, i.e. every test fails with an unexpected SHUTDOWN. Fixes: 14c47b7530e2d ("kvm: selftests: introduce ucall") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| | * | KVM: selftests: explicitly disable PIE for testsSean Christopherson2019-03-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KVM selftests embed the guest "image" as a function in the test itself and extract the guest code at runtime by manually parsing the elf headers. The parsing is very simple and doesn't supporting fancy things like position independent executables. Recent versions of gcc enable pie by default, which results in triple fault shutdowns in the guest due to the virtual address in the headers not matching up with the virtual address retrieved from the function pointer. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| | * | KVM: selftests: assert on exit reason in CR4/cpuid sync testSean Christopherson2019-03-281-16/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ...so that the test doesn't end up in an infinite loop if it fails for whatever reason, e.g. SHUTDOWN due to gcc inserting stack canary code into ucall() and attempting to derefence a null segment. Fixes: ca359066889f7 ("kvm: selftests: add cr4_cpuid_sync_test") Cc: Wei Huang <wei@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| | * | KVM: x86: update %rip after emulating IOSean Christopherson2019-03-282-10/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most (all?) x86 platforms provide a port IO based reset mechanism, e.g. OUT 92h or CF9h. Userspace may emulate said mechanism, i.e. reset a vCPU in response to KVM_EXIT_IO, without explicitly announcing to KVM that it is doing a reset, e.g. Qemu jams vCPU state and resumes running. To avoid corruping %rip after such a reset, commit 0967b7bf1c22 ("KVM: Skip pio instruction when it is emulated, not executed") changed the behavior of PIO handlers, i.e. today's "fast" PIO handling to skip the instruction prior to exiting to userspace. Full emulation doesn't need such tricks becase re-emulating the instruction will naturally handle %rip being changed to point at the reset vector. Updating %rip prior to executing to userspace has several drawbacks: - Userspace sees the wrong %rip on the exit, e.g. if PIO emulation fails it will likely yell about the wrong address. - Single step exits to userspace for are effectively dropped as KVM_EXIT_DEBUG is overwritten with KVM_EXIT_IO. - Behavior of PIO emulation is different depending on whether it goes down the fast path or the slow path. Rather than skip the PIO instruction before exiting to userspace, snapshot the linear %rip and cancel PIO completion if the current value does not match the snapshot. For a 64-bit vCPU, i.e. the most common scenario, the snapshot and comparison has negligible overhead as VMCS.GUEST_RIP will be cached regardless, i.e. there is no extra VMREAD in this case. All other alternatives to snapshotting the linear %rip that don't rely on an explicit reset announcenment suffer from one corner case or another. For example, canceling PIO completion on any write to %rip fails if userspace does a save/restore of %rip, and attempting to avoid that issue by canceling PIO only if %rip changed then fails if PIO collides with the reset %rip. Attempting to zero in on the exact reset vector won't work for APs, which means adding more hooks such as the vCPU's MP_STATE, and so on and so forth. Checking for a linear %rip match technically suffers from corner cases, e.g. userspace could theoretically rewrite the underlying code page and expect a different instruction to execute, or the guest hardcodes a PIO reset at 0xfffffff0, but those are far, far outside of what can be considered normal operation. Fixes: 432baf60eee3 ("KVM: VMX: use kvm_fast_pio_in for handling IN I/O") Cc: <stable@vger.kernel.org> Reported-by: Jim Mattson <jmattson@google.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| | * | x86/kvm/hyper-v: avoid spurious pending stimer on vCPU initVitaly Kuznetsov2019-03-281-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When userspace initializes guest vCPUs it may want to zero all supported MSRs including Hyper-V related ones including HV_X64_MSR_STIMERn_CONFIG/ HV_X64_MSR_STIMERn_COUNT. With commit f3b138c5d89a ("kvm/x86: Update SynIC timers on guest entry only") we began doing stimer_mark_pending() unconditionally on every config change. The issue I'm observing manifests itself as following: - Qemu writes 0 to STIMERn_{CONFIG,COUNT} MSRs and marks all stimers as pending in stimer_pending_bitmap, arms KVM_REQ_HV_STIMER; - kvm_hv_has_stimer_pending() starts returning true; - kvm_vcpu_has_events() starts returning true; - kvm_arch_vcpu_runnable() starts returning true; - when kvm_arch_vcpu_ioctl_run() gets into (vcpu->arch.mp_state == KVM_MP_STATE_UNINITIALIZED) case: - kvm_vcpu_block() gets in 'kvm_vcpu_check_block(vcpu) < 0' and returns immediately, avoiding normal wait path; - -EAGAIN is returned from kvm_arch_vcpu_ioctl_run() immediately forcing userspace to retry. So instead of normal wait path we get a busy loop on all secondary vCPUs before they get INIT signal. This seems to be undesirable, especially given that this happens even when Hyper-V extensions are not used. Generally, it seems to be pointless to mark an stimer as pending in stimer_pending_bitmap and arm KVM_REQ_HV_STIMER as the only thing kvm_hv_process_stimers() will do is clear the corresponding bit. We may just not mark disabled timers as pending instead. Fixes: f3b138c5d89a ("kvm/x86: Update SynIC timers on guest entry only") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>