linux - linux

	Commit message (Collapse)	Author	Files	Lines
2016-07-29	documentation: da9052: Update regulator bindings names to match DA9052/53 ↵	Steve Twiss	1	-11/+11
	DTS expectations Buck and LDO binding name changes. The binding names for the regulators have been changed to match the current expectation from existing device tree source files. This fix rectifies the disparity between what currently exists in some .dts[i] board files and what is listed in this binding document. This change re-aligns those differences and also brings the binding document in-line with the expectations of the product datasheet from Dialog Semiconductor. Bucks and LDOs now follow the expected notation: { buck1, buck2, buck3, buck4 } { ldo1, ldo2, ldo3, ldo4, ldo5, ldo6, ldo7, ldo8, ldo9, ldo10 } Signed-off-by: Steve Twiss <stwiss.opensource@diasemi.com> Signed-off-by: Rob Herring <robh@kernel.org>
2016-07-29	Revert "vfs: add lookup_hash() helper"	Linus Torvalds	2	-30/+5
	This reverts commit 3c9fe8cdff1b889a059a30d22f130372f2b3885f. As Miklos points out in commit c1b2cc1a765a, the "lookup_hash()" helper is now unused, and in fact, with the hash salting changes, since the hash of a dentry name now depends on the directory dentry it is in, the helper function isn't even really likely to be useful. So rather than keep it around in case somebody else might end up finding a use for it, let's just remove the helper and not trick people into thinking it might be a useful thing. For example, I had obviously completely missed how the helper didn't follow the normal dentry hashing patterns, and how the hash salting patch broke overlayfs. Things would quietly build and look sane, but not work. Suggested-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-29	sparc64: Trim page tables for 8M hugepages	Nitin Gupta	6	-68/+129
	For PMD aligned (8M) hugepages, we currently allocate all four page table levels which is wasteful. We now allocate till PMD level only which saves memory usage from page tables. Also, when freeing page table for 8M hugepage backed region, make sure we don't try to access non-existent PTE level. Orabug: 22630259 Signed-off-by: Nitin Gupta <nitin.m.gupta@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-29	objtool: Un-capitalize "Warning" for out-of-sync instruction decoder	Josh Poimboeuf	1	-1/+1
	Change "Warning" to "warning" to make it look more like a GCC warning. Hopefully that will be enough to help the 0-day bot or other automated tools catch this warning earlier before it ends up in Linus's tree. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/b1669f391a5db91040427fd9f8e1e79db18f9709.1469751119.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-29	objtool: Resync x86 instruction decoder with the kernel's	Josh Poimboeuf	5	-101/+220
	This fixes the following warning: Warning: objtool: x86 instruction decoder differs from kernel Unfortunately we have three identical copies of the x86 instruction decoder in the kernel tree that have to be manually kept in sync. It's on my TODO list to at least library-ize the ones in the tools subdir so we'd only have two of them instead of three. In the meantime, here's another manual sync. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: c61f4d5ebaf0 ("perf tools: Add AVX-512 support to the instruction decoder used by Intel PT") Link: http://lkml.kernel.org/r/d7f74b4d91fed25b0be33cd5c86f5131fa1a7529.1469751119.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-29	objtool: Support new GCC 6 switch jump table pattern	Josh Poimboeuf	1	-52/+88
	This fixes some false positive objtool warnings seen with gcc 6.1.1: kernel/trace/ring_buffer.o: warning: objtool: ring_buffer_read_page()+0x36c: sibling call from callable instruction with changed frame pointer arch/x86/kernel/reboot.o: warning: objtool: native_machine_emergency_restart()+0x139: sibling call from callable instruction with changed frame pointer lib/xz/xz_dec_stream.o: warning: objtool: xz_dec_run()+0xc2: sibling call from callable instruction with changed frame pointer With GCC 6, a new code pattern is sometimes used to access a switch statement jump table in .rodata, which objtool doesn't yet recognize: mov [rodata addr],%reg1 ... some instructions ... jmpq *(%reg1,%reg2,8) Add support for detecting that pattern. The detection code is rather crude, but it's still effective at weeding out false positives and catching real warnings. It can be refined later once objtool starts reading DWARF CFI. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/b8c9503b4ad8c8a827cc5400db4c1b40a3ea07bc.1469751119.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-29	fuse: use filemap_check_errors()	Miklos Szeredi	1	-12/+2
	Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29	mm: export filemap_check_errors() to modules	Miklos Szeredi	2	-1/+3
	Can be used by fuse, btrfs and f2fs to replace opencoded variants. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29	fuse: fix wrong assignment of ->flags in fuse_send_init()	Wei Fang	1	-1/+1
	FUSE_HAS_IOCTL_DIR should be assigned to ->flags, it may be a typo. Signed-off-by: Wei Fang <fangwei1@huawei.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 69fe05c90ed5 ("fuse: add missing INIT flags") Cc: <stable@vger.kernel.org>
2016-07-29	fuse: fuse_flush must check mapping->flags for errors	Maxim Patlasov	1	-0/+9
	fuse_flush() calls write_inode_now() that triggers writeback, but actual writeback will happen later, on fuse_sync_writes(). If an error happens, fuse_writepage_end() will set error bit in mapping->flags. So, we have to check mapping->flags after fuse_sync_writes(). Signed-off-by: Maxim Patlasov <mpatlasov@virtuozzo.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 4d99ff8f12eb ("fuse: Turn writeback cache on") Cc: <stable@vger.kernel.org> # v3.15+
2016-07-29	fuse: fsync() did not return IO errors	Alexey Kuznetsov	1	-0/+15
	Due to implementation of fuse writeback filemap_write_and_wait_range() does not catch errors. We have to do this directly after fuse_sync_writes() Signed-off-by: Alexey Kuznetsov <kuznet@virtuozzo.com> Signed-off-by: Maxim Patlasov <mpatlasov@virtuozzo.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 4d99ff8f12eb ("fuse: Turn writeback cache on") Cc: <stable@vger.kernel.org> # v3.15+
2016-07-29	x86/power/64: Fix hibernation return address corruption	Josh Poimboeuf	1	-3/+1
	In kernel bug 150021, a kernel panic was reported when restoring a hibernate image. Only a picture of the oops was reported, so I can't paste the whole thing here. But here are the most interesting parts: kernel tried to execute NX-protected page - exploit attempt? (uid: 0) BUG: unable to handle kernel paging request at ffff8804615cfd78 ... RIP: ffff8804615cfd78 RSP: ffff8804615f0000 RBP: ffff8804615cfdc0 ... Call Trace: do_signal+0x23 exit_to_usermode_loop+0x64 ... The RIP is on the same page as RBP, so it apparently started executing on the stack. The bug was bisected to commit ef0f3ed5a4ac (x86/asm/power: Create stack frames in hibernate_asm_64.S), which in retrospect seems quite dangerous, since that code saves and restores the stack pointer from a global variable ('saved_context'). There are a lot of moving parts in the hibernate save and restore paths, so I don't know exactly what caused the panic. Presumably, a FRAME_END was executed without the corresponding FRAME_BEGIN, or vice versa. That would corrupt the return address on the stack and would be consistent with the details of the above panic. [ rjw: One major problem is that by the time the FRAME_BEGIN in restore_registers() is executed, the stack pointer value may not be valid any more. Namely, the stack area pointed to by it previously may have been overwritten by some image memory contents and that page frame may now be used for whatever different purpose it had been allocated for before hibernation. In that case, the FRAME_BEGIN will corrupt that memory. ] Instead of doing the frame pointer save/restore around the bounds of the affected functions, just do it around the call to swsusp_save(). That has the same effect of ensuring that if swsusp_save() sleeps, the frame pointers will be correct. It's also a much more obviously safe way to do it than the original patch. And objtool still doesn't report any warnings. Fixes: ef0f3ed5a4ac (x86/asm/power: Create stack frames in hibernate_asm_64.S) Link: https://bugzilla.kernel.org/show_bug.cgi?id=150021 Cc: 4.6+ <stable@vger.kernel.org> # 4.6+ Reported-by: Andre Reinke <andre.reinke@mailbox.org> Tested-by: Andre Reinke <andre.reinke@mailbox.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-29	ovl: simplify empty checking	Miklos Szeredi	1	-29/+21
	The empty checking logic is duplicated in ovl_check_empty_and_clear() and ovl_remove_and_whiteout(), except the condition for clearing whiteouts is different: ovl_check_empty_and_clear() checked for being upper ovl_remove_and_whiteout() checked for merge OR lower Move the intersection of those checks (upper AND merge) into ovl_check_empty_and_clear() and simplify ovl_remove_and_whiteout(). Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29	qstr: constify instances in overlayfs	Al Viro	1	-1/+1
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29	ovl: clear nlink on rmdir	Miklos Szeredi	1	-2/+6
	To make delete notification work on fa/inotify. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29	ovl: disallow overlayfs as upperdir	Miklos Szeredi	1	-1/+2
	This does not work and does not make sense. So instead of fixing it (probably not hard) just disallow. Reported-by: Andrei Vagin <avagin@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org>
2016-07-29	ovl: fix warning	Miklos Szeredi	1	-1/+1
	There's a superfluous newline in the warning message in ovl_d_real(). Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29	ovl: remove duplicated include from super.c	Wei Yongjun	1	-1/+0
	Remove duplicated include. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29	ovl: append MAY_READ when diluting write checks	Vivek Goyal	1	-1/+4
	Right now we remove MAY_WRITE/MAY_APPEND bits from mask if realfile is on lower/. This is done as files on lower will never be written and will be copied up. But to copy up a file, mounter should have MAY_READ permission otherwise copy up will fail. So set MAY_READ in mask when MAY_WRITE is reset. Dan Walsh noticed this when he did access(lowerfile, W_OK) and it returned True (context mounts) but when he tried to actually write to file, it failed as mounter did not have permission on lower file. [SzM] don't set MAY_READ if only MAY_APPEND is set without MAY_WRITE; this won't trigger a copy-up. Reported-by: Dan Walsh <dwalsh@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29	ovl: dilute permission checks on lower only if not special file	Vivek Goyal	1	-1/+1
	Right now if file is on lower/, we remove MAY_WRITE/MAY_APPEND bits from mask as lower/ will never be written and file will be copied up. But this is not true for special files. These files are not copied up and are opened in place. So don't dilute the checks for these types of files. Reported-by: Dan Walsh <dwalsh@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29	ovl: fix POSIX ACL setting	Miklos Szeredi	4	-11/+104
	Setting POSIX ACL needs special handling: 1) Some permission checks are done by ->setxattr() which now uses mounter's creds ("ovl: do operations on underlying file system in mounter's context"). These permission checks need to be done with current cred as well. 2) Setting ACL can fail for various reasons. We do not need to copy up in these cases. In the mean time switch to using generic_setxattr. [Arnd Bergmann] Fix link error without POSIX ACL. posix_acl_from_xattr() doesn't have a 'static inline' implementation when CONFIG_FS_POSIX_ACL is disabled, and I could not come up with an obvious way to do it. This instead avoids the link error by defining two sets of ACL operations and letting the compiler drop one of the two at compile time depending on CONFIG_FS_POSIX_ACL. This avoids all references to the ACL code, also leading to smaller code. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29	ovl: share inode for hard link	Miklos Szeredi	4	-48/+100
	Inode attributes are copied up to overlay inode (uid, gid, mode, atime, mtime, ctime) so generic code using these fields works correcty. If a hard link is created in overlayfs separate inodes are allocated for each link. If chmod/chown/etc. is performed on one of the links then the inode belonging to the other ones won't be updated. This patch attempts to fix this by sharing inodes for hard links. Use inode hash (with real inode pointer as a key) to make sure overlay inodes are shared for hard links on upper. Hard links on lower are still split (which is not user observable until the copy-up happens, see Documentation/filesystems/overlayfs.txt under "Non-standard behavior"). The inode is only inserted in the hash if it is non-directoy and upper. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29	ovl: store real inode pointer in ->i_private	Miklos Szeredi	5	-37/+40
	To get from overlay inode to real inode we currently use 'struct ovl_entry', which has lifetime connected to overlay dentry. This is okay, since each overlay dentry had a new overlay inode allocated. Following patch will break that assumption, so need to leave out ovl_entry. This patch stores the real inode directly in i_private, with the lowest bit used to indicate whether the inode is upper or lower. Lifetime rules remain, using ovl_inode_real() must only be done while caller holds ref on overlay dentry (and hence on real dentry), or within RCU protected regions. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29	ovl: permission: return ECHILD instead of ENOENT	Miklos Szeredi	1	-1/+1
	The error is due to RCU and is temporary. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29	ovl: update atime on upper	Miklos Szeredi	4	-5/+37
	Fix atime update logic in overlayfs. This patch adds an i_op->update_time() handler to overlayfs inodes. This forwards atime updates to the upper layer only. No atime updates are done on lower layers. Remove implicit atime updates to underlying files and directories with O_NOATIME. Remove explicit atime update in ovl_readlink(). Clear atime related mnt flags from cloned upper mount. This means atime updates are controlled purely by overlayfs mount options. Reported-by: Konstantin Khlebnikov <koct9i@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29	ovl: fix sgid on directory	Miklos Szeredi	1	-4/+27
	When creating directory in workdir, the group/sgid inheritance from the parent dir was omitted completely. Fix this by calling inode_init_owner() on overlay inode and using the resulting uid/gid/mode to create the file. Unfortunately the sgid bit can be stripped off due to umask, so need to reset the mode in this case in workdir before moving the directory in place. Reported-by: Eryu Guan <eguan@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29	ovl: simplify permission checking	Miklos Szeredi	3	-53/+1
	The fact that we always do permission checking on the overlay inode and clear MAY_WRITE for checking access to the lower inode allows cruft to be removed from ovl_permission(). 1) "default_permissions" option effectively did generic_permission() on the overlay inode with i_mode, i_uid and i_gid updated from underlying filesystem. This is what we do by default now. It did the update using vfs_getattr() but that's only needed if the underlying filesystem can change (which is not allowed). We may later introduce a "paranoia_mode" that verifies that mode/uid/gid are not changed. 2) splitting out the IS_RDONLY() check from inode_permission() also becomes unnecessary once we remove the MAY_WRITE from the lower inode check. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29	ovl: do not require mounter to have MAY_WRITE on lower	Vivek Goyal	1	-0/+2
	Now we have two levels of checks in ovl_permission(). overlay inode is checked with the creds of task while underlying inode is checked with the creds of mounter. Looks like mounter does not have to have WRITE access to files on lower/. So remove the MAY_WRITE from access mask for checks on underlying lower inode. This means task should still have the MAY_WRITE permission on lower inode and mounter is not required to have MAY_WRITE. It also solves the problem of read only NFS mounts being used as lower. If __inode_permission(lower_inode, MAY_WRITE) is called on read only NFS, it fails. By resetting MAY_WRITE, check succeeds and case of read only NFS shold work with overlay without having to specify any special mount options (default permission). Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29	ovl: do operations on underlying file system in mounter's context	Vivek Goyal	2	-37/+72
	Given we are now doing checks both on overlay inode as well underlying inode, we should be able to do checks and operations on underlying file system using mounter's context. So modify all operations to do checks/operations on underlying dentry/inode in the context of mounter. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29	ovl: modify ovl_permission() to do checks on two inodes	Vivek Goyal	1	-4/+14
	Right now ovl_permission() calls __inode_permission(realinode), to do permission checks on real inode and no checks are done on overlay inode. Modify it to do checks both on overlay inode as well as underlying inode. Checks on overlay inode will be done with the creds of calling task while checks on underlying inode will be done with the creds of mounter. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29	ovl: define ->get_acl() for overlay inodes	Vivek Goyal	4	-0/+28
	Now we are planning to do DAC permission checks on overlay inode itself. And to make it work, we will need to make sure we can get acls from underlying inode. So define ->get_acl() for overlay inodes and this in turn calls into underlying filesystem to get acls, if any. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29	ovl: move some common code in a function	Vivek Goyal	1	-8/+12
	ovl_create_upper() and ovl_create_over_whiteout() seem to be sharing some common code which can be moved into a separate function. No functionality change. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29	ovl: store ovl_entry in inode->i_private for all inodes	Andreas Gruenbacher	1	-37/+11
	Previously this was only done for directory inodes. Doing so for all inodes makes for a nice cleanup in ovl_permission at zero cost. Inodes are not shared for hard links on the overlay, so this works fine. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29	ovl: use generic_delete_inode	Miklos Szeredi	1	-0/+1
	No point in keeping overlay inodes around since they will never be reused. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29	ovl: check mounter creds on underlying lookup	Miklos Szeredi	1	-4/+9
	The hash salting changes meant that we can no longer reuse the hash in the overlay dentry to look up the underlying dentry. Instead of lookup_hash(), use lookup_one_len_unlocked() and swith to mounter's creds (like we do for all other operations later in the series). Now the lookup_hash() export introduced in 4.6 by 3c9fe8cdff1b ("vfs: add lookup_hash() helper") is unused and can possibly be removed; its usefulness negated by the hash salting and the idea that mounter's creds should be used on operations on underlying filesystems. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 8387ff2577eb ("vfs: make the string hashes salt the hash")
2016-07-29	avr32: off by one in at32_init_pio()	Dan Carpenter	1	-1/+1
	The pio_dev[] array has MAX_NR_PIO_DEVICES elements so the > should be >=. Fixes: 5f97f7f9400d ('[PATCH] avr32 architecture') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
2016-07-29	avr32: fixup code style in unistd.h and syscall_table.S	Hans-Christian Noren Egtvedt	2	-654/+654
	This patch swaps the mix of tabs and space for alignment of comment after code to use spaces only. Also document why recvmmsg was defined twice in the syscall_table.S table, but only once in unistd.h. In short, wired in the table by generic arch patch, but forgotten in unistd.h (review slip).
2016-07-29	avr32: wire up preadv2 and pwritev2 syscalls	Hans-Christian Noren Egtvedt	3	-0/+22
	This patch wires up the new preadv2 and pwritev2 syscall on AVR32. On AVR32, all parameters beyond the 5th are passed on the stack. System calls don't use the stack -- they borrow a callee-saved register instead. This means that syscalls that take 6 parameters must be called through a stub that pushes the last parameter on the stack. Signed-off-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
2016-07-29	mmc: rtsx_pci: Remove deprecated create_singlethread_workqueue	Bhaktipriya Shridhar	1	-10/+2
	The workqueue "workq" provides support for sd/mmc async request, which makes next request do dma_map_sg() while previous request transferring data. The workqueue has a single workitem(&host->work) and hence doesn't require ordering. Also, it is not being used on a memory reclaim path. Hence, the singlethreaded workqueue has been replaced with the use of system_wq. System workqueues have been able to handle high level of concurrency for a long time now and hence it's not required to have a singlethreaded workqueue just to gain concurrency. Unlike a dedicated per-cpu workqueue created with create_singlethread_workqueue(), system_wq allows multiple work items to overlap executions even on the same CPU; however, a per-cpu workqueue doesn't have any CPU locality or global ordering guarantee unless the target CPU is explicitly specified and thus the increase of local concurrency shouldn't make any difference. Work item has been flushed in rtsx_pci_sdmmc_drv_remove() to ensure that there are no pending tasks while disconnecting the driver. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-07-29	mmc: rtsx_pci: Enable MMC_CAP_ERASE to allow erase/discard/trim requests	Ulf Hansson	1	-1/+1
	Cc: Micky Ching <micky_ching@realsil.com.cn> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Tested-by: Mauro Santos <registo.mailling@gmail.com>
2016-07-29	mmc: rtsx_pci: Use the provided busy timeout from the mmc core	Ulf Hansson	1	-1/+1
	The rtsx_pci driver is using a fixed 3s timeout for R1B responses, which in some cases isn't suffient. For example, erase/discard requests may require longer timeouts. Instead of always using a fixed timeout, let's use the per request calculated busy timeout from the mmc core. Cc: Micky Ching <micky_ching@realsil.com.cn> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Tested-by: Mauro Santos <registo.mailling@gmail.com>
2016-07-29	mmc: sdhci-pltfm: Drop define for SDHCI_PLTFM_PMOPS	Ulf Hansson	9	-9/+8
	Due to previous changes this define has no longer a purpose. Instead move the sdhci-pltfm drivers over to use the exported struct sdhci_pltfm_pmops. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-07-29	mmc: sdhci-pltfm: Convert to use the SET_SYSTEM_SLEEP_PM_OPS	Ulf Hansson	2	-8/+3
	Move the system PM callbacks within #ifdef CONFIG_PM_SLEEP as to avoid them being build when not used. This also allows us to use the SET_SYSTEM_SLEEP_PM_OPS macro which simplifies the code. Within this context it also makes sense to move the declaration of the struct sdhci_pltfm_pmops, outside the #ifdef CONFIG_PM as the SET_SYSTEM_SLEEP_PM_OPS deals with this. This further simplifies the code. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-07-29	mmc: sdhci-pltfm: Make sdhci_pltfm_suspend\|resume() static	Ulf Hansson	2	-6/+2
	There are no users left of these exported APIs, so let's make them static. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-07-29	mmc: sdhci-esdhc-imx: Use common sdhci_suspend\|resume_host()	Ulf Hansson	1	-2/+4
	To prepare to make the sdhci_pltfm_suspend\|resume() static functions, move sdhci-esdhc-imx over to use the sdhci_suspend\|resume_host(). Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Dong Aisheng <aisheng.dong@nxp.com>
2016-07-29	mmc: sdhci-esdhc-imx: Assign system PM ops within #ifdef CONFIG_PM_SLEEP	Ulf Hansson	1	-1/+3
	The system PM callbacks isn't used unless CONFIG_PM_SLEEP is set, thus it triggers a compiler warning about unused functions. Avoid this by changing from CONFIG_PM to CONFIG_PM_SLEEP. Reported-by: Arnd Bergmann <arnd@arndb.de> Fixes: b70d0b3b5b29 ("mmc: sdhci-esdhc-imx: add esdhc specific suspend resume callback") Cc: Dong Aisheng <aisheng.dong@nxp.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Dong Aisheng <aisheng.dong@nxp.com>
2016-07-29	sparc64 mm: Fix base TSB sizing when hugetlb pages are used	Mike Kravetz	6	-15/+19
	do_sparc64_fault() calculates both the base and huge page RSS sizes and uses this information in calls to tsb_grow(). The calculation for base page TSB size is not correct if the task uses hugetlb pages. hugetlb pages are not accounted for in RSS, therefore the call to get_mm_rss(mm) does not include hugetlb pages. However, the number of pages based on huge_pte_count (which does include hugetlb pages) is subtracted from this value. This will result in an artificially small and often negative RSS calculation. The base TSB size is then often set to max_tsb_size as the passed RSS is unsigned, so a negative value looks really big. THP pages are also accounted for in huge_pte_count, and THP pages are accounted for in RSS so the calculation in do_sparc64_fault() is correct if a task only uses THP pages. A single huge_pte_count is not sufficient for TSB sizing if both hugetlb and THP pages can be used. Instead of a single counter, use two: one for hugetlb and one for THP. Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-29	mm, compaction: simplify contended compaction handling	Vlastimil Babka	4	-101/+17
	Async compaction detects contention either due to failing trylock on zone->lock or lru_lock, or by need_resched(). Since 1f9efdef4f3f ("mm, compaction: khugepaged should not give up due to need_resched()") the code got quite complicated to distinguish these two up to the __alloc_pages_slowpath() level, so different decisions could be taken for khugepaged allocations. After the recent changes, khugepaged allocations don't check for contended compaction anymore, so we again don't need to distinguish lock and sched contention, and simplify the current convoluted code a lot. However, I believe it's also possible to simplify even more and completely remove the check for contended compaction after the initial async compaction for costly orders, which was originally aimed at THP page fault allocations. There are several reasons why this can be done now: - with the new defaults, THP page faults no longer do reclaim/compaction at all, unless the system admin has overridden the default, or application has indicated via madvise that it can benefit from THP's. In both cases, it means that the potential extra latency is expected and worth the benefits. - even if reclaim/compaction proceeds after this patch where it previously wouldn't, the second compaction attempt is still async and will detect the contention and back off, if the contention persists - there are still heuristics like deferred compaction and pageblock skip bits in place that prevent excessive THP page fault latencies Link: http://lkml.kernel.org/r/20160721073614.24395-9-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-29	mm, compaction: introduce direct compaction priority	Vlastimil Babka	4	-35/+40
	In the context of direct compaction, for some types of allocations we would like the compaction to either succeed or definitely fail while trying as hard as possible. Current async/sync_light migration mode is insufficient, as there are heuristics such as caching scanner positions, marking pageblocks as unsuitable or deferring compaction for a zone. At least the final compaction attempt should be able to override these heuristics. To communicate how hard compaction should try, we replace migration mode with a new enum compact_priority and change the relevant function signatures. In compact_zone_order() where struct compact_control is constructed, the priority is mapped to suitable control flags. This patch itself has no functional change, as the current priority levels are mapped back to the same migration modes as before. Expanding them will be done next. Note that !CONFIG_COMPACTION variant of try_to_compact_pages() is removed, as the only caller exists under CONFIG_COMPACTION. Link: http://lkml.kernel.org/r/20160721073614.24395-8-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-29	mm, thp: remove __GFP_NORETRY from khugepaged and madvised allocations	Vlastimil Babka	7	-25/+30
	After the previous patch, we can distinguish costly allocations that should be really lightweight, such as THP page faults, with __GFP_NORETRY. This means we don't need to recognize khugepaged allocations via PF_KTHREAD anymore. We can also change THP page faults in areas where madvise(MADV_HUGEPAGE) was used to try as hard as khugepaged, as the process has indicated that it benefits from THP's and is willing to pay some initial latency costs. We can also make the flags handling less cryptic by distinguishing GFP_TRANSHUGE_LIGHT (no reclaim at all, default mode in page fault) from GFP_TRANSHUGE (only direct reclaim, khugepaged default). Adding __GFP_NORETRY or __GFP_KSWAPD_RECLAIM is done where needed. The patch effectively changes the current GFP_TRANSHUGE users as follows: * get_huge_zero_page() - the zero page lifetime should be relatively long and it's shared by multiple users, so it's worth spending some effort on it. We use GFP_TRANSHUGE, and __GFP_NORETRY is not added. This also restores direct reclaim to this allocation, which was unintentionally removed by commit e4a49efe4e7e ("mm: thp: set THP defrag by default to madvise and add a stall-free defrag option") * alloc_hugepage_khugepaged_gfpmask() - this is khugepaged, so latency is not an issue. So if khugepaged "defrag" is enabled (the default), do reclaim via GFP_TRANSHUGE without __GFP_NORETRY. We can remove the PF_KTHREAD check from page alloc. As a side-effect, khugepaged will now no longer check if the initial compaction was deferred or contended. This is OK, as khugepaged sleep times between collapsion attempts are long enough to prevent noticeable disruption, so we should allow it to spend some effort. * migrate_misplaced_transhuge_page() - already was masking out __GFP_RECLAIM, so just convert to GFP_TRANSHUGE_LIGHT which is equivalent. * alloc_hugepage_direct_gfpmask() - vma's with VM_HUGEPAGE (via madvise) are now allocating without __GFP_NORETRY. Other vma's keep using __GFP_NORETRY if direct reclaim/compaction is at all allowed (by default it's allowed only for madvised vma's). The rest is conversion to GFP_TRANSHUGE(_LIGHT). [mhocko@suse.com: suggested GFP_TRANSHUGE_LIGHT] Link: http://lkml.kernel.org/r/20160721073614.24395-7-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>