summaryrefslogtreecommitdiffstats
path: root/fs/namespace.c (unfollow)
Commit message (Collapse)AuthorFilesLines
2018-06-04MAINTAINERS: Add Andreas Gruenbacher as a maintainer for gfs2Bob Peterson1-1/+1
Add Andreas Gruenbacher as a maintainer for the gfs2 file system and remove Steve Whitehouse. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-04gfs2: Iomap cleanups and improvementsAndreas Gruenbacher4-95/+126
Clean up gfs2_iomap_alloc and gfs2_iomap_get. Document how gfs2_iomap_alloc works: it now needs to be called separately after gfs2_iomap_get where necessary; this will be used later by iomap write. Move gfs2_iomap_ops into bmap.c. Introduce a new gfs2_iomap_get_alloc helper and use it in fallocate_chunk: gfs2_iomap_begin will become unsuitable for fallocate with proper iomap write support. In gfs2_block_map and fallocate_chunk, zero-initialize struct iomap. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-04gfs2: Remove ordered write mode handling from gfs2_trans_add_dataAndreas Gruenbacher5-28/+30
In journaled data mode, we need to add each buffer head to the current transaction. In ordered write mode, we only need to add the inode to the ordered inode list. So far, both cases are handled in gfs2_trans_add_data. This makes the code look misleading and is inefficient for small block sizes as well. Handle both cases separately instead. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-04gfs2: gfs2_stuffed_write_end cleanupAndreas Gruenbacher1-31/+18
First, change the sanity check in gfs2_stuffed_write_end to check for the actual write size instead of the requested write size. Second, use the existing teardown code in gfs2_write_end instead of duplicating it in gfs2_stuffed_write_end. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-04gfs2: hole_size improvementAndreas Gruenbacher1-57/+153
Reimplement function hole_size based on a generic function for walking the metadata tree and rename hole_size to gfs2_hole_size. While previously, multiple invocations of hole_size were sometimes needed to walk across the entire hole, the new implementation always returns the entire hole at once (provided that the caller is interested in the total size). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-04GFS2: gfs2_free_extlen can return an extent that is too longBob Peterson2-1/+2
Function gfs2_free_extlen calculates the length of an extent of free blocks that may be reserved. The end pointer was calculated as end = start + bh->b_size but b_size is incorrect because the bitmap usually stops prior to the end of the buffer data on the last bitmap. What this means is that when you do a write, you can reserve a chunk of blocks that runs off the end of the last bitmap. For example, I've got a file system where there is only one bitmap for each rgrp, so ri_length==1. I saw cases in which iozone tried to do a big write, grabbed a large block reservation, chose rgrp 5464152, which has ri_data0 5464153 and ri_data 8188. So 5464153 + 8188 = 5472341 which is the end of the rgrp. When it grabbed a reservation it got back: 5470936, length 7229. But 5470936 + 7229 = 5478165. So the reservation starts inside the rgrp but runs 5824 blocks past the end of the bitmap. This patch fixes the calculation so it won't exceed the last bitmap. It also adds a BUG_ON to guard against overflows in the future. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-04GFS2: Fix allocation error bug with recursive rgrp glockingAndreas Gruenbacher1-5/+8
Before this patch function gfs2_write_begin, upon discovering an error, called gfs2_trim_blocks while the rgrp glock was still held. That's because gfs2_inplace_release is not called until later. This patch reorganizes the logic a bit so gfs2_inplace_release is called to release the lock prior to the call to gfs2_trim_blocks, thus preventing the glock recursion. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-04gfs2: Update find_metapath commentAndreas Gruenbacher1-3/+2
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-03Linux 4.17v4.17Linus Torvalds1-1/+1
2018-06-03Revert "fs: fold open_check_o_direct into do_dentry_open"Al Viro3-19/+33
This reverts commit cab64df194667dc5d9d786f0a895f647f5501c0d. Having vfs_open() in some cases drop the reference to struct file combined with error = vfs_open(path, f, cred); if (error) { put_filp(f); return ERR_PTR(error); } return f; is flat-out wrong. It used to be error = vfs_open(path, f, cred); if (!error) { /* from now on we need fput() to dispose of f */ error = open_check_o_direct(f); if (error) { fput(f); f = ERR_PTR(error); } } else { put_filp(f); f = ERR_PTR(error); } and sure, having that open_check_o_direct() boilerplate gotten rid of is nice, but not that way... Worse, another call chain (via finish_open()) is FUBAR now wrt FILE_OPENED handling - in that case we get error returned, with file already hit by fput() *AND* FILE_OPENED not set. Guess what happens in path_openat(), when it hits if (!(opened & FILE_OPENED)) { BUG_ON(!error); put_filp(file); } The root cause of all that crap is that the callers of do_dentry_open() have no way to tell which way did it fail; while that could be fixed up (by passing something like int *opened to do_dentry_open() and have it marked if we'd called ->open()), it's probably much too late in the cycle to do so right now. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-03blk-mq: update nr_requests when switching to 'none' schedulerMing Lei1-0/+1
Now we setup q->nr_requests when switching to one new scheduler, but not do it for 'none', then q->nr_requests may not be correct for 'none'. This patch fixes this issue by always updating 'nr_requests' when switching to 'none'. Cc: Marco Patalano <mpatalan@redhat.com> Cc: "Ewan D. Milne" <emilne@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-03block: don't use blocking queue entered for recursive bio submitsJens Axboe3-1/+15
If we end up splitting a bio and the queue goes away between the initial submission and the later split submission, then we can block forever in blk_queue_enter() waiting for the reference to drop to zero. This will never happen, since we already hold a reference. Mark a split bio as already having entered the queue, so we can just use the live non-blocking queue enter variant. Thanks to Tetsuo Handa for the analysis. Reported-by: syzbot+c4f9cebf9d651f6e54de@syzkaller.appspotmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-03dm-crypt: fix warning in shutdown pathKent Overstreet1-4/+3
The counter for the number of allocated pages includes pages in the mempool's reserve, so checking that the number of allocated pages is 0 needs to happen after we exit the mempool. Fixes: 6f1c819c219f ("dm: convert to bioset_init()/mempool_init()") Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Reported-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Mike Snitzer <snitzer@redhat.com> Fixed to always just use percpu_counter_sum() Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-03CIFS: Add support for direct pages in wdataLong Li3-4/+17
Add a function to allocate wdata without allocating pages for data transfer. This gives the caller an option to pass a number of pages that point to the data buffer to write to. wdata is reponsible for free those pages after it's done. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com>
2018-06-03CIFS: Use offset when reading pagesLong Li5-19/+45
With offset defined in rdata, transport functions need to look at this offset when reading data into the correct places in pages. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com>
2018-06-03CIFS: Add support for direct pages in rdataLong Li2-4/+21
Add a function to allocate rdata without allocating pages for data transfer. This gives the caller an option to pass a number of pages that point to the data buffer. rdata is still reponsible for free those pages after it's done. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com>
2018-06-03cifs: update multiplex loop to handle compounded responsesRonnie Sahlberg4-5/+39
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
2018-06-02mm: fix the NULL mapping case in __isolate_lru_page()Hugh Dickins1-1/+1
George Boole would have noticed a slight error in 4.16 commit 69d763fc6d3a ("mm: pin address_space before dereferencing it while isolating an LRU page"). Fix it, to match both the comment above it, and the original behaviour. Although anonymous pages are not marked PageDirty at first, we have an old habit of calling SetPageDirty when a page is removed from swap cache: so there's a category of ex-swap pages that are easily migratable, but were inadvertently excluded from compaction's async migration in 4.16. Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1805302014001.12558@eggly.anvils Fixes: 69d763fc6d3a ("mm: pin address_space before dereferencing it while isolating an LRU page") Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Mel Gorman <mgorman@techsingularity.net> Reported-by: Ivan Kalvachev <ikalvachev@gmail.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-02mm/huge_memory.c: __split_huge_page() use atomic ClearPageDirty()Hugh Dickins1-1/+1
Swapping load on huge=always tmpfs (with khugepaged tuned up to be very eager, but I'm not sure that is relevant) soon hung uninterruptibly, waiting for page lock in shmem_getpage_gfp()'s find_lock_entry(), most often when "cp -a" was trying to write to a smallish file. Debug showed that the page in question was not locked, and page->mapping NULL by now, but page->index consistent with having been in a huge page before. Reproduced in minutes on a 4.15 kernel, even with 4.17's 605ca5ede764 ("mm/huge_memory.c: reorder operations in __split_huge_page_tail()") added in; but took hours to reproduce on a 4.17 kernel (no idea why). The culprit proved to be the __ClearPageDirty() on tails beyond i_size in __split_huge_page(): the non-atomic __bitoperation may have been safe when 4.8's baa355fd3314 ("thp: file pages support for split_huge_page()") introduced it, but liable to erase PageWaiters after 4.10's 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit"). Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1805291841070.3197@eggly.anvils Fixes: 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit") Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-02Revert "vfio/type1: Improve memory pinning process for raw PFN mapping"Alex Williamson1-15/+10
Bisection by Amadeusz Sławiński implicates this commit leading to bad page state issues after VM shutdown, likely due to unbalanced page references. The original commit was intended only as a performance improvement, therefore revert for offline rework. Link: https://lkml.org/lkml/2018/6/2/97 Fixes: 356e88ebe447 ("vfio/type1: Improve memory pinning process for raw PFN mapping") Cc: Jason Cai (Xiang Feng) <jason.cai@linux.alibaba.com> Reported-by: Amadeusz Sławiński <amade@asmblr.net> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-06-02bpf: fix uapi hole for 32 bit compat applicationsDaniel Borkmann2-0/+4
In 64 bit, we have a 4 byte hole between ifindex and netns_dev in the case of struct bpf_map_info but also struct bpf_prog_info. In net-next commit b85fab0e67b ("bpf: Add gpl_compatible flag to struct bpf_prog_info") added a bitfield into it to expose some flags related to programs. Thus, add an unnamed __u32 bitfield for both so that alignment keeps the same in both 32 and 64 bit cases, and can be naturally extended from there as in b85fab0e67b. Before: # file test.o test.o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not stripped # pahole test.o struct bpf_map_info { __u32 type; /* 0 4 */ __u32 id; /* 4 4 */ __u32 key_size; /* 8 4 */ __u32 value_size; /* 12 4 */ __u32 max_entries; /* 16 4 */ __u32 map_flags; /* 20 4 */ char name[16]; /* 24 16 */ __u32 ifindex; /* 40 4 */ __u64 netns_dev; /* 44 8 */ __u64 netns_ino; /* 52 8 */ /* size: 64, cachelines: 1, members: 10 */ /* padding: 4 */ }; After (same as on 64 bit): # file test.o test.o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not stripped # pahole test.o struct bpf_map_info { __u32 type; /* 0 4 */ __u32 id; /* 4 4 */ __u32 key_size; /* 8 4 */ __u32 value_size; /* 12 4 */ __u32 max_entries; /* 16 4 */ __u32 map_flags; /* 20 4 */ char name[16]; /* 24 16 */ __u32 ifindex; /* 40 4 */ /* XXX 4 bytes hole, try to pack */ __u64 netns_dev; /* 48 8 */ __u64 netns_ino; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ /* size: 64, cachelines: 1, members: 10 */ /* sum members: 60, holes: 1, sum holes: 4 */ }; Reported-by: Dmitry V. Levin <ldv@altlinux.org> Reported-by: Eugene Syromiatnikov <esyr@redhat.com> Fixes: 52775b33bb507 ("bpf: offload: report device information about offloaded maps") Fixes: 675fc275a3a2d ("bpf: offload: report device information for offloaded programs") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-01perf tools intel-pt-decoder: Update insn.h from the kernel sourcesArnaldo Carvalho de Melo1-0/+18
To pick up the changes in: ee6a7354a362 ("kprobes/x86: Prohibit probing on exception masking instructions") That doesn't entail changes in tooling, but silences this perf build warning: Warning: Intel PT: x86 instruction decoder header at 'tools/perf/util/intel-pt-decoder/insn.h' differs from latest version at 'arch/x86/include/asm/insn.h' Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-o3wfwjnyh7r8l0gi9q3y9f44@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-06-01tools headers: Sync x86 cpufeatures.h with the kernel sourcesArnaldo Carvalho de Melo1-6/+14
To pick up changes found in these csets: 11fb0683493b x86/speculation: Add virtualized speculative store bypass disable support d1035d971829 x86/cpufeatures: Add FEATURE_ZEN 52817587e706 x86/cpufeatures: Disentangle SSBD enumeration 7eb8956a7fec x86/cpufeatures: Disentangle MSR_SPEC_CTRL enumeration from IBRS e7c587da1252 x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP 9f65fb29374e x86/bugs: Rename _RDS to _SSBD 764f3c21588a x86/bugs/AMD: Add support to disable RDS on Fam[15,16,17]h if requested 24f7fc83b920 x86/bugs: Provide boot parameters for the spec_store_bypass_disable mitigation 0cc5fa00b0a8 x86/cpufeatures: Add X86_FEATURE_RDS c456442cd3a5 x86/bugs: Expose /sys/../spec_store_bypass The usage of this file in tools doesn't use the newly added X86_FEATURE_ defines: CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o LD /tmp/build/perf/bench/perf-in.o LD /tmp/build/perf/perf-in.o Silencing this perf build warning: Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h' Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-mrwyauyov8c7s048abg26khg@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-06-01tools headers: Synchronize prctl.h ABI headerArnaldo Carvalho de Melo1-0/+12
To pick up changes from: $ git log --oneline -2 -i include/uapi/linux/prctl.h 356e4bfff2c5 prctl: Add force disable speculation b617cfc85816 prctl: Add speculation control prctls $ tools/perf/trace/beauty/prctl_option.sh > before.c $ cp include/uapi/linux/prctl.h tools/include/uapi/linux/prctl.h $ tools/perf/trace/beauty/prctl_option.sh > after.c $ diff -u before.c after.c --- before.c 2018-06-01 10:39:53.834073962 -0300 +++ after.c 2018-06-01 10:42:11.307985394 -0300 @@ -35,6 +35,8 @@ [42] = "GET_THP_DISABLE", [45] = "SET_FP_MODE", [46] = "GET_FP_MODE", + [52] = "GET_SPECULATION_CTRL", + [53] = "SET_SPECULATION_CTRL", }; static const char *prctl_set_mm_options[] = { [1] = "START_CODE", $ This will be used by 'perf trace' to show these strings when beautifying the prctl syscall args. At some point we'll be able to say something like: 'perf trace --all-cpus -e prctl(option=*SPEC*)' To filter by arg by name. This silences this warning when building tools/perf: Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h' Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-zztsptwhc264r8wg44tqh5gp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-06-01perf trace beauty prctl: Default header_dir to cwd to work without parmsArnaldo Carvalho de Melo1-1/+1
Useful when checking the effects of header synchs for the files it uses as a input to generate string tables, in retrospect this is how it should've been done from day 1, not requiring the header_dir to be set on the Makefile, will change everything later, so that the only parm, common to all generators will be $(srctree) and $(beauty_outdir). So, to see what it generates, just call it without any parameters: $ tools/perf/trace/beauty/prctl_option.sh static const char *prctl_options[] = { [1] = "SET_PDEATHSIG", [2] = "GET_PDEATHSIG", [3] = "GET_DUMPABLE", [4] = "SET_DUMPABLE", [5] = "GET_UNALIGN", [6] = "SET_UNALIGN", [7] = "GET_KEEPCAPS", [8] = "SET_KEEPCAPS", [9] = "GET_FPEMU", [10] = "SET_FPEMU", [11] = "GET_FPEXC", [12] = "SET_FPEXC", [13] = "GET_TIMING", [14] = "SET_TIMING", [15] = "SET_NAME", [16] = "GET_NAME", [19] = "GET_ENDIAN", [20] = "SET_ENDIAN", [21] = "GET_SECCOMP", [22] = "SET_SECCOMP", [25] = "GET_TSC", [26] = "SET_TSC", [27] = "GET_SECUREBITS", [28] = "SET_SECUREBITS", [29] = "SET_TIMERSLACK", [30] = "GET_TIMERSLACK", [35] = "SET_MM", [36] = "SET_CHILD_SUBREAPER", [37] = "GET_CHILD_SUBREAPER", [38] = "SET_NO_NEW_PRIVS", [39] = "GET_NO_NEW_PRIVS", [40] = "GET_TID_ADDRESS", [41] = "SET_THP_DISABLE", [42] = "GET_THP_DISABLE", [45] = "SET_FP_MODE", [46] = "GET_FP_MODE", }; static const char *prctl_set_mm_options[] = { [1] = "START_CODE", [2] = "END_CODE", [3] = "START_DATA", [4] = "END_DATA", [5] = "START_STACK", [6] = "START_BRK", [7] = "BRK", [8] = "ARG_START", [9] = "ARG_END", [10] = "ENV_START", [11] = "ENV_END", [12] = "AUXV", [13] = "EXE_FILE", [14] = "MAP", [15] = "MAP_SIZE", }; $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-qtotspuztydjttxi7k6mec6h@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-06-01net: usb: cdc_mbim: add flag FLAG_SEND_ZLPDaniele Palmas1-1/+1
Testing Telit LM940 with ICMP packets > 14552 bytes revealed that the modem needs FLAG_SEND_ZLP to properly work, otherwise the cdc mbim data interface won't be anymore responsive. Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-01ip6_tunnel: remove magic mtu value 0xFFF8Nicolas Dichtel2-5/+11
I don't know where this value comes from (probably a copy and paste and paste and paste ...). Let's use standard values which are a bit greater. Link: https://git.kernel.org/pub/scm/linux/kernel/git/davem/netdev-vger-cvs.git/commit/?id=e5afd356a411a Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-01ip_tunnel: restore binding to ifaces with a large mtuNicolas Dichtel1-4/+4
After commit f6cc9c054e77, the following conf is broken (note that the default loopback mtu is 65536, ie IP_MAX_MTU + 1): $ ip tunnel add gre1 mode gre local 10.125.0.1 remote 10.125.0.2 dev lo add tunnel "gre0" failed: Invalid argument $ ip l a type dummy $ ip l s dummy1 up $ ip l s dummy1 mtu 65535 $ ip tunnel add gre1 mode gre local 10.125.0.1 remote 10.125.0.2 dev dummy1 add tunnel "gre0" failed: Invalid argument dev_set_mtu() doesn't allow to set a mtu which is too large. First, let's cap the mtu returned by ip_tunnel_bind_dev(). Second, remove the magic value 0xFFF8 and use IP_MAX_MTU instead. 0xFFF8 seems to be there for ages, I don't know why this value was used. With a recent kernel, it's also possible to set a mtu > IP_MAX_MTU: $ ip l s dummy1 mtu 66000 After that patch, it's also possible to bind an ip tunnel on that kind of interface. CC: Petr Machata <petrm@mellanox.com> CC: Ido Schimmel <idosch@mellanox.com> Link: https://git.kernel.org/pub/scm/linux/kernel/git/davem/netdev-vger-cvs.git/commit/?id=e5afd356a411a Fixes: f6cc9c054e77 ("ip_tunnel: Emit events for post-register MTU changes") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-01hwmon: (asus_atk0110) Make use of device managed memoryBastian Germann1-36/+8
Use devm_* variants of kstrdup and kzalloc. Get rid of kfree cleanups. Signed-off-by: Bastian Germann <bastiangermann@fishpost.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-06-01hwmon: (asus_atk0110) Replace deprecated device register callBastian Germann1-50/+25
Make the asus_atk0110 driver use hwmon_device_register_with_groups instead of the deprecated hwmon_device_register. Construct the expected attribute_group array from the sensor list which contains all needed attributes. Remove the manual sysfs file creation and deletion that are now taken care of by the (un)register calls via the attribute_group array. Signed-off-by: Bastian Germann <bastiangermann@fishpost.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-06-01hwmon: (k10temp) Make function get_raw_temp staticColin Ian King1-1/+1
The function get_raw_temp is local to the source and does not need to be in global scope, so make it static. Cleans up sparse warning: drivers/hwmon/k10temp.c:149:14: warning: symbol 'get_raw_temp' was not declared. Should it be static? Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-06-01net: dsa: b53: Add BCM5389 supportDamien Thébault4-1/+19
This patch adds support for the BCM5389 switch connected through MDIO. Signed-off-by: Damien Thébault <damien.thebault@vitec.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-01lightnvm: pblk: take bitmap alloc. out of critical sectionJavier González1-41/+56
pblk allocates line bitmaps within the line lock unnecessarily. In order to take pressure out of the fast patch, allocate line bitmaps outside of this lock and refactor accordingly. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: kick writer on new flush pointsHans Holmberg4-7/+10
Unless we kick the writer directly when setting a new flush point, the user risks having to wait for up to one second (the default timeout for the write thread to be kicked) for the IO to complete. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: only try to recover lines with written smetaHans Holmberg1-9/+21
When switching between different lun configurations, there is no guarantee that all lines that contain closed/open chunks have some valid data to recover. Check that the smeta chunk has been written to instead. Also skip bad lines (that does not have enough good chunks). Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: remove unnecessary bio_get/putJavier González1-37/+28
In the read path, pblk gets a reference to the incoming bio and puts it after ending the bio. Though this behavior is correct, it is unnecessary since pblk is the one putting the bio, therefore, it cannot disappear underneath it. Removing this reference, allows to clean up rqd->bio and avoids pointer bouncing for the different read paths. Now, the incoming bio always resides in the read context and pblk's internal bios (if any) reside in rqd->bio. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: add possibility to set write buffer size manuallyMarcin Dziegielewski1-2/+12
In some cases, users can want set write buffer size manually, e.g. to adjust it to specific workload. This patch provides the possibility to set write buffer size via module parameter feature. Signed-off-by: Marcin Dziegielewski <marcin.dziegielewski@intel.com> Signed-off-by: Igor Konopko <igor.j.konopko@intel.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: fix partial read error pathIgor Konopko1-4/+4
When error occurs during bio_add_page on partial read path, pblk tries to free pages twice. Signed-off-by: Igor Konopko <igor.j.konopko@intel.com> Signed-off-by: Marcin Dziegielewski <marcin.dziegielewski@intel.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: proper error handling for pblk_bio_add_pagesIgor Konopko1-2/+4
Currently in case of error caused by bio_pc_add_page in pblk_bio_add_pages two issues occur when calling from pblk_rb_read_to_bio(). First one is in pblk_bio_free_pages, since we are trying to free pages not allocated from our mempool. Second one is the warn from dma_pool_free, that we are trying to free NULL pointer dma. This commit fix both issues. Signed-off-by: Igor Konopko <igor.j.konopko@intel.com> Signed-off-by: Marcin Dziegielewski <marcin.dziegielewski@intel.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: fix smeta write error pathHans Holmberg1-3/+4
Smeta write errors were previously ignored. Skip these lines instead and throw them back on the free list, so the chunks will go through a reset cycle before we attempt to use the line again. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: garbage collect lines with failed writesHans Holmberg7-64/+200
Write failures should not happen under normal circumstances, so in order to bring the chunk back into a known state as soon as possible, evacuate all the valid data out of the line and let the fw judge if the block can be written to in the next reset cycle. Do this by introducing a new gc list for lines with failed writes, and ensure that the rate limiter allocates a small portion of the write bandwidth to get the job done. The lba list is saved in memory for use during gc as we cannot gurantee that the emeta data is readable if a write error occurred. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: rework write error recovery pathHans Holmberg5-229/+181
The write error recovery path is incomplete, so rework the write error recovery handling to do resubmits directly from the write buffer. When a write error occurs, the remaining sectors in the chunk are mapped out and invalidated and the request inserted in a resubmit list. The writer thread checks if there are any requests to resubmit, scans and invalidates any lbas that have been overwritten by later writes and resubmits the failed entries. Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01kcm: Fix use-after-free caused by clonned socketsKirill Tkhai1-1/+1
(resend for properly queueing in patchwork) kcm_clone() creates kernel socket, which does not take net counter. Thus, the net may die before the socket is completely destructed, i.e. kcm_exit_net() is executed before kcm_done(). Reported-by: syzbot+5f1a04e374a635efc426@syzkaller.appspotmail.com Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-01cifs: remove header_preamble_size where it is always 0Ronnie Sahlberg3-68/+48
Since header_preamble_size is 0 for SMB2+ we can remove it in those code paths that are only invoked from SMB2. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-06-01cifs: remove struct smb2_hdrRonnie Sahlberg7-55/+41
struct smb2_hdr is now just a wrapper for smb2_sync_hdr. We can thus get rid of smb2_hdr completely and access the sync header directly. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-06-01CIFS: 511c54a2f69195b28afb9dd119f03787b1625bb4 adds a check for session ↵Mark Syms1-2/+3
expiry, status STATUS_NETWORK_SESSION_EXPIRED, however the server can also respond with STATUS_USER_SESSION_DELETED in cases where the session has been idle for some time and the server reaps the session to recover resources. Handle this additional status in the same way as SESSION_EXPIRED. Signed-off-by: Mark Syms <mark.syms@citrix.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org>
2018-06-01lightnvm: pblk: remove dead functionJavier González2-8/+0
Remove dead function for manual sync. I/O Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pass flag on graceful teardown to targetsJavier González6-18/+35
If the namespace is unregistered before the LightNVM target is removed (e.g., on hot unplug) it is too late for the target to store any metadata on the device - any attempt to write to the device will fail. In this case, pass on a "gracefull teardown" flag to the target to let it know when this happens. In the case of pblk, we pad the open line (close all open chunks) to improve data retention. In the event of an ungraceful shutdown, avoid this part and just clean up. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: check for chunk size before allocating itJavier González1-3/+3
Do the check for the chunk state after making sure that the chunk type is supported. Fixes: 32ef9412c114 ("lightnvm: pblk: implement get log report chunk") Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01lightnvm: pblk: remove unnecessary argumentJavier González3-5/+5
Remove unnecessary argument on pblk_line_free() Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>