summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* proc_sysctl: fix oops caused by incorrect command parametersXiaoming Ni2021-01-241-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The process_sysctl_arg() does not check whether val is empty before invoking strlen(val). If the command line parameter () is incorrectly configured and val is empty, oops is triggered. For example: "hung_task_panic=1" is incorrectly written as "hung_task_panic", oops is triggered. The call stack is as follows: Kernel command line: .... hung_task_panic ...... Call trace: __pi_strlen+0x10/0x98 parse_args+0x278/0x344 do_sysctl_args+0x8c/0xfc kernel_init+0x5c/0xf4 ret_from_fork+0x10/0x30 To fix it, check whether "val" is empty when "phram" is a sysctl field. Error codes are returned in the failure branch, and error logs are generated by parse_args(). Link: https://lkml.kernel.org/r/20210118133029.28580-1-nixiaoming@huawei.com Fixes: 3db978d480e2843 ("kernel/sysctl: support setting sysctl parameters from kernel command line") Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Heiner Kallweit <hkallweit1@gmail.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: <stable@vger.kernel.org> [5.8+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* powerpc/mm/highmem: use __set_pte_at() for kmap_local()Thomas Gleixner2021-01-241-0/+2
| | | | | | | | | | | | | | | | | | | | The original PowerPC highmem mapping function used __set_pte_at() to denote that the mapping is per CPU. This got lost with the conversion to the generic implementation. Override the default map function. Link: https://lkml.kernel.org/r/20210112170411.281464308@linutronix.de Fixes: 47da42b27a56 ("powerpc/mm/highmem: Switch to generic kmap atomic") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Andreas Larsson <andreas@gaisler.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Paul Cercueil <paul@crapouillou.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mips/mm/highmem: use set_pte() for kmap_local()Thomas Gleixner2021-01-241-0/+1
| | | | | | | | | | | | | | | | | | | | set_pte_at() on MIPS invokes update_cache() which might recurse into kmap_local(). Use set_pte() like the original MIPS highmem implementation did. Link: https://lkml.kernel.org/r/20210112170411.187513575@linutronix.de Fixes: a4c33e83bca1 ("mips/mm/highmem: Switch to generic kmap atomic") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reported-by: Paul Cercueil <paul@crapouillou.net> Reported-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Andreas Larsson <andreas@gaisler.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm/highmem: prepare for overriding set_pte_at()Thomas Gleixner2021-01-241-1/+6
| | | | | | | | | | | | | | | | | | The generic kmap_local() map function uses set_pte_at(), but MIPS requires set_pte() and PowerPC wants __set_pte_at(). Provide arch_kmap_local_set_pte() and default it to set_pte_at(). Link: https://lkml.kernel.org/r/20210112170411.056306194@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andreas Larsson <andreas@gaisler.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Cercueil <paul@crapouillou.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* sparc/mm/highmem: flush cache and TLBThomas Gleixner2021-01-241-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "mm/highmem: Fix fallout from generic kmap_local conversions". The kmap_local conversion wreckaged sparc, mips and powerpc as it missed some of the details in the original implementation. This patch (of 4): The recent conversion to the generic kmap_local infrastructure failed to assign the proper pre/post map/unmap flush operations for sparc. Sparc requires cache flush before map/unmap and tlb flush afterwards. Link: https://lkml.kernel.org/r/20210112170136.078559026@linutronix.de Link: https://lkml.kernel.org/r/20210112170410.905976187@linutronix.de Fixes: 3293efa97807 ("sparc/mm/highmem: Switch to generic kmap atomic") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reported-by: Andreas Larsson <andreas@gaisler.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Cercueil <paul@crapouillou.net> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: fix page reference leak in soft_offline_page()Dan Williams2021-01-241-4/+16
| | | | | | | | | | | | | | | | | | | | | | | | The conversion to move pfn_to_online_page() internal to soft_offline_page() missed that the get_user_pages() reference taken by the madvise() path needs to be dropped when pfn_to_online_page() fails. Note the direct sysfs-path to soft_offline_page() does not perform a get_user_pages() lookup. When soft_offline_page() is handed a pfn_valid() && !pfn_to_online_page() pfn the kernel hangs at dax-device shutdown due to a leaked reference. Link: https://lkml.kernel.org/r/161058501210.1840162.8108917599181157327.stgit@dwillia2-desk3.amr.corp.intel.com Fixes: feec24a6139d ("mm, soft-offline: convert parameter to pfn") Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Qian Cai <cai@lca.pw> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ubsan: disable unsigned-overflow check for i386Arnd Bergmann2021-01-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Building ubsan kernels even for compile-testing introduced these warnings in my randconfig environment: crypto/blake2b_generic.c:98:13: error: stack frame size of 9636 bytes in function 'blake2b_compress' [-Werror,-Wframe-larger-than=] static void blake2b_compress(struct blake2b_state *S, crypto/sha512_generic.c:151:13: error: stack frame size of 1292 bytes in function 'sha512_generic_block_fn' [-Werror,-Wframe-larger-than=] static void sha512_generic_block_fn(struct sha512_state *sst, u8 const *src, lib/crypto/curve25519-fiat32.c:312:22: error: stack frame size of 2180 bytes in function 'fe_mul_impl' [-Werror,-Wframe-larger-than=] static noinline void fe_mul_impl(u32 out[10], const u32 in1[10], const u32 in2[10]) lib/crypto/curve25519-fiat32.c:444:22: error: stack frame size of 1588 bytes in function 'fe_sqr_impl' [-Werror,-Wframe-larger-than=] static noinline void fe_sqr_impl(u32 out[10], const u32 in1[10]) Further testing showed that this is caused by -fsanitize=unsigned-integer-overflow, but is isolated to the 32-bit x86 architecture. The one in blake2b immediately overflows the 8KB stack area architectures, so better ensure this never happens by disabling the option for 32-bit x86. Link: https://lkml.kernel.org/r/20210112202922.2454435-1-arnd@kernel.org Link: https://lore.kernel.org/lkml/20201230154749.746641-1-arnd@kernel.org/ Fixes: d0a3ac549f38 ("ubsan: enable for all*config builds") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Kees Cook <keescook@chromium.org> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Marco Elver <elver@google.com> Cc: George Popescu <georgepope@android.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* kasan, mm: fix resetting page_alloc tags for HW_TAGSAndrey Konovalov2021-01-241-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | A previous commit added resetting KASAN page tags to kernel_init_free_pages() to avoid false-positives due to accesses to metadata with the hardware tag-based mode. That commit did reset page tags before the metadata access, but didn't restore them after. As the result, KASAN fails to detect bad accesses to page_alloc allocations on some configurations. Fix this by recovering the tag after the metadata access. Link: https://lkml.kernel.org/r/02b5bcd692e912c27d484030f666b350ad7e4ae4.1611074450.git.andreyknvl@google.com Fixes: aa1ef4d7b3f6 ("kasan, mm: reset tags when accessing metadata") Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Marco Elver <elver@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* kasan, mm: fix conflicts with init_on_alloc/freeAndrey Konovalov2021-01-241-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A few places where SLUB accesses object's data or metadata were missed in a previous patch. This leads to false positives with hardware tag-based KASAN when bulk allocations are used with init_on_alloc/free. Fix the false-positives by resetting pointer tags during these accesses. (The kasan_reset_tag call is removed from slab_alloc_node, as it's added into maybe_wipe_obj_freeptr.) Link: https://linux-review.googlesource.com/id/I50dd32838a666e173fe06c3c5c766f2c36aae901 Link: https://lkml.kernel.org/r/093428b5d2ca8b507f4a79f92f9929b35f7fada7.1610731872.git.andreyknvl@google.com Fixes: aa1ef4d7b3f67 ("kasan, mm: reset tags when accessing metadata") Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Alexander Potapenko <glider@google.com> Cc: Marco Elver <elver@google.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* kasan: fix HW_TAGS boot parametersAndrey Konovalov2021-01-242-66/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The initially proposed KASAN command line parameters are redundant. This change drops the complex "kasan.mode=off/prod/full" parameter and adds a simpler kill switch "kasan=off/on" instead. The new parameter together with the already existing ones provides a cleaner way to express the same set of features. The full set of parameters with this change: kasan=off/on - whether KASAN is enabled kasan.fault=report/panic - whether to only print a report or also panic kasan.stacktrace=off/on - whether to collect alloc/free stack traces Default values: kasan=on kasan.fault=report kasan.stacktrace=on (if CONFIG_DEBUG_KERNEL=y) kasan.stacktrace=off (otherwise) Link: https://linux-review.googlesource.com/id/Ib3694ed90b1e8ccac6cf77dfd301847af4aba7b8 Link: https://lkml.kernel.org/r/4e9c4a4bdcadc168317deb2419144582a9be6e61.1610736745.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* kasan: fix incorrect arguments passing in kasan_add_zero_shadowLecopzer Chen2021-01-241-2/+1
| | | | | | | | | | | | | | | | kasan_remove_zero_shadow() shall use original virtual address, start and size, instead of shadow address. Link: https://lkml.kernel.org/r/20210103063847.5963-1-lecopzer@gmail.com Fixes: 0207df4fa1a86 ("kernel/memremap, kasan: make ZONE_DEVICE with work with KASAN") Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* kasan: fix unaligned address is unhandled in kasan_remove_zero_shadowLecopzer Chen2021-01-241-8/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During testing kasan_populate_early_shadow and kasan_remove_zero_shadow, if the shadow start and end address in kasan_remove_zero_shadow() is not aligned to PMD_SIZE, the remain unaligned PTE won't be removed. In the test case for kasan_remove_zero_shadow(): shadow_start: 0xffffffb802000000, shadow end: 0xffffffbfbe000000 3-level page table: PUD_SIZE: 0x40000000 PMD_SIZE: 0x200000 PAGE_SIZE: 4K 0xffffffbf80000000 ~ 0xffffffbfbdf80000 will not be removed because in kasan_remove_pud_table(), kasan_pmd_table(*pud) is true but the next address is 0xffffffbfbdf80000 which is not aligned to PUD_SIZE. In the correct condition, this should fallback to the next level kasan_remove_pmd_table() but the condition flow always continue to skip the unaligned part. Fix by correcting the condition when next and addr are neither aligned. Link: https://lkml.kernel.org/r/20210103135621.83129-1-lecopzer@gmail.com Fixes: 0207df4fa1a86 ("kernel/memremap, kasan: make ZONE_DEVICE with work with KASAN") Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: YJ Chiang <yj.chiang@mediatek.com> Cc: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: fix numa stats for thp migrationShakeel Butt2021-01-241-11/+12
| | | | | | | | | | | | | | | | | | | | | | Currently the kernel is not correctly updating the numa stats for NR_FILE_PAGES and NR_SHMEM on THP migration. Fix that. For NR_FILE_DIRTY and NR_ZONE_WRITE_PENDING, although at the moment there is no need to handle THP migration as kernel still does not have write support for file THP but to be more future proof, this patch adds the THP support for those stats as well. Link: https://lkml.kernel.org/r/20210108155813.2914586-2-shakeelb@google.com Fixes: e71769ae52609 ("mm: enable thp migration for shmem thp") Signed-off-by: Shakeel Butt <shakeelb@google.com> Acked-by: Yang Shi <shy828301@gmail.com> Reviewed-by: Roman Gushchin <guro@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <songmuchun@bytedance.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: memcg: fix memcg file_dirty numa statShakeel Butt2021-01-241-2/+2
| | | | | | | | | | | | | | | | | | | | | | | The kernel updates the per-node NR_FILE_DIRTY stats on page migration but not the memcg numa stats. That was not an issue until recently the commit 5f9a4f4a7096 ("mm: memcontrol: add the missing numa_stat interface for cgroup v2") exposed numa stats for the memcg. So fix the file_dirty per-memcg numa stat. Link: https://lkml.kernel.org/r/20210108155813.2914586-1-shakeelb@google.com Fixes: 5f9a4f4a7096 ("mm: memcontrol: add the missing numa_stat interface for cgroup v2") Signed-off-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Yang Shi <shy828301@gmail.com> Reviewed-by: Roman Gushchin <guro@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: memcg/slab: optimize objcg stock drainingRoman Gushchin2021-01-241-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Imran Khan reported a 16% regression in hackbench results caused by the commit f2fe7b09a52b ("mm: memcg/slab: charge individual slab objects instead of pages"). The regression is noticeable in the case of a consequent allocation of several relatively large slab objects, e.g. skb's. As soon as the amount of stocked bytes exceeds PAGE_SIZE, drain_obj_stock() and __memcg_kmem_uncharge() are called, and it leads to a number of atomic operations in page_counter_uncharge(). The corresponding call graph is below (provided by Imran Khan): |__alloc_skb | | | |__kmalloc_reserve.isra.61 | | | | | |__kmalloc_node_track_caller | | | | | | | |slab_pre_alloc_hook.constprop.88 | | | obj_cgroup_charge | | | | | | | | | |__memcg_kmem_charge | | | | | | | | | | | |page_counter_try_charge | | | | | | | | | |refill_obj_stock | | | | | | | | | | | |drain_obj_stock.isra.68 | | | | | | | | | | | | | |__memcg_kmem_uncharge | | | | | | | | | | | | | | | |page_counter_uncharge | | | | | | | | | | | | | | | | | |page_counter_cancel | | | | | | | | | | | |__slab_alloc | | | | | | | | | |___slab_alloc | | | | | | | | |slab_post_alloc_hook Instead of directly uncharging the accounted kernel memory, it's possible to refill the generic page-sized per-cpu stock instead. It's a much faster operation, especially on a default hierarchy. As a bonus, __memcg_kmem_uncharge_page() will also get faster, so the freeing of page-sized kernel allocations (e.g. large kmallocs) will become faster. A similar change has been done earlier for the socket memory by the commit 475d0487a2ad ("mm: memcontrol: use per-cpu stocks for socket memory uncharging"). Link: https://lkml.kernel.org/r/20210106042239.2860107-1-guro@fb.com Fixes: f2fe7b09a52b ("mm: memcg/slab: charge individual slab objects instead of pages") Signed-off-by: Roman Gushchin <guro@fb.com> Reported-by: Imran Khan <imran.f.khan@oracle.com> Tested-by: Imran Khan <imran.f.khan@oracle.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Michal Koutn <mkoutny@suse.com> Cc: Michal Koutný <mkoutny@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: fix initialization of struct page for holes in memory layoutMike Rapoport2021-01-241-34/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There could be struct pages that are not backed by actual physical memory. This can happen when the actual memory bank is not a multiple of SECTION_SIZE or when an architecture does not register memory holes reserved by the firmware as memblock.memory. Such pages are currently initialized using init_unavailable_mem() function that iterates through PFNs in holes in memblock.memory and if there is a struct page corresponding to a PFN, the fields if this page are set to default values and the page is marked as Reserved. init_unavailable_mem() does not take into account zone and node the page belongs to and sets both zone and node links in struct page to zero. On a system that has firmware reserved holes in a zone above ZONE_DMA, for instance in a configuration below: # grep -A1 E820 /proc/iomem 7a17b000-7a216fff : Unknown E820 type 7a217000-7bffffff : System RAM unset zone link in struct page will trigger VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page), pfn), page); because there are pages in both ZONE_DMA32 and ZONE_DMA (unset zone link in struct page) in the same pageblock. Update init_unavailable_mem() to use zone constraints defined by an architecture to properly setup the zone link and use node ID of the adjacent range in memblock.memory to set the node link. Link: https://lkml.kernel.org/r/20210111194017.22696-3-rppt@kernel.org Fixes: 73a6e474cb37 ("mm: memmap_init: iterate over memblock regions rather that check each PFN") Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Reported-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: David Hildenbrand <david@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: Qian Cai <cai@lca.pw> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* x86/setup: don't remove E820_TYPE_RAM for pfn 0Mike Rapoport2021-01-241-11/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "mm: fix initialization of struct page for holes in memory layout", v3. Commit 73a6e474cb37 ("mm: memmap_init: iterate over memblock regions rather that check each PFN") exposed several issues with the memory map initialization and these patches fix those issues. Initially there were crashes during compaction that Qian Cai reported back in April [1]. It seemed back then that the problem was fixed, but a few weeks ago Andrea Arcangeli hit the same bug [2] and there was an additional discussion at [3]. [1] https://lore.kernel.org/lkml/8C537EB7-85EE-4DCF-943E-3CC0ED0DF56D@lca.pw [2] https://lore.kernel.org/lkml/20201121194506.13464-1-aarcange@redhat.com [3] https://lore.kernel.org/mm-commits/20201206005401.qKuAVgOXr%akpm@linux-foundation.org This patch (of 2): The first 4Kb of memory is a BIOS owned area and to avoid its allocation for the kernel it was not listed in e820 tables as memory. As the result, pfn 0 was never recognised by the generic memory management and it is not a part of neither node 0 nor ZONE_DMA. If set_pfnblock_flags_mask() would be ever called for the pageblock corresponding to the first 2Mbytes of memory, having pfn 0 outside of ZONE_DMA would trigger VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page), pfn), page); Along with reserving the first 4Kb in e820 tables, several first pages are reserved with memblock in several places during setup_arch(). These reservations are enough to ensure the kernel does not touch the BIOS area and it is not necessary to remove E820_TYPE_RAM for pfn 0. Remove the update of e820 table that changes the type of pfn 0 and move the comment describing why it was done to trim_low_memory_range() that reserves the beginning of the memory. Link: https://lkml.kernel.org/r/20210111194017.22696-2-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: David Hildenbrand <david@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: Qian Cai <cai@lca.pw> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'mtd/fixes' of ↵Linus Torvalds2021-01-235-16/+27
|\ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux Pull mtd fixes from Miquel Raynal. * 'mtd/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: mtd: rawnand: omap: Use BCH private fields in the specific OOB layout mtd: spinand: Fix MTD_OPS_AUTO_OOB requests mtd: rawnand: intel: check the mtd name only after setting the variable mtd: rawnand: nandsim: Fix the logic when selecting Hamming soft ECC engine mtd: rawnand: gpmi: fix dst bit offset when extracting raw payload
| * mtd: rawnand: omap: Use BCH private fields in the specific OOB layoutMiquel Raynal2021-01-201-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The OMAP driver may leverage software BCH logic to locate errors while using its own hardware to detect the presence of errors. This is achieved with a "mixed" mode which initializes manually the software BCH internal logic while providing its own OOB layout. The issue here comes from the fact that the BCH driver has been updated to only use generic NAND objects, and no longer depend on raw NAND structures as it is usable from SPI-NAND as well. However, at the end of the BCH context initialization, the driver checks the validity of the OOB layout. At this stage, the raw NAND fields have not been populated yet while being used by the layout helpers, leading to an invalid layout. The chosen solution here is to include the BCH structure definition and to refer to the BCH fields directly (de-referenced as a const pointer here) to know as early as possible the number of steps and ECC bytes which have been chosen. Note: I don't know which commit exactly triggered the error, but the entire migration to a generic BCH driver got merged in one go, so this should not be a problem for stable backports. Reported-by: Adam Ford <aford173@gmail.com> Fixes: 80fe603160a4 ("mtd: nand: ecc-bch: Stop using raw NAND structures") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Tested-by: Adam Ford <aford173@gmail.com> #logicpd-torpedo-37xx-devkit-28.dts Link: https://lore.kernel.org/linux-mtd/20210119155510.5655-1-miquel.raynal@bootlin.com
| * mtd: spinand: Fix MTD_OPS_AUTO_OOB requestsMiquel Raynal2021-01-141-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The initial change breaking the logic is commit 3d1f08b032dc ("mtd: spinand: Use the external ECC engine logic") It inadvertently dropped proper OOB support while doing something else. Shortly later, half of it got re-integrated by commit 868cbe2a6dce ("mtd: spinand: Fix OOB read") (pointing by the way to a more early change which had nothing to do with the issue). Problem is, this commit failed to revert the faulty change entirely and missed the logic handling MTD_OPS_AUTO_OOB requests. Let's fix this mess by re-inserting the missing part now. Fixes: 868cbe2a6dce ("mtd: spinand: Fix OOB read") Reported-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20210107083813.24283-1-miquel.raynal@bootlin.com
| * mtd: rawnand: intel: check the mtd name only after setting the variableMartin Blumenstingl2021-01-141-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Move the check for mtd->name after the mtd variable has actually been initialized. While here, also drop the NULL assignment to the mtd variable as it's overwritten later on anyways and the NULL value is never read. Fixes: 0b1039f016e8a3 ("mtd: rawnand: Add NAND controller support on Intel LGM SoC") Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20210106140943.98072-1-martin.blumenstingl@googlemail.com
| * mtd: rawnand: nandsim: Fix the logic when selecting Hamming soft ECC engineMiquel Raynal2021-01-141-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | I have been fooled by the logic picking the right ECC engine which is spread across two functions: *init_module() and *_attach(). I thought this driver was not impacted by the recent changes around the ECC engines DT parsing logic but in fact it is. Reported-by: kernel test robot <oliver.sang@intel.com> Fixes: d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits") Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20210104093057.31178-1-miquel.raynal@bootlin.com
| * mtd: rawnand: gpmi: fix dst bit offset when extracting raw payloadSean Nyekjaer2021-01-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Re-add the multiply by 8 to "step * eccsize" to correct the destination bit offset when extracting the data payload in gpmi_ecc_read_page_raw(). Fixes: e5e5631cc889 ("mtd: rawnand: gpmi: Use nand_extract_bits()") Cc: stable@vger.kernel.org Reported-by: Martin Hundebøll <martin@geanix.com> Signed-off-by: Sean Nyekjaer <sean@geanix.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201221100013.2715675-1-sean@geanix.com
* | Merge branch 'i2c/for-current' of ↵Linus Torvalds2021-01-235-5/+44
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Another bunch of driver fixes" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: sprd: depend on COMMON_CLK to fix compile tests Revert "i2c: imx: Remove unused .id_table support" i2c: octeon: check correct size of maximum RECV_LEN packet i2c: tegra: Create i2c_writesl_vi() to use with VI I2C for filling TX FIFO i2c: bpmp-tegra: Ignore unknown I2C_M flags i2c: tegra: Wait for config load atomically while in ISR
| * | i2c: sprd: depend on COMMON_CLK to fix compile testsKrzysztof Kozlowski2021-01-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The I2C_SPRD uses Common Clock Framework thus it cannot be built on platforms without it (e.g. compile test on MIPS with LANTIQ): /usr/bin/mips-linux-gnu-ld: drivers/i2c/busses/i2c-sprd.o: in function `sprd_i2c_probe': i2c-sprd.c:(.text.sprd_i2c_probe+0x254): undefined reference to `clk_set_parent' Fixes: 4a2d5f663dab ("i2c: Enable compile testing for more drivers") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: Baolin Wang <baolin.wang7@gmail.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
| * | Revert "i2c: imx: Remove unused .id_table support"Fabio Estevam2021-01-221-1/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Coldfire platforms are non-DT users of this driver, so keep the .id_table support. This reverts commit c610199cd392e6e2d41811ef83d85355c1b862b3. Fixes: c610199cd392 (i2c: imx: Remove unused .id_table support") Reported-by: Sascha Hauer <sha@pengutronix.de> Signed-off-by: Fabio Estevam <festevam@gmail.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
| * | i2c: octeon: check correct size of maximum RECV_LEN packetWolfram Sang2021-01-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I2C_SMBUS_BLOCK_MAX defines already the maximum number as defined in the SMBus 2.0 specs. No reason to add one to it. Fixes: 886f6f8337dd ("i2c: octeon: Support I2C_M_RECV_LEN") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Robert Richter <rric@kernel.org> Signed-off-by: Wolfram Sang <wsa@kernel.org>
| * | i2c: tegra: Create i2c_writesl_vi() to use with VI I2C for filling TX FIFOSowjanya Komatineni2021-01-171-1/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | VI I2C controller has known hardware bug where immediate multiple writes to TX_FIFO register gets stuck. Recommended software work around is to read I2C register after each write to TX_FIFO register to flush out the data. This patch implements this work around for VI I2C controller. Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com> Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
| * | i2c: bpmp-tegra: Ignore unknown I2C_M flagsMikko Perttunen2021-01-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to not to start returning errors when new I2C_M flags are added, change behavior to just ignore all flags that we don't know about. This includes the I2C_M_DMA_SAFE flag that already exists but causes -EINVAL to be returned for valid transactions. Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
| * | i2c: tegra: Wait for config load atomically while in ISRMikko Perttunen2021-01-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Upon a communication error, the interrupt handler can call tegra_i2c_disable_packet_mode. This causes a sleeping poll to happen unless the current transaction was marked atomic. Fix this by making the poll happen atomically if we are in an IRQ. This matches the behavior prior to the patch mentioned in the Fixes tag. Fixes: ede2299f7101 ("i2c: tegra: Support atomic transfers") Cc: stable@vger.kernel.org Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
* | | Merge tag 'scsi-fixes' of ↵Linus Torvalds2021-01-239-42/+90
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Twelve minor fixes, all in drivers or doc. Most of the fixes are pretty obvious (although we had two goes to get the UFS sysfs doc right) and the biggest change is in the ufs driver which they've extensively tested" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ibmvfc: Set default timeout to avoid crash during migration scsi: target: tcmu: Fix use-after-free of se_cmd->priv scsi: fnic: Fix memleak in vnic_dev_init_devcmd2 scsi: libfc: Avoid invoking response handler twice if ep is already completed scsi: scsi_transport_srp: Don't block target in failfast state scsi: docs: ABI: sysfs-driver-ufs: Rectify table formatting scsi: ufs: Fix tm request when non-fatal error happens scsi: ufs: Fix livelock of ufshcd_clear_ua_wluns() scsi: ibmvfc: Fix missing cast of ibmvfc_event pointer to u64 handle scsi: ufs: ufshcd-pltfrm depends on HAS_IOMEM scsi: megaraid_sas: Fix MEGASAS_IOC_FIRMWARE regression scsi: docs: ABI: sysfs-driver-ufs: Add DeepSleep power mode
| * | | scsi: ibmvfc: Set default timeout to avoid crash during migrationBrian King2021-01-151-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While testing live partition mobility, we have observed occasional crashes of the Linux partition. What we've seen is that during the live migration, for specific configurations with large amounts of memory, slow network links, and workloads that are changing memory a lot, the partition can end up being suspended for 30 seconds or longer. This resulted in the following scenario: CPU 0 CPU 1 ------------------------------- ---------------------------------- scsi_queue_rq migration_store -> blk_mq_start_request -> rtas_ibm_suspend_me -> blk_add_timer -> on_each_cpu(rtas_percpu_suspend_me _______________________________________V | V -> IPI from CPU 1 -> rtas_percpu_suspend_me -> __rtas_suspend_last_cpu -- Linux partition suspended for > 30 seconds -- -> for_each_online_cpu(cpu) plpar_hcall_norets(H_PROD -> scsi_dispatch_cmd -> scsi_times_out -> scsi_abort_command -> queue_delayed_work -> ibmvfc_queuecommand_lck -> ibmvfc_send_event -> ibmvfc_send_crq - returns H_CLOSED <- returns SCSI_MLQUEUE_HOST_BUSY -> __blk_mq_requeue_request -> scmd_eh_abort_handler -> scsi_try_to_abort_cmd - returns SUCCESS -> scsi_queue_insert Normally, the SCMD_STATE_COMPLETE bit would protect against the command completion and the timeout, but that doesn't work here, since we don't check that at all in the SCSI_MLQUEUE_HOST_BUSY path. In this case we end up calling scsi_queue_insert on a request that has already been queued, or possibly even freed, and we crash. The patch below simply increases the default I/O timeout to avoid this race condition. This is also the timeout value that nearly all IBM SAN storage recommends setting as the default value. Link: https://lore.kernel.org/r/1610463998-19791-1-git-send-email-brking@linux.vnet.ibm.com Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | | scsi: target: tcmu: Fix use-after-free of se_cmd->privShin'ichiro Kawasaki2021-01-151-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit a35129024e88 ("scsi: target: tcmu: Use priv pointer in se_cmd") modified tcmu_free_cmd() to set NULL to priv pointer in se_cmd. However, se_cmd can be already freed by work queue triggered in target_complete_cmd(). This caused BUG KASAN use-after-free [1]. To fix the bug, do not touch priv pointer in tcmu_free_cmd(). Instead, set NULL to priv pointer before target_complete_cmd() calls. Also, to avoid unnecessary priv pointer change in tcmu_queue_cmd(), modify priv pointer in the function only when tcmu_free_cmd() is not called. [1] BUG: KASAN: use-after-free in tcmu_handle_completions+0x1172/0x1770 [target_core_user] Write of size 8 at addr ffff88814cf79a40 by task cmdproc-uio0/14842 CPU: 2 PID: 14842 Comm: cmdproc-uio0 Not tainted 5.11.0-rc2 #1 Hardware name: Supermicro Super Server/X10SRL-F, BIOS 3.2 11/22/2019 Call Trace: dump_stack+0x9a/0xcc ? tcmu_handle_completions+0x1172/0x1770 [target_core_user] print_address_description.constprop.0+0x18/0x130 ? tcmu_handle_completions+0x1172/0x1770 [target_core_user] ? tcmu_handle_completions+0x1172/0x1770 [target_core_user] kasan_report.cold+0x7f/0x10e ? tcmu_handle_completions+0x1172/0x1770 [target_core_user] tcmu_handle_completions+0x1172/0x1770 [target_core_user] ? queue_tmr_ring+0x5d0/0x5d0 [target_core_user] tcmu_irqcontrol+0x28/0x60 [target_core_user] uio_write+0x155/0x230 ? uio_vma_fault+0x460/0x460 ? security_file_permission+0x4f/0x440 vfs_write+0x1ce/0x860 ksys_write+0xe9/0x1b0 ? __ia32_sys_read+0xb0/0xb0 ? syscall_enter_from_user_mode+0x27/0x70 ? trace_hardirqs_on+0x1c/0x110 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fcf8b61905f Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 b9 fc ff ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 0c fd ff ff 48 RSP: 002b:00007fcf7b3e6c30 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fcf8b61905f RDX: 0000000000000004 RSI: 00007fcf7b3e6c78 RDI: 000000000000000c RBP: 00007fcf7b3e6c80 R08: 0000000000000000 R09: 00007fcf7b3e6aa8 R10: 000000000b01c000 R11: 0000000000000293 R12: 00007ffe0c32a52e R13: 00007ffe0c32a52f R14: 0000000000000000 R15: 00007fcf7b3e7640 Allocated by task 383: kasan_save_stack+0x1b/0x40 ____kasan_kmalloc.constprop.0+0x84/0xa0 kmem_cache_alloc+0x142/0x330 tcm_loop_queuecommand+0x2a/0x4e0 [tcm_loop] scsi_queue_rq+0x12ec/0x2d20 blk_mq_dispatch_rq_list+0x30a/0x1db0 __blk_mq_do_dispatch_sched+0x326/0x830 __blk_mq_sched_dispatch_requests+0x2c8/0x3f0 blk_mq_sched_dispatch_requests+0xca/0x120 __blk_mq_run_hw_queue+0x93/0xe0 process_one_work+0x7b6/0x1290 worker_thread+0x590/0xf80 kthread+0x362/0x430 ret_from_fork+0x22/0x30 Freed by task 11655: kasan_save_stack+0x1b/0x40 kasan_set_track+0x1c/0x30 kasan_set_free_info+0x20/0x30 ____kasan_slab_free+0xec/0x120 slab_free_freelist_hook+0x53/0x160 kmem_cache_free+0xf4/0x5c0 target_release_cmd_kref+0x3ea/0x9e0 [target_core_mod] transport_generic_free_cmd+0x28b/0x2f0 [target_core_mod] target_complete_ok_work+0x250/0xac0 [target_core_mod] process_one_work+0x7b6/0x1290 worker_thread+0x590/0xf80 kthread+0x362/0x430 ret_from_fork+0x22/0x30 Last potentially related work creation: kasan_save_stack+0x1b/0x40 kasan_record_aux_stack+0xa3/0xb0 insert_work+0x48/0x2e0 __queue_work+0x4e8/0xdf0 queue_work_on+0x78/0x80 tcmu_handle_completions+0xad0/0x1770 [target_core_user] tcmu_irqcontrol+0x28/0x60 [target_core_user] uio_write+0x155/0x230 vfs_write+0x1ce/0x860 ksys_write+0xe9/0x1b0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Second to last potentially related work creation: kasan_save_stack+0x1b/0x40 kasan_record_aux_stack+0xa3/0xb0 insert_work+0x48/0x2e0 __queue_work+0x4e8/0xdf0 queue_work_on+0x78/0x80 tcm_loop_queuecommand+0x1c3/0x4e0 [tcm_loop] scsi_queue_rq+0x12ec/0x2d20 blk_mq_dispatch_rq_list+0x30a/0x1db0 __blk_mq_do_dispatch_sched+0x326/0x830 __blk_mq_sched_dispatch_requests+0x2c8/0x3f0 blk_mq_sched_dispatch_requests+0xca/0x120 __blk_mq_run_hw_queue+0x93/0xe0 process_one_work+0x7b6/0x1290 worker_thread+0x590/0xf80 kthread+0x362/0x430 ret_from_fork+0x22/0x30 The buggy address belongs to the object at ffff88814cf79800 which belongs to the cache tcm_loop_cmd_cache of size 896. Link: https://lore.kernel.org/r/20210113024508.1264992-1-shinichiro.kawasaki@wdc.com Fixes: a35129024e88 ("scsi: target: tcmu: Use priv pointer in se_cmd") Cc: stable@vger.kernel.org # v5.9+ Acked-by: Bodo Stroesser <bostroesser@gmail.com> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | | scsi: fnic: Fix memleak in vnic_dev_init_devcmd2Dinghao Liu2021-01-131-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When ioread32() returns 0xFFFFFFFF, we should execute cleanup functions like other error handling paths before returning. Link: https://lore.kernel.org/r/20201225083520.22015-1-dinghao.liu@zju.edu.cn Acked-by: Karan Tilak Kumar <kartilak@cisco.com> Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | | scsi: libfc: Avoid invoking response handler twice if ep is already completedJaved Hasan2021-01-131-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A race condition exists between the response handler getting called because of exchange_mgr_reset() (which clears out all the active XIDs) and the response we get via an interrupt. Sequence of events: rport ba0200: Port timeout, state PLOGI rport ba0200: Port entered PLOGI state from PLOGI state xid 1052: Exchange timer armed : 20000 msecs  xid timer armed here rport ba0200: Received LOGO request while in state PLOGI rport ba0200: Delete port rport ba0200: work event 3 rport ba0200: lld callback ev 3 bnx2fc: rport_event_hdlr: event = 3, port_id = 0xba0200 bnx2fc: ba0200 - rport not created Yet!! /* Here we reset any outstanding exchanges before freeing rport using the exch_mgr_reset() */ xid 1052: Exchange timer canceled /* Here we got two responses for one xid */ xid 1052: invoking resp(), esb 20000000 state 3 xid 1052: invoking resp(), esb 20000000 state 3 xid 1052: fc_rport_plogi_resp() : ep->resp_active 2 xid 1052: fc_rport_plogi_resp() : ep->resp_active 2 Skip the response if the exchange is already completed. Link: https://lore.kernel.org/r/20201215194731.2326-1-jhasan@marvell.com Signed-off-by: Javed Hasan <jhasan@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | | scsi: scsi_transport_srp: Don't block target in failfast stateMartin Wilck2021-01-131-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the port is in SRP_RPORT_FAIL_FAST state when srp_reconnect_rport() is entered, a transition to SDEV_BLOCK would be illegal, and a kernel WARNING would be triggered. Skip scsi_target_block() in this case. Link: https://lore.kernel.org/r/20210111142541.21534-1-mwilck@suse.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | | scsi: docs: ABI: sysfs-driver-ufs: Rectify table formattingLukas Bulwahn2021-01-131-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 0b2894cd0fdf ("scsi: docs: ABI: sysfs-driver-ufs: Add DeepSleep power mode") adds new entries in tables of sysfs-driver-ufs ABI documentation, but formatted the table incorrectly. Hence, make htmldocs warns: ./Documentation/ABI/testing/sysfs-driver-ufs:{915,956}: WARNING: Malformed table. Text in column margin in table line 15. Rectify table formatting for DeepSleep power mode. Link: https://lore.kernel.org/r/20210111102212.19377-1-lukas.bulwahn@gmail.com Acked-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | | scsi: ufs: Fix tm request when non-fatal error happensJaegeuk Kim2021-01-081-5/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When non-fatal error like line-reset happens, ufshcd_err_handler() starts to abort tasks by ufshcd_try_to_abort_task(). When it tries to issue a task management request, we hit two warnings: WARNING: CPU: 7 PID: 7 at block/blk-core.c:630 blk_get_request+0x68/0x70 WARNING: CPU: 4 PID: 157 at block/blk-mq-tag.c:82 blk_mq_get_tag+0x438/0x46c After fixing the above warnings we hit another tm_cmd timeout which may be caused by unstable controller state: __ufshcd_issue_tm_cmd: task management cmd 0x80 timed-out Then, ufshcd_err_handler() enters full reset, and kernel gets stuck. It turned out ufshcd_print_trs() printed too many messages on console which requires CPU locks. Likewise hba->silence_err_logs, we need to avoid too verbose messages. This is actually not an error case. Link: https://lore.kernel.org/r/20210107185316.788815-3-jaegeuk@kernel.org Fixes: 69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to allocate and free TMFs") Reviewed-by: Can Guo <cang@codeaurora.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | | scsi: ufs: Fix livelock of ufshcd_clear_ua_wluns()Jaegeuk Kim2021-01-081-7/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When gate_work/ungate_work experience an error during hibern8_enter or exit we can livelock: ufshcd_err_handler() ufshcd_scsi_block_requests() ufshcd_reset_and_restore() ufshcd_clear_ua_wluns() -> stuck ufshcd_scsi_unblock_requests() In order to avoid this, ufshcd_clear_ua_wluns() can be called per recovery flows such as suspend/resume, link_recovery, and error_handler. Link: https://lore.kernel.org/r/20210107185316.788815-2-jaegeuk@kernel.org Fixes: 1918651f2d7e ("scsi: ufs: Clear UAC for RPMB after ufshcd resets") Reviewed-by: Can Guo <cang@codeaurora.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | | scsi: ibmvfc: Fix missing cast of ibmvfc_event pointer to u64 handleTyrel Datwyler2021-01-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 2aa0102c6688 ("scsi: ibmvfc: Use correlation token to tag commands") sets the vfcFrame correlation token to the pointer handle of the associated ibmvfc_event. However, that commit failed to cast the pointer to an appropriate type which in this case is a u64. As such sparse warnings are generated for both correlation token assignments. ibmvfc.c:2375:36: sparse: incorrect type in argument 1 (different base types) ibmvfc.c:2375:36: sparse: expected unsigned long long [usertype] val ibmvfc.c:2375:36: sparse: got struct ibmvfc_event *[assigned] evt Add the appropriate u64 casts when assigning an ibmvfc_event as a correlation token. Link: https://lore.kernel.org/r/20210106203721.1054693-1-tyreld@linux.ibm.com Fixes: 2aa0102c6688 ("scsi: ibmvfc: Use correlation token to tag commands") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | | scsi: ufs: ufshcd-pltfrm depends on HAS_IOMEMRandy Dunlap2021-01-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Building ufshcd-pltfrm.c on arch/s390/ has a linker error since S390 does not support IOMEM, so add a dependency on HAS_IOMEM. s390-linux-ld: drivers/scsi/ufs/ufshcd-pltfrm.o: in function `ufshcd_pltfrm_init': ufshcd-pltfrm.c:(.text+0x38e): undefined reference to `devm_platform_ioremap_resource' where that devm_ function is inside an #ifdef CONFIG_HAS_IOMEM/#endif block. Link: lore.kernel.org/r/202101031125.ZEFCUiKi-lkp@intel.com Link: https://lore.kernel.org/r/20210106040822.933-1-rdunlap@infradead.org Fixes: 03b1781aa978 ("[SCSI] ufs: Add Platform glue driver for ufshcd") Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Alim Akhtar <alim.akhtar@samsung.com> Cc: Avri Altman <avri.altman@wdc.com> Cc: linux-scsi@vger.kernel.org Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | | scsi: megaraid_sas: Fix MEGASAS_IOC_FIRMWARE regressionArnd Bergmann2021-01-081-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Phil Oester reported that a fix for a possible buffer overrun that I sent caused a regression that manifests in this output: Event Message: A PCI parity error was detected on a component at bus 0 device 5 function 0. Severity: Critical Message ID: PCI1308 The original code tried to handle the sense data pointer differently when using 32-bit 64-bit DMA addressing, which would lead to a 32-bit dma_addr_t value of 0x11223344 to get stored 32-bit kernel: 44 33 22 11 ?? ?? ?? ?? 64-bit LE kernel: 44 33 22 11 00 00 00 00 64-bit BE kernel: 00 00 00 00 44 33 22 11 or a 64-bit dma_addr_t value of 0x1122334455667788 to get stored as 32-bit kernel: 88 77 66 55 ?? ?? ?? ?? 64-bit kernel: 88 77 66 55 44 33 22 11 In my patch, I tried to ensure that the same value is used on both 32-bit and 64-bit kernels, and picked what seemed to be the most sensible combination, storing 32-bit addresses in the first four bytes (as 32-bit kernels already did), and 64-bit addresses in eight consecutive bytes (as 64-bit kernels already did), but evidently this was incorrect. Always storing the dma_addr_t pointer as 64-bit little-endian, i.e. initializing the second four bytes to zero in case of 32-bit addressing, apparently solved the problem for Phil, and is consistent with what all 64-bit little-endian machines did before. I also checked in the history that in previous versions of the code, the pointer was always in the first four bytes without padding, and that previous attempts to fix 64-bit user space, big-endian architectures and 64-bit DMA were clearly flawed and seem to have introduced made this worse. Link: https://lore.kernel.org/r/20210104234137.438275-1-arnd@kernel.org Fixes: 381d34e376e3 ("scsi: megaraid_sas: Check user-provided offsets") Fixes: 107a60dd71b5 ("scsi: megaraid_sas: Add support for 64bit consistent DMA") Fixes: 94cd65ddf4d7 ("[SCSI] megaraid_sas: addded support for big endian architecture") Fixes: 7b2519afa1ab ("[SCSI] megaraid_sas: fix 64 bit sense pointer truncation") Reported-by: Phil Oester <kernel@linuxace.com> Tested-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | | scsi: docs: ABI: sysfs-driver-ufs: Add DeepSleep power modeAdrian Hunter2021-01-081-14/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update sysfs documentation for addition of DeepSleep power mode. Link: https://lore.kernel.org/r/20210104155026.16417-1-adrian.hunter@intel.com Fixes: fe1d4c2ebcae ("scsi: ufs: Add DeepSleep feature") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | | | Merge tag 'linux-kselftest-kunit-fixes-5.11-rc5' of ↵Linus Torvalds2021-01-236-94/+141
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kunit fixes from Shuah : "Five fixes to the kunit tool and documentation from Daniel Latypov and David Gow" * tag 'linux-kselftest-kunit-fixes-5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: kunit: tool: move kunitconfig parsing into __init__, make it optional kunit: tool: fix minor typing issue with None status kunit: tool: surface and address more typing issues Documentation: kunit: include example of a parameterized test kunit: tool: Fix spelling of "diagnostic" in kunit_parser
| * | | | kunit: tool: move kunitconfig parsing into __init__, make it optionalDaniel Latypov2021-01-162-28/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LinuxSourceTree will unceremoniously crash if the user doesn't call read_kunitconfig() first in a number of functions. And currently every place we create an instance, the caller also calls create_kunitconfig() and read_kunitconfig(). Move these instead into __init__() so they can't be forgotten and to reduce copy-paste. The https://github.com/google/pytype type-checker complained that _config wasn't initialized. With this, kunit_tool now type checks under both pytype and mypy. Add an optional boolean that can be used to disable this for use cases in the future where we might not need/want to load the config. Signed-off-by: Daniel Latypov <dlatypov@google.com> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Tested-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
| * | | | kunit: tool: fix minor typing issue with None statusDaniel Latypov2021-01-161-9/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code to handle aggregating statuses didn't check that the status actually got set to some non-None value. Default the value to SUCCESS instead of adding a bunch of `is None` checks. This sorta follows the precedent in commit 3fc48259d525 ("kunit: Don't fail test suites if one of them is empty"). Also slightly simplify the code and add type annotations. Signed-off-by: Daniel Latypov <dlatypov@google.com> Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Tested-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
| * | | | kunit: tool: surface and address more typing issuesDaniel Latypov2021-01-165-52/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The authors of this tool were more familiar with a different type-checker, https://github.com/google/pytype. That's open source, but mypy seems more prevalent (and runs faster). And unlike pytype, mypy doesn't try to infer types so it doesn't check unanotated functions. So annotate ~all functions in kunit tool to increase type-checking coverage. Note: per https://www.python.org/dev/peps/pep-0484/, `__init__()` should be annotated as `-> None`. Doing so makes mypy discover a number of new violations. Exclude main() since we reuse `request` for the different types of requests, which mypy isn't happy about. This commit fixes all but one error, where `TestSuite.status` might be None. Signed-off-by: Daniel Latypov <dlatypov@google.com> Reviewed-by: David Gow <davidgow@google.com> Tested-by: Brendan Higgins <brendanhiggins@google.com> Acked-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
| * | | | Documentation: kunit: include example of a parameterized testDaniel Latypov2021-01-161-0/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit fadb08e7c750 ("kunit: Support for Parameterized Testing") introduced support but lacks documentation for how to use it. This patch builds on commit 1f0e943df68a ("Documentation: kunit: provide guidance for testing many inputs") to show a minimal example of the new feature. Signed-off-by: Daniel Latypov <dlatypov@google.com> Reviewed-by: David Gow <davidgow@google.com> Tested-by: Brendan Higgins <brendanhiggins@google.com> Acked-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
| * | | | kunit: tool: Fix spelling of "diagnostic" in kunit_parserDavid Gow2021-01-161-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Various helper functions were misspelling "diagnostic" in their names. It finally got annoying, so fix it. Signed-off-by: David Gow <davidgow@google.com> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Tested-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* | | | | Merge tag 'for-5.11/dm-fixes-2' of ↵Linus Torvalds2021-01-224-11/+54
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - Fix DM integrity crash if "recalculate" used without "internal_hash" - Fix DM integrity "recalculate" support to prevent recalculating checksums if we use internal_hash or journal_hash with a key (e.g. HMAC). Use of crypto as a means to prevent malicious corruption requires further changes and was never a design goal for dm-integrity's primary usecase of detecting accidental corruption. - Fix a benign dm-crypt copy-and-paste bug introduced as part of a fix that was merged for 5.11-rc4. - Fix DM core's dm_get_device() to avoid filesystem lookup to get block device (if possible). * tag 'for-5.11/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm: avoid filesystem lookup in dm_get_dev_t() dm crypt: fix copy and paste bug in crypt_alloc_req_aead dm integrity: conditionally disable "recalculate" feature dm integrity: fix a crash if "recalculate" used without "internal_hash"