summaryrefslogtreecommitdiffstats
path: root/include (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'soc-fixes-6.13-4' of ↵Linus Torvalds8 hours1-6/+6
|\ | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC fixes from Arnd Bergmann: "Two last minute fixes: one build issue on TI soc drivers, and a regression in the renesas reset controller driver" * tag 'soc-fixes-6.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: soc: ti: pruss: Fix pruss APIs reset: rzg2l-usbphy-ctrl: Assign proper of node to the allocated device
| * Merge tag 'ti-driver-soc-for-v6.14' of ↵Arnd Bergmann3 days1-6/+6
| |\ | | | | | | | | | | | | | | | | | | | | | https://git.kernel.org/pub/scm/linux/kernel/git/ti/linux into arm/fixes TI SoC driver updates for v6.14 - Build fixup when CONFIG_TI_PRUSS is disabled.
| | * soc: ti: pruss: Fix pruss APIsMD Danish Anwar2025-01-021-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PRUSS APIs in pruss_driver.h produce lots of compilation errors when CONFIG_TI_PRUSS is not set. The errors and warnings, warning: returning 'void *' from a function with return type 'int' makes integer from pointer without a cast [-Wint-conversion] error: expected identifier or '(' before '{' token Fix these warnings and errors by fixing the return type of pruss APIs as well as removing the misplaced semicolon from pruss_cfg_xfr_enable() Fixes: 0211cc1e4fbb ("soc: ti: pruss: Add helper functions to set GPI mode, MII_RT_event and XFR") Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Reviewed-by: Roger Quadros <rogerq@kernel.org> Link: https://lore.kernel.org/r/20241220100508.1554309-2-danishanwar@ti.com Signed-off-by: Nishanth Menon <nm@ti.com>
* | | Merge tag 'net-6.13-rc8' of ↵Linus Torvalds38 hours4-24/+1
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Notably this includes fixes for a few regressions spotted very recently. No known outstanding ones. Current release - regressions: - core: avoid CFI problems with sock priv helpers - xsk: bring back busy polling support - netpoll: ensure skb_pool list is always initialized Current release - new code bugs: - core: make page_pool_ref_netmem work with net iovs - ipv4: route: fix drop reason being overridden in ip_route_input_slow - udp: make rehash4 independent in udp_lib_rehash() Previous releases - regressions: - bpf: fix bpf_sk_select_reuseport() memory leak - openvswitch: fix lockup on tx to unregistering netdev with carrier - mptcp: be sure to send ack when mptcp-level window re-opens - eth: - bnxt: always recalculate features after XDP clearing, fix null-deref - mlx5: fix sub-function add port error handling - fec: handle page_pool_dev_alloc_pages error Previous releases - always broken: - vsock: some fixes due to transport de-assignment - eth: - ice: fix E825 initialization - mlx5e: fix inversion dependency warning while enabling IPsec tunnel - gtp: destroy device along with udp socket's netns dismantle. - xilinx: axienet: Fix IRQ coalescing packet count overflow" * tag 'net-6.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (44 commits) netdev: avoid CFI problems with sock priv helpers net/mlx5e: Always start IPsec sequence number from 1 net/mlx5e: Rely on reqid in IPsec tunnel mode net/mlx5e: Fix inversion dependency warning while enabling IPsec tunnel net/mlx5: Clear port select structure when fail to create net/mlx5: SF, Fix add port error handling net/mlx5: Fix a lockdep warning as part of the write combining test net/mlx5: Fix RDMA TX steering prio net: make page_pool_ref_netmem work with net iovs net: ethernet: xgbe: re-add aneg to supported features in PHY quirks net: pcs: xpcs: actively unset DW_VR_MII_DIG_CTRL1_2G5_EN for 1G SGMII net: pcs: xpcs: fix DW_VR_MII_DIG_CTRL1_2G5_EN bit being set for 1G SGMII w/o inband selftests: net: Adapt ethtool mq tests to fix in qdisc graft net: fec: handle page_pool_dev_alloc_pages error net: netpoll: ensure skb_pool list is always initialized net: xilinx: axienet: Fix IRQ coalescing packet count overflow nfp: bpf: prevent integer overflow in nfp_bpf_event_output() selftests: mptcp: avoid spurious errors on disconnect mptcp: fix spurious wake-up on under memory pressure mptcp: be sure to send ack when mptcp-level window re-opens ...
| * | | net: make page_pool_ref_netmem work with net iovsPavel Begunkov2 days1-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | page_pool_ref_netmem() should work with either netmem representation, but currently it casts to a page with netmem_to_page(), which will fail with net iovs. Use netmem_get_pp_ref_count_ref() instead. Fixes: 8ab79ed50cf1 ("page_pool: devmem support") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: David Wei <dw@davidwei.uk> Link: https://lore.kernel.org/20250108220644.3528845-2-dw@davidwei.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | | xsk: Bring back busy polling supportStanislav Fomichev7 days3-23/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 86e25f40aa1e ("net: napi: Add napi_config") moved napi->napi_id assignment to a later point in time (napi_hash_add_with_id). This breaks __xdp_rxq_info_reg which copies napi_id at an earlier time and now stores 0 napi_id. It also makes sk_mark_napi_id_once_xdp and __sk_mark_napi_id_once useless because they now work against 0 napi_id. Since sk_busy_loop requires valid napi_id to busy-poll on, there is no way to busy-poll AF_XDP sockets anymore. Bring back the ability to busy-poll on XSK by resolving socket's napi_id at bind time. This relies on relatively recent netif_queue_set_napi, but (assume) at this point most popular drivers should have been converted. This also removes per-tx/rx cycles which used to check and/or set the napi_id value. Confirmed by running a busy-polling AF_XDP socket (github.com/fomichev/xskrtt) on mlx5 and looking at BusyPollRxPackets from /proc/net/netstat. Fixes: 86e25f40aa1e ("net: napi: Add napi_config") Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250109003436.2829560-1-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | | | Merge tag 'seccomp-v6.13-rc8' of ↵Linus Torvalds3 days1-1/+1
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp fix from Kees Cook: "Fix a randconfig failure: - Unconditionally define stub for !CONFIG_SECCOMP (Linus Walleij)" * tag 'seccomp-v6.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: seccomp: Stub for !CONFIG_SECCOMP
| * | | | seccomp: Stub for !CONFIG_SECCOMPLinus Walleij9 days1-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using !CONFIG_SECCOMP with CONFIG_GENERIC_ENTRY, the randconfig bots found the following snag: kernel/entry/common.c: In function 'syscall_trace_enter': >> kernel/entry/common.c:52:23: error: implicit declaration of function '__secure_computing' [-Wimplicit-function-declaration] 52 | ret = __secure_computing(NULL); | ^~~~~~~~~~~~~~~~~~ Since generic entry calls __secure_computing() unconditionally, fix this by moving the stub out of the ifdef clause for CONFIG_HAVE_ARCH_SECCOMP_FILTER so it's always available. Link: https://lore.kernel.org/oe-kbuild-all/202501061240.Fzk9qiFZ-lkp@intel.com/ Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20250108-seccomp-stub-2-v2-1-74523d49420f@linaro.org Signed-off-by: Kees Cook <kees@kernel.org>
* | | | | Merge tag 'mm-hotfixes-stable-2025-01-13-00-03' of ↵Linus Torvalds5 days3-2/+15
|\ \ \ \ \ | |_|_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "18 hotfixes. 11 are cc:stable. 13 are MM and 5 are non-MM. All patches are singletons - please see the relevant changelogs for details" * tag 'mm-hotfixes-stable-2025-01-13-00-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: fs/proc: fix softlockup in __read_vmcore (part 2) mm: fix assertion in folio_end_read() mm: vmscan : pgdemote vmstat is not getting updated when MGLRU is enabled. vmstat: disable vmstat_work on vmstat_cpu_down_prep() zram: fix potential UAF of zram table selftests/mm: set allocated memory to non-zero content in cow test mm: clear uffd-wp PTE/PMD state on mremap() module: fix writing of livepatch relocations in ROX text mm: zswap: properly synchronize freeing resources during CPU hotunplug Revert "mm: zswap: fix race between [de]compression and CPU hotunplug" hugetlb: fix NULL pointer dereference in trace_hugetlbfs_alloc_inode mm: fix div by zero in bdi_ratio_from_pages x86/execmem: fix ROX cache usage in Xen PV guests filemap: avoid truncating 64-bit offset to 32 bits tools: fix atomic_set() definition to set the value correctly mm/mempolicy: count MPOL_WEIGHTED_INTERLEAVE to "interleave_hit" scripts/decode_stacktrace.sh: fix decoding of lines with an additional info mm/kmemleak: fix percpu memory leak detection failure
| * | | | mm: clear uffd-wp PTE/PMD state on mremap()Ryan Roberts5 days1-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When mremap()ing a memory region previously registered with userfaultfd as write-protected but without UFFD_FEATURE_EVENT_REMAP, an inconsistency in flag clearing leads to a mismatch between the vma flags (which have uffd-wp cleared) and the pte/pmd flags (which do not have uffd-wp cleared). This mismatch causes a subsequent mprotect(PROT_WRITE) to trigger a warning in page_table_check_pte_flags() due to setting the pte to writable while uffd-wp is still set. Fix this by always explicitly clearing the uffd-wp pte/pmd flags on any such mremap() so that the values are consistent with the existing clearing of VM_UFFD_WP. Be careful to clear the logical flag regardless of its physical form; a PTE bit, a swap PTE bit, or a PTE marker. Cover PTE, huge PMD and hugetlb paths. Link: https://lkml.kernel.org/r/20250107144755.1871363-2-ryan.roberts@arm.com Co-developed-by: Mikołaj Lenczewski <miko.lenczewski@arm.com> Signed-off-by: Mikołaj Lenczewski <miko.lenczewski@arm.com> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Closes: https://lore.kernel.org/linux-mm/810b44a8-d2ae-4107-b665-5a42eae2d948@arm.com/ Fixes: 63b2d4174c4a ("userfaultfd: wp: add the writeprotect API to userfaultfd ioctl") Cc: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Peter Xu <peterx@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * | | | module: fix writing of livepatch relocations in ROX textPetr Pavlu5 days1-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A livepatch module can contain a special relocation section .klp.rela.<objname>.<secname> to apply its relocations at the appropriate time and to additionally access local and unexported symbols. When <objname> points to another module, such relocations are processed separately from the regular module relocation process. For instance, only when the target <objname> actually becomes loaded. With CONFIG_STRICT_MODULE_RWX, when the livepatch core decides to apply these relocations, their processing results in the following bug: [ 25.827238] BUG: unable to handle page fault for address: 00000000000012ba [ 25.827819] #PF: supervisor read access in kernel mode [ 25.828153] #PF: error_code(0x0000) - not-present page [ 25.828588] PGD 0 P4D 0 [ 25.829063] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI [ 25.829742] CPU: 2 UID: 0 PID: 452 Comm: insmod Tainted: G O K 6.13.0-rc4-00078-g059dd502b263 #7820 [ 25.830417] Tainted: [O]=OOT_MODULE, [K]=LIVEPATCH [ 25.830768] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-20220807_005459-localhost 04/01/2014 [ 25.831651] RIP: 0010:memcmp+0x24/0x60 [ 25.832190] Code: [...] [ 25.833378] RSP: 0018:ffffa40b403a3ae8 EFLAGS: 00000246 [ 25.833637] RAX: 0000000000000000 RBX: ffff93bc81d8e700 RCX: ffffffffc0202000 [ 25.834072] RDX: 0000000000000000 RSI: 0000000000000004 RDI: 00000000000012ba [ 25.834548] RBP: ffffa40b403a3b68 R08: ffffa40b403a3b30 R09: 0000004a00000002 [ 25.835088] R10: ffffffffffffd222 R11: f000000000000000 R12: 0000000000000000 [ 25.835666] R13: ffffffffc02032ba R14: ffffffffc007d1e0 R15: 0000000000000004 [ 25.836139] FS: 00007fecef8c3080(0000) GS:ffff93bc8f900000(0000) knlGS:0000000000000000 [ 25.836519] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 25.836977] CR2: 00000000000012ba CR3: 0000000002f24000 CR4: 00000000000006f0 [ 25.837442] Call Trace: [ 25.838297] <TASK> [ 25.841083] __write_relocate_add.constprop.0+0xc7/0x2b0 [ 25.841701] apply_relocate_add+0x75/0xa0 [ 25.841973] klp_write_section_relocs+0x10e/0x140 [ 25.842304] klp_write_object_relocs+0x70/0xa0 [ 25.842682] klp_init_object_loaded+0x21/0xf0 [ 25.842972] klp_enable_patch+0x43d/0x900 [ 25.843572] do_one_initcall+0x4c/0x220 [ 25.844186] do_init_module+0x6a/0x260 [ 25.844423] init_module_from_file+0x9c/0xe0 [ 25.844702] idempotent_init_module+0x172/0x270 [ 25.845008] __x64_sys_finit_module+0x69/0xc0 [ 25.845253] do_syscall_64+0x9e/0x1a0 [ 25.845498] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 25.846056] RIP: 0033:0x7fecef9eb25d [ 25.846444] Code: [...] [ 25.847563] RSP: 002b:00007ffd0c5d6de8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 25.848082] RAX: ffffffffffffffda RBX: 000055b03f05e470 RCX: 00007fecef9eb25d [ 25.848456] RDX: 0000000000000000 RSI: 000055b001e74e52 RDI: 0000000000000003 [ 25.848969] RBP: 00007ffd0c5d6ea0 R08: 0000000000000040 R09: 0000000000004100 [ 25.849411] R10: 00007fecefac7b20 R11: 0000000000000246 R12: 000055b001e74e52 [ 25.849905] R13: 0000000000000000 R14: 000055b03f05e440 R15: 0000000000000000 [ 25.850336] </TASK> [ 25.850553] Modules linked in: deku(OK+) uinput [ 25.851408] CR2: 00000000000012ba [ 25.852085] ---[ end trace 0000000000000000 ]--- The problem is that the .klp.rela.<objname>.<secname> relocations are processed after the module was already formed and mod->rw_copy was reset. However, the code in __write_relocate_add() calls module_writable_address() which translates the target address 'loc' still to 'loc + (mem->rw_copy - mem->base)', with mem->rw_copy now being 0. Fix the problem by returning directly 'loc' in module_writable_address() when the module is already formed. Function __write_relocate_add() knows to use text_poke() in such a case. Link: https://lkml.kernel.org/r/20250107153507.14733-1-petr.pavlu@suse.com Fixes: 0c133b1e78cd ("module: prepare to handle ROX allocations for text") Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Reported-by: Marek Maslanka <mmaslanka@google.com> Closes: https://lore.kernel.org/linux-modules/CAGcaFA2hdThQV6mjD_1_U+GNHThv84+MQvMWLgEuX+LVbAyDxg@mail.gmail.com/ Reviewed-by: Petr Mladek <pmladek@suse.com> Tested-by: Petr Mladek <pmladek@suse.com> Cc: Joe Lawrence <joe.lawrence@redhat.com> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Petr Mladek <pmladek@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * | | | hugetlb: fix NULL pointer dereference in trace_hugetlbfs_alloc_inodeMuchun Song5 days1-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | hugetlb_file_setup() will pass a NULL @dir to hugetlbfs_get_inode(), so we will access a NULL pointer for @dir. Fix it and set __entry->dr to 0 if @dir is NULL. Because ->i_ino cannot be 0 (see get_next_ino()), there is no confusing if user sees a 0 inode number. Link: https://lkml.kernel.org/r/20250106033118.4640-1-songmuchun@bytedance.com Fixes: 318580ad7f28 ("hugetlbfs: support tracepoint") Signed-off-by: Muchun Song <songmuchun@bytedance.com> Reported-by: Cheung Wall <zzqq0103.hey@gmail.com> Closes: https://lore.kernel.org/linux-mm/02858D60-43C1-4863-A84F-3C76A8AF1F15@linux.dev/T/# Reviewed-by: Hongbo Li <lihongbo22@huawei.com> Cc: cheung wall <zzqq0103.hey@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | | | | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds5 days1-4/+2
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull kvm fixes from Paolo Bonzini: "The largest part here is for KVM/PPC, where a NULL pointer dereference was introduced in the 6.13 merge window and is now fixed. There's some "holiday-induced lateness", as the s390 submaintainer put it, but otherwise things looks fine. s390: - fix a latent bug when the kernel is compiled in debug mode - two small UCONTROL fixes and their selftests arm64: - always check page state in hyp_ack_unshare() - align set_id_regs selftest with the fact that ASIDBITS field is RO - various vPMU fixes for bugs that only affect nested virt PPC e500: - Fix a mostly impossible (but just wrong) case where IRQs were never re-enabled - Observe host permissions instead of mapping readonly host pages as guest-writable. This fixes a NULL-pointer dereference in 6.13 - Replace brittle VMA-based attempts at building huge shadow TLB entries with PTE lookups" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: e500: perform hugepage check after looking up the PFN KVM: e500: map readonly host pages for read KVM: e500: track host-writability of pages KVM: e500: use shadow TLB entry as witness for writability KVM: e500: always restore irqs KVM: s390: selftests: Add has device attr check to uc_attr_mem_limit selftest KVM: s390: selftests: Add ucontrol gis routing test KVM: s390: Reject KVM_SET_GSI_ROUTING on ucontrol VMs KVM: s390: selftests: Add ucontrol flic attr selftests KVM: s390: Reject setting flic pfault attributes on ucontrol VMs KVM: s390: vsie: fix virtual/physical address in unpin_scb() KVM: arm64: Only apply PMCR_EL0.P to the guest range of counters KVM: arm64: nv: Reload PMU events upon MDCR_EL2.HPME change KVM: arm64: Use KVM_REQ_RELOAD_PMU to handle PMCR_EL0.E change KVM: arm64: Add unified helper for reprogramming counters by mask KVM: arm64: Always check the state from hyp_ack_unshare() KVM: arm64: Fix set_id_regs selftest for ASIDBITS becoming unwritable
| * | | | | Merge tag 'kvm-s390-master-6.13-1' of ↵Paolo Bonzini6 days15-63/+168
| |\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: three small bugfixes Fix a latent bug when the kernel is compiled in debug mode. Two small UCONTROL fixes and their selftests.
| * | | | | Merge tag 'kvmarm-fixes-6.13-3' of ↵Paolo Bonzini6 days1-4/+2
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 changes for 6.13, part #3 - Always check page state in hyp_ack_unshare() - Align set_id_regs selftest with the fact that ASIDBITS field is RO - Various vPMU fixes for bugs that only affect nested virt
| | * | | | | KVM: arm64: Add unified helper for reprogramming counters by maskOliver Upton2024-12-181-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Having separate helpers for enabling/disabling counters provides the wrong abstraction, as the state of each counter needs to be evaluated independently and, in some cases, use a different global enable bit. Collapse the enable/disable accessors into a single, common helper that reconfigures every counter set in @mask, leaving the complexity of determining if an event is actually enabled in kvm_pmu_counter_is_enabled(). Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241217175513.3658056-1-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
* | | | | | | Merge tag 'soc-fixes-6.13-3' of ↵Linus Torvalds7 days1-1/+1
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC fixes from Arnd Bergmann: "Over the Christmas break a couple of devicetree fixes came in for Rockchips, Qualcomm and NXP/i.MX. These add some missing board specific properties, address build time warnings, The USB/TOG supoprt on X1 Elite regressed, so two earlier DT changes get reverted for now. Aside from the devicetree fixes, there is One build fix for the stm32 firewall driver, and a defconfig change to enable SPDIF support for i.MX" * tag 'soc-fixes-6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: firewall: remove misplaced semicolon from stm32_firewall_get_firewall arm64: dts: rockchip: add hevc power domain clock to rk3328 arm64: dts: rockchip: Fix the SD card detection on NanoPi R6C/R6S arm64: dts: qcom: sa8775p: fix the secure device bootup issue Revert "arm64: dts: qcom: x1e80100: enable OTG on USB-C controllers" Revert "arm64: dts: qcom: x1e80100-crd: enable otg on usb ports" arm64: dts: qcom: x1e80100: Fix up BAR space size for PCIe6a Revert "arm64: dts: qcom: x1e78100-t14s: enable otg on usb-c ports" ARM: dts: imxrt1050: Fix clocks for mmc ARM: imx_v6_v7_defconfig: enable SND_SOC_SPDIF arm64: dts: imx95: correct the address length of netcmix_blk_ctrl arm64: dts: imx8-ss-audio: add fallback compatible string fsl,imx6ull-esai for esai arm64: dts: rockchip: rename rfkill label for Radxa ROCK 5B arm64: dts: rockchip: add reset-names for combphy on rk3568 arm64: dts: qcom: sa8775p: Fix the size of 'addr_space' regions
| * | | | | | | firewall: remove misplaced semicolon from stm32_firewall_get_firewallguanjing8 days1-1/+1
| | |_|/ / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove misplaced colon in stm32_firewall_get_firewall() which results in a syntax error when the code is compiled without CONFIG_STM32_FIREWALL. Fixes: 5c9668cfc6d7 ("firewall: introduce stm32_firewall framework") Signed-off-by: guanjing <guanjing@cmss.chinamobile.com> Reviewed-by: Gatien Chevallier <gatien.chevallier@foss.st.com> Signed-off-by: Alexandre Torgue <alexandre.torgue@foss.st.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
* | | | | | | Merge tag 'vfs-6.13-rc7.fixes.2' of ↵Linus Torvalds8 days3-26/+20
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "afs: - Fix the maximum cell name length - Fix merge preference rule failure condition fuse: - Fix fuse_get_user_pages() so it doesn't risk misleading the caller to think pages have been allocated when they actually haven't - Fix direct-io folio offset and length calculation netfs: - Fix async direct-io handling - Fix read-retry for filesystems that don't provide a ->prepare_read() method vfs: - Prevent truncating 64-bit offsets to 32-bits in iomap - Fix memory barrier interactions when polling - Remove MNT_ONRB to fix concurrent modification of @mnt->mnt_flags leading to MNT_ONRB to not be raised and invalid access to a list member" * tag 'vfs-6.13-rc7.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: poll: kill poll_does_not_wait() sock_poll_wait: kill the no longer necessary barrier after poll_wait() io_uring_poll: kill the no longer necessary barrier after poll_wait() poll_wait: kill the obsolete wait_address check poll_wait: add mb() to fix theoretical race between waitqueue_active() and .poll() afs: Fix merge preference rule failure condition netfs: Fix read-retry for fs with no ->prepare_read() netfs: Fix kernel async DIO fs: kill MNT_ONRB iomap: avoid avoid truncating 64-bit offset to 32 bits afs: Fix the maximum cell name length fuse: Set *nbytesp=0 in fuse_get_user_pages on allocation failure fuse: fix direct io folio offset and length calculation
| * \ \ \ \ \ \ Merge branch 'vfs-6.14.poll' into vfs.fixesChristian Brauner8 days2-24/+19
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bring in the fixes for __pollwait() and waitqueue_active() interactions. Signed-off-by: Christian Brauner <brauner@kernel.org>
| | * | | | | | | poll: kill poll_does_not_wait()Oleg Nesterov8 days1-13/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It no longer has users. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20250107162743.GA18947@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
| | * | | | | | | sock_poll_wait: kill the no longer necessary barrier after poll_wait()Oleg Nesterov8 days1-10/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that poll_wait() provides a full barrier we can remove smp_mb() from sock_poll_wait(). Also, the poll_does_not_wait() check before poll_wait() just adds the unnecessary confusion, kill it. poll_wait() does the same "p && p->_qproc" check. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20250107162736.GA18944@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
| | * | | | | | | poll_wait: kill the obsolete wait_address checkOleg Nesterov8 days1-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This check is historical and no longer needed, wait_address is never NULL. These days we rely on the poll_table->_qproc check. NULL if select/poll is not going to sleep, or it already has a data to report, or all waiters have already been registered after the 1st iteration. However, poll_table *p can be NULL, see p9_fd_poll() for example, so we can't remove the "p != NULL" check. Link: https://lore.kernel.org/all/20250106180325.GF7233@redhat.com/ Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20250107162724.GA18926@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
| | * | | | | | | poll_wait: add mb() to fix theoretical race between waitqueue_active() and ↵Oleg Nesterov8 days1-1/+9
| | | |_|_|_|_|/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | .poll() As the comment above waitqueue_active() explains, it can only be used if both waker and waiter have mb()'s that pair with each other. However __pollwait() is broken in this respect. This is not pipe-specific, but let's look at pipe_poll() for example: poll_wait(...); // -> __pollwait() -> add_wait_queue() LOAD(pipe->head); LOAD(pipe->head); In theory these LOAD()'s can leak into the critical section inside add_wait_queue() and can happen before list_add(entry, wq_head), in this case pipe_poll() can race with wakeup_pipe_readers/writers which do smp_mb(); if (waitqueue_active(wq_head)) wake_up_interruptible(wq_head); There are more __pollwait()-like functions (grep init_poll_funcptr), and it seems that at least ep_ptable_queue_proc() has the same problem, so the patch adds smp_mb() into poll_wait(). Link: https://lore.kernel.org/all/20250102163320.GA17691@redhat.com/ Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20250107162717.GA18922@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
| * | | | | | | Merge tag 'vfs-6.14-rc7.mount.fixes'Christian Brauner9 days1-2/+1
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bring in the fix for the mount namespace rbtree. It is used as the base for the vfs mount work for this cycle and so shouldn't be applied directly. Signed-off-by: Christian Brauner <brauner@kernel.org>
| | * | | | | | | fs: kill MNT_ONRBChristian Brauner9 days1-2/+1
| | |/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move mnt->mnt_node into the union with mnt->mnt_rcu and mnt->mnt_llist instead of keeping it with mnt->mnt_list. This allows us to use RB_CLEAR_NODE(&mnt->mnt_node) in umount_tree() as well as list_empty(&mnt->mnt_node). That in turn allows us to remove MNT_ONRB. This also fixes the bug reported in [1] where seemingly MNT_ONRB wasn't set in @mnt->mnt_flags even though the mount was present in the mount rbtree of the mount namespace. The root cause is the following race. When a btrfs subvolume is mounted a temporary mount is created: btrfs_get_tree_subvol() { mnt = fc_mount() // Register the newly allocated mount with sb->mounts: lock_mount_hash(); list_add_tail(&mnt->mnt_instance, &mnt->mnt.mnt_sb->s_mounts); unlock_mount_hash(); } and registered on sb->s_mounts. Later it is added to an anonymous mount namespace via mount_subvol(): -> mount_subvol() -> mount_subtree() -> alloc_mnt_ns() mnt_add_to_ns() vfs_path_lookup() put_mnt_ns() The mnt_add_to_ns() call raises MNT_ONRB in @mnt->mnt_flags. If someone concurrently does a ro remount: reconfigure_super() -> sb_prepare_remount_readonly() { list_for_each_entry(mnt, &sb->s_mounts, mnt_instance) { } all mounts registered in sb->s_mounts are visited and first MNT_WRITE_HOLD is raised, then MNT_READONLY is raised, and finally MNT_WRITE_HOLD is removed again. The flag modification for MNT_WRITE_HOLD/MNT_READONLY and MNT_ONRB race so MNT_ONRB might be lost. Fixes: 2eea9ce4310d ("mounts: keep list of mounts in an rbtree") Cc: <stable@kernel.org> # v6.8+ Link: https://lore.kernel.org/r/20241215-vfs-6-14-mount-work-v1-1-fd55922c4af8@kernel.org Link: https://lore.kernel.org/r/ec6784ed-8722-4695-980a-4400d4e7bd1a@gmx.com [1] Signed-off-by: Christian Brauner <brauner@kernel.org>
* | | | | | | | Merge tag 'regulator-fix-v6.13-rc6' of ↵Linus Torvalds8 days1-45/+32
|\ \ \ \ \ \ \ \ | |_|_|_|_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "A couple of fixes for !REGULATOR and !OF configurations, adding missing stubs" * tag 'regulator-fix-v6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: Move OF_ API declarations/definitions outside CONFIG_REGULATOR regulator: Guard of_regulator_bulk_get_all() with CONFIG_OF
| * | | | | | | regulator: Move OF_ API declarations/definitions outside CONFIG_REGULATORManivannan Sadhasivam12 days1-46/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since these are hidden inside CONFIG_REGULATOR, building the consumer drivers without CONFIG_REGULATOR will result in the following build error: >> drivers/pci/pwrctrl/slot.c:39:15: error: implicit declaration of function 'of_regulator_bulk_get_all'; did you mean 'regulator_bulk_get'? [-Werror=implicit-function-declaration] 39 | ret = of_regulator_bulk_get_all(dev, dev_of_node(dev), | ^~~~~~~~~~~~~~~~~~~~~~~~~ | regulator_bulk_get cc1: some warnings being treated as errors This also removes the duplicated definitions that were possibly added to fix the build issues. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202501020407.HmQQQKa0-lkp@intel.com/ Fixes: 27b9ecc7a9ba ("regulator: Add of_regulator_bulk_get_all") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://patch.msgid.link/20250104115058.19216-3-manivannan.sadhasivam@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
| * | | | | | | regulator: Guard of_regulator_bulk_get_all() with CONFIG_OFManivannan Sadhasivam12 days1-8/+9
| | |_|_|_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the definition is in drivers/regulator/of_regulator.c and compiled only if CONFIG_OF is enabled, building the consumer driver without CONFIG_OF and with CONFIG_REGULATOR will result in below build error: ERROR: modpost: "of_regulator_bulk_get_all" [drivers/pci/pwrctrl/pci-pwrctl-slot.ko] undefined! Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202412181640.12Iufkvd-lkp@intel.com/ Fixes: 27b9ecc7a9ba ("regulator: Add of_regulator_bulk_get_all") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Link: https://patch.msgid.link/20250104115058.19216-2-manivannan.sadhasivam@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
* | | | | | | Merge tag 'net-6.13-rc7' of ↵Linus Torvalds8 days1-1/+1
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter, Bluetooth and WPAN. No outstanding fixes / investigations at this time. Current release - new code bugs: - eth: fbnic: revert HWMON support, it doesn't work at all and revert is similar size as the fixes Previous releases - regressions: - tcp: allow a connection when sk_max_ack_backlog is zero - tls: fix tls_sw_sendmsg error handling Previous releases - always broken: - netdev netlink family: - prevent accessing NAPI instances from another namespace - don't dump Tx and uninitialized NAPIs - net: sysctl: avoid using current->nsproxy, fix null-deref if task is exiting and stick to opener's netns - sched: sch_cake: add bounds checks to host bulk flow fairness counts Misc: - annual cleanup of inactive maintainers" * tag 'net-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits) rds: sysctl: rds_tcp_{rcv,snd}buf: avoid using current->nsproxy sctp: sysctl: plpmtud_probe_interval: avoid using current->nsproxy sctp: sysctl: udp_port: avoid using current->nsproxy sctp: sysctl: auth_enable: avoid using current->nsproxy sctp: sysctl: rto_min/max: avoid using current->nsproxy sctp: sysctl: cookie_hmac_alg: avoid using current->nsproxy mptcp: sysctl: blackhole timeout: avoid using current->nsproxy mptcp: sysctl: sched: avoid using current->nsproxy mptcp: sysctl: avail sched: remove write access MAINTAINERS: remove Lars Povlsen from Microchip Sparx5 SoC MAINTAINERS: remove Noam Dagan from AMAZON ETHERNET MAINTAINERS: remove Ying Xue from TIPC MAINTAINERS: remove Mark Lee from MediaTek Ethernet MAINTAINERS: mark stmmac ethernet as an Orphan MAINTAINERS: remove Andy Gospodarek from bonding MAINTAINERS: update maintainers for Microchip LAN78xx MAINTAINERS: mark Synopsys DW XPCS as Orphan net/mlx5: Fix variable not being completed when function returns rtase: Fix a check for error in rtase_alloc_msix() net: stmmac: dwmac-tegra: Read iommu stream id from device tree ...
| * | | | | | | tcp/dccp: allow a connection when sk_max_ack_backlog is zeroZhongqiu Duan14 days1-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the backlog of listen() is set to zero, sk_acceptq_is_full() allows one connection to be made, but inet_csk_reqsk_queue_is_full() does not. When the net.ipv4.tcp_syncookies is zero, inet_csk_reqsk_queue_is_full() will cause an immediate drop before the sk_acceptq_is_full() check in tcp_conn_request(), resulting in no connection can be made. This patch tries to keep consistent with 64a146513f8f ("[NET]: Revert incorrect accept queue backlog changes."). Link: https://lore.kernel.org/netdev/20250102080258.53858-1-kuniyu@amazon.com/ Fixes: ef547f2ac16b ("tcp: remove max_qlen_log") Signed-off-by: Zhongqiu Duan <dzq.aishenghu0@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250102171426.915276-1-dzq.aishenghu0@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | | | | | | | Merge tag 'for-6.13-rc6-tag' of ↵Linus Torvalds9 days1-0/+10
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "A few more fixes. Besides the one-liners in Btrfs there's fix to the io_uring and encoded read integration (added in this development cycle). The update to io_uring provides more space for the ongoing command that is then used in Btrfs to handle some cases. - io_uring and encoded read: - provide stable storage for io_uring command data - make a copy of encoded read ioctl call, reuse that in case the call would block and will be called again - properly initialize zlib context for hardware compression on s390 - fix max extent size calculation on filesystems with non-zoned devices - fix crash in scrub on crafted image due to invalid extent tree" * tag 'for-6.13-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: zlib: fix avail_in bytes for s390 zlib HW compression path btrfs: zoned: calculate max_extent_size properly on non-zoned setup btrfs: avoid NULL pointer dereference if no valid extent tree btrfs: don't read from userspace twice in btrfs_uring_encoded_read() io_uring: add io_uring_cmd_get_async_data helper io_uring/cmd: add per-op data to struct io_uring_cmd_data io_uring/cmd: rename struct uring_cache to io_uring_cmd_data
| * | | | | | | | io_uring: add io_uring_cmd_get_async_data helperMark Harmstone12 days1-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a helper function in include/linux/io_uring/cmd.h to read the async_data pointer from a struct io_uring_cmd. Signed-off-by: Mark Harmstone <maharmstone@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | | | | io_uring/cmd: add per-op data to struct io_uring_cmd_dataJens Axboe12 days1-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case an op handler for ->uring_cmd() needs stable storage for user data, it can allocate io_uring_cmd_data->op_data and use it for the duration of the request. When the request gets cleaned up, uring_cmd will free it automatically. Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | | | | | io_uring/cmd: rename struct uring_cache to io_uring_cmd_dataJens Axboe12 days1-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for making this more generically available for ->uring_cmd() usage that needs stable command data, rename it and move it to io_uring/cmd.h instead. Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: David Sterba <dsterba@suse.com>
* | | | | | | | | Merge tag 'scsi-fixes' of ↵Linus Torvalds9 days1-2/+0
|\ \ \ \ \ \ \ \ \ | |_|_|_|/ / / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Four driver fixes in UFS, mostly to do with power management" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: qcom: Power down the controller/device during system suspend for SM8550/SM8650 SoCs scsi: ufs: qcom: Allow passing platform specific OF data scsi: ufs: core: Honor runtime/system PM levels if set by host controller drivers scsi: ufs: qcom: Power off the PHY if it was already powered on in ufs_qcom_power_up_sequence()
| * | | | | | | | scsi: ufs: qcom: Power off the PHY if it was already powered on in ↵Manivannan Sadhasivam2025-01-021-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ufs_qcom_power_up_sequence() PHY might already be powered on during ufs_qcom_power_up_sequence() in a couple of cases: 1. During UFSHCD_QUIRK_REINIT_AFTER_MAX_GEAR_SWITCH quirk 2. Resuming from spm_lvl = 5 suspend In those cases, it is necessary to call phy_power_off() and phy_exit() in ufs_qcom_power_up_sequence() function to power off the PHY before calling phy_init() and phy_power_on(). Case (1) is doing it via ufs_qcom_reinit_notify() callback, but case (2) is not handled. So to satisfy both cases, call phy_power_off() and phy_exit() if the phy_count is non-zero. And with this change, the reinit_notify() callback is no longer needed. This fixes the below UFS resume failure with spm_lvl = 5: ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5 ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5 ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5 ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5 ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5 ufs_device_wlun 0:0:0:49488: ufshcd_wl_resume failed: -5 ufs_device_wlun 0:0:0:49488: PM: dpm_run_callback(): scsi_bus_resume returns -5 ufs_device_wlun 0:0:0:49488: PM: failed to resume async: error -5 Cc: stable@vger.kernel.org # 6.3 Fixes: baf5ddac90dc ("scsi: ufs: ufs-qcom: Add support for reinitializing the UFS device") Reported-by: Ram Kumar Dwivedi <quic_rdwivedi@quicinc.com> Tested-by: Amit Pundir <amit.pundir@linaro.org> # on SM8550-HDK Reviewed-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20241219-ufs-qcom-suspend-fix-v3-1-63c4b95a70b9@linaro.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* | | | | | | | | Merge tag 'vfs-6.13-rc7.fixes' of ↵Linus Torvalds12 days2-5/+4
|\ \ \ \ \ \ \ \ \ | |_|_|_|_|_|/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Relax assertions on failure to encode file handles The ->encode_fh() method can fail for various reasons. None of them warrant a WARN_ON(). - Fix overlayfs file handle encoding by allowing encoding an fid from an inode without an alias - Make sure fuse_dir_open() handles FOPEN_KEEP_CACHE. If it's not specified fuse needs to invaludate the directory inode page cache - Fix qnx6 so it builds with gcc-15 - Various fixes for netfslib and ceph and nfs filesystems: - Ignore silly rename files from afs and nfs when building header archives - Fix read result collection in netfslib with multiple subrequests - Handle ENOMEM for netfslib buffered reads - Fix oops in nfs_netfs_init_request() - Parse the secctx command immediately in cachefiles - Remove a redundant smp_rmb() in netfslib - Handle recursion in read retry in netfslib - Fix clearing of folio_queue - Fix missing cancellation of copy-to_cache when the cache for a file is temporarly disabled in netfslib - Sanity check the hfs root record - Fix zero padding data issues in concurrent write scenarios - Fix is_mnt_ns_file() after converting nsfs to path_from_stashed() - Fix missing declaration of init_files - Increase I/O priority when writing revoke records in jbd2 - Flush filesystem device before updating tail sequence in jbd2 * tag 'vfs-6.13-rc7.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (23 commits) ovl: support encoding fid from inode with no alias ovl: pass realinode to ovl_encode_real_fh() instead of realdentry fuse: respect FOPEN_KEEP_CACHE on opendir netfs: Fix is-caching check in read-retry netfs: Fix the (non-)cancellation of copy when cache is temporarily disabled netfs: Fix ceph copy to cache on write-begin netfs: Work around recursion by abandoning retry if nothing read netfs: Fix missing barriers by using clear_and_wake_up_bit() netfs: Remove redundant use of smp_rmb() cachefiles: Parse the "secctx" immediately nfs: Fix oops in nfs_netfs_init_request() when copying to cache netfs: Fix enomem handling in buffered reads netfs: Fix non-contiguous donation between completed reads kheaders: Ignore silly-rename files fs: relax assertions on failure to encode file handles fs: fix missing declaration of init_files fs: fix is_mnt_ns_file() iomap: fix zero padding data issue in concurrent append writes iomap: pass byte granular end position to iomap_add_to_ioend jbd2: flush filesystem device before updating tail sequence ...
| * | | | | | | | netfs: Fix is-caching check in read-retryDavid Howells2024-12-201-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | netfs: Fix is-caching check in read-retry The read-retry code checks the NETFS_RREQ_COPY_TO_CACHE flag to determine if there might be failed reads from the cache that need turning into reads from the server, with the intention of skipping the complicated part if it can. The code that set the flag, however, got lost during the read-side rewrite. Fix the check to see if the cache_resources are valid instead. The flag can then be removed. Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/3752048.1734381285@warthog.procyon.org.uk cc: Jeff Layton <jlayton@kernel.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
| * | | | | | | | netfs: Work around recursion by abandoning retry if nothing readDavid Howells2024-12-201-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | syzkaller reported recursion with a loop of three calls (netfs_rreq_assess, netfs_retry_reads and netfs_rreq_terminated) hitting the limit of the stack during an unbuffered or direct I/O read. There are a number of issues: (1) There is no limit on the number of retries. (2) A subrequest is supposed to be abandoned if it does not transfer anything (NETFS_SREQ_NO_PROGRESS), but that isn't checked under all circumstances. (3) The actual root cause, which is this: if (atomic_dec_and_test(&rreq->nr_outstanding)) netfs_rreq_terminated(rreq, ...); When we do a retry, we bump the rreq->nr_outstanding counter to prevent the final cleanup phase running before we've finished dispatching the retries. The problem is if we hit 0, we have to do the cleanup phase - but we're in the cleanup phase and end up repeating the retry cycle, hence the recursion. Work around the problem by limiting the number of retries. This is based on Lizhi Xu's patch[1], and makes the following changes: (1) Replace NETFS_SREQ_NO_PROGRESS with NETFS_SREQ_MADE_PROGRESS and make the filesystem set it if it managed to read or write at least one byte of data. Clear this bit before issuing a subrequest. (2) Add a ->retry_count member to the subrequest and increment it any time we do a retry. (3) Remove the NETFS_SREQ_RETRYING flag as it is superfluous with ->retry_count. If the latter is non-zero, we're doing a retry. (4) Abandon a subrequest if retry_count is non-zero and we made no progress. (5) Use ->retry_count in both the write-side and the read-size. [?] Question: Should I set a hard limit on retry_count in both read and write? Say it hits 50, we always abandon it. The problem is that these changes only mitigate the issue. As long as it made at least one byte of progress, the recursion is still an issue. This patch mitigates the problem, but does not fix the underlying cause. I have patches that will do that, but it's an intrusive fix that's currently pending for the next merge window. The oops generated by KASAN looks something like: BUG: TASK stack guard page was hit at ffffc9000482ff48 (stack is ffffc90004830000..ffffc90004838000) Oops: stack guard page: 0000 [#1] PREEMPT SMP KASAN NOPTI ... RIP: 0010:mark_lock+0x25/0xc60 kernel/locking/lockdep.c:4686 ... mark_usage kernel/locking/lockdep.c:4646 [inline] __lock_acquire+0x906/0x3ce0 kernel/locking/lockdep.c:5156 lock_acquire.part.0+0x11b/0x380 kernel/locking/lockdep.c:5825 local_lock_acquire include/linux/local_lock_internal.h:29 [inline] ___slab_alloc+0x123/0x1880 mm/slub.c:3695 __slab_alloc.constprop.0+0x56/0xb0 mm/slub.c:3908 __slab_alloc_node mm/slub.c:3961 [inline] slab_alloc_node mm/slub.c:4122 [inline] kmem_cache_alloc_noprof+0x2a7/0x2f0 mm/slub.c:4141 radix_tree_node_alloc.constprop.0+0x1e8/0x350 lib/radix-tree.c:253 idr_get_free+0x528/0xa40 lib/radix-tree.c:1506 idr_alloc_u32+0x191/0x2f0 lib/idr.c:46 idr_alloc+0xc1/0x130 lib/idr.c:87 p9_tag_alloc+0x394/0x870 net/9p/client.c:321 p9_client_prepare_req+0x19f/0x4d0 net/9p/client.c:644 p9_client_zc_rpc.constprop.0+0x105/0x880 net/9p/client.c:793 p9_client_read_once+0x443/0x820 net/9p/client.c:1570 p9_client_read+0x13f/0x1b0 net/9p/client.c:1534 v9fs_issue_read+0x115/0x310 fs/9p/vfs_addr.c:74 netfs_retry_read_subrequests fs/netfs/read_retry.c:60 [inline] netfs_retry_reads+0x153a/0x1d00 fs/netfs/read_retry.c:232 netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:371 netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:407 netfs_retry_reads+0x155e/0x1d00 fs/netfs/read_retry.c:235 netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:371 netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:407 netfs_retry_reads+0x155e/0x1d00 fs/netfs/read_retry.c:235 netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:371 ... netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:407 netfs_retry_reads+0x155e/0x1d00 fs/netfs/read_retry.c:235 netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:371 netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:407 netfs_retry_reads+0x155e/0x1d00 fs/netfs/read_retry.c:235 netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:371 netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:407 netfs_dispatch_unbuffered_reads fs/netfs/direct_read.c:103 [inline] netfs_unbuffered_read fs/netfs/direct_read.c:127 [inline] netfs_unbuffered_read_iter_locked+0x12f6/0x19b0 fs/netfs/direct_read.c:221 netfs_unbuffered_read_iter+0xc5/0x100 fs/netfs/direct_read.c:256 v9fs_file_read_iter+0xbf/0x100 fs/9p/vfs_file.c:361 do_iter_readv_writev+0x614/0x7f0 fs/read_write.c:832 vfs_readv+0x4cf/0x890 fs/read_write.c:1025 do_preadv fs/read_write.c:1142 [inline] __do_sys_preadv fs/read_write.c:1192 [inline] __se_sys_preadv fs/read_write.c:1187 [inline] __x64_sys_preadv+0x22d/0x310 fs/read_write.c:1187 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83 Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Closes: https://syzkaller.appspot.com/bug?extid=1fc6f64c40a9d143cfb6 Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20241108034020.3695718-1-lizhi.xu@windriver.com/ [1] Link: https://lore.kernel.org/r/20241213135013.2964079-9-dhowells@redhat.com Tested-by: syzbot+885c03ad650731743489@syzkaller.appspotmail.com Suggested-by: Lizhi Xu <lizhi.xu@windriver.com> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Jeff Layton <jlayton@kernel.org> cc: v9fs@lists.linux.dev cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Reported-by: syzbot+885c03ad650731743489@syzkaller.appspotmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
| * | | | | | | | iomap: fix zero padding data issue in concurrent append writesLong Li2024-12-111-1/+1
| | |_|_|_|/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During concurrent append writes to XFS filesystem, zero padding data may appear in the file after power failure. This happens due to imprecise disk size updates when handling write completion. Consider this scenario with concurrent append writes same file: Thread 1: Thread 2: ------------ ----------- write [A, A+B] update inode size to A+B submit I/O [A, A+BS] write [A+B, A+B+C] update inode size to A+B+C <I/O completes, updates disk size to min(A+B+C, A+BS)> <power failure> After reboot: 1) with A+B+C < A+BS, the file has zero padding in range [A+B, A+B+C] |< Block Size (BS) >| |DDDDDDDDDDDDDDDD0000000000000000| ^ ^ ^ A A+B A+B+C (EOF) 2) with A+B+C > A+BS, the file has zero padding in range [A+B, A+BS] |< Block Size (BS) >|< Block Size (BS) >| |DDDDDDDDDDDDDDDD0000000000000000|00000000000000000000000000000000| ^ ^ ^ ^ A A+B A+BS A+B+C (EOF) D = Valid Data 0 = Zero Padding The issue stems from disk size being set to min(io_offset + io_size, inode->i_size) at I/O completion. Since io_offset+io_size is block size granularity, it may exceed the actual valid file data size. In the case of concurrent append writes, inode->i_size may be larger than the actual range of valid file data written to disk, leading to inaccurate disk size updates. This patch modifies the meaning of io_size to represent the size of valid data within EOF in an ioend. If the ioend spans beyond i_size, io_size will be trimmed to provide the file with more accurate size information. This is particularly useful for on-disk size updates at completion time. After this change, ioends that span i_size will not grow or merge with other ioends in concurrent scenarios. However, these cases that need growth/merging rarely occur and it seems no noticeable performance impact. Although rounding up io_size could enable ioend growth/merging in these scenarios, we decided to keep the code simple after discussion [1]. Another benefit is that it makes the xfs_ioend_is_append() check more accurate, which can reduce unnecessary end bio callbacks of xfs_end_bio() in certain scenarios, such as repeated writes at the file tail without extending the file size. Link [1]: https://patchwork.kernel.org/project/xfs/patch/20241113091907.56937-1-leo.lilong@huawei.com Fixes: ae259a9c8593 ("fs: introduce iomap infrastructure") # goes further back than this Signed-off-by: Long Li <leo.lilong@huawei.com> Link: https://lore.kernel.org/r/20241209114241.3725722-3-leo.lilong@huawei.com Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
* | | | | | | | Merge tag 'mm-hotfixes-stable-2025-01-04-18-02' of ↵Linus Torvalds13 days4-22/+86
|\ \ \ \ \ \ \ \ | |_|_|_|/ / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull hotfixes from Andrew Morton: "25 hotfixes. 16 are cc:stable. 18 are MM and 7 are non-MM. The usual bunch of singletons and two doubletons - please see the relevant changelogs for details" * tag 'mm-hotfixes-stable-2025-01-04-18-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (25 commits) MAINTAINERS: change Arınç _NAL's name and email address scripts/sorttable: fix orc_sort_cmp() to maintain symmetry and transitivity mm/util: make memdup_user_nul() similar to memdup_user() mm, madvise: fix potential workingset node list_lru leaks mm/damon/core: fix ignored quota goals and filters of newly committed schemes mm/damon/core: fix new damon_target objects leaks on damon_commit_targets() mm/list_lru: fix false warning of negative counter vmstat: disable vmstat_work on vmstat_cpu_down_prep() mm: shmem: fix the update of 'shmem_falloc->nr_unswapped' mm: shmem: fix incorrect index alignment for within_size policy percpu: remove intermediate variable in PERCPU_PTR() mm: zswap: fix race between [de]compression and CPU hotunplug ocfs2: fix slab-use-after-free due to dangling pointer dqi_priv fs/proc/task_mmu: fix pagemap flags with PMD THP entries on 32bit kcov: mark in_softirq_really() as __always_inline docs: mm: fix the incorrect 'FileHugeMapped' field mailmap: modify the entry for Mathieu Othacehe mm/kmemleak: fix sleeping function called from invalid context at print message mm: hugetlb: independent PMD page table shared count maple_tree: reload mas before the second call for mas_empty_area ...
| * | | | | | | percpu: remove intermediate variable in PERCPU_PTR()Gal Pressman2024-12-311-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The intermediate variable in the PERCPU_PTR() macro results in a kernel panic on boot [1] due to a compiler bug seen when compiling the kernel (+ KASAN) with gcc 11.3.1, but not when compiling with latest gcc (v14.2)/clang(v18.1). To solve it, remove the intermediate variable (which is not needed) and keep the casting that resolves the address space checks. [1] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f] CPU: 0 UID: 0 PID: 547 Comm: iptables Not tainted 6.13.0-rc1_external_tested-master #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:nf_ct_netns_do_get+0x139/0x540 Code: 03 00 00 48 81 c4 88 00 00 00 5b 5d 41 5c 41 5d 41 5e 41 5f c3 4d 8d 75 08 48 b8 00 00 00 00 00 fc ff df 4c 89 f2 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 27 03 00 00 41 8b 45 08 83 c0 RSP: 0018:ffff888116df75e8 EFLAGS: 00010207 RAX: dffffc0000000000 RBX: 1ffff11022dbeebe RCX: ffffffff839a2382 RDX: 0000000000000003 RSI: 0000000000000008 RDI: ffff88842ec46d10 RBP: 0000000000000002 R08: 0000000000000000 R09: fffffbfff0b0860c R10: ffff888116df75e8 R11: 0000000000000001 R12: ffffffff879d6a80 R13: 0000000000000016 R14: 000000000000001e R15: ffff888116df7908 FS: 00007fba01646740(0000) GS:ffff88842ec00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055bd901800d8 CR3: 00000001205f0003 CR4: 0000000000172eb0 Call Trace: <TASK> ? die_addr+0x3d/0xa0 ? exc_general_protection+0x144/0x220 ? asm_exc_general_protection+0x22/0x30 ? __mutex_lock+0x2c2/0x1d70 ? nf_ct_netns_do_get+0x139/0x540 ? nf_ct_netns_do_get+0xb5/0x540 ? net_generic+0x1f0/0x1f0 ? __create_object+0x5e/0x80 xt_check_target+0x1f0/0x930 ? textify_hooks.constprop.0+0x110/0x110 ? pcpu_alloc_noprof+0x7cd/0xcf0 ? xt_find_target+0x148/0x1e0 find_check_entry.constprop.0+0x6c0/0x920 ? get_info+0x380/0x380 ? __virt_addr_valid+0x1df/0x3b0 ? kasan_quarantine_put+0xe3/0x200 ? kfree+0x13e/0x3d0 ? translate_table+0xaf5/0x1750 translate_table+0xbd8/0x1750 ? ipt_unregister_table_exit+0x30/0x30 ? __might_fault+0xbb/0x170 do_ipt_set_ctl+0x408/0x1340 ? nf_sockopt_find.constprop.0+0x17b/0x1f0 ? lock_downgrade+0x680/0x680 ? lockdep_hardirqs_on_prepare+0x284/0x400 ? ipt_register_table+0x440/0x440 ? bit_wait_timeout+0x160/0x160 nf_setsockopt+0x6f/0xd0 raw_setsockopt+0x7e/0x200 ? raw_bind+0x590/0x590 ? do_user_addr_fault+0x812/0xd20 do_sock_setsockopt+0x1e2/0x3f0 ? move_addr_to_user+0x90/0x90 ? lock_downgrade+0x680/0x680 __sys_setsockopt+0x9e/0x100 __x64_sys_setsockopt+0xb9/0x150 ? do_syscall_64+0x33/0x140 do_syscall_64+0x6d/0x140 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7fba015134ce Code: 0f 1f 40 00 48 8b 15 59 69 0e 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b1 0f 1f 00 f3 0f 1e fa 49 89 ca b8 36 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 0a c3 66 0f 1f 84 00 00 00 00 00 48 8b 15 21 RSP: 002b:00007ffd9de6f388 EFLAGS: 00000246 ORIG_RAX: 0000000000000036 RAX: ffffffffffffffda RBX: 000055bd9017f490 RCX: 00007fba015134ce RDX: 0000000000000040 RSI: 0000000000000000 RDI: 0000000000000004 RBP: 0000000000000500 R08: 0000000000000560 R09: 0000000000000052 R10: 000055bd901800e0 R11: 0000000000000246 R12: 000055bd90180140 R13: 000055bd901800e0 R14: 000055bd9017f498 R15: 000055bd9017ff10 </TASK> Modules linked in: xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay zram zsmalloc mlx4_ib mlx4_en mlx4_core rpcrdma rdma_ucm ib_uverbs ib_iser libiscsi scsi_transport_iscsi fuse ib_umad rdma_cm ib_ipoib iw_cm ib_cm ib_core ---[ end trace 0000000000000000 ]--- [akpm@linux-foundation.org: simplification, per Uros] Link: https://lkml.kernel.org/r/20241219121828.2120780-1-gal@nvidia.com Fixes: dabddd687c9e ("percpu: cast percpu pointer in PERCPU_PTR() via unsigned long") Signed-off-by: Gal Pressman <gal@nvidia.com> Closes: https://lore.kernel.org/all/7590f546-4021-4602-9252-0d525de35b52@nvidia.com Cc: Uros Bizjak <ubizjak@gmail.com> Cc: Bill Wendling <morbo@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * | | | | | | mm: hugetlb: independent PMD page table shared countLiu Shixin2024-12-312-0/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The folio refcount may be increased unexpectly through try_get_folio() by caller such as split_huge_pages. In huge_pmd_unshare(), we use refcount to check whether a pmd page table is shared. The check is incorrect if the refcount is increased by the above caller, and this can cause the page table leaked: BUG: Bad page state in process sh pfn:109324 page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x66 pfn:0x109324 flags: 0x17ffff800000000(node=0|zone=2|lastcpupid=0xfffff) page_type: f2(table) raw: 017ffff800000000 0000000000000000 0000000000000000 0000000000000000 raw: 0000000000000066 0000000000000000 00000000f2000000 0000000000000000 page dumped because: nonzero mapcount ... CPU: 31 UID: 0 PID: 7515 Comm: sh Kdump: loaded Tainted: G B 6.13.0-rc2master+ #7 Tainted: [B]=BAD_PAGE Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 Call trace: show_stack+0x20/0x38 (C) dump_stack_lvl+0x80/0xf8 dump_stack+0x18/0x28 bad_page+0x8c/0x130 free_page_is_bad_report+0xa4/0xb0 free_unref_page+0x3cc/0x620 __folio_put+0xf4/0x158 split_huge_pages_all+0x1e0/0x3e8 split_huge_pages_write+0x25c/0x2d8 full_proxy_write+0x64/0xd8 vfs_write+0xcc/0x280 ksys_write+0x70/0x110 __arm64_sys_write+0x24/0x38 invoke_syscall+0x50/0x120 el0_svc_common.constprop.0+0xc8/0xf0 do_el0_svc+0x24/0x38 el0_svc+0x34/0x128 el0t_64_sync_handler+0xc8/0xd0 el0t_64_sync+0x190/0x198 The issue may be triggered by damon, offline_page, page_idle, etc, which will increase the refcount of page table. 1. The page table itself will be discarded after reporting the "nonzero mapcount". 2. The HugeTLB page mapped by the page table miss freeing since we treat the page table as shared and a shared page table will not be unmapped. Fix it by introducing independent PMD page table shared count. As described by comment, pt_index/pt_mm/pt_frag_refcount are used for s390 gmap, x86 pgds and powerpc, pt_share_count is used for x86/arm64/riscv pmds, so we can reuse the field as pt_share_count. Link: https://lkml.kernel.org/r/20241216071147.3984217-1-liushixin2@huawei.com Fixes: 39dde65c9940 ("[PATCH] shared page table for hugetlb page") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Ken Chen <kenneth.w.chen@intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Cc: Jane Chu <jane.chu@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * | | | | | | mm: reinstate ability to map write-sealed memfd mappings read-onlyLorenzo Stoakes2024-12-312-18/+54
| | |_|_|_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "mm: reinstate ability to map write-sealed memfd mappings read-only". In commit 158978945f31 ("mm: perform the mapping_map_writable() check after call_mmap()") (and preceding changes in the same series) it became possible to mmap() F_SEAL_WRITE sealed memfd mappings read-only. Commit 5de195060b2e ("mm: resolve faulty mmap_region() error path behaviour") unintentionally undid this logic by moving the mapping_map_writable() check before the shmem_mmap() hook is invoked, thereby regressing this change. This series reworks how we both permit write-sealed mappings being mapped read-only and disallow mprotect() from undoing the write-seal, fixing this regression. We also add a regression test to ensure that we do not accidentally regress this in future. Thanks to Julian Orth for reporting this regression. This patch (of 2): In commit 158978945f31 ("mm: perform the mapping_map_writable() check after call_mmap()") (and preceding changes in the same series) it became possible to mmap() F_SEAL_WRITE sealed memfd mappings read-only. This was previously unnecessarily disallowed, despite the man page documentation indicating that it would be, thereby limiting the usefulness of F_SEAL_WRITE logic. We fixed this by adapting logic that existed for the F_SEAL_FUTURE_WRITE seal (one which disallows future writes to the memfd) to also be used for F_SEAL_WRITE. For background - the F_SEAL_FUTURE_WRITE seal clears VM_MAYWRITE for a read-only mapping to disallow mprotect() from overriding the seal - an operation performed by seal_check_write(), invoked from shmem_mmap(), the f_op->mmap() hook used by shmem mappings. By extending this to F_SEAL_WRITE and critically - checking mapping_map_writable() to determine if we may map the memfd AFTER we invoke shmem_mmap() - the desired logic becomes possible. This is because mapping_map_writable() explicitly checks for VM_MAYWRITE, which we will have cleared. Commit 5de195060b2e ("mm: resolve faulty mmap_region() error path behaviour") unintentionally undid this logic by moving the mapping_map_writable() check before the shmem_mmap() hook is invoked, thereby regressing this change. We reinstate this functionality by moving the check out of shmem_mmap() and instead performing it in do_mmap() at the point at which VMA flags are being determined, which seems in any case to be a more appropriate place in which to make this determination. In order to achieve this we rework memfd seal logic to allow us access to this information using existing logic and eliminate the clearing of VM_MAYWRITE from seal_check_write() which we are performing in do_mmap() instead. Link: https://lkml.kernel.org/r/99fc35d2c62bd2e05571cf60d9f8b843c56069e0.1732804776.git.lorenzo.stoakes@oracle.com Fixes: 5de195060b2e ("mm: resolve faulty mmap_region() error path behaviour") Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reported-by: Julian Orth <ju.orth@gmail.com> Closes: https://lore.kernel.org/all/CAHijbEUMhvJTN9Xw1GmbM266FXXv=U7s4L_Jem5x3AaPZxrYpQ@mail.gmail.com/ Cc: Jann Horn <jannh@google.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | | | | | | Merge tag 'net-6.13-rc6' of ↵Linus Torvalds2025-01-034-29/+45
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from wireles and netfilter. Nothing major here. Over the last two weeks we gathered only around two-thirds of our normal weekly fix count, but delaying sending these until -rc7 seemed like a really bad idea. AFAIK we have no bugs under investigation. One or two reverts for stuff for which we haven't gotten a proper fix will likely come in the next PR. Current release - fix to a fix: - netfilter: nft_set_hash: unaligned atomic read on struct nft_set_ext - eth: gve: trigger RX NAPI instead of TX NAPI in gve_xsk_wakeup Previous releases - regressions: - net: reenable NETIF_F_IPV6_CSUM offload for BIG TCP packets - mptcp: - fix sleeping rcvmsg sleeping forever after bad recvbuffer adjust - fix TCP options overflow - prevent excessive coalescing on receive, fix throughput - net: fix memory leak in tcp_conn_request() if map insertion fails - wifi: cw1200: fix potential NULL dereference after conversion to GPIO descriptors - phy: micrel: dynamically control external clock of KSZ PHY, fix suspend behavior Previous releases - always broken: - af_packet: fix VLAN handling with MSG_PEEK - net: restrict SO_REUSEPORT to inet sockets - netdev-genl: avoid empty messages in NAPI get - dsa: microchip: fix set_ageing_time function on KSZ9477 and LAN937X - eth: - gve: XDP fixes around transmit, queue wakeup etc. - ti: icssg-prueth: fix firmware load sequence to prevent time jump which breaks timesync related operations Misc: - netlink: specs: mptcp: add missing attr and improve documentation" * tag 'net-6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits) net: ti: icssg-prueth: Fix clearing of IEP_CMP_CFG registers during iep_init net: ti: icssg-prueth: Fix firmware load sequence. mptcp: prevent excessive coalescing on receive mptcp: don't always assume copied data in mptcp_cleanup_rbuf() mptcp: fix recvbuffer adjust on sleeping rcvmsg ila: serialize calls to nf_register_net_hooks() af_packet: fix vlan_get_protocol_dgram() vs MSG_PEEK af_packet: fix vlan_get_tci() vs MSG_PEEK net: wwan: iosm: Properly check for valid exec stage in ipc_mmio_init() net: restrict SO_REUSEPORT to inet sockets net: reenable NETIF_F_IPV6_CSUM offload for BIG TCP packets net: sfc: Correct key_len for efx_tc_ct_zone_ht_params net: wwan: t7xx: Fix FSM command timeout issue sky2: Add device ID 11ab:4373 for Marvell 88E8075 mptcp: fix TCP options overflow. net: mv643xx_eth: fix an OF node reference leak gve: trigger RX NAPI instead of TX NAPI in gve_xsk_wakeup eth: bcmsysport: fix call balance of priv->clk handling routines net: llc: reset skb->transport_header netlink: specs: mptcp: fix missing doc ...
| * | | | | | | af_packet: fix vlan_get_protocol_dgram() vs MSG_PEEKEric Dumazet2025-01-031-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Blamed commit forgot MSG_PEEK case, allowing a crash [1] as found by syzbot. Rework vlan_get_protocol_dgram() to not touch skb at all, so that it can be used from many cpus on the same skb. Add a const qualifier to skb argument. [1] skbuff: skb_under_panic: text:ffffffff8a8ccd05 len:29 put:14 head:ffff88807fc8e400 data:ffff88807fc8e3f4 tail:0x11 end:0x140 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:206 ! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 1 UID: 0 PID: 5892 Comm: syz-executor883 Not tainted 6.13.0-rc4-syzkaller-00054-gd6ef8b40d075 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 RIP: 0010:skb_panic net/core/skbuff.c:206 [inline] RIP: 0010:skb_under_panic+0x14b/0x150 net/core/skbuff.c:216 Code: 0b 8d 48 c7 c6 86 d5 25 8e 48 8b 54 24 08 8b 0c 24 44 8b 44 24 04 4d 89 e9 50 41 54 41 57 41 56 e8 5a 69 79 f7 48 83 c4 20 90 <0f> 0b 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 RSP: 0018:ffffc900038d7638 EFLAGS: 00010282 RAX: 0000000000000087 RBX: dffffc0000000000 RCX: 609ffd18ea660600 RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000 RBP: ffff88802483c8d0 R08: ffffffff817f0a8c R09: 1ffff9200071ae60 R10: dffffc0000000000 R11: fffff5200071ae61 R12: 0000000000000140 R13: ffff88807fc8e400 R14: ffff88807fc8e3f4 R15: 0000000000000011 FS: 00007fbac5e006c0(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fbac5e00d58 CR3: 000000001238e000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> skb_push+0xe5/0x100 net/core/skbuff.c:2636 vlan_get_protocol_dgram+0x165/0x290 net/packet/af_packet.c:585 packet_recvmsg+0x948/0x1ef0 net/packet/af_packet.c:3552 sock_recvmsg_nosec net/socket.c:1033 [inline] sock_recvmsg+0x22f/0x280 net/socket.c:1055 ____sys_recvmsg+0x1c6/0x480 net/socket.c:2803 ___sys_recvmsg net/socket.c:2845 [inline] do_recvmmsg+0x426/0xab0 net/socket.c:2940 __sys_recvmmsg net/socket.c:3014 [inline] __do_sys_recvmmsg net/socket.c:3037 [inline] __se_sys_recvmmsg net/socket.c:3030 [inline] __x64_sys_recvmmsg+0x199/0x250 net/socket.c:3030 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: 79eecf631c14 ("af_packet: Handle outgoing VLAN packets without hardware offloading") Reported-by: syzbot+74f70bb1cb968bf09e4f@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6772c485.050a0220.2f3838.04c5.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Chengen Du <chengen.du@canonical.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20241230161004.2681892-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | | | | | | Merge tag 'nf-24-12-25' of ↵Jakub Kicinski2024-12-271-2/+5
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following batch contains one Netfilter fix for net: 1) Fix unaligned atomic read on struct nft_set_ext in nft_set_hash backend that causes an alignment failure splat on aarch64. This is related to a recent fix and it has been reported via the regressions mailing list. * tag 'nf-24-12-25' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_set_hash: unaligned atomic read on struct nft_set_ext ==================== Link: https://patch.msgid.link/20241224233109.361755-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * | | | | | | netfilter: nft_set_hash: unaligned atomic read on struct nft_set_extPablo Neira Ayuso2024-12-251-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Access to genmask field in struct nft_set_ext results in unaligned atomic read: [ 72.130109] Unable to handle kernel paging request at virtual address ffff0000c2bb708c [ 72.131036] Mem abort info: [ 72.131213] ESR = 0x0000000096000021 [ 72.131446] EC = 0x25: DABT (current EL), IL = 32 bits [ 72.132209] SET = 0, FnV = 0 [ 72.133216] EA = 0, S1PTW = 0 [ 72.134080] FSC = 0x21: alignment fault [ 72.135593] Data abort info: [ 72.137194] ISV = 0, ISS = 0x00000021, ISS2 = 0x00000000 [ 72.142351] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 72.145989] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 72.150115] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000237d27000 [ 72.154893] [ffff0000c2bb708c] pgd=0000000000000000, p4d=180000023ffff403, pud=180000023f84b403, pmd=180000023f835403, +pte=0068000102bb7707 [ 72.163021] Internal error: Oops: 0000000096000021 [#1] SMP [...] [ 72.170041] CPU: 7 UID: 0 PID: 54 Comm: kworker/7:0 Tainted: G E 6.13.0-rc3+ #2 [ 72.170509] Tainted: [E]=UNSIGNED_MODULE [ 72.170720] Hardware name: QEMU QEMU Virtual Machine, BIOS edk2-stable202302-for-qemu 03/01/2023 [ 72.171192] Workqueue: events_power_efficient nft_rhash_gc [nf_tables] [ 72.171552] pstate: 21400005 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) [ 72.171915] pc : nft_rhash_gc+0x200/0x2d8 [nf_tables] [ 72.172166] lr : nft_rhash_gc+0x128/0x2d8 [nf_tables] [ 72.172546] sp : ffff800081f2bce0 [ 72.172724] x29: ffff800081f2bd40 x28: ffff0000c2bb708c x27: 0000000000000038 [ 72.173078] x26: ffff0000c6780ef0 x25: ffff0000c643df00 x24: ffff0000c6778f78 [ 72.173431] x23: 000000000000001a x22: ffff0000c4b1f000 x21: ffff0000c6780f78 [ 72.173782] x20: ffff0000c2bb70dc x19: ffff0000c2bb7080 x18: 0000000000000000 [ 72.174135] x17: ffff0000c0a4e1c0 x16: 0000000000003000 x15: 0000ac26d173b978 [ 72.174485] x14: ffffffffffffffff x13: 0000000000000030 x12: ffff0000c6780ef0 [ 72.174841] x11: 0000000000000000 x10: ffff800081f2bcf8 x9 : ffff0000c3000000 [ 72.175193] x8 : 00000000000004be x7 : 0000000000000000 x6 : 0000000000000000 [ 72.175544] x5 : 0000000000000040 x4 : ffff0000c3000010 x3 : 0000000000000000 [ 72.175871] x2 : 0000000000003a98 x1 : ffff0000c2bb708c x0 : 0000000000000004 [ 72.176207] Call trace: [ 72.176316] nft_rhash_gc+0x200/0x2d8 [nf_tables] (P) [ 72.176653] process_one_work+0x178/0x3d0 [ 72.176831] worker_thread+0x200/0x3f0 [ 72.176995] kthread+0xe8/0xf8 [ 72.177130] ret_from_fork+0x10/0x20 [ 72.177289] Code: 54fff984 d503201f d2800080 91003261 (f820303f) [ 72.177557] ---[ end trace 0000000000000000 ]--- Align struct nft_set_ext to word size to address this and documentation it. pahole reports that this increases the size of elements for rhash and pipapo in 8 bytes on x86_64. Fixes: 7ffc7481153b ("netfilter: nft_set_hash: skip duplicated elements pending gc run") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | | | | | netlink: specs: mptcp: clearly mention attributesMatthieu Baerts (NGI0)2024-12-271-26/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rendered version of the MPTCP events [1] looked strange, because the whole content of the 'doc' was displayed in the same block. It was then not clear that the first words, not even ended by a period, were the attributes that are defined when such events are emitted. These attributes have now been moved to the end, prefixed by 'Attributes:' and ended with a period. Note that '>-' has been added after 'doc:' to allow ':' in the text below. The documentation in the UAPI header has been auto-generated by: ./tools/net/ynl/ynl-regen.sh Link: https://docs.kernel.org/networking/netlink_spec/mptcp_pm.html#event-type [1] Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20241221-net-mptcp-netlink-specs-pm-doc-fixes-v2-2-e54f2db3f844@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>