summaryrefslogtreecommitdiffstats
path: root/fs/ocfs2 (follow)
Commit message (Collapse)AuthorAgeFilesLines
* ocfs2: check dir i_size in ocfs2_find_entrySu Yue2 days1-4/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | syz reports an out of bounds read: ================================================================== BUG: KASAN: slab-out-of-bounds in ocfs2_match fs/ocfs2/dir.c:334 [inline] BUG: KASAN: slab-out-of-bounds in ocfs2_search_dirblock+0x283/0x6e0 fs/ocfs2/dir.c:367 Read of size 1 at addr ffff88804d8b9982 by task syz-executor.2/14802 CPU: 0 UID: 0 PID: 14802 Comm: syz-executor.2 Not tainted 6.13.0-rc4 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Sched_ext: serialise (enabled+all), task: runnable_at=-10ms Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x229/0x350 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0x164/0x530 mm/kasan/report.c:489 kasan_report+0x147/0x180 mm/kasan/report.c:602 ocfs2_match fs/ocfs2/dir.c:334 [inline] ocfs2_search_dirblock+0x283/0x6e0 fs/ocfs2/dir.c:367 ocfs2_find_entry_id fs/ocfs2/dir.c:414 [inline] ocfs2_find_entry+0x1143/0x2db0 fs/ocfs2/dir.c:1078 ocfs2_find_files_on_disk+0x18e/0x530 fs/ocfs2/dir.c:1981 ocfs2_lookup_ino_from_name+0xb6/0x110 fs/ocfs2/dir.c:2003 ocfs2_lookup+0x30a/0xd40 fs/ocfs2/namei.c:122 lookup_open fs/namei.c:3627 [inline] open_last_lookups fs/namei.c:3748 [inline] path_openat+0x145a/0x3870 fs/namei.c:3984 do_filp_open+0xe9/0x1c0 fs/namei.c:4014 do_sys_openat2+0x135/0x1d0 fs/open.c:1402 do_sys_open fs/open.c:1417 [inline] __do_sys_openat fs/open.c:1433 [inline] __se_sys_openat fs/open.c:1428 [inline] __x64_sys_openat+0x15d/0x1c0 fs/open.c:1428 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf6/0x210 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f01076903ad Code: c3 e8 a7 2b 00 00 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f01084acfc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 00007f01077cbf80 RCX: 00007f01076903ad RDX: 0000000000105042 RSI: 0000000020000080 RDI: ffffffffffffff9c RBP: 00007f01077cbf80 R08: 0000000000000000 R09: 0000000000000000 R10: 00000000000001ff R11: 0000000000000246 R12: 0000000000000000 R13: 00007f01077cbf80 R14: 00007f010764fc90 R15: 00007f010848d000 </TASK> ================================================================== And a general protection fault in ocfs2_prepare_dir_for_insert: ================================================================== loop0: detected capacity change from 0 to 32768 JBD2: Ignoring recovery information on journal ocfs2: Mounting device (7,0) on (node local, slot 0) with ordered data mode. Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 UID: 0 PID: 5096 Comm: syz-executor792 Not tainted 6.11.0-rc4-syzkaller-00002-gb0da640826ba #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 RIP: 0010:ocfs2_find_dir_space_id fs/ocfs2/dir.c:3406 [inline] RIP: 0010:ocfs2_prepare_dir_for_insert+0x3309/0x5c70 fs/ocfs2/dir.c:4280 Code: 00 00 e8 2a 25 13 fe e9 ba 06 00 00 e8 20 25 13 fe e9 4f 01 00 00 e8 16 25 13 fe 49 8d 7f 08 49 8d 5f 09 48 89 f8 48 c1 e8 03 <42> 0f b6 04 20 84 c0 0f 85 bd 23 00 00 48 89 d8 48 c1 e8 03 42 0f RSP: 0018:ffffc9000af9f020 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000009 RCX: ffff88801e27a440 RDX: 0000000000000000 RSI: 0000000000000400 RDI: 0000000000000008 RBP: ffffc9000af9f830 R08: ffffffff8380395b R09: ffffffff838090a7 R10: 0000000000000002 R11: ffff88801e27a440 R12: dffffc0000000000 R13: ffff88803c660878 R14: f700000000000088 R15: 0000000000000000 FS: 000055555a677380(0000) GS:ffff888020800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000560bce569178 CR3: 000000001de5a000 CR4: 0000000000350ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ocfs2_mknod+0xcaf/0x2b40 fs/ocfs2/namei.c:292 vfs_mknod+0x36d/0x3b0 fs/namei.c:4088 do_mknodat+0x3ec/0x5b0 __do_sys_mknodat fs/namei.c:4166 [inline] __se_sys_mknodat fs/namei.c:4163 [inline] __x64_sys_mknodat+0xa7/0xc0 fs/namei.c:4163 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f2dafda3a99 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 17 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffe336a6658 EFLAGS: 00000246 ORIG_RAX: 0000000000000103 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2dafda3a99 RDX: 00000000000021c0 RSI: 0000000020000040 RDI: 00000000ffffff9c RBP: 00007f2dafe1b5f0 R08: 0000000000004480 R09: 000055555a6784c0 R10: 0000000000000103 R11: 0000000000000246 R12: 00007ffe336a6680 R13: 00007ffe336a68a8 R14: 431bde82d7b634db R15: 00007f2dafdec03b </TASK> ================================================================== The two reports are all caused invalid negative i_size of dir inode. For ocfs2, dir_inode can't be negative or zero. Here add a check in which is called by ocfs2_check_dir_for_entry(). It fixes the second report as ocfs2_check_dir_for_entry() must be called before ocfs2_prepare_dir_for_insert(). Also set a up limit for dir with OCFS2_INLINE_DATA_FL. The i_size can't be great than blocksize. Link: https://lkml.kernel.org/r/20250106140640.92260-1-glass.su@suse.com Reported-by: Jiacheng Xu <stitch@zju.edu.cn> Link: https://lore.kernel.org/ocfs2-devel/17a04f01.1ae74.19436d003fc.Coremail.stitch@zju.edu.cn/T/#u Reported-by: syzbot+5a64828fcc4c2ad9b04f@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/0000000000005894f3062018caf1@google.com/T/ Signed-off-by: Su Yue <glass.su@suse.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* ocfs2: fix slab-use-after-free due to dangling pointer dqi_privDennis Lam2024-12-312-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When mounting ocfs2 and then remounting it as read-only, a slab-use-after-free occurs after the user uses a syscall to quota_getnextquota. Specifically, sb_dqinfo(sb, type)->dqi_priv is the dangling pointer. During the remounting process, the pointer dqi_priv is freed but is never set as null leaving it to be accessed. Additionally, the read-only option for remounting sets the DQUOT_SUSPENDED flag instead of setting the DQUOT_USAGE_ENABLED flags. Moreover, later in the process of getting the next quota, the function ocfs2_get_next_id is called and only checks the quota usage flags and not the quota suspended flags. To fix this, I set dqi_priv to null when it is freed after remounting with read-only and put a check for DQUOT_SUSPENDED in ocfs2_get_next_id. [akpm@linux-foundation.org: coding-style cleanups] Link: https://lkml.kernel.org/r/20241218023924.22821-2-dennis.lamerice@gmail.com Fixes: 8f9e8f5fcc05 ("ocfs2: Fix Q_GETNEXTQUOTA for filesystem without quotas") Signed-off-by: Dennis Lam <dennis.lamerice@gmail.com> Reported-by: syzbot+d173bf8a5a7faeede34c@syzkaller.appspotmail.com Tested-by: syzbot+d173bf8a5a7faeede34c@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/6731d26f.050a0220.1fb99c.014b.GAE@google.com/T/ Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* ocfs2: fix the space leak in LA when releasing LAHeming Zhao2024-12-191-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | Commit 30dd3478c3cd ("ocfs2: correctly use ocfs2_find_next_zero_bit()") introduced an issue, the ocfs2_sync_local_to_main() ignores the last contiguous free bits, which causes an OCFS2 volume to lose the last free clusters of LA window during the release routine. Please note, because commit dfe6c5692fb5 ("ocfs2: fix the la space leak when unmounting an ocfs2 volume") was reverted, this commit is a replacement fix for commit dfe6c5692fb5. Link: https://lkml.kernel.org/r/20241205104835.18223-3-heming.zhao@suse.com Fixes: 30dd3478c3cd ("ocfs2: correctly use ocfs2_find_next_zero_bit()") Signed-off-by: Heming Zhao <heming.zhao@suse.com> Suggested-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* ocfs2: revert "ocfs2: fix the la space leak when unmounting an ocfs2 volume"Heming Zhao2024-12-191-19/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "Revert ocfs2 commit dfe6c5692fb5 and provide a new fix". SUSE QA team detected a mistake in my commit dfe6c5692fb5 ("ocfs2: fix the la space leak when unmounting an ocfs2 volume"). I am very sorry for my error. (If my eyes are correct) From the mailling list mails, this patch shouldn't be applied to 4.19 5.4 5.10 5.15 6.1 6.6, and these branches should perform a revert operation. Reason for revert: In commit dfe6c5692fb5, I mistakenly wrote: "This bug has existed since the initial OCFS2 code.". The statement is wrong. The correct introduction commit is 30dd3478c3cd. IOW, if the branch doesn't include 30dd3478c3cd, dfe6c5692fb5 should also not be included. This reverts commit dfe6c5692fb5 ("ocfs2: fix the la space leak when unmounting an ocfs2 volume"). In commit dfe6c5692fb5, the commit log "This bug has existed since the initial OCFS2 code." is wrong. The correct introduction commit is 30dd3478c3cd ("ocfs2: correctly use ocfs2_find_next_zero_bit()"). The influence of commit dfe6c5692fb5 is that it provides a correct fix for the latest kernel. however, it shouldn't be pushed to stable branches. Let's use this commit to revert all branches that include dfe6c5692fb5 and use a new fix method to fix commit 30dd3478c3cd. Link: https://lkml.kernel.org/r/20241205104835.18223-1-heming.zhao@suse.com Link: https://lkml.kernel.org/r/20241205104835.18223-2-heming.zhao@suse.com Fixes: dfe6c5692fb5 ("ocfs2: fix the la space leak when unmounting an ocfs2 volume") Signed-off-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* ocfs2: update seq_file index in ocfs2_dlm_seq_nextWengang Wang2024-12-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | The following INFO level message was seen: seq_file: buggy .next function ocfs2_dlm_seq_next [ocfs2] did not update position index Fix: Update *pos (so m->index) to make seq_read_iter happy though the index its self makes no sense to ocfs2_dlm_seq_next. Link: https://lkml.kernel.org/r/20241119174500.9198-1-wen.gang.wang@oracle.com Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* ocfs2: free inode when ocfs2_get_init_inode() failsTetsuo Handa2024-12-061-1/+3
| | | | | | | | | | | | | | | | | | | | syzbot is reporting busy inodes after unmount, for commit 9c89fe0af826 ("ocfs2: Handle error from dquot_initialize()") forgot to call iput() when new_inode() succeeded and dquot_initialize() failed. Link: https://lkml.kernel.org/r/e68c0224-b7c6-4784-b4fa-a9fc8c675525@I-love.SAKURA.ne.jp Fixes: 9c89fe0af826 ("ocfs2: Handle error from dquot_initialize()") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot+0af00f6a2cba2058b5db@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=0af00f6a2cba2058b5db Tested-by: syzbot+0af00f6a2cba2058b5db@syzkaller.appspotmail.com Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* Merge tag 'mm-nonmm-stable-2024-11-24-02-05' of ↵Linus Torvalds2024-11-267-58/+8
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - The series "resource: A couple of cleanups" from Andy Shevchenko performs some cleanups in the resource management code - The series "Improve the copy of task comm" from Yafang Shao addresses possible race-induced overflows in the management of task_struct.comm[] - The series "Remove unnecessary header includes from {tools/}lib/list_sort.c" from Kuan-Wei Chiu adds some cleanups and a small fix to the list_sort library code and to its selftest - The series "Enhance min heap API with non-inline functions and optimizations" also from Kuan-Wei Chiu optimizes and cleans up the min_heap library code - The series "nilfs2: Finish folio conversion" from Ryusuke Konishi finishes off nilfs2's folioification - The series "add detect count for hung tasks" from Lance Yang adds more userspace visibility into the hung-task detector's activity - Apart from that, singelton patches in many places - please see the individual changelogs for details * tag 'mm-nonmm-stable-2024-11-24-02-05' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (71 commits) gdb: lx-symbols: do not error out on monolithic build kernel/reboot: replace sprintf() with sysfs_emit() lib: util_macros_kunit: add kunit test for util_macros.h util_macros.h: fix/rework find_closest() macros Improve consistency of '#error' directive messages ocfs2: fix uninitialized value in ocfs2_file_read_iter() hung_task: add docs for hung_task_detect_count hung_task: add detect count for hung tasks dma-buf: use atomic64_inc_return() in dma_buf_getfile() fs/proc/kcore.c: fix coccinelle reported ERROR instances resource: avoid unnecessary resource tree walking in __region_intersects() ocfs2: remove unused errmsg function and table ocfs2: cluster: fix a typo lib/scatterlist: use sg_phys() helper checkpatch: always parse orig_commit in fixes tag nilfs2: convert metadata aops from writepage to writepages nilfs2: convert nilfs_recovery_copy_block() to take a folio nilfs2: convert nilfs_page_count_clean_buffers() to take a folio nilfs2: remove nilfs_writepage nilfs2: convert checkpoint file to be folio-based ...
| * ocfs2: fix uninitialized value in ocfs2_file_read_iter()Dmitry Antipov2024-11-122-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Syzbot has reported the following KMSAN splat: BUG: KMSAN: uninit-value in ocfs2_file_read_iter+0x9a4/0xf80 ocfs2_file_read_iter+0x9a4/0xf80 __io_read+0x8d4/0x20f0 io_read+0x3e/0xf0 io_issue_sqe+0x42b/0x22c0 io_wq_submit_work+0xaf9/0xdc0 io_worker_handle_work+0xd13/0x2110 io_wq_worker+0x447/0x1410 ret_from_fork+0x6f/0x90 ret_from_fork_asm+0x1a/0x30 Uninit was created at: __alloc_pages_noprof+0x9a7/0xe00 alloc_pages_mpol_noprof+0x299/0x990 alloc_pages_noprof+0x1bf/0x1e0 allocate_slab+0x33a/0x1250 ___slab_alloc+0x12ef/0x35e0 kmem_cache_alloc_bulk_noprof+0x486/0x1330 __io_alloc_req_refill+0x84/0x560 io_submit_sqes+0x172f/0x2f30 __se_sys_io_uring_enter+0x406/0x41c0 __x64_sys_io_uring_enter+0x11f/0x1a0 x64_sys_call+0x2b54/0x3ba0 do_syscall_64+0xcd/0x1e0 entry_SYSCALL_64_after_hwframe+0x77/0x7f Since an instance of 'struct kiocb' may be passed from the block layer with 'private' field uninitialized, introduce 'ocfs2_iocb_init_rw_locked()' and use it from where 'ocfs2_dio_end_io()' might take care, i.e. in 'ocfs2_file_read_iter()' and 'ocfs2_file_write_iter()'. Link: https://lkml.kernel.org/r/20241029091736.1501946-1-dmantipov@yandex.ru Fixes: 7cdfc3a1c397 ("ocfs2: Remember rw lock level during direct io") Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Reported-by: syzbot+a73e253cca4f0230a5a5@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a73e253cca4f0230a5a5 Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * ocfs2: remove unused errmsg function and tableDr. David Alan Gilbert2024-11-062-55/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dlm_errmsg() has been unused since 2010's commit 0016eedc4185 ("ocfs2_dlmfs: Use the stackglue.") Remove dlm_errmsg() and the message table it indexes. Link: https://lkml.kernel.org/r/20241022002543.302606-1-linux@treblig.org Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * ocfs2: cluster: fix a typoAndrew Kreimer2024-11-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a typo: panicing -> panicking. Via codespell. Link: https://lkml.kernel.org/r/20241027133540.22090-1-algonell@gmail.com Signed-off-by: Andrew Kreimer <algonell@gmail.com> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * ocfs2: fix typo in commentMohammed Anees2024-11-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix "Allcate" -> "Allocate" Link: https://lkml.kernel.org/r/20240917185156.10580-1-pvmohammedanees2003@gmail.com Signed-off-by: Mohammed Anees <pvmohammedanees2003@gmail.com> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * ocfs2: remove unused declaration in header fileZhang Zekun2024-11-061-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The definition of ocfs2_global_read_dquot() has been removed since commit fb8dd8d78014 ("ocfs2: Fix quota locking"). Let's remove the empty declartion Link: https://lkml.kernel.org/r/20240906055742.105024-1-zhangzekun11@huawei.com Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | Merge tag 'pull-fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds2024-11-181-14/+10
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull 'struct fd' class updates from Al Viro: "The bulk of struct fd memory safety stuff Making sure that struct fd instances are destroyed in the same scope where they'd been created, getting rid of reassignments and passing them by reference, converting to CLASS(fd{,_pos,_raw}). We are getting very close to having the memory safety of that stuff trivial to verify" * tag 'pull-fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (28 commits) deal with the last remaing boolean uses of fd_file() css_set_fork(): switch to CLASS(fd_raw, ...) memcg_write_event_control(): switch to CLASS(fd) assorted variants of irqfd setup: convert to CLASS(fd) do_pollfd(): convert to CLASS(fd) convert do_select() convert vfs_dedupe_file_range(). convert cifs_ioctl_copychunk() convert media_request_get_by_fd() convert spu_run(2) switch spufs_calls_{get,put}() to CLASS() use convert cachestat(2) convert do_preadv()/do_pwritev() fdget(), more trivial conversions fdget(), trivial conversions privcmd_ioeventfd_assign(): don't open-code eventfd_ctx_fdget() o2hb_region_dev_store(): avoid goto around fdget()/fdput() introduce "fd_pos" class, convert fdget_pos() users to it. fdget_raw() users: switch to CLASS(fd_raw) convert vmsplice() to CLASS(fd) ...
| * | fdget(), more trivial conversionsAl Viro2024-11-031-8/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | all failure exits prior to fdget() leave the scope, all matching fdput() are immediately followed by leaving the scope. [xfs_ioc_commit_range() chunk moved here as well] Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | o2hb_region_dev_store(): avoid goto around fdget()/fdput()Al Viro2024-11-031-6/+5
| | | | | | | | | | | | | | | | | | Preparation for CLASS(fd) conversion. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | Merge tag 'vfs-6.13.misc' of ↵Linus Torvalds2024-11-182-1/+2
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "Features: - Fixup and improve NLM and kNFSD file lock callbacks Last year both GFS2 and OCFS2 had some work done to make their locking more robust when exported over NFS. Unfortunately, part of that work caused both NLM (for NFS v3 exports) and kNFSD (for NFSv4.1+ exports) to no longer send lock notifications to clients This in itself is not a huge problem because most NFS clients will still poll the server in order to acquire a conflicted lock It's important for NLM and kNFSD that they do not block their kernel threads inside filesystem's file_lock implementations because that can produce deadlocks. We used to make sure of this by only trusting that posix_lock_file() can correctly handle blocking lock calls asynchronously, so the lock managers would only setup their file_lock requests for async callbacks if the filesystem did not define its own lock() file operation However, when GFS2 and OCFS2 grew the capability to correctly handle blocking lock requests asynchronously, they started signalling this behavior with EXPORT_OP_ASYNC_LOCK, and the check for also trusting posix_lock_file() was inadvertently dropped, so now most filesystems no longer produce lock notifications when exported over NFS Fix this by using an fop_flag which greatly simplifies the problem and grooms the way for future uses by both filesystems and lock managers alike - Add a sysctl to delete the dentry when a file is removed instead of making it a negative dentry Commit 681ce8623567 ("vfs: Delete the associated dentry when deleting a file") introduced an unconditional deletion of the associated dentry when a file is removed. However, this led to performance regressions in specific benchmarks, such as ilebench.sum_operations/s, prompting a revert in commit 4a4be1ad3a6e ("Revert "vfs: Delete the associated dentry when deleting a file""). This reintroduces the concept conditionally through a sysctl - Expand the statmount() system call: * Report the filesystem subtype in a new fs_subtype field to e.g., report fuse filesystem subtypes * Report the superblock source in a new sb_source field * Add a new way to return filesystem specific mount options in an option array that returns filesystem specific mount options separated by zero bytes and unescaped. This allows caller's to retrieve filesystem specific mount options and immediately pass them to e.g., fsconfig() without having to unescape or split them * Report security (LSM) specific mount options in a separate security option array. We don't lump them together with filesystem specific mount options as security mount options are generic and most users aren't interested in them The format is the same as for the filesystem specific mount option array - Support relative paths in fsconfig()'s FSCONFIG_SET_STRING command - Optimize acl_permission_check() to avoid costly {g,u}id ownership checks if possible - Use smp_mb__after_spinlock() to avoid full smp_mb() in evict() - Add synchronous wakeup support for ep_poll_callback. Currently, epoll only uses wake_up() to wake up task. But sometimes there are epoll users which want to use the synchronous wakeup flag to give a hint to the scheduler, e.g., the Android binder driver. So add a wake_up_sync() define, and use wake_up_sync() when sync is true in ep_poll_callback() Fixes: - Fix kernel documentation for inode_insert5() and iget5_locked() - Annotate racy epoll check on file->f_ep - Make F_DUPFD_QUERY associative - Avoid filename buffer overrun in initramfs - Don't let statmount() return empty strings - Add a cond_resched() to dump_user_range() to avoid hogging the CPU - Don't query the device logical blocksize multiple times for hfsplus - Make filemap_read() check that the offset is positive or zero Cleanups: - Various typo fixes - Cleanup wbc_attach_fdatawrite_inode() - Add __releases annotation to wbc_attach_and_unlock_inode() - Add hugetlbfs tracepoints - Fix various vfs kernel doc parameters - Remove obsolete TODO comment from io_cancel() - Convert wbc_account_cgroup_owner() to take a folio - Fix comments for BANDWITH_INTERVAL and wb_domain_writeout_add() - Reorder struct posix_acl to save 8 bytes - Annotate struct posix_acl with __counted_by() - Replace one-element array with flexible array member in freevxfs - Use idiomatic atomic64_inc_return() in alloc_mnt_ns()" * tag 'vfs-6.13.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (35 commits) statmount: retrieve security mount options vfs: make evict() use smp_mb__after_spinlock instead of smp_mb statmount: add flag to retrieve unescaped options fs: add the ability for statmount() to report the sb_source writeback: wbc_attach_fdatawrite_inode out of line writeback: add a __releases annoation to wbc_attach_and_unlock_inode fs: add the ability for statmount() to report the fs_subtype fs: don't let statmount return empty strings fs:aio: Remove TODO comment suggesting hash or array usage in io_cancel() hfsplus: don't query the device logical block size multiple times freevxfs: Replace one-element array with flexible array member fs: optimize acl_permission_check() initramfs: avoid filename buffer overrun fs/writeback: convert wbc_account_cgroup_owner to take a folio acl: Annotate struct posix_acl with __counted_by() acl: Realign struct posix_acl to save 8 bytes epoll: Add synchronous wakeup support for ep_poll_callback coredump: add cond_resched() to dump_user_range mm/page-writeback.c: Fix comment of wb_domain_writeout_add() mm/page-writeback.c: Update comment for BANDWIDTH_INTERVAL ...
| * \ \ Merge patch series "Fixup NLM and kNFSD file lock callbacks"Christian Brauner2024-10-022-1/+2
| |\ \ \ | | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Benjamin Coddington <bcodding@redhat.com> says: Last year both GFS2 and OCFS2 had some work done to make their locking more robust when exported over NFS. Unfortunately, part of that work caused both NLM (for NFS v3 exports) and kNFSD (for NFSv4.1+ exports) to no longer send lock notifications to clients. This in itself is not a huge problem because most NFS clients will still poll the server in order to acquire a conflicted lock, but now that I've noticed it I can't help but try to fix it because there are big advantages for setups that might depend on timely lock notifications, and we've supported that as a feature for a long time. Its important for NLM and kNFSD that they do not block their kernel threads inside filesystem's file_lock implementations because that can produce deadlocks. We used to make sure of this by only trusting that posix_lock_file() can correctly handle blocking lock calls asynchronously, so the lock managers would only setup their file_lock requests for async callbacks if the filesystem did not define its own lock() file operation. However, when GFS2 and OCFS2 grew the capability to correctly handle blocking lock requests asynchronously, they started signalling this behavior with EXPORT_OP_ASYNC_LOCK, and the check for also trusting posix_lock_file() was inadvertently dropped, so now most filesystems no longer produce lock notifications when exported over NFS. I tried to fix this by simply including the old check for lock(), but the resulting include mess and layering violations was more than I could accept. There's a much cleaner way presented here using an fop_flag, which while potentially flag-greedy, greatly simplifies the problem and grooms the way for future uses by both filesystems and lock managers alike. * patches from https://lore.kernel.org/r/cover.1726083391.git.bcodding@redhat.com: exportfs: Remove EXPORT_OP_ASYNC_LOCK NLM/NFSD: Fix lock notifications for async-capable filesystems gfs2/ocfs2: set FOP_ASYNC_LOCK fs: Introduce FOP_ASYNC_LOCK NFS: trace: show TIMEDOUT instead of 0x6e nfsd: use system_unbound_wq for nfsd_file_gc_worker() nfsd: count nfsd_file allocations nfsd: fix refcount leak when file is unhashed after being found nfsd: remove unneeded EEXIST error check in nfsd_do_file_acquire nfsd: add list_head nf_gc to struct nfsd_file Link: https://lore.kernel.org/r/cover.1726083391.git.bcodding@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
| | * | exportfs: Remove EXPORT_OP_ASYNC_LOCKBenjamin Coddington2024-10-011-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that GFS2 and OCFS2 are signalling async ->lock() support with FOP_ASYNC_LOCK and checks for support are converted, we can remove EXPORT_OP_ASYNC_LOCK. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Link: https://lore.kernel.org/r/0a114db814fec3086f937ae3d44a086f13b8de26.1726083391.git.bcodding@redhat.com Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
| | * | gfs2/ocfs2: set FOP_ASYNC_LOCKBenjamin Coddington2024-09-121-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both GFS2 and OCFS2 use DLM locking, which will allow async lock requests. Signal this support by setting FOP_ASYNC_LOCK. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Link: https://lore.kernel.org/r/fc4163dbbf33c58e5a8b8ee8cb8c57e555f53ce5.1726083391.git.bcodding@redhat.com Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
* | | | ocfs2: uncache inode which has failed entering the groupDmitry Antipov2024-11-151-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Syzbot has reported the following BUG: kernel BUG at fs/ocfs2/uptodate.c:509! ... Call Trace: <TASK> ? __die_body+0x5f/0xb0 ? die+0x9e/0xc0 ? do_trap+0x15a/0x3a0 ? ocfs2_set_new_buffer_uptodate+0x145/0x160 ? do_error_trap+0x1dc/0x2c0 ? ocfs2_set_new_buffer_uptodate+0x145/0x160 ? __pfx_do_error_trap+0x10/0x10 ? handle_invalid_op+0x34/0x40 ? ocfs2_set_new_buffer_uptodate+0x145/0x160 ? exc_invalid_op+0x38/0x50 ? asm_exc_invalid_op+0x1a/0x20 ? ocfs2_set_new_buffer_uptodate+0x2e/0x160 ? ocfs2_set_new_buffer_uptodate+0x144/0x160 ? ocfs2_set_new_buffer_uptodate+0x145/0x160 ocfs2_group_add+0x39f/0x15a0 ? __pfx_ocfs2_group_add+0x10/0x10 ? __pfx_lock_acquire+0x10/0x10 ? mnt_get_write_access+0x68/0x2b0 ? __pfx_lock_release+0x10/0x10 ? rcu_read_lock_any_held+0xb7/0x160 ? __pfx_rcu_read_lock_any_held+0x10/0x10 ? smack_log+0x123/0x540 ? mnt_get_write_access+0x68/0x2b0 ? mnt_get_write_access+0x68/0x2b0 ? mnt_get_write_access+0x226/0x2b0 ocfs2_ioctl+0x65e/0x7d0 ? __pfx_ocfs2_ioctl+0x10/0x10 ? smack_file_ioctl+0x29e/0x3a0 ? __pfx_smack_file_ioctl+0x10/0x10 ? lockdep_hardirqs_on_prepare+0x43d/0x780 ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10 ? __pfx_ocfs2_ioctl+0x10/0x10 __se_sys_ioctl+0xfb/0x170 do_syscall_64+0xf3/0x230 entry_SYSCALL_64_after_hwframe+0x77/0x7f ... </TASK> When 'ioctl(OCFS2_IOC_GROUP_ADD, ...)' has failed for the particular inode in 'ocfs2_verify_group_and_input()', corresponding buffer head remains cached and subsequent call to the same 'ioctl()' for the same inode issues the BUG() in 'ocfs2_set_new_buffer_uptodate()' (trying to cache the same buffer head of that inode). Fix this by uncaching the buffer head with 'ocfs2_remove_from_cache()' on error path in 'ocfs2_group_add()'. Link: https://lkml.kernel.org/r/20241114043844.111847-1-dmantipov@yandex.ru Fixes: 7909f2bf8353 ("[PATCH 2/2] ocfs2: Implement group add for online resize") Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Reported-by: syzbot+453873f1588c2d75b447@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=453873f1588c2d75b447 Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Dmitry Antipov <dmantipov@yandex.ru> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mark@fasheh.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | | | ocfs2: fix UBSAN warning in ocfs2_verify_volume()Dmitry Antipov2024-11-121-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Syzbot has reported the following splat triggered by UBSAN: UBSAN: shift-out-of-bounds in fs/ocfs2/super.c:2336:10 shift exponent 32768 is too large for 32-bit type 'int' CPU: 2 UID: 0 PID: 5255 Comm: repro Not tainted 6.12.0-rc4-syzkaller-00047-gc2ee9f594da8 #0 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x241/0x360 ? __pfx_dump_stack_lvl+0x10/0x10 ? __pfx__printk+0x10/0x10 ? __asan_memset+0x23/0x50 ? lockdep_init_map_type+0xa1/0x910 __ubsan_handle_shift_out_of_bounds+0x3c8/0x420 ocfs2_fill_super+0xf9c/0x5750 ? __pfx_ocfs2_fill_super+0x10/0x10 ? __pfx_validate_chain+0x10/0x10 ? __pfx_validate_chain+0x10/0x10 ? validate_chain+0x11e/0x5920 ? __lock_acquire+0x1384/0x2050 ? __pfx_validate_chain+0x10/0x10 ? string+0x26a/0x2b0 ? widen_string+0x3a/0x310 ? string+0x26a/0x2b0 ? bdev_name+0x2b1/0x3c0 ? pointer+0x703/0x1210 ? __pfx_pointer+0x10/0x10 ? __pfx_format_decode+0x10/0x10 ? __lock_acquire+0x1384/0x2050 ? vsnprintf+0x1ccd/0x1da0 ? snprintf+0xda/0x120 ? __pfx_lock_release+0x10/0x10 ? do_raw_spin_lock+0x14f/0x370 ? __pfx_snprintf+0x10/0x10 ? set_blocksize+0x1f9/0x360 ? sb_set_blocksize+0x98/0xf0 ? setup_bdev_super+0x4e6/0x5d0 mount_bdev+0x20c/0x2d0 ? __pfx_ocfs2_fill_super+0x10/0x10 ? __pfx_mount_bdev+0x10/0x10 ? vfs_parse_fs_string+0x190/0x230 ? __pfx_vfs_parse_fs_string+0x10/0x10 legacy_get_tree+0xf0/0x190 ? __pfx_ocfs2_mount+0x10/0x10 vfs_get_tree+0x92/0x2b0 do_new_mount+0x2be/0xb40 ? __pfx_do_new_mount+0x10/0x10 __se_sys_mount+0x2d6/0x3c0 ? __pfx___se_sys_mount+0x10/0x10 ? do_syscall_64+0x100/0x230 ? __x64_sys_mount+0x20/0xc0 do_syscall_64+0xf3/0x230 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f37cae96fda Code: 48 8b 0d 51 ce 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1e ce 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007fff6c1aa228 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 00007fff6c1aa240 RCX: 00007f37cae96fda RDX: 00000000200002c0 RSI: 0000000020000040 RDI: 00007fff6c1aa240 RBP: 0000000000000004 R08: 00007fff6c1aa280 R09: 0000000000000000 R10: 00000000000008c0 R11: 0000000000000206 R12: 00000000000008c0 R13: 00007fff6c1aa280 R14: 0000000000000003 R15: 0000000001000000 </TASK> For a really damaged superblock, the value of 'i_super.s_blocksize_bits' may exceed the maximum possible shift for an underlying 'int'. So add an extra check whether the aforementioned field represents the valid block size, which is 512 bytes, 1K, 2K, or 4K. Link: https://lkml.kernel.org/r/20241106092100.2661330-1-dmantipov@yandex.ru Fixes: ccd979bdbce9 ("[PATCH] OCFS2: The Second Oracle Cluster Filesystem") Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Reported-by: syzbot+56f7cd1abe4b8e475180@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=56f7cd1abe4b8e475180 Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | | | ocfs2: remove entry once instead of null-ptr-dereference in ocfs2_xa_remove()Andrew Kanner2024-11-071-2/+1
| |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Syzkaller is able to provoke null-ptr-dereference in ocfs2_xa_remove(): [ 57.319872] (a.out,1161,7):ocfs2_xa_remove:2028 ERROR: status = -12 [ 57.320420] (a.out,1161,7):ocfs2_xa_cleanup_value_truncate:1999 ERROR: Partial truncate while removing xattr overlay.upper. Leaking 1 clusters and removing the entry [ 57.321727] BUG: kernel NULL pointer dereference, address: 0000000000000004 [...] [ 57.325727] RIP: 0010:ocfs2_xa_block_wipe_namevalue+0x2a/0xc0 [...] [ 57.331328] Call Trace: [ 57.331477] <TASK> [...] [ 57.333511] ? do_user_addr_fault+0x3e5/0x740 [ 57.333778] ? exc_page_fault+0x70/0x170 [ 57.334016] ? asm_exc_page_fault+0x2b/0x30 [ 57.334263] ? __pfx_ocfs2_xa_block_wipe_namevalue+0x10/0x10 [ 57.334596] ? ocfs2_xa_block_wipe_namevalue+0x2a/0xc0 [ 57.334913] ocfs2_xa_remove_entry+0x23/0xc0 [ 57.335164] ocfs2_xa_set+0x704/0xcf0 [ 57.335381] ? _raw_spin_unlock+0x1a/0x40 [ 57.335620] ? ocfs2_inode_cache_unlock+0x16/0x20 [ 57.335915] ? trace_preempt_on+0x1e/0x70 [ 57.336153] ? start_this_handle+0x16c/0x500 [ 57.336410] ? preempt_count_sub+0x50/0x80 [ 57.336656] ? _raw_read_unlock+0x20/0x40 [ 57.336906] ? start_this_handle+0x16c/0x500 [ 57.337162] ocfs2_xattr_block_set+0xa6/0x1e0 [ 57.337424] __ocfs2_xattr_set_handle+0x1fd/0x5d0 [ 57.337706] ? ocfs2_start_trans+0x13d/0x290 [ 57.337971] ocfs2_xattr_set+0xb13/0xfb0 [ 57.338207] ? dput+0x46/0x1c0 [ 57.338393] ocfs2_xattr_trusted_set+0x28/0x30 [ 57.338665] ? ocfs2_xattr_trusted_set+0x28/0x30 [ 57.338948] __vfs_removexattr+0x92/0xc0 [ 57.339182] __vfs_removexattr_locked+0xd5/0x190 [ 57.339456] ? preempt_count_sub+0x50/0x80 [ 57.339705] vfs_removexattr+0x5f/0x100 [...] Reproducer uses faultinject facility to fail ocfs2_xa_remove() -> ocfs2_xa_value_truncate() with -ENOMEM. In this case the comment mentions that we can return 0 if ocfs2_xa_cleanup_value_truncate() is going to wipe the entry anyway. But the following 'rc' check is wrong and execution flow do 'ocfs2_xa_remove_entry(loc);' twice: * 1st: in ocfs2_xa_cleanup_value_truncate(); * 2nd: returning back to ocfs2_xa_remove() instead of going to 'out'. Fix this by skipping the 2nd removal of the same entry and making syzkaller repro happy. Link: https://lkml.kernel.org/r/20241103193845.2940988-1-andrew.kanner@gmail.com Fixes: 399ff3a748cf ("ocfs2: Handle errors while setting external xattr values.") Signed-off-by: Andrew Kanner <andrew.kanner@gmail.com> Reported-by: syzbot+386ce9e60fa1b18aac5b@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/671e13ab.050a0220.2b8c0f.01d0.GAE@google.com/T/ Tested-by: syzbot+386ce9e60fa1b18aac5b@syzkaller.appspotmail.com Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | | ocfs2: pass u64 to ocfs2_truncate_inline maybe overflowEdward Adam Davis2024-10-291-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Syzbot reported a kernel BUG in ocfs2_truncate_inline. There are two reasons for this: first, the parameter value passed is greater than ocfs2_max_inline_data_with_xattr, second, the start and end parameters of ocfs2_truncate_inline are "unsigned int". So, we need to add a sanity check for byte_start and byte_len right before ocfs2_truncate_inline() in ocfs2_remove_inode_range(), if they are greater than ocfs2_max_inline_data_with_xattr return -EINVAL. Link: https://lkml.kernel.org/r/tencent_D48DB5122ADDAEDDD11918CFB68D93258C07@qq.com Fixes: 1afc32b95233 ("ocfs2: Write support for inline data") Signed-off-by: Edward Adam Davis <eadavis@qq.com> Reported-by: syzbot+81092778aac03460d6b7@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=81092778aac03460d6b7 Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | | fs: Fix uninitialized value issue in from_kuid and from_kgidAlessandro Zanni2024-10-171-3/+6
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | ocfs2_setattr() uses attr->ia_mode, attr->ia_uid and attr->ia_gid in a trace point even though ATTR_MODE, ATTR_UID and ATTR_GID aren't set. Initialize all fields of newattrs to avoid uninitialized variables, by checking if ATTR_MODE, ATTR_UID, ATTR_GID are initialized, otherwise 0. Reported-by: syzbot+6c55f725d1bdc8c52058@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=6c55f725d1bdc8c52058 Signed-off-by: Alessandro Zanni <alessandro.zanni87@gmail.com> Link: https://lore.kernel.org/r/20241017120553.55331-1-alessandro.zanni87@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
* | ocfs2: fix uninit-value in ocfs2_get_block()Joseph Qi2024-09-261-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | syzbot reported an uninit-value BUG: BUG: KMSAN: uninit-value in ocfs2_get_block+0xed2/0x2710 fs/ocfs2/aops.c:159 ocfs2_get_block+0xed2/0x2710 fs/ocfs2/aops.c:159 do_mpage_readpage+0xc45/0x2780 fs/mpage.c:225 mpage_readahead+0x43f/0x840 fs/mpage.c:374 ocfs2_readahead+0x269/0x320 fs/ocfs2/aops.c:381 read_pages+0x193/0x1110 mm/readahead.c:160 page_cache_ra_unbounded+0x901/0x9f0 mm/readahead.c:273 do_page_cache_ra mm/readahead.c:303 [inline] force_page_cache_ra+0x3b1/0x4b0 mm/readahead.c:332 force_page_cache_readahead mm/internal.h:347 [inline] generic_fadvise+0x6b0/0xa90 mm/fadvise.c:106 vfs_fadvise mm/fadvise.c:185 [inline] ksys_fadvise64_64 mm/fadvise.c:199 [inline] __do_sys_fadvise64 mm/fadvise.c:214 [inline] __se_sys_fadvise64 mm/fadvise.c:212 [inline] __x64_sys_fadvise64+0x1fb/0x3a0 mm/fadvise.c:212 x64_sys_call+0xe11/0x3ba0 arch/x86/include/generated/asm/syscalls_64.h:222 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f This is because when ocfs2_extent_map_get_blocks() fails, p_blkno is uninitialized. So the error log will trigger the above uninit-value access. The error log is out-of-date since get_blocks() was removed long time ago. And the error code will be logged in ocfs2_extent_map_get_blocks() once ocfs2_get_cluster() fails, so fix this by only logging inode and block. Link: https://syzkaller.appspot.com/bug?extid=9709e73bae885b05314b Link: https://lkml.kernel.org/r/20240925090600.3643376-1-joseph.qi@linux.alibaba.com Fixes: ccd979bdbce9 ("[PATCH] OCFS2: The Second Oracle Cluster Filesystem") Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reported-by: syzbot+9709e73bae885b05314b@syzkaller.appspotmail.com Tested-by: syzbot+9709e73bae885b05314b@syzkaller.appspotmail.com Cc: Heming Zhao <heming.zhao@suse.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | ocfs2: fix deadlock in ocfs2_get_system_file_inodeMohammed Anees2024-09-261-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | syzbot has found a possible deadlock in ocfs2_get_system_file_inode [1]. The scenario is depicted here, CPU0 CPU1 lock(&ocfs2_file_ip_alloc_sem_key); lock(&osb->system_file_mutex); lock(&ocfs2_file_ip_alloc_sem_key); lock(&osb->system_file_mutex); The function calls which could lead to this are: CPU0 ocfs2_mknod - lock(&ocfs2_file_ip_alloc_sem_key); . . . ocfs2_get_system_file_inode - lock(&osb->system_file_mutex); CPU1 - ocfs2_fill_super - lock(&osb->system_file_mutex); . . . ocfs2_read_virt_blocks - lock(&ocfs2_file_ip_alloc_sem_key); This issue can be resolved by making the down_read -> down_read_try in the ocfs2_read_virt_blocks. [1] https://syzkaller.appspot.com/bug?extid=e0055ea09f1f5e6fabdd Link: https://lkml.kernel.org/r/20240924093257.7181-1-pvmohammedanees2003@gmail.com Signed-off-by: Mohammed Anees <pvmohammedanees2003@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reported-by: <syzbot+e0055ea09f1f5e6fabdd@syzkaller.appspotmail.com> Closes: https://syzkaller.appspot.com/bug?extid=e0055ea09f1f5e6fabdd Tested-by: syzbot+e0055ea09f1f5e6fabdd@syzkaller.appspotmail.com Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | ocfs2: reserve space for inline xattr before attaching reflink treeGautham Ananthakrishna2024-09-262-12/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One of our customers reported a crash and a corrupted ocfs2 filesystem. The crash was due to the detection of corruption. Upon troubleshooting, the fsck -fn output showed the below corruption [EXTENT_LIST_FREE] Extent list in owner 33080590 claims 230 as the next free chain record, but fsck believes the largest valid value is 227. Clamp the next record value? n The stat output from the debugfs.ocfs2 showed the following corruption where the "Next Free Rec:" had overshot the "Count:" in the root metadata block. Inode: 33080590 Mode: 0640 Generation: 2619713622 (0x9c25a856) FS Generation: 904309833 (0x35e6ac49) CRC32: 00000000 ECC: 0000 Type: Regular Attr: 0x0 Flags: Valid Dynamic Features: (0x16) HasXattr InlineXattr Refcounted Extended Attributes Block: 0 Extended Attributes Inline Size: 256 User: 0 (root) Group: 0 (root) Size: 281320357888 Links: 1 Clusters: 141738 ctime: 0x66911b56 0x316edcb8 -- Fri Jul 12 06:02:30.829349048 2024 atime: 0x66911d6b 0x7f7a28d -- Fri Jul 12 06:11:23.133669517 2024 mtime: 0x66911b56 0x12ed75d7 -- Fri Jul 12 06:02:30.317552087 2024 dtime: 0x0 -- Wed Dec 31 17:00:00 1969 Refcount Block: 2777346 Last Extblk: 2886943 Orphan Slot: 0 Sub Alloc Slot: 0 Sub Alloc Bit: 14 Tree Depth: 1 Count: 227 Next Free Rec: 230 ## Offset Clusters Block# 0 0 2310 2776351 1 2310 2139 2777375 2 4449 1221 2778399 3 5670 731 2779423 4 6401 566 2780447 ....... .... ....... ....... .... ....... The issue was in the reflink workfow while reserving space for inline xattr. The problematic function is ocfs2_reflink_xattr_inline(). By the time this function is called the reflink tree is already recreated at the destination inode from the source inode. At this point, this function reserves space for inline xattrs at the destination inode without even checking if there is space at the root metadata block. It simply reduces the l_count from 243 to 227 thereby making space of 256 bytes for inline xattr whereas the inode already has extents beyond this index (in this case up to 230), thereby causing corruption. The fix for this is to reserve space for inline metadata at the destination inode before the reflink tree gets recreated. The customer has verified the fix. Link: https://lkml.kernel.org/r/20240918063844.1830332-1-gautham.ananthakrishna@oracle.com Fixes: ef962df057aa ("ocfs2: xattr: fix inlined xattr reflink") Signed-off-by: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | Merge tag 'fs_for_v6.12-rc1' of ↵Linus Torvalds2024-09-231-4/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull quota and isofs updates from Jan Kara: "A few small cleanups in quota and isofs" * tag 'fs_for_v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: isofs: Annotate struct SL_component with __counted_by() quota: remove unnecessary error code translation in dquot_quota_enable quota: remove redundant return at end of void function quota: remove unneeded return value of register_quota_format quota: avoid missing put_quota_format when DQUOT_SUSPENDED is passed
| * | quota: remove unneeded return value of register_quota_formatKemeng Shi2024-07-221-4/+2
| |/ | | | | | | | | | | | | | | | | | | The register_quota_format always returns 0, simply remove unneeded return value. Link: https://patch.msgid.link/20240715130534.2112678-3-shikemeng@huaweicloud.com Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Signed-off-by: Jan Kara <jack@suse.cz>
* | Merge tag 'pull-stable-struct_fd' of ↵Linus Torvalds2024-09-231-3/+3
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull 'struct fd' updates from Al Viro: "Just the 'struct fd' layout change, with conversion to accessor helpers" * tag 'pull-stable-struct_fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: add struct fd constructors, get rid of __to_fd() struct fd: representation change introduce fd_file(), convert all accessors to it.
| * | introduce fd_file(), convert all accessors to it.Al Viro2024-08-131-3/+3
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For any changes of struct fd representation we need to turn existing accesses to fields into calls of wrappers. Accesses to struct fd::flags are very few (3 in linux/file.h, 1 in net/socket.c, 3 in fs/overlayfs/file.c and 3 more in explicit initializers). Those can be dealt with in the commit converting to new layout; accesses to struct fd::file are too many for that. This commit converts (almost) all of f.file to fd_file(f). It's not entirely mechanical ('file' is used as a member name more than just in struct fd) and it does not even attempt to distinguish the uses in pointer context from those in boolean context; the latter will be eventually turned into a separate helper (fd_empty()). NOTE: mass conversion to fd_empty(), tempting as it might be, is a bad idea; better do that piecewise in commit that convert from fdget...() to CLASS(...). [conflicts in fs/fhandle.c, kernel/bpf/syscall.c, mm/memcontrol.c caught by git; fs/stat.c one got caught by git grep] [fs/xattr.c conflict] Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Merge tag 'mm-nonmm-stable-2024-09-21-07-52' of ↵Linus Torvalds2024-09-218-47/+40
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: "Many singleton patches - please see the various changelogs for details. Quite a lot of nilfs2 work this time around. Notable patch series in this pull request are: - "mul_u64_u64_div_u64: new implementation" by Nicolas Pitre, with assistance from Uwe Kleine-König. Reimplement mul_u64_u64_div_u64() to provide (much) more accurate results. The current implementation was causing Uwe some issues in the PWM drivers. - "xz: Updates to license, filters, and compression options" from Lasse Collin. Miscellaneous maintenance and kinor feature work to the xz decompressor. - "Fix some GDB command error and add some GDB commands" from Kuan-Ying Lee. Fixes and enhancements to the gdb scripts. - "treewide: add missing MODULE_DESCRIPTION() macros" from Jeff Johnson. Adds lots of MODULE_DESCRIPTIONs, thus fixing lots of warnings about this. - "nilfs2: add support for some common ioctls" from Ryusuke Konishi. Adds various commonly-available ioctls to nilfs2. - "This series fixes a number of formatting issues in kernel doc comments" from Ryusuke Konishi does that. - "nilfs2: prevent unexpected ENOENT propagation" from Ryusuke Konishi. Fix issues where -ENOENT was being unintentionally and inappropriately returned to userspace. - "nilfs2: assorted cleanups" from Huang Xiaojia. - "nilfs2: fix potential issues with empty b-tree nodes" from Ryusuke Konishi fixes some issues which can occur on corrupted nilfs2 filesystems. - "scripts/decode_stacktrace.sh: improve error reporting and usability" from Luca Ceresoli does those things" * tag 'mm-nonmm-stable-2024-09-21-07-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (103 commits) list: test: increase coverage of list_test_list_replace*() list: test: fix tests for list_cut_position() proc: use __auto_type more treewide: correct the typo 'retun' ocfs2: cleanup return value and mlog in ocfs2_global_read_info() nilfs2: remove duplicate 'unlikely()' usage nilfs2: fix potential oob read in nilfs_btree_check_delete() nilfs2: determine empty node blocks as corrupted nilfs2: fix potential null-ptr-deref in nilfs_btree_insert() user_namespace: use kmemdup_array() instead of kmemdup() for multiple allocation tools/mm: rm thp_swap_allocator_test when make clean squashfs: fix percpu address space issues in decompressor_multi_percpu.c lib: glob.c: added null check for character class nilfs2: refactor nilfs_segctor_thread() nilfs2: use kthread_create and kthread_stop for the log writer thread nilfs2: remove sc_timer_task nilfs2: do not repair reserved inode bitmap in nilfs_new_inode() nilfs2: eliminate the shared counter and spinlock for i_generation nilfs2: separate inode type information from i_state field nilfs2: use the BITS_PER_LONG macro ...
| * | ocfs2: cleanup return value and mlog in ocfs2_global_read_info()Joseph Qi2024-09-101-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Return 0 instead of sizeof(ocfs2_global_disk_dqinfo) that .quota_read returns in normal case. Also cleanup mlog to make code more readable. Link: https://lkml.kernel.org/r/20240904071004.2067695-2-joseph.qi@linux.alibaba.com Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * | ocfs2: use max() to improve ocfs2_dlm_seq_show()Thorsten Blum2024-09-021-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the max() macro to simplify the ocfs2_dlm_seq_show() function and improve its readability. Link: https://lkml.kernel.org/r/20240820021605.97887-3-thorsten.blum@toblux.com Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * | ocfs2: fix shift-out-of-bounds UBSAN bug in ocfs2_verify_volume()qasdev2024-09-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch addresses a shift-out-of-bounds error in the ocfs2_verify_volume() function, identified by UBSAN. The bug was triggered by an invalid s_clustersize_bits value (e.g., 1548), which caused the expression "1 << le32_to_cpu(di->id2.i_super.s_clustersize_bits)" to exceed the limits of a 32-bit integer, leading to an out-of-bounds shift. Link: https://lkml.kernel.org/r/ZsPvwQAXd5R/jNY+@hostname Signed-off-by: Qasim Ijaz <qasdev00@gmail.com> Reported-by: syzbot <syzbot+f3fff775402751ebb471@syzkaller.appspotmail.com> Closes: https://syzkaller.appspot.com/bug?extid=f3fff775402751ebb471 Tested-by: syzbot <syzbot+f3fff775402751ebb471@syzkaller.appspotmail.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * | ocfs2: fix unexpected zeroing of virtual diskChi Zhiling2024-09-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In a guest virtual machine, we found that there is unexpected data zeroing problem detected occassionly: XFS (vdb): Mounting V5 Filesystem XFS (vdb): Ending clean mount XFS (vdb): Metadata CRC error detected at xfs_refcountbt_read_verify+0x2c/0xf0, xfs_refcountbt block 0x200028 XFS (vdb): Unmount and run xfs_repair XFS (vdb): First 128 bytes of corrupted metadata buffer: 00000000e0cd2f5e: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000000cafd57f5: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000000d0298d7d: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000000f0698484: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000000adb789a7: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 000000005292b878: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000000885b4700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000000fd4b4df7: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ XFS (vdb): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x200028 len 8 error 74 XFS (vdb): Error -117 recovering leftover CoW allocations. XFS (vdb): xfs_do_force_shutdown(0x8) called from line 994 of file fs/xfs/xfs_mount.c. Return address = 000000003a53523a XFS (vdb): Corruption of in-memory data detected. Shutting down filesystem XFS (vdb): Please umount the filesystem and rectify the problem(s) It turns out that the root cause is from the physical host machine. More specifically, it is caused by the ocfs2. when the page_size is 64k, the block should advance by 16 each time instead of 1. This will lead to a wrong mapping from the page to the disk, which will zero some adjacent part of the disk. Link: https://lkml.kernel.org/r/20240815092141.1223238-1-chizhiling@163.com Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn> Suggested-by: Shida Zhang <zhangshida@kylinos.cn> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * | ocfs2: remove custom swap functions in favor of built-in sort swapKuan-Wei Chiu2024-09-023-33/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The custom swap functions used in ocfs2 do not perform any special operations and can be replaced with the built-in swap function of sort. This change not only reduces code size but also improves efficiency, especially in scenarios where CONFIG_RETPOLINE is enabled, as it makes indirect function calls more expensive. By using the built-in swap, we avoid these costly indirect function calls, leading to better performance. Link: https://lkml.kernel.org/r/20240810195316.186504-1-visitorckw@gmail.com Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mark@fasheh.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * | ocfs2: fix the la space leak when unmounting an ocfs2 volumeHeming Zhao2024-09-021-0/+19
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This bug has existed since the initial OCFS2 code. The code logic in ocfs2_sync_local_to_main() is wrong, as it ignores the last contiguous free bits, which causes an OCFS2 volume to lose the last free clusters of LA window on each umount command. Link: https://lkml.kernel.org/r/20240719114310.14245-1-heming.zhao@suse.com Signed-off-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Su Yue <glass.su@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | Merge tag 'mm-hotfixes-stable-2024-09-19-00-31' of ↵Linus Torvalds2024-09-193-7/+12
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc hotfixes from Andrew Morton: "12 hotfixes, 11 of which are cc:stable. Four fixes for longstanding ocfs2 issues and the remainder address random MM things" * tag 'mm-hotfixes-stable-2024-09-19-00-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/madvise: process_madvise() drop capability check if same mm mm/huge_memory: ensure huge_zero_folio won't have large_rmappable flag set mm/hugetlb.c: fix UAF of vma in hugetlb fault pathway mm: change vmf_anon_prepare() to __vmf_anon_prepare() resource: fix region_intersects() vs add_memory_driver_managed() zsmalloc: use unique zsmalloc caches names mm/damon/vaddr: protect vma traversal in __damon_va_thre_regions() with rcu read lock mm: vmscan.c: fix OOM on swap stress test ocfs2: cancel dqi_sync_work before freeing oinfo ocfs2: fix possible null-ptr-deref in ocfs2_set_buffer_uptodate ocfs2: remove unreasonable unlock in ocfs2_read_blocks ocfs2: fix null-ptr-deref when journal load failed.
| * | ocfs2: cancel dqi_sync_work before freeing oinfoJoseph Qi2024-09-101-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ocfs2_global_read_info() will initialize and schedule dqi_sync_work at the end, if error occurs after successfully reading global quota, it will trigger the following warning with CONFIG_DEBUG_OBJECTS_* enabled: ODEBUG: free active (active state 0) object: 00000000d8b0ce28 object type: timer_list hint: qsync_work_fn+0x0/0x16c This reports that there is an active delayed work when freeing oinfo in error handling, so cancel dqi_sync_work first. BTW, return status instead of -1 when .read_file_info fails. Link: https://syzkaller.appspot.com/bug?extid=f7af59df5d6b25f0febd Link: https://lkml.kernel.org/r/20240904071004.2067695-1-joseph.qi@linux.alibaba.com Fixes: 171bf93ce11f ("ocfs2: Periodic quota syncing") Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Reported-by: syzbot+f7af59df5d6b25f0febd@syzkaller.appspotmail.com Tested-by: syzbot+f7af59df5d6b25f0febd@syzkaller.appspotmail.com Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * | ocfs2: fix possible null-ptr-deref in ocfs2_set_buffer_uptodateLizhi Xu2024-09-101-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When doing cleanup, if flags without OCFS2_BH_READAHEAD, it may trigger NULL pointer dereference in the following ocfs2_set_buffer_uptodate() if bh is NULL. Link: https://lkml.kernel.org/r/20240902023636.1843422-3-joseph.qi@linux.alibaba.com Fixes: cf76c78595ca ("ocfs2: don't put and assigning null to bh allocated outside") Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com> Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reported-by: Heming Zhao <heming.zhao@suse.com> Suggested-by: Heming Zhao <heming.zhao@suse.com> Cc: <stable@vger.kernel.org> [4.20+] Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * | ocfs2: remove unreasonable unlock in ocfs2_read_blocksLizhi Xu2024-09-101-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "Misc fixes for ocfs2_read_blocks", v5. This series contains 2 fixes for ocfs2_read_blocks(). The first patch fix the issue reported by syzbot, which detects bad unlock balance in ocfs2_read_blocks(). The second patch fixes an issue reported by Heming Zhao when reviewing above fix. This patch (of 2): There was a lock release before exiting, so remove the unreasonable unlock. Link: https://lkml.kernel.org/r/20240902023636.1843422-1-joseph.qi@linux.alibaba.com Link: https://lkml.kernel.org/r/20240902023636.1843422-2-joseph.qi@linux.alibaba.com Fixes: cf76c78595ca ("ocfs2: don't put and assigning null to bh allocated outside") Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com> Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reported-by: syzbot+ab134185af9ef88dfed5@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=ab134185af9ef88dfed5 Tested-by: syzbot+ab134185af9ef88dfed5@syzkaller.appspotmail.com Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> [4.20+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * | ocfs2: fix null-ptr-deref when journal load failed.Julian Sun2024-09-101-3/+4
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During the mounting process, if journal_reset() fails because of too short journal, then lead to jbd2_journal_load() fails with NULL j_sb_buffer. Subsequently, ocfs2_journal_shutdown() calls jbd2_journal_flush()->jbd2_cleanup_journal_tail()-> __jbd2_update_log_tail()->jbd2_journal_update_sb_log_tail() ->lock_buffer(journal->j_sb_buffer), resulting in a null-pointer dereference error. To resolve this issue, we should check the JBD2_LOADED flag to ensure the journal was properly loaded. Additionally, use journal instead of osb->journal directly to simplify the code. Link: https://syzkaller.appspot.com/bug?extid=05b9b39d8bdfe1a0861f Link: https://lkml.kernel.org/r/20240902030844.422725-1-sunjunchao2870@gmail.com Fixes: f6f50e28f0cb ("jbd2: Fail to load a journal if it is too short") Signed-off-by: Julian Sun <sunjunchao2870@gmail.com> Reported-by: syzbot+05b9b39d8bdfe1a0861f@syzkaller.appspotmail.com Suggested-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | Merge tag 'vfs-6.12.file' of ↵Linus Torvalds2024-09-163-3/+12
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs file updates from Christian Brauner: "This is the work to cleanup and shrink struct file significantly. Right now, (focusing on x86) struct file is 232 bytes. After this series struct file will be 184 bytes aka 3 cacheline and a spare 8 bytes for future extensions at the end of the struct. With struct file being as ubiquitous as it is this should make a difference for file heavy workloads and allow further optimizations in the future. - struct fown_struct was embedded into struct file letting it take up 32 bytes in total when really it shouldn't even be embedded in struct file in the first place. Instead, actual users of struct fown_struct now allocate the struct on demand. This frees up 24 bytes. - Move struct file_ra_state into the union containg the cleanup hooks and move f_iocb_flags out of the union. This closes a 4 byte hole we created earlier and brings struct file to 192 bytes. Which means struct file is 3 cachelines and we managed to shrink it by 40 bytes. - Reorder struct file so that nothing crosses a cacheline. I suspect that in the future we will end up reordering some members to mitigate false sharing issues or just because someone does actually provide really good perf data. - Shrinking struct file to 192 bytes is only part of the work. Files use a slab that is SLAB_TYPESAFE_BY_RCU and when a kmem cache is created with SLAB_TYPESAFE_BY_RCU the free pointer must be located outside of the object because the cache doesn't know what part of the memory can safely be overwritten as it may be needed to prevent object recycling. That has the consequence that SLAB_TYPESAFE_BY_RCU may end up adding a new cacheline. So this also contains work to add a new kmem_cache_create_rcu() function that allows the caller to specify an offset where the freelist pointer is supposed to be placed. Thus avoiding the implicit addition of a fourth cacheline. - And finally this removes the f_version member in struct file. The f_version member isn't particularly well-defined. It is mainly used as a cookie to detect concurrent seeks when iterating directories. But it is also abused by some subsystems for completely unrelated things. It is mostly a directory and filesystem specific thing that doesn't really need to live in struct file and with its wonky semantics it really lacks a specific function. For pipes, f_version is (ab)used to defer poll notifications until a write has happened. And struct pipe_inode_info is used by multiple struct files in their ->private_data so there's no chance of pushing that down into file->private_data without introducing another pointer indirection. But pipes don't rely on f_pos_lock so this adds a union into struct file encompassing f_pos_lock and a pipe specific f_pipe member that pipes can use. This union of course can be extended to other file types and is similar to what we do in struct inode already" * tag 'vfs-6.12.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (26 commits) fs: remove f_version pipe: use f_pipe fs: add f_pipe ubifs: store cookie in private data ufs: store cookie in private data udf: store cookie in private data proc: store cookie in private data ocfs2: store cookie in private data input: remove f_version abuse ext4: store cookie in private data ext2: store cookie in private data affs: store cookie in private data fs: add generic_llseek_cookie() fs: use must_set_pos() fs: add must_set_pos() fs: add vfs_setpos_cookie() s390: remove unused f_version ceph: remove unused f_version adi: remove unused f_version mm: Removed @freeptr_offset to prevent doc warning ...
| * | ocfs2: store cookie in private dataChristian Brauner2024-09-123-3/+12
| |/ | | | | | | | | | | | | | | | | | | Store the cookie to detect concurrent seeks on directories in file->private_data. Link: https://lore.kernel.org/r/20240830-vfs-file-f_version-v1-13-6d3e4816aa7b@kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
* | buffer: Convert __block_write_begin() to take a folioMatthew Wilcox (Oracle)2024-08-071-1/+1
| | | | | | | | | | | | | | | | | | Almost all callers have a folio now, so change __block_write_begin() to take a folio and remove a call to compound_head(). Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
* | ocfs2: Convert ocfs2_write_zero_page to use a folioMatthew Wilcox (Oracle)2024-08-071-8/+9
| | | | | | | | | | | | | | | | | | Removes a conversion of folio to page, and then two hidden conversions of page back to folio. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
* | fs: Convert aops->write_begin to take a folioMatthew Wilcox (Oracle)2024-08-073-9/+9
| | | | | | | | | | | | | | | | | | | | Convert all callers from working on a page to working on one page of a folio (support for working on an entire folio can come later). Removes a lot of folio->page->folio conversions. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
* | fs: Convert aops->write_end to take a folioMatthew Wilcox (Oracle)2024-08-071-1/+1
|/ | | | | | | | | | Most callers have a folio, and most implementations operate on a folio, so remove the conversion from folio->page->folio to fit through this interface. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
* Merge tag 'mm-nonmm-stable-2024-07-21-15-07' of ↵Linus Torvalds2024-07-228-44/+67
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - In the series "treewide: Refactor heap related implementation", Kuan-Wei Chiu has significantly reworked the min_heap library code and has taught bcachefs to use the new more generic implementation. - Yury Norov's series "Cleanup cpumask.h inclusion in core headers" reworks the cpumask and nodemask headers to make things generally more rational. - Kuan-Wei Chiu has sent along some maintenance work against our sorting library code in the series "lib/sort: Optimizations and cleanups". - More library maintainance work from Christophe Jaillet in the series "Remove usage of the deprecated ida_simple_xx() API". - Ryusuke Konishi continues with the nilfs2 fixes and clanups in the series "nilfs2: eliminate the call to inode_attach_wb()". - Kuan-Ying Lee has some fixes to the gdb scripts in the series "Fix GDB command error". - Plus the usual shower of singleton patches all over the place. Please see the relevant changelogs for details. * tag 'mm-nonmm-stable-2024-07-21-15-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (98 commits) ia64: scrub ia64 from poison.h watchdog/perf: properly initialize the turbo mode timestamp and rearm counter tsacct: replace strncpy() with strscpy() lib/bch.c: use swap() to improve code test_bpf: convert comma to semicolon init/modpost: conditionally check section mismatch to __meminit* init: remove unused __MEMINIT* macros nilfs2: Constify struct kobj_type nilfs2: avoid undefined behavior in nilfs_cnt32_ge macro math: rational: add missing MODULE_DESCRIPTION() macro lib/zlib: add missing MODULE_DESCRIPTION() macro fs: ufs: add MODULE_DESCRIPTION() lib/rbtree.c: fix the example typo ocfs2: add bounds checking to ocfs2_check_dir_entry() fs: add kernel-doc comments to ocfs2_prepare_orphan_dir() coredump: simplify zap_process() selftests/fpu: add missing MODULE_DESCRIPTION() macro compiler.h: simplify data_race() macro build-id: require program headers to be right after ELF header resource: add missing MODULE_DESCRIPTION() ...