summaryrefslogtreecommitdiffstats
path: root/fs/gfs2/glock.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* gfs2: Avoid dequeuing GL_ASYNC glock holders twiceAndreas Gruenbacher2022-12-061-0/+8
| | | | | | | | | | | When a locking request fails, the associated glock holder is automatically dequeued from the list of active and waiting holders. For GL_ASYNC locking requests, this will obviously happen asynchronously and it can race with attempts to cancel that locking request via gfs2_glock_dq(). Therefore, don't forget to check if a locking request has already been dequeued in gfs2_glock_dq(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* gfs2: Make gfs2_glock_hold return its glock argumentAndreas Gruenbacher2022-12-061-3/+3
| | | | | | This allows code like 'gl = gfs2_glock_hold(...)'. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* gfs2: Merge branch 'for-next.nopid' into for-nextAndreas Gruenbacher2022-10-091-10/+203
|\ | | | | | | | | | | | | | | | | | | Resolves a conflict in gfs2_inode_lookup() between the following commits: gfs2: Use TRY lock in gfs2_inode_lookup for UNLINKED inodes gfs2: Mark the remaining process-independent glock holders as GL_NOPID Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * gfs2: Add GL_NOPID flag for process-independent glock holdersAndreas Gruenbacher2022-06-291-10/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a GL_NOPID flag to indicate that once a glock holder has been acquired, it won't be associated with the current process anymore. This is useful for iopen and flock glocks which are associated with open files, as well as journal glock holders and similar which are associated with the filesystem. Once GL_NOPID is used for all applicable glocks (see the next patches), processes will no longer be falsely reported as holding glocks which they are not actually holding in the glocks dump file. Unlike before, when a process is reported as having "(ended)", this will indicate an actual bug. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * gfs2: Add flocks to glockfd debugfs fileAndreas Gruenbacher2022-06-291-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | Include flock glocks in the "glockfd" debugfs file. Those are similar to the iopen glocks; while an open file is holding an flock, it is holding the file's flock glock. We cannot take f_fl_mutex in gfs2_glockfd_seq_show_flock() or else dumping the "glockfd" file would block on flock operations. Instead, use the file->f_lock spin lock to protect the f_fl_gh.gh_gl glock pointer. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * gfs2: Add glockfd debugfs fileAndreas Gruenbacher2022-06-291-0/+149
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a process has a gfs2 file open, the file is keeping a reference on the underlying gfs2 inode, and the inode is keeping the inode's iopen glock held in shared mode. In other words, the process depends on the iopen glock of each open gfs2 file. Expose those dependencies in a new "glockfd" debugfs file. The new debugfs file contains one line for each gfs2 file descriptor, specifying the tgid, file descriptor number, and glock name, e.g., 1601 6 5/816d This list is compiled by iterating all tasks on the system using find_ge_pid(), and all file descriptors of each task using task_lookup_next_fd_rcu(). To make that work from gfs2, export those two functions. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* | gfs2: Clear flags when withdraw prevents xmoteBob Peterson2022-08-251-2/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are a couple places in function do_xmote where normal processing is circumvented due to withdraws in progress. However, since we bypass most of do_xmote() we bypass telling dlm to lock the dlm lock, which means dlm will never respond with a completion callback. Since the completion callback ordinarily clears GLF_LOCK, this patch changes function do_xmote to handle those situations more gracefully so the file system may be unmounted after withdraw. A very similar situation happens with the GLF_DEMOTE_IN_PROGRESS flag, which is cleared by function finish_xmote(). Since the withdraw causes us to skip the majority of do_xmote, it therefore also skips the call to finish_xmote() so the DEMOTE_IN_PROGRESS flag needs to be cleared manually. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* | gfs2: Dequeue waiters when withdrawnBob Peterson2022-08-251-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a withdraw occurs, ordinary (not system) glocks may not be granted anymore. Later, when the file system is unmounted, gfs2_gl_hash_clear() tries to clear out all the glocks, but these un-grantable pending waiters prevent some glocks from being freed. So the unmount hangs, at least for its ten-minute timeout period. This patch takes measures to remove any pending waiters from the glocks that will never be granted. This allows the unmount to proceed in a reasonable period of time. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* | gfs2: Use TRY lock in gfs2_inode_lookup for UNLINKED inodesBob Peterson2022-08-251-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this patch, delete_work_func() would check for the GLF_DEMOTE flag on the iopen glock and if set, it would perform special processing. However, there was a race whereby the GLF_DEMOTE flag could be set by another process after the check. Then when it called gfs2_lookup_by_inum() which calls gfs2_inode_lookup(), it tried to lock the iopen glock in SH mode, but the GLF_DEMOTE flag prevented the request from being granted. But the iopen glock could never be demoted because that happens when the inode is evicted, and the evict was never completed because of the failed lookup. To fix that, change function gfs2_inode_lookup() so that when GFS2_BLKST_UNLINKED inodes are searched, it uses the LM_FLAG_TRY flag for the iopen glock. If the locking request fails, fail gfs2_inode_lookup() with -EAGAIN so that delete_work_func() can retry the operation later. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* | Merge tag 'gfs2-v5.19-rc4-fixes' of ↵Linus Torvalds2022-08-061-123/+77
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Andreas Gruenbacher: - Instantiate glocks ouside of the glock state engine, in the contect of the process taking the glock. This moves unnecessary complexity out of the core glock code. Clean up the instantiate logic to be more sensible. - In gfs2_glock_async_wait(), cancel pending locking request upon failure. Make sure all glocks are left in a consistent state. - Various other minor cleanups and fixes. * tag 'gfs2-v5.19-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: List traversal in do_promote is safe gfs2: do_promote glock holder stealing fix gfs2: Use better variable name gfs2: Make go_instantiate take a glock gfs2: Add new go_held glock operation gfs2: Revert 'Fix "truncate in progress" hang' gfs2: Instantiate glocks ouside of glock state engine gfs2: Fix up gfs2_glock_async_wait gfs2: Minor gfs2_glock_nq_m cleanup gfs2: Fix spelling mistake in comment gfs2: Rewrap overlong comment in do_promote gfs2: Remove redundant NULL check before kfree
| * \ Merge part of branch 'for-next.instantiate' into for-nextAndreas Gruenbacher2022-08-051-115/+72
| |\ \
| | * | gfs2: List traversal in do_promote is safeAndreas Gruenbacher2022-06-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In do_promote(), we're never removing the current entry from the list and so the list traversal is actually safe. Switch back to list_for_each_entry(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| | * | gfs2: do_promote glock holder stealing fixBob Peterson2022-06-291-7/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In do_promote(), when the glock had no strong holders, we were accidentally calling demote_incompat_holders() with new_gh == NULL, so no weak holders were considered incompatible. Instead, the new holder should have been passed in. For doing that, the HIF_HOLDER flag needs to be set in new_gh to prevent may_grant() from complaining. This means that the new holder will now be recognized as a current holder, so skip over it explicitly in demote_incompat_holders() to prevent it from being dequeued. To further clarify things, we can now rename new_gh to current_gh in demote_incompat_holders(); after all, the HIF_HOLDER flag is already set, which means the new holder is already a current holder. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| | * | gfs2: Use better variable nameAndreas Gruenbacher2022-06-291-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In do_promote() and add_to_queue(), use current_gh as the variable name for the first strong holder we could find: this matches the variable name is may_grant(), and more clearly indicates that we're interested in one (any) of the current strong holders. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| | * | gfs2: Make go_instantiate take a glockAndreas Gruenbacher2022-06-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Make go_instantiate take a glock instead of a glock holder as its argument: this handler is supposed to instantiate the object associated with the glock. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| | * | gfs2: Add new go_held glock operationAndreas Gruenbacher2022-06-291-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Right now, inode_go_instantiate() contains functionality that relates to how a glock is held rather than the glock itself, like waiting for pending direct I/O to complete and completing interrupted truncates. This code is meant to be run each time a holder is acquired, but go_instantiate is actually only called once, when the glock is instantiated. To fix that, introduce a new go_held glock operation that is called each time a glock holder is acquired. Move the holder specific code in inode_go_instantiate() over to inode_go_held(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| | * | gfs2: Revert 'Fix "truncate in progress" hang'Andreas Gruenbacher2022-06-291-36/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that interrupted truncates are completed in the context of the process taking the glock, there is no need for the glock state engine to delegate that task to gfs2_quotad or for quotad to perform those truncates anymore. Get rid of the obsolete associated infrastructure. Reverts commit 813e0c46c9e2 ("GFS2: Fix "truncate in progress" hang"). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| | * | gfs2: Instantiate glocks ouside of glock state engineAndreas Gruenbacher2022-06-291-33/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instantiate glocks outside of the glock state engine: there is no real reason for instantiating them inside the glock state engine; it only complicates the code. Instead, instantiate them in gfs2_glock_wait() and gfs2_glock_async_wait() using the new gfs2_glock_holder_ready() helper. On top of that, the only other place that acquires a glock without using gfs2_glock_wait() or gfs2_glock_async_wait() is gfs2_upgrade_iopen_glock(), so call gfs2_glock_holder_ready() there as well. If a dinode has a pending truncate, the glock-specific instantiate function for inodes wakes up the truncate function in the quota daemon. Waiting for the completion of the truncate was previously done by the glock state engine, but we now need to wait in inode_go_instantiate(). This also means that gfs2_instantiate() will now no longer return any "special" error codes. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| | * | gfs2: Fix up gfs2_glock_async_waitAndreas Gruenbacher2022-06-291-38/+15
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 1fc05c8d8426 ("gfs2: cancel timed-out glock requests"), a pending locking request can be canceled by calling gfs2_glock_dq() on the pending holder. In gfs2_glock_async_wait(), when we time out, use that to cancel the remaining locking requests and dequeue the locking requests already granted. That's simpler as well as more efficient than waiting for all locking requests to eventually be granted and dequeuing them then. In addition, gfs2_glock_async_wait() promises that by the time the function completes, all glocks are either granted or dequeued, but the implementation doesn't keep that promise if individual locking requests fail. Fix that as well. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * | gfs2: Minor gfs2_glock_nq_m cleanupAndreas Gruenbacher2022-06-281-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add state and flags arguments to gfs2_rlist_alloc() to make it somewhat more obvious which state and flags an rlist uses. With that, stop knocking off flags in gfs2_glock_nq_m() and its nq_m_sync() helper that are never set in the first place. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * | gfs2: Rewrap overlong comment in do_promoteBob Peterson2022-06-091-3/+4
| |/ | | | | | | | | | | | | Rewrap the comment to keep the line length below 80 characters. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* / mm: shrinkers: provide shrinkers with namesRoman Gushchin2022-07-041-1/+1
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently shrinkers are anonymous objects. For debugging purposes they can be identified by count/scan function names, but it's not always useful: e.g. for superblock's shrinkers it's nice to have at least an idea of to which superblock the shrinker belongs. This commit adds names to shrinkers. register_shrinker() and prealloc_shrinker() functions are extended to take a format and arguments to master a name. In some cases it's not possible to determine a good name at the time when a shrinker is allocated. For such cases shrinker_debugfs_rename() is provided. The expected format is: <subsystem>-<shrinker_type>[:<instance>]-<id> For some shrinkers an instance can be encoded as (MAJOR:MINOR) pair. After this change the shrinker debugfs directory looks like: $ cd /sys/kernel/debug/shrinker/ $ ls dquota-cache-16 sb-devpts-28 sb-proc-47 sb-tmpfs-42 mm-shadow-18 sb-devtmpfs-5 sb-proc-48 sb-tmpfs-43 mm-zspool:zram0-34 sb-hugetlbfs-17 sb-pstore-31 sb-tmpfs-44 rcu-kfree-0 sb-hugetlbfs-33 sb-rootfs-2 sb-tmpfs-49 sb-aio-20 sb-iomem-12 sb-securityfs-6 sb-tracefs-13 sb-anon_inodefs-15 sb-mqueue-21 sb-selinuxfs-22 sb-xfs:vda1-36 sb-bdev-3 sb-nsfs-4 sb-sockfs-8 sb-zsmalloc-19 sb-bpf-32 sb-pipefs-14 sb-sysfs-26 thp-deferred_split-10 sb-btrfs:vda2-24 sb-proc-25 sb-tmpfs-1 thp-zero-9 sb-cgroup2-30 sb-proc-39 sb-tmpfs-27 xfs-buf:vda1-37 sb-configfs-23 sb-proc-41 sb-tmpfs-29 xfs-inodegc:vda1-38 sb-dax-11 sb-proc-45 sb-tmpfs-35 sb-debugfs-7 sb-proc-46 sb-tmpfs-40 [roman.gushchin@linux.dev: fix build warnings] Link: https://lkml.kernel.org/r/Yr+ZTnLb9lJk6fJO@castle Reported-by: kernel test robot <lkp@intel.com> Link: https://lkml.kernel.org/r/20220601032227.4076670-4-roman.gushchin@linux.dev Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Dave Chinner <dchinner@redhat.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* gfs2: Use container_of() for gfs2_glock(aspace)Kees Cook2022-05-241-16/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | Clang's structure layout randomization feature gets upset when it sees struct address_space (which is randomized) cast to struct gfs2_glock. This is due to seeing the mapping pointer as being treated as an array of gfs2_glock, rather than "something else, before struct address_space": In file included from fs/gfs2/acl.c:23: fs/gfs2/meta_io.h:44:12: error: casting from randomized structure pointer type 'struct address_space *' to 'struct gfs2_glock *' return (((struct gfs2_glock *)mapping) - 1)->gl_name.ln_sbd; ^ Replace the instances of open-coded pointer math with container_of() usage, and update the allocator to match. Some cleanups and conversion of gfs2_glock_get() and gfs2_glock_dealloc() by Andreas. Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/lkml/202205041550.naKxwCBj-lkp@intel.com Cc: Bob Peterson <rpeterso@redhat.com> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Bill Wendling <morbo@google.com> Cc: cluster-devel@redhat.com Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* gfs2: Initialize gh_error in gfs2_glock_nqAndreas Gruenbacher2022-02-151-1/+1
| | | | | | | | | | | The gh_error field if a glock holder is initialized to zero in gfs2_holder_init(). When a locking operation fails, gh_error is set to an error code; when it succeeds, the gh_error value is left unchanged. The field isn't initialized in gfs2_holder_reinit(), which is a problem. Instead of fixing that directly, initialize gh_error in gfs2_glock_nq(). That also obsoletes the assignment in do_flock(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* gfs2: Make use of list_is_firstAndreas Gruenbacher2022-02-151-1/+1
| | | | | | Use list_is_first() instead of open-coding it. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* gfs2: cancel timed-out glock requestsAndreas Gruenbacher2022-02-151-0/+10
| | | | | | | | | | The gfs2 evict code tries to upgrade the iopen glock from SH to EX. If the attempt to upgrade times out, gfs2 needs to tell dlm to cancel the lock request or it can deadlock. We also need to wake up the process waiting for the lock when dlm sends its AST back to gfs2. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
* Revert "gfs2: check context in gfs2_glock_put"Andreas Gruenbacher2022-02-111-3/+0
| | | | | | | | | | | | | It turns out that the might_sleep() call that commit 660a6126f8c3 adds is triggering occasional data corruption in testing. We're not sure about the root cause yet, but since this commit was added as a debugging aid only, revert it for now. This reverts commit 660a6126f8c3208f6df8d552039cda078a8426d1. Fixes: 660a6126f8c3 ("gfs2: check context in gfs2_glock_put") Cc: stable@vger.kernel.org # v5.16+ Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* gfs2: Fix gfs2_instantiate descriptionAndreas Gruenbacher2021-12-041-1/+1
| | | | | | | The description of gfs2_instantiate accidentally lists a glock argument, but the function takes a glock holder. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* gfs2: Fix __gfs2_holder_init function name in kernel-doc commentAndreas Gruenbacher2021-12-041-1/+1
| | | | | | | | The function name in the kernel-doc comment wasn't updated when the function was renamed. Fixes: b016d9a84abd ("gfs2: Save ip from gfs2_glock_nq_init") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* gfs2: Fix remote demote of weak glock holdersAndreas Gruenbacher2021-12-021-3/+7
| | | | | | | | | | | When we mock up a temporary holder in gfs2_glock_cb to demote weak holders in response to a remote locking conflict, we don't set the HIF_HOLDER flag. This causes function may_grant to BUG. Fix by setting the missing HIF_HOLDER flag in the mock glock holder. In addition, define the mock glock holder where it is used. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* gfs2: Fix "Introduce flag for glock holder auto-demotion"Andreas Gruenbacher2021-11-081-2/+2
| | | | | | | | | | Function demote_incompat_holders iterates over the list of glock holders with list_for_each_entry, and it then sometimes removes the current holder from the list. This will get the loop stuck; we must use list_for_each_entry_safe instead. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* gfs2: Fix atomic bug in gfs2_instantiateAndreas Gruenbacher2021-11-051-6/+2
| | | | | | | Replace test_bit() + set_bit() with test_and_set_bit() where we need an atomic operation. Use clear_and_wake_up_bit() instead of open coding it. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* gfs2: check context in gfs2_glock_putAlexander Aring2021-10-251-0/+3
| | | | | | | | | Add a might_sleep call into gfs2_glock_put which can sleep in DLM when the last reference is released. This will show problems earlier, and not only when the last reference is put. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* gfs2: Fix glock_hash_walk bugsAndreas Gruenbacher2021-10-251-10/+12
| | | | | | | | | | | | | | | | | So far, glock_hash_walk took a reference on each glock it iterated over, and it was the examiner's responsibility to drop those references. Dropping the final reference to a glock can sleep and the examiners are called in a RCU critical section with spin locks held, so examiners that didn't need the extra reference had to drop it asynchronously via gfs2_glock_queue_put or similar. This wasn't done correctly in thaw_glock which did call gfs2_glock_put, and not at all in dump_glock_func. Change glock_hash_walk to not take glock references at all. That way, the examiners that don't need them won't have to bother with slow asynchronous puts, and the examiners that do need references can take them themselves. Reported-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* gfs2: Cancel remote delete work asynchronouslyAndreas Gruenbacher2021-10-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In gfs2_inode_lookup and gfs2_create_inode, we're calling gfs2_cancel_delete_work which currently cancels any remote delete work (delete_work_func) synchronously. This means that if the work is currently running, it will wait for it to finish. We're doing this to pevent a previous instance of an inode from having any influence on the next instance. However, delete_work_func uses gfs2_inode_lookup internally, and we can end up in a deadlock when delete_work_func gets interrupted at the wrong time. For example, (1) An inode's iopen glock has delete work queued, but the inode itself has been evicted from the inode cache. (2) The delete work is preempted before reaching gfs2_inode_lookup. (3) Another process recreates the inode (gfs2_create_inode). It tries to cancel any outstanding delete work, which blocks waiting for the ongoing delete work to finish. (4) The delete work calls gfs2_inode_lookup, which blocks waiting for gfs2_create_inode to instantiate and unlock the new inode => deadlock. It turns out that when the delete work notices that its inode has been re-instantiated, it will do nothing. This means that it's safe to cancel the delete work asynchronously. This prevents the kind of deadlock described above. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
* gfs2: fix GL_SKIP node_scope problemsBob Peterson2021-10-251-4/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this patch, when a glock was locked, the very first holder on the queue would unlock the lockref and call the go_instantiate glops function (if one existed), unless GL_SKIP was specified. When we introduced the new node-scope concept, we allowed multiple holders to lock glocks in EX mode and share the lock. But node-scope introduced a new problem: if the first holder has GL_SKIP and the next one does NOT, since it is not the first holder on the queue, the go_instantiate op was not called. Eventually the GL_SKIP holder may call the instantiate sub-function (e.g. gfs2_rgrp_bh_get) but there was still a window of time in which another non-GL_SKIP holder assumes the instantiate function had been called by the first holder. In the case of rgrp glocks, this led to a NULL pointer dereference on the buffer_heads. This patch tries to fix the problem by introducing two new glock flags: GLF_INSTANTIATE_NEEDED, which keeps track of when the instantiate function needs to be called to "fill in" or "read in" the object before it is referenced. GLF_INSTANTIATE_IN_PROG which is used to determine when a process is in the process of reading in the object. Whenever a function needs to reference the object, it checks the GLF_INSTANTIATE_NEEDED flag, and if set, it sets GLF_INSTANTIATE_IN_PROG and calls the glops "go_instantiate" function. As before, the gl_lockref spin_lock is unlocked during the IO operation, which may take a relatively long amount of time to complete. While unlocked, if another process determines go_instantiate is still needed, it sees GLF_INSTANTIATE_IN_PROG is set, and waits for the go_instantiate glop operation to be completed. Once GLF_INSTANTIATE_IN_PROG is cleared, it needs to check GLF_INSTANTIATE_NEEDED again because the other process's go_instantiate operation may not have been successful. Functions that previously called the instantiate sub-functions now call directly into gfs2_instantiate so the new bits are managed properly. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* gfs2: split glock instantiation off from do_promoteBob Peterson2021-10-251-3/+17
| | | | | | | | | | Before this patch, function do_promote had a section of code that did the actual instantiation. This patch splits that off into its own function, gfs2_instantiate, which prepares us for the next patch that will use that function. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* gfs2: further simplify do_promoteBob Peterson2021-10-251-20/+23
| | | | | | | | | This patch further simplifies function do_promote by eliminating some redundant code in favor of using a lock_released flag. This is just prep work for a future patch. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* gfs2: re-factor function do_promoteBob Peterson2021-10-251-36/+35
| | | | | | | | This patch simply re-factors function do_promote to reduce the indents. The logic should be unchanged. This makes future patches more readable. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* gfs2: Remove 'first' trace_gfs2_promote argumentAndreas Gruenbacher2021-10-251-2/+2
| | | | | | | | | Remove the 'first' argument of trace_gfs2_promote: with GL_SKIP, the 'first' holder isn't the one that instantiates the glock (gl_instantiate), which is what the 'first' flag was apparently supposed to indicate. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* gfs2: change go_lock to go_instantiateBob Peterson2021-10-251-2/+2
| | | | | | | | | | | Before this patch, the go_lock glock operations (glops) did not do any actual locking. They were used to instantiate objects, like reading in dinodes and rgrps from the media. This patch renames the functions to go_instantiate for clarity. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* gfs2: Save ip from gfs2_glock_nq_initAndreas Gruenbacher2021-10-251-4/+4
| | | | | | | | | | | | Before this patch, when a glock was locked by function gfs2_glock_nq_init, it initialized the holder gh_ip (return address) as gfs2_glock_nq_init. That made it extremely difficult to track down problems because many functions call gfs2_glock_nq_init. This patch changes the function so that it saves gh_ip from the caller of gfs2_glock_nq_init, which makes it easy to backtrack which holder took the lock. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
* gfs2: move GL_SKIP check from glops to do_promoteBob Peterson2021-10-251-12/+14
| | | | | | | | | | | | | Before this patch, each individual "go_lock" glock operation (glop) checked the GL_SKIP flag, and if set, would skip further processing. This patch changes the logic so the go_lock caller, function go_promote, checks the GL_SKIP flag before calling the go_lock op in the first place. This avoids having to unnecessarily unlock gl_lockref.lock only to re-lock it again. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* gfs2: Add GL_SKIP holder flag to dump_holderBob Peterson2021-10-251-0/+2
| | | | | | | | | | Somehow, the GL_SKIP flag was missed when dumping glock holders. This patch adds it to function hflags2str. I added it at the end because I wanted Holder and Skip flags together to read "Hs" rather than "sH" to avoid confusion with "Shared" ("SH") holder state. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* gfs2: Introduce flag for glock holder auto-demotionBob Peterson2021-10-201-36/+179
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces a new HIF_MAY_DEMOTE flag and infrastructure that will allow glocks to be demoted automatically on locking conflicts. When a locking request comes in that isn't compatible with the locking state of an active holder and that holder has the HIF_MAY_DEMOTE flag set, the holder will be demoted before the incoming locking request is granted. Note that this mechanism demotes active holders (with the HIF_HOLDER flag set), while before we were only demoting glocks without any active holders. This allows processes to keep hold of locks that may form a cyclic locking dependency; the core glock logic will then break those dependencies in case a conflicting locking request occurs. We'll use this to avoid giving up the inode glock proactively before faulting in pages. Processes that allow a glock holder to be taken away indicate this by calling gfs2_holder_allow_demote(), which sets the HIF_MAY_DEMOTE flag. Later, they call gfs2_holder_disallow_demote() to clear the flag again, and then they check if their holder is still queued: if it is, they are still holding the glock; if it isn't, they can re-acquire the glock (or abort). Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* gfs2: Clean up function may_grantAndreas Gruenbacher2021-10-201-50/+69
| | | | | | | | | | | | | Pass the first current glock holder into function may_grant and deobfuscate the logic there. While at it, switch from BUG_ON to GLOCK_BUG_ON in may_grant. To make that build cleanly, de-constify the may_grant arguments. We're now using function find_first_holder in do_promote, so move the function's definition above do_promote. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* gfs2: Remove redundant check from gfs2_glock_dqBob Peterson2021-08-201-6/+5
| | | | | | | | | | In function gfs2_glock_dq, it checks to see if this is the fast path. Before this patch, it checked both "find_first_holder(gl) == NULL" and list_empty(&gl->gl_holders), which is redundant. If gl_holders is empty then find_first_holder must return NULL. This patch removes the redundancy. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
* gfs2: Eliminate vestigial HIF_FIRSTBob Peterson2021-08-201-2/+0
| | | | | | Holder flag HIF_FIRST is no longer used or needed, so remove it. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
* gfs2: Use list_move_tail instead of list_del/list_add_tailBaokun Li2021-06-281-2/+1
| | | | | | | | Using list_move_tail() instead of list_del() + list_add_tail(). Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* gfs2: Fix use-after-free in gfs2_glock_shrink_scanHillf Danton2021-05-311-1/+1
| | | | | | | | | | | | | | | | | The GLF_LRU flag is checked under lru_lock in gfs2_glock_remove_from_lru() to remove the glock from the lru list in __gfs2_glock_put(). On the shrink scan path, the same flag is cleared under lru_lock but because of cond_resched_lock(&lru_lock) in gfs2_dispose_glock_lru(), progress on the put side can be made without deleting the glock from the lru list. Keep GLF_LRU across the race window opened by cond_resched_lock(&lru_lock) to ensure correct behavior on both sides - clear GLF_LRU after list_del under lru_lock. Reported-by: syzbot <syzbot+34ba7ddbf3021981a228@syzkaller.appspotmail.com> Signed-off-by: Hillf Danton <hdanton@sina.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>