summaryrefslogtreecommitdiffstats
path: root/src/cls (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Merge pull request #59884 from cbodley/wip-51786Casey Bodley2024-10-071-4/+4
|\ | | | | | | | | cls/user: reset stats only returns marker when truncated Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
| * cls/user: reset stats only returns marker when truncatedCasey Bodley2024-09-191-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the returned marker is a bucket name. when bucket names are long, the response can overflow the 64-byte limit on responses to write operations with librados::OPERATION_RETURNVEC. this leads to errors like: > ERROR: could not reset user stats: (75) Value too large for defined data type however, the client only needs this marker string to resume listing after a truncated response. if the listing is not truncated, we can omit the marker to save space in general, users will have less than MAX_ENTRIES=1000 buckets, so won't get truncated listings Fixes: https://tracker.ceph.com/issues/51786 Signed-off-by: Casey Bodley <cbodley@redhat.com>
* | cls/rgw: cls_rgw_obj_chain uses vector instead of listCasey Bodley2024-09-251-6/+5
| | | | | | | | Signed-off-by: Casey Bodley <cbodley@redhat.com>
* | Merge pull request #59611 from cbodley/wip-cls-rgw-busy-reshardingCasey Bodley2024-09-198-156/+267
|\ \ | | | | | | | | | | | | cls/rgw: duplicate reshard checks in all cls_rgw write operations Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
| * | cls/rgw: duplicate reshard checks in all cls_rgw write operationsCasey Bodley2024-09-053-26/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with the addition of the reshard log, all write ops now have to read the omap header themselves. this read duplicates the one in cls_rgw_guard_bucket_resharding(), leading to a minor performance regression this commit prepares a long-term fix for this regression by duplicating the the reshard check done by cls_rgw_guard_bucket_resharding() in each write operation. this will allow the cls_rgw_guard_bucket_resharding() calls to be removed from rgw two releases later when we can guarantee that all OSDs are performing this check for all write operations Signed-off-by: Casey Bodley <cbodley@redhat.com>
| * | cls/rgw: remove unused cls_rgw_bi_get_vals()Casey Bodley2024-09-055-115/+0
| | | | | | | | | | | | Signed-off-by: Casey Bodley <cbodley@redhat.com>
| * | cls/rgw: add bulk cls_rgw_bi_put_entries() op for reshardCasey Bodley2024-09-056-0/+156
| | | | | | | | | | | | | | | | | | | | | | | | adds a bulk api for reshard to write entries to the target index shard object. this takes care of the bucket stats updates so that rgw's reshard logic doesn't have to worry about it Signed-off-by: Casey Bodley <cbodley@redhat.com>
| * | cls/rgw: remove rgw_bucket_dir_entry_meta default ctorCasey Bodley2024-09-051-7/+4
| | | | | | | | | | | | | | | | | | to allow aggregate initialization Signed-off-by: Casey Bodley <cbodley@redhat.com>
| * | cls/rgw/client: expose cls_rgw_bucket_init_index2()Casey Bodley2024-09-052-3/+9
| | | | | | | | | | | | Signed-off-by: Casey Bodley <cbodley@redhat.com>
| * | cls/rgw: rgw_cls_bi_entry::get_info() is constCasey Bodley2024-09-052-2/+2
| | | | | | | | | | | | Signed-off-by: Casey Bodley <cbodley@redhat.com>
| * | cls/rgw: remove rgw_cls_bi_entry default ctorCasey Bodley2024-09-051-3/+1
| | | | | | | | | | | | | | | | | | to allow aggregate initialization Signed-off-by: Casey Bodley <cbodley@redhat.com>
* | | Merge pull request #59417 from nbalacha/wip-nbalacha-ns-mirroringIlya Dryomov2024-09-183-0/+99
|\ \ \ | |_|/ |/| | | | | | | | rbd-mirror: allow mirroring to a different namespace Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
| * | rbd-mirror: allow mirroring to a different namespaceN Balachandran2024-09-173-0/+99
| | | | | | | | | | | | | | | | | | | | | Allows a namespace in a pool to be mirrored to a differently named namespace in the secondary cluster. Signed-off-by: N Balachandran <nibalach@redhat.com>
* | | Merge pull request #59107 from nbalacha/wip-nbalacha-async-sorted-snapsIlya Dryomov2024-09-102-22/+62
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | librbd: make "group snap list" async and optionally sorted by snap creation time Reviewed-by: Ramana Raja <rraja@redhat.com> Reviewed-by: Mykola Golub <mgolub@suse.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
| * | cls/rbd: async methods for group snap listN Balachandran2024-08-302-22/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds methods for asynchronous cls group_snap_list and group_snap_list_order, and a helper class which will list group snaps asynchronously.The helper class also takes try_to_sort and fail_if_not_sorted arguments. It will attempt to sort the group snaps listing in order of creation if try_to_sort is true. If sorting fails and fail_if_not_sorted is true, an error is returned. Otherwise the unsorted list is returned. librbd::Group::group_snap_list() now uses the async helper function with try_to_sort set to true and fail_if_not_sorted to false so it will attempt to return a sorted listing but will not fail if it cannot. Credit: Mykola Golub <mgolub@suse.com> Fixes: https://tracker.ceph.com/issues/51686 Signed-off-by: N Balachandran <nibalach@redhat.com>
* | | Merge pull request #56597 from liangmingyuanneo/optimize-reshardCasey Bodley2024-09-058-124/+674
|\ \ \ | | | | | | | | | | | | | | | | rgw reshard: optimize reshard process to minimum blocking time Reviewed-by: Casey Bodley <cbodley@redhat.com>
| * | | cls/rgw: add a helper function for calls to cls_cxx_map_remove_key()liangmingyuan2024-09-043-227/+88
| | | | | | | | | | | | | | | | | | | | | | | | Add some testing cases and do cleanup too. Signed-off-by: Mingyuan Liang <liangmingyuan@baidu.com>
| * | | reshard: limiting the number of log to be recordedliangmingyuan2024-07-263-43/+129
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the bucket's index shards are already overloaded, avoid adding too many extra keys in the reshard log. Limiting the size of this reshard log to `rgw_reshardlog_threshold`, if an index write operation during the logrecord stage would exceed that limit, returning the ERR_BUSY_RESHARDING error early. Using the reshardlog_entries in `rgw_bucket_dir_header` to do this, when writting shards, adding the reshardlog_entries. But not need to add in deleting, because number of index entries reduce meanwhile. Signed-off-by: Mingyuan Liang <liangmingyuan@baidu.com>
| * | | reshard: guarantee no duplicated index entries exist before startingliangmingyuan2024-07-264-0/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | reshard There will be duplicated index entries remaining after reshard failed, that can lead to redundant copys in a new reshard process. What's more, if the duplicated entry is deleting operation, and the same entry was written again before a new resharding, the dst index may be deleted wrongly. So duplicated index entries should be cleared after reshard failed and before a new reshard autom automatically. For convenience, rgw-admin can list and purge reshard logsi manually. Signed-off-by: Mingyuan Liang <liangmingyuan@baidu.com>
| * | | reshard: small fix and cleanupliangmingyuan2024-07-223-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the end of each stage, finish() will be called once to guarantee all entries can be flushed to dst shards. This may costs long time if the counts of dst shards is vast, especially in second stage, so renew reshard_lock is needed. Signed-off-by: Mingyuan Liang <liangmingyuan@baidu.com>
| * | | rgw/reshard: Backward Compatibilityliangmingyuan2024-07-225-4/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The privious release only has one reshard phase: the progress phase which will block client writes. Because our release contains this phase and the process is same too, that means it is superset of privious release. So when privious rgw initiates a reshard, it will execute as before. When a updated rgw initiates a reshard, it firstly enter the logrecord phase which privious releases do not realized. That means the nodes which do not upgraded will deal with client write operations without recording logs. It may leads to part of these index entries missed. So we forbit this scene by adding `cls_rgw_set_bucket_resharding2()` and `cls_rgw_bucket_init_index2()` control source and target versions, old osds would fail the request with -EOPNOTSUPP. so radosgw could start by trying that on all shards. if there are no errors, it can safely proceed with the new scheme. If any of the osds do return -EOPNOTSUPP there, then rgw fall back to the current resharding scheme where writes are blocked the whole time. Signed-off-by: Mingyuan Liang <liangmingyuan@baidu.com>
| * | | rgw/reshard: take into account the object stats of dest shardsliangmingyuan2024-07-216-5/+148
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the progress state, some index entries that have already been copyed to dest shards in logrecord state will be copyed again, we should subtract their stats in dest shards firstly. Signed-off-by: Mingyuan Liang <liangmingyuan@baidu.com>
| * | | rgw/reshard: copy the index entries to dest shards.liangmingyuan2024-07-215-15/+107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In logrecord state, copy inventoried index entries to dest shards and record a log for new writting entry. In progress state, block the writes, listing the logs written in logrecord state, then gain corresponding index entries and copy them to dest shards. Signed-off-by: Mingyuan Liang <liangmingyuan@baidu.com>
| * | | rgw/reshard: record a duplicated index entry copy together withliangmingyuan2024-07-201-59/+113
| | | | | | | | | | | | | | | | | | | | | | | | version bucket writting operations. Signed-off-by: Mingyuan Liang <liangmingyuan@baidu.com>
| * | | rgw/reshard: record a duplicated index entry copy together withliangmingyuan2024-07-201-2/+50
| | | | | | | | | | | | | | | | | | | | | | | | prepare and complete. Signed-off-by: Mingyuan Liang <liangmingyuan@baidu.com>
| * | | rgw/reshard: Define the operation to record a duplicated index entry.liangmingyuan2024-07-203-6/+111
| | | | | | | | | | | | | | | | Signed-off-by: Mingyuan Liang <liangmingyuan@baidu.com>
| * | | rgw/reshard: Add logrecord phase in reshardingliangmingyuan2024-07-201-2/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Define a new status for reshard named IN_LOGRECORD, which will be used later for recording the index ops written when a bucket is resharding. Signed-off-by: Mingyuan Liang <liangmingyuan@baidu.com>
* | | | Merge pull request #57878 from Suyashd999/fix-uam4Yuval Lifshitz2024-09-041-2/+2
|\ \ \ \ | | | | | | | | | | cls: avoid reusing moved-from buffers in cls_queue_src.cc
| * | | | clang tidy generates use-after-move warningSuyash Dongre2024-07-031-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | clang-tidy original warning: /home/suyash/ceph/src/cls/queue/cls_queue_src.cc:330:50: warning: 'bl' used after it was moved [bugprone-use-after-move] uint64_t entry_start_offset = start_offset - bl.length(); ^ /home/suyash/ceph/src/cls/queue/cls_queue_src.cc:333:14: note: move occurred here bl_chunk = std::move(bl); ^ /home/suyash/ceph/src/cls/queue/cls_queue_src.cc:330:50: note: the use happens in a later loop iteration than the move uint64_t entry_start_offset = start_offset - bl.length(); ^ Fixes: https://tracker.ceph.com/issues/66356 Signed-off-by: Suyash Dongre <suyashd999@gmail.com>
* | | | | integer being interpreted as a character code when assigning to a stringSuyash Dongre2024-08-311-6/+6
| |_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | clang-tidy original warning: `src/cls/rgw/cls_rgw.cc: warning: an integer is interpreted as a character code when assigning it to a string; if this is intended, cast the integer to the appropriate character type; if you want a string representation, use the appropriate conversion facility [bugprone-string-integer-assignment] key = BI_PREFIX_CHAR; ^ /home/suyash/ceph/src/cls/rgw/cls_rgw.cc:51:24: note: expanded from macro 'BI_PREFIX_CHAR' #define BI_PREFIX_CHAR 0x80` **On the following lines we are getting the warning:** 1. 138 2. 319 3. 342 4. 2875 5. 2966 6. 3294 7. 3304 8. 3307 9. 3418 10. 3421 Signed-off-by: Suyash Dongre <suyashd999@gmail.com>
* | | | Merge pull request #58911 from yuvalif/wip-yuval-67229Yuval Lifshitz2024-08-151-2/+2
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | test/cls_2pc_queue: prevent list+remove race between consumers Reviewed-By: Casey Bodley <cbodley@ibm.com>
| * | | | test/cls_2pc_queue: prevent list+remove race between consumersYuval Lifshitz2024-08-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * make sure that the queue full condition is covered * add cls debugging to test Fixes: https://tracker.ceph.com/issues/67229 Signed-off-by: Yuval Lifshitz <ylifshit@ibm.com>
* | | | | Merge pull request #58448 from cbodley/wip-rgw-lc-asyncCasey Bodley2024-08-142-67/+61
|\ \ \ \ \ | |_|_|_|/ |/| | | | | | | | | | | | | | cls/rgw: define lc ops in terms of ObjectOperation instead of IoCtx Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>
| * | | | cls/rgw: define lc ops in terms of ObjectOperation instead of IoCtxCasey Bodley2024-07-232-67/+61
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | clean up the lc functions that were supposed to be hidden by CLS_CLIENT_HIDE_IOCTX this allows rgw to use them asynchonously with rgw_rados_operate() and optional_yield, and warn about blocking calls that should be async Signed-off-by: Casey Bodley <cbodley@redhat.com>
* | | | Merge pull request #58002 from nbalacha/wip-nbalacha-sorted-snapsIlya Dryomov2024-08-073-6/+158
|\ \ \ \ | |_|/ / |/| | | | | | | | | | | | | | | cls/rbd: add group_snap_list_order method to enable sorting snapshots in creation order Reviewed-by: Ramana Raja <rraja@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
| * | | cls/rbd: add functions to get group snap ordersN Balachandran2024-08-063-2/+87
| | | | | | | | | | | | | | | | | | | | | | | | Added functions to get the group snap order keys. Signed-off-by: N Balachandran <nibalach@redhat.com>
| * | | cls/rbd: save max group snap orderN Balachandran2024-08-051-27/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Save the last used group snap order value to speed up snapshot creation. Signed-off-by: N Balachandran <nibalach@redhat.com>
| * | | cls/rbd: save group snapshot creation order in a new keyMykola Golub2024-08-051-4/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to be able to list group snapshots in creation order, a new snapshot order key is stored on snapshot creation. [nbalacha: removed the group_snap_unsorted function, reverted the group_snap_list function changes] Signed-off-by: Mykola Golub <mgolub@suse.com> Signed-off-by: N Balachandran <nibalach@redhat.com>
| * | | cls/rbd: make group_snap_list return error if it failedMykola Golub2024-07-241-1/+4
| |/ / | | | | | | | | | | | | Signed-off-by: Mykola Golub <mgolub@suse.com> Signed-off-by: N Balachandran <nibalach@redhat.com>
* / / cls/rgw: gc_list uses ObjectOperation instead of IoCtxCasey Bodley2024-07-232-14/+15
|/ / | | | | | | | | | | | | | | | | clean up the only gc function that was hidden with CLS_CLIENT_HIDE_IOCTX this allows rgw to use it asynchonously with rgw_rados_operate() and optional_yield, and warn about blocking calls that should be async Signed-off-by: Casey Bodley <cbodley@redhat.com>
* | cls/rgw: bump cls_rgw_reshard_entry decode version to match encodeCasey Bodley2024-07-021-1/+1
| | | | | | | | | | | | | | | | | | 9302fbb3f5416871c1978af5d45f3bf568c2c190 bumped the version in ENCODE_START() but missed DECODE_START(). i don't think that would cause any decode failures, unless we later raise ENCODE_START's compat_v above 2 Signed-off-by: Casey Bodley <cbodley@redhat.com>
* | rgw/multisite: use bilog_flags in cls_rgw_bucket_unlink_instance as well.Shilpa Jagannath2024-06-284-20/+9
| | | | | | | | Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
* | rgw/multisite: add a flag to bilog_flags and use it to set/unset null versionShilpa Jagannath2024-06-285-11/+29
| | | | | | | | Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
* | librbd: disallow group snap rollback if memberships don't matchIlya Dryomov2024-06-171-0/+1
|/ | | | | | | | | | | | | | Before proceeding with group rollback, ensure that the set of images that took part in the group snapshot matches the set of images that are currently part of the group. Otherwise, because we preserve affected snapshots when an image is removed from the group, data loss can ensue where an image gets rolled back while part of another group or not part of any group but long repurposed for something else. Similarly, ensure that the group snapshot is complete. Fixes: https://tracker.ceph.com/issues/66300 Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
* rgw: track initiator of reshard queue entriesJ. Eric Ivancich2024-05-312-4/+38
| | | | | | | | | | | | | | | The logic for managing the reshard queue (log) can vary depending on whether the entry was added by an admin or by dynamic resharding. For example, if it's a reshard reduction, dynamic resharding won't overwrite the queue entry so as not to disrupt the reduction wait period. On the other hand, and admin should be able to overwrite the entry at will. So we now track the initiator of each entry on the queue. This adds another field to that at rest data structure, and it updates the logic to make use of it. Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
* cls/rgw: adding an entry to reshard queue has O_CREAT optionJ. Eric Ivancich2024-05-304-9/+38
| | | | | | | | | | | | Adds the ability to prevent overwriting a reshard queue (log) entry for a given bucket with a newer entry. This adds a flag to the op, so it will either CREATE or make no changes. If an entry already exists when this flag is set, -EEXIST will be returned. This is a preparatory step to adding shard reduction to dynamic resharding. Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
* rgw: switch back to boost::asio for spawn() and yield_contextCasey Bodley2024-05-131-10/+5
| | | | | | | | | | a fork of boost::asio::spawn() was introduced in 2020 with spawn::spawn() from #31580. this fork enabled rgw to customize how the coroutine stacks are allocated in order to avoid stack overflows in frontend request coroutines. this customization was based on a StackAllocator concept from the boost::context library in boost 1.80, that same StackAllocator overload was added to boost::asio::spawn(), along with other improvements like per-op cancellation. now that boost has everything we need, switch back and drop the spawn submodule this required switching a lot of async functions from async_completion<> to async_initiate<>. similar changes were necessary to enable the c++20 coroutine token boost::asio::use_awaitable Signed-off-by: Casey Bodley <cbodley@redhat.com>
* rados/test: Remove cls_remote_reade since gather deprecatednmordech@redhat.com2024-05-122-96/+1
| | | | | https://tracker.ceph.com/issues/64258 Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
* cls/user: add interfaces to index user account resourcesCasey Bodley2024-04-107-3/+681
| | | | Signed-off-by: Casey Bodley <cbodley@redhat.com>
* cls/cas/cls_cas_internal: Initialize 'hash' value before decoding into itnmordech@redhat.com2024-03-121-1/+1
| | | | | | | | | | In the decode function for chunk_refs_by_hash_t, initialize the variable 'hash' of type ceph_le32 to zero before its first use. This prevents the variable from containing dirty (uninitialized) values, which could lead to unexpected behavior later in the code. Fixes: https://tracker.ceph.com/issues/64854 Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>