| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| |
| |
| | |
cls/user: reset stats only returns marker when truncated
Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
the returned marker is a bucket name. when bucket names are long, the
response can overflow the 64-byte limit on responses to write operations
with librados::OPERATION_RETURNVEC. this leads to errors like:
> ERROR: could not reset user stats: (75) Value too large for defined data type
however, the client only needs this marker string to resume listing
after a truncated response. if the listing is not truncated, we can omit
the marker to save space
in general, users will have less than MAX_ENTRIES=1000 buckets, so won't
get truncated listings
Fixes: https://tracker.ceph.com/issues/51786
Signed-off-by: Casey Bodley <cbodley@redhat.com>
|
| |
| |
| |
| | |
Signed-off-by: Casey Bodley <cbodley@redhat.com>
|
|\ \
| | |
| | |
| | |
| | | |
cls/rgw: duplicate reshard checks in all cls_rgw write operations
Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
with the addition of the reshard log, all write ops now have to read the
omap header themselves. this read duplicates the one in
cls_rgw_guard_bucket_resharding(), leading to a minor performance
regression
this commit prepares a long-term fix for this regression by duplicating
the the reshard check done by cls_rgw_guard_bucket_resharding() in each
write operation. this will allow the cls_rgw_guard_bucket_resharding()
calls to be removed from rgw two releases later when we can guarantee
that all OSDs are performing this check for all write operations
Signed-off-by: Casey Bodley <cbodley@redhat.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Casey Bodley <cbodley@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
adds a bulk api for reshard to write entries to the target index shard
object. this takes care of the bucket stats updates so that rgw's
reshard logic doesn't have to worry about it
Signed-off-by: Casey Bodley <cbodley@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | | |
to allow aggregate initialization
Signed-off-by: Casey Bodley <cbodley@redhat.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Casey Bodley <cbodley@redhat.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Casey Bodley <cbodley@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | | |
to allow aggregate initialization
Signed-off-by: Casey Bodley <cbodley@redhat.com>
|
|\ \ \
| |_|/
|/| |
| | |
| | | |
rbd-mirror: allow mirroring to a different namespace
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Allows a namespace in a pool to be mirrored to a differently named
namespace in the secondary cluster.
Signed-off-by: N Balachandran <nibalach@redhat.com>
|
|\ \ \
| |_|/
|/| |
| | |
| | |
| | |
| | | |
librbd: make "group snap list" async and optionally sorted by snap creation time
Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Adds methods for asynchronous cls group_snap_list and
group_snap_list_order, and a helper class which will list group snaps
asynchronously.The helper class also takes try_to_sort and
fail_if_not_sorted arguments. It will attempt to sort the group snaps
listing in order of creation if try_to_sort is true. If sorting fails and
fail_if_not_sorted is true, an error is returned. Otherwise the
unsorted list is returned.
librbd::Group::group_snap_list() now uses the async helper function with
try_to_sort set to true and fail_if_not_sorted to false so it will
attempt to return a sorted listing but will not fail if it cannot.
Credit: Mykola Golub <mgolub@suse.com>
Fixes: https://tracker.ceph.com/issues/51686
Signed-off-by: N Balachandran <nibalach@redhat.com>
|
|\ \ \
| | | |
| | | |
| | | |
| | | | |
rgw reshard: optimize reshard process to minimum blocking time
Reviewed-by: Casey Bodley <cbodley@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Add some testing cases and do cleanup too.
Signed-off-by: Mingyuan Liang <liangmingyuan@baidu.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
When the bucket's index shards are already overloaded, avoid
adding too many extra keys in the reshard log. Limiting the size
of this reshard log to `rgw_reshardlog_threshold`, if an index
write operation during the logrecord stage would exceed that limit,
returning the ERR_BUSY_RESHARDING error early.
Using the reshardlog_entries in `rgw_bucket_dir_header` to do this,
when writting shards, adding the reshardlog_entries. But not need
to add in deleting, because number of index entries reduce meanwhile.
Signed-off-by: Mingyuan Liang <liangmingyuan@baidu.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
reshard
There will be duplicated index entries remaining after reshard failed,
that can lead to redundant copys in a new reshard process. What's more,
if the duplicated entry is deleting operation, and the same entry was
written again before a new resharding, the dst index may be deleted
wrongly. So duplicated index entries should be cleared after reshard
failed and before a new reshard autom automatically.
For convenience, rgw-admin can list and purge reshard logsi manually.
Signed-off-by: Mingyuan Liang <liangmingyuan@baidu.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
At the end of each stage, finish() will be called once to guarantee
all entries can be flushed to dst shards. This may costs long time
if the counts of dst shards is vast, especially in second stage,
so renew reshard_lock is needed.
Signed-off-by: Mingyuan Liang <liangmingyuan@baidu.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The privious release only has one reshard phase: the progress phase
which will block client writes. Because our release contains this
phase and the process is same too, that means it is superset of
privious release. So when privious rgw initiates a reshard, it will
execute as before.
When a updated rgw initiates a reshard, it firstly enter the
logrecord phase which privious releases do not realized. That means
the nodes which do not upgraded will deal with client write
operations without recording logs. It may leads to part of these
index entries missed. So we forbit this scene by adding
`cls_rgw_set_bucket_resharding2()` and `cls_rgw_bucket_init_index2()`
control source and target versions, old osds would fail the request
with -EOPNOTSUPP. so radosgw could start by trying that on all
shards. if there are no errors, it can safely proceed with the new
scheme. If any of the osds do return -EOPNOTSUPP there, then rgw
fall back to the current resharding scheme where writes are blocked
the whole time.
Signed-off-by: Mingyuan Liang <liangmingyuan@baidu.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
In the progress state, some index entries that have already been
copyed to dest shards in logrecord state will be copyed again, we
should subtract their stats in dest shards firstly.
Signed-off-by: Mingyuan Liang <liangmingyuan@baidu.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
In logrecord state, copy inventoried index entries to dest shards
and record a log for new writting entry. In progress state, block
the writes, listing the logs written in logrecord state, then gain
corresponding index entries and copy them to dest shards.
Signed-off-by: Mingyuan Liang <liangmingyuan@baidu.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
version bucket writting operations.
Signed-off-by: Mingyuan Liang <liangmingyuan@baidu.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
prepare and complete.
Signed-off-by: Mingyuan Liang <liangmingyuan@baidu.com>
|
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: Mingyuan Liang <liangmingyuan@baidu.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Define a new status for reshard named IN_LOGRECORD, which will be
used later for recording the index ops written when a bucket is
resharding.
Signed-off-by: Mingyuan Liang <liangmingyuan@baidu.com>
|
|\ \ \ \
| | | | |
| | | | | |
cls: avoid reusing moved-from buffers in cls_queue_src.cc
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
clang-tidy original warning:
/home/suyash/ceph/src/cls/queue/cls_queue_src.cc:330:50: warning: 'bl' used after it was moved [bugprone-use-after-move]
uint64_t entry_start_offset = start_offset - bl.length();
^
/home/suyash/ceph/src/cls/queue/cls_queue_src.cc:333:14: note: move occurred here
bl_chunk = std::move(bl);
^
/home/suyash/ceph/src/cls/queue/cls_queue_src.cc:330:50: note: the use happens in a later loop iteration than the move
uint64_t entry_start_offset = start_offset - bl.length();
^
Fixes: https://tracker.ceph.com/issues/66356
Signed-off-by: Suyash Dongre <suyashd999@gmail.com>
|
| |_|_|/
|/| | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
clang-tidy original warning:
`src/cls/rgw/cls_rgw.cc: warning: an integer is interpreted as a character code when assigning it to a string; if this is intended, cast the integer to the appropriate character type; if you want a string representation, use the appropriate conversion facility [bugprone-string-integer-assignment]
key = BI_PREFIX_CHAR;
^
/home/suyash/ceph/src/cls/rgw/cls_rgw.cc:51:24: note: expanded from macro 'BI_PREFIX_CHAR' #define BI_PREFIX_CHAR 0x80`
**On the following lines we are getting the warning:**
1. 138
2. 319
3. 342
4. 2875
5. 2966
6. 3294
7. 3304
8. 3307
9. 3418
10. 3421
Signed-off-by: Suyash Dongre <suyashd999@gmail.com>
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | | |
test/cls_2pc_queue: prevent list+remove race between consumers
Reviewed-By: Casey Bodley <cbodley@ibm.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
* make sure that the queue full condition is covered
* add cls debugging to test
Fixes: https://tracker.ceph.com/issues/67229
Signed-off-by: Yuval Lifshitz <ylifshit@ibm.com>
|
|\ \ \ \ \
| |_|_|_|/
|/| | | |
| | | | |
| | | | | |
cls/rgw: define lc ops in terms of ObjectOperation instead of IoCtx
Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>
|
| | |_|/
| |/| |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
clean up the lc functions that were supposed to be hidden by
CLS_CLIENT_HIDE_IOCTX
this allows rgw to use them asynchonously with rgw_rados_operate() and
optional_yield, and warn about blocking calls that should be async
Signed-off-by: Casey Bodley <cbodley@redhat.com>
|
|\ \ \ \
| |_|/ /
|/| | |
| | | |
| | | |
| | | | |
cls/rbd: add group_snap_list_order method to enable sorting snapshots in creation order
Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Added functions to get the group snap order keys.
Signed-off-by: N Balachandran <nibalach@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Save the last used group snap order value to speed up
snapshot creation.
Signed-off-by: N Balachandran <nibalach@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
In order to be able to list group snapshots in creation order, a
new snapshot order key is stored on snapshot creation.
[nbalacha: removed the group_snap_unsorted function,
reverted the group_snap_list function changes]
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: N Balachandran <nibalach@redhat.com>
|
| |/ /
| | |
| | |
| | |
| | | |
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: N Balachandran <nibalach@redhat.com>
|
|/ /
| |
| |
| |
| |
| |
| |
| |
| | |
clean up the only gc function that was hidden with CLS_CLIENT_HIDE_IOCTX
this allows rgw to use it asynchonously with rgw_rados_operate() and
optional_yield, and warn about blocking calls that should be async
Signed-off-by: Casey Bodley <cbodley@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
9302fbb3f5416871c1978af5d45f3bf568c2c190 bumped the version in
ENCODE_START() but missed DECODE_START(). i don't think that would cause
any decode failures, unless we later raise ENCODE_START's compat_v above
2
Signed-off-by: Casey Bodley <cbodley@redhat.com>
|
| |
| |
| |
| | |
Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
|
| |
| |
| |
| | |
Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before proceeding with group rollback, ensure that the set of images
that took part in the group snapshot matches the set of images that are
currently part of the group. Otherwise, because we preserve affected
snapshots when an image is removed from the group, data loss can ensue
where an image gets rolled back while part of another group or not part
of any group but long repurposed for something else.
Similarly, ensure that the group snapshot is complete.
Fixes: https://tracker.ceph.com/issues/66300
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The logic for managing the reshard queue (log) can vary depending on
whether the entry was added by an admin or by dynamic resharding. For
example, if it's a reshard reduction, dynamic resharding won't
overwrite the queue entry so as not to disrupt the reduction wait
period. On the other hand, and admin should be able to overwrite the
entry at will.
So we now track the initiator of each entry on the queue. This adds
another field to that at rest data structure, and it updates the logic
to make use of it.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds the ability to prevent overwriting a reshard queue (log) entry
for a given bucket with a newer entry. This adds a flag to the op, so
it will either CREATE or make no changes. If an entry already exists
when this flag is set, -EEXIST will be returned.
This is a preparatory step to adding shard reduction to dynamic
resharding.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
a fork of boost::asio::spawn() was introduced in 2020 with spawn::spawn() from #31580. this fork enabled rgw to customize how the coroutine stacks are allocated in order to avoid stack overflows in frontend request coroutines. this customization was based on a StackAllocator concept from the boost::context library
in boost 1.80, that same StackAllocator overload was added to boost::asio::spawn(), along with other improvements like per-op cancellation. now that boost has everything we need, switch back and drop the spawn submodule
this required switching a lot of async functions from async_completion<> to async_initiate<>. similar changes were necessary to enable the c++20 coroutine token boost::asio::use_awaitable
Signed-off-by: Casey Bodley <cbodley@redhat.com>
|
|
|
|
|
| |
https://tracker.ceph.com/issues/64258
Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
|
|
|
|
| |
Signed-off-by: Casey Bodley <cbodley@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
In the decode function for chunk_refs_by_hash_t, initialize the variable
'hash' of type ceph_le32 to zero before its first use.
This prevents the variable from containing dirty (uninitialized) values,
which could lead to unexpected behavior later in the code.
Fixes: https://tracker.ceph.com/issues/64854
Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
|