| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
So that callers on the read side can optionally
do their own write_head calls according to
the same condition that Journaler uses
internally for its write_head during _flush() condition.
Signed-off-by: John Spray <john.spray@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, if doing a write/is_readable/write/is_readable sequence,
you'd end up doing a flush after every write, even though there
was already a flush in flight that would advance the readable-ness
of the journal.
Because this flush-during-read path is only active when using
a read/write journal such as in PurgeQueue, tweak the behaviour
to suit this case.
Signed-off-by: John Spray <john.spray@redhat.com>
|
|
|
|
|
|
|
|
|
| |
This was an unused code path. If anyone set a nonzero
value here the MDS would crash because the Timer implementation
has changed since this code was written, and now requires
add_event_after callers to hold the right lock.
Signed-off-by: John Spray <john.spray@redhat.com>
|
|
|
|
|
|
|
|
| |
Otherwise, the callback will deadlock if it in turn
calls into any Journaler functions. Don't care
about performance because we do this once at startup.
Signed-off-by: John Spray <john.spray@redhat.com>
|
|
|
|
|
|
|
| |
...so that const references can be passed into
purge calls.
Signed-off-by: John Spray <john.spray@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Allows users of wait_for_readable to conveniently
see if there is already a waiter. Yes, they could
do this themselves, but I'd rather peek at an existing
variable than add a new one caller-side.
Signed-off-by: John Spray <john.spray@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
This asserted that flush_pos would be ahead of
safe_pos after calling _flush. However, this
is not guaranteed to be the case because
prezeroing might prevent us from flushing
right now.
Signed-off-by: John Spray <john.spray@redhat.com>
|
|
|
|
|
|
|
| |
Now that we have an MDLog journaler and a PurgeQueue journaler,
this is needed to avoid confusion.
Signed-off-by: John Spray <john.spray@redhat.com>
|
|\
| |
| |
| |
| |
| | |
osdc/Objecter: resend RWORDERED ops on full
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Our condition for respecting the FULL flag is complex, and involves
the WRITE | RWORDERED flags vs the FULL_FORCE | FULL_TRY flags. Previously,
we could block a read bc of RWORDRED but not resend it later.
Fix by capturing the complex condition in a respects_full() bool and using
it both for the blocking-on-send and resending-on-possibly-notfull-later
checks.
Fixes: http://tracker.ceph.com/issues/19133
Signed-off-by: Sage Weil <sage@redhat.com>
|
|\ \
| | |
| | |
| | |
| | | |
librados: use cursor for nobjects listing
Reviewed-by: Sage Weil <sage@redhat.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Instead of merging it. The returned list is in the correct order,
whereas merging it breaks this order.
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
|
|\ \ \
| |_|/
|/| |
| | |
| | |
| | | |
rbd: When Ceph cluster becomes full, should allow user to remove rbd …
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Pan Liu <liupan1111@gmail.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Sage Weil <sage@redhat.com>
|
|\ \ \
| |/ /
|/| |
| | |
| | |
| | | |
osdc: fix osdc_osd_seesion perf counter.
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
|
| |/
| |
| |
| |
| |
| | |
Should be "set" instead of "inc"
Signed-off-by: Xiaoxi Chen <xiaoxchen@ebay.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The objecter actually always needs to get a response in order to
be able to not continually resend ops (even if the caller didn't
provide a callback). Thus, it makes no sense for an MOSDOp to
ever not have FLAG_ONDISK set. Therefore, we'll just remove the
helper and assume it's always there (it's safe to send a response
the client didn't ask for, the error paths already do that). On
the Objecter side, we'll just unconditionally fill in ONDISK for
the benefit of pre-luminous OSDs.
Fixes: http://tracker.ceph.com/issues/18961
Signed-off-by: Samuel Just <sjust@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We need to make sure we update the mapping and get an accurate actual_pgid
value by recalcuating the mapping on every map change. Otherwise, we may
not notice a split (and subsequent actual_pgid change) and resend the same
op with a stale spg_t. To fix this,
- _calc_target on need_resend
- update target regardless of current con
Signed-off-by: Sage Weil <sage@redhat.com>
|
| |
| |
| |
| | |
Signed-off-by: Sage Weil <sage@redhat.com>
|
|/
|
|
| |
Signed-off-by: Sage Weil <sage@redhat.com>
|
|\
| |
| |
| |
| |
| |
| | |
osd: have clients resend ops on pg split
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
|
| |
| |
| |
| |
| |
| |
| | |
A backoff [range] is defined only within a specific spg_t; it does not
pass anything to children on split, or to another primary.
Signed-off-by: Sage Weil <sage@redhat.com>
|
| |
| |
| |
| |
| |
| | |
and make it an MOSDFastDispatchOp.
Signed-off-by: Sage Weil <sage@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Any time we are asked to calculate the target we should apply the
pool tiering parameters. The previous logic of only doing so when the
target hadn't been calculated didn't make a whole lot of sense, and broke
our update of *pi that is needed to get the correct pg_num for the target
pool. This didn't really matter for old clusters that take the raw pg,
but for luminous and beyond we need the exact spg_t which requires a
correct pg_num.
Signed-off-by: Sage Weil <sage@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
All callers now pass in an explicit pgid, including pg listing. Since
we resend ops on split, there is not need to do any translation here,
even for the jewel and kraken osds that can handle a full hash value.
Signed-off-by: Sage Weil <sage@redhat.com>
|
| |
| |
| |
| |
| |
| |
| | |
We use pi for pg_num and other values below; we need to update accordingly
if we follow the overlay.
Signed-off-by: Sage Weil <sage@redhat.com>
|
| |
| |
| |
| | |
Signed-off-by: Sage Weil <sage@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
pg_read is only used for PG listing and hit_set_{list,get}; these
operations can't and shouldn't consider the tiering overlay.
This makes the _calc_target behavior with the explicit pgid make sense;
otherwise, what would it mean to try to read pg x.1 from pool x and get
redirected to pg y.1 in pool y?
Signed-off-by: Sage Weil <sage@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
New clients need the actual pgid as well as the full hash (as part of the
target hobj). Old clients only use the full hash value. We need to pass
both to MOSDOp so it can encode based on the target features.
Signed-off-by: Sage Weil <sage@redhat.com>
|
| |
| |
| |
| |
| |
| | |
We never populate this since we never get an ack.
Signed-off-by: Sage Weil <sage@redhat.com>
|
| |
| |
| |
| | |
Signed-off-by: Sage Weil <sage@redhat.com>
|
|\ \
| | |
| | |
| | |
| | | |
osd: add override in osd subsystem
Reviewed-by: Sage Weil <sage@redhat.com.
|
| | |
| | |
| | |
| | |
| | |
| | | |
Fixes: http://tracker.ceph.com/issues/18922
Signed-off-by: liuchang0812 <liuchang0812@gmail.com>
|
|\ \ \
| | | |
| | | |
| | | |
| | | | |
mon/MonClient: hunt monitors in parallel
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
|
| | |/
| |/|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
* add an option "mon_client_hunt_parallel" for the maxmimum number of parallel
hunting sessions.
Fixes: http://tracker.ceph.com/issues/16091
Signed-off-by: Steven Dieffenbach <sdieffen@redhat.com>
Signed-off-by: Kefu Chai <kchai@redhat.com>
|
|\ \ \
| |/ /
|/| |
| | |
| | | |
librados: remove legacy object listing API, clean up newer api
Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This is means we don't know how to list objects on pre-hammer OSDs.
(The PGNLS op was added around a03f85a8e7fab296ea2df70a929a1c5e4aa0f7fb
Signed-off-by: Sage Weil <sage@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Stop using current_pg as a position pointer; use the hobject_t
cursor explicitly.
We keep current_pg *only* for compatibility with !sortbitwise
clusters, and we only use it when we get back MAX from a
!sortbitwise OSD and need to determine where the start of the next
PG is. In !sortbitwise mode we also have the legacy kludges to
behave on PG split.
Signed-off-by: Sage Weil <sage@redhat.com>
|
|\ \ \
| |_|/
|/| |
| | |
| | |
| | | |
osd: kill sortbitwise
Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Sage Weil <sage@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | | |
Fix up callers.
Signed-off-by: Sage Weil <sage@redhat.com>
|
|/ /
| |
| |
| |
| |
| |
| | |
This is introduced by the newly added backoff logic.
Not sure if it will really happen, but just in case.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
|
| |
| |
| |
| | |
Signed-off-by: Sage Weil <sage@redhat.com>
|
| |
| |
| |
| | |
Signed-off-by: Sage Weil <sage@redhat.com>
|
| |
| |
| |
| |
| |
| | |
This lets us avoid an rbtree lookup.
Signed-off-by: Sage Weil <sage@redhat.com>
|
|/
|
|
|
|
|
|
|
| |
Fix for:
CID 1395468 (#1 of 1): Self assignment (NO_EFFECT)
self_assign: Assigning bh->last_write to itself has no effect.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
|\
| |
| |
| |
| |
| | |
osdc: cache should ignore error bhs during trim
Reviewed-by: John Spray <john.spray@redhat.com>
Reviewed-by: Gregory Farnum <gfarnum@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
A read error (such as injecting a timeout into an OSD op) might result
in a bh in an error state. These should be trimable by the cache.
Fixes: http://tracker.ceph.com/issues/18436
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
|