summaryrefslogtreecommitdiffstats
path: root/src/osdc (follow)
Commit message (Collapse)AuthorAgeFilesLines
* osdc: expose Journaler::write_head_neededJohn Spray2017-03-082-2/+14
| | | | | | | | | So that callers on the read side can optionally do their own write_head calls according to the same condition that Journaler uses internally for its write_head during _flush() condition. Signed-off-by: John Spray <john.spray@redhat.com>
* osdc: less aggressive prefetch in read/write JournalerJohn Spray2017-03-081-3/+13
| | | | | | | | | | | | | Previously, if doing a write/is_readable/write/is_readable sequence, you'd end up doing a flush after every write, even though there was already a flush in flight that would advance the readable-ness of the journal. Because this flush-during-read path is only active when using a read/write journal such as in PurgeQueue, tweak the behaviour to suit this case. Signed-off-by: John Spray <john.spray@redhat.com>
* osdc: remove Journaler "journaler_batch_*" settingsJohn Spray2017-03-082-19/+3
| | | | | | | | | This was an unused code path. If anyone set a nonzero value here the MDS would crash because the Timer implementation has changed since this code was written, and now requires add_event_after callers to hold the right lock. Signed-off-by: John Spray <john.spray@redhat.com>
* osdc/Journaler: wrap recover() completion in finisherJohn Spray2017-03-081-1/+1
| | | | | | | | Otherwise, the callback will deadlock if it in turn calls into any Journaler functions. Don't care about performance because we do this once at startup. Signed-off-by: John Spray <john.spray@redhat.com>
* osdc/Filer: const fix for passed layoutsJohn Spray2017-03-082-4/+4
| | | | | | | ...so that const references can be passed into purge calls. Signed-off-by: John Spray <john.spray@redhat.com>
* osdc/Journaler: add have_waiter()John Spray2017-03-082-1/+9
| | | | | | | | | Allows users of wait_for_readable to conveniently see if there is already a waiter. Yes, they could do this themselves, but I'd rather peek at an existing variable than add a new one caller-side. Signed-off-by: John Spray <john.spray@redhat.com>
* osdc/Journaler: remove incorrect assertionJohn Spray2017-03-081-1/+0
| | | | | | | | | | This asserted that flush_pos would be ahead of safe_pos after calling _flush. However, this is not guaranteed to be the case because prezeroing might prevent us from flushing right now. Signed-off-by: John Spray <john.spray@redhat.com>
* osdc/Journaler: assign a name for loggingJohn Spray2017-03-082-4/+6
| | | | | | | Now that we have an MDLog journaler and a PurgeQueue journaler, this is needed to avoid confusion. Signed-off-by: John Spray <john.spray@redhat.com>
* Merge pull request #13759 from liewegas/wip-19133Sage Weil2017-03-082-5/+8
|\ | | | | | | | | | | osdc/Objecter: resend RWORDERED ops on full Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Greg Farnum <gfarnum@redhat.com>
| * osdc/Objecter: resend RWORDERED ops on fullSage Weil2017-03-072-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | Our condition for respecting the FULL flag is complex, and involves the WRITE | RWORDERED flags vs the FULL_FORCE | FULL_TRY flags. Previously, we could block a read bc of RWORDRED but not resend it later. Fix by capturing the complex condition in a respects_full() bool and using it both for the blocking-on-send and resending-on-possibly-notfull-later checks. Fixes: http://tracker.ceph.com/issues/19133 Signed-off-by: Sage Weil <sage@redhat.com>
* | Merge pull request #13323 from yehudasa/wip-18079-2Sage Weil2017-03-072-1/+30
|\ \ | | | | | | | | | | | | librados: use cursor for nobjects listing Reviewed-by: Sage Weil <sage@redhat.com>
| * | objecter: new calls to support cursor with nobjectsYehuda Sadeh2017-02-162-0/+29
| | | | | | | | | | | | Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
| * | objecter: append list of returned objects in pool listingYehuda Sadeh2017-02-161-1/+1
| | | | | | | | | | | | | | | | | | | | | Instead of merging it. The returned list is in the correct order, whereas merging it breaks this order. Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
* | | Merge pull request #12627 from liupan1111/wip-fix-remove-when-fullYuri Weinstein2017-03-032-1/+8
|\ \ \ | |_|/ |/| | | | | | | | | | | rbd: When Ceph cluster becomes full, should allow user to remove rbd … Reviewed-by: Sage Weil <sage@redhat.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
| * | librados: add interface to set osdmap fulll try flagPan Liu2017-02-262-1/+8
| | | | | | | | | | | | Signed-off-by: Pan Liu <liupan1111@gmail.com>
* | | msg/Dispatcher: pass const Message* to ms_can_fast_dispatchSage Weil2017-02-251-1/+1
| | | | | | | | | | | | Signed-off-by: Sage Weil <sage@redhat.com>
* | | Merge pull request #13478 from xiaoxichen/fix_osdc_perfcounterKefu Chai2017-02-251-1/+1
|\ \ \ | |/ / |/| | | | | | | | | | | osdc: fix osdc_osd_seesion perf counter. Reviewed-by: Sage Weil <sage@redhat.com> Reviewed-by: Kefu Chai <kchai@redhat.com>
| * | osdc: fix osdc_osd_seesion perf counter.Xiaoxi Chen2017-02-171-1/+1
| |/ | | | | | | | | | | Should be "set" instead of "inc" Signed-off-by: Xiaoxi Chen <xiaoxchen@ebay.com>
* | osd,osdc: eliminate FLAG_ONDISK and helpersSamuel Just2017-02-241-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The objecter actually always needs to get a response in order to be able to not continually resend ops (even if the caller didn't provide a callback). Thus, it makes no sense for an MOSDOp to ever not have FLAG_ONDISK set. Therefore, we'll just remove the helper and assume it's always there (it's safe to send a response the client didn't ask for, the error paths already do that). On the Objecter side, we'll just unconditionally fill in ONDISK for the benefit of pre-luminous OSDs. Fixes: http://tracker.ceph.com/issues/18961 Signed-off-by: Samuel Just <sjust@redhat.com>
* | osdc/Objecter: _calc_target on all ops so that we notice splitsSage Weil2017-02-231-14/+37
| | | | | | | | | | | | | | | | | | | | | | | | We need to make sure we update the mapping and get an accurate actual_pgid value by recalcuating the mapping on every map change. Otherwise, we may not notice a split (and subsequent actual_pgid change) and resend the same op with a stale spg_t. To fix this, - _calc_target on need_resend - update target regardless of current con Signed-off-by: Sage Weil <sage@redhat.com>
* | osdc/Objecter: refactor pool dne check to make op->session optionalSage Weil2017-02-232-16/+21
| | | | | | | | Signed-off-by: Sage Weil <sage@redhat.com>
* | osdc/Objecter: track latest epoch in op_target_tSage Weil2017-02-232-1/+6
|/ | | | Signed-off-by: Sage Weil <sage@redhat.com>
* Merge pull request #13235 from liewegas/wip-pg-split-intervalSage Weil2017-02-152-85/+119
|\ | | | | | | | | | | | | osd: have clients resend ops on pg split Reviewed-by: Greg Farnum <gfarnum@redhat.com> Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Samuel Just <sjust@redhat.com>
| * osdc/Objecter: manage backoffs per-spg_tSage Weil2017-02-142-28/+46
| | | | | | | | | | | | | | A backoff [range] is defined only within a specific spg_t; it does not pass anything to children on split, or to another primary. Signed-off-by: Sage Weil <sage@redhat.com>
| * messages/MOSDBackoff: add spg_t to messageSage Weil2017-02-141-4/+6
| | | | | | | | | | | | and make it an MOSDFastDispatchOp. Signed-off-by: Sage Weil <sage@redhat.com>
| * osdc/Objecter: recalculate target_* on every _calc_target callSage Weil2017-02-141-14/+19
| | | | | | | | | | | | | | | | | | | | | | | | Any time we are asked to calculate the target we should apply the pool tiering parameters. The previous logic of only doing so when the target hadn't been calculated didn't make a whole lot of sense, and broke our update of *pi that is needed to get the correct pg_num for the target pool. This didn't really matter for old clusters that take the raw pg, but for luminous and beyond we need the exact spg_t which requires a correct pg_num. Signed-off-by: Sage Weil <sage@redhat.com>
| * osdc/Objecter: simplify pgid translationSage Weil2017-02-141-15/+2
| | | | | | | | | | | | | | | | All callers now pass in an explicit pgid, including pg listing. Since we resend ops on split, there is not need to do any translation here, even for the jewel and kraken osds that can handle a full hash value. Signed-off-by: Sage Weil <sage@redhat.com>
| * osdc/Objecter: use overlay pg_pool_t for subsequent calculationsSage Weil2017-02-141-0/+5
| | | | | | | | | | | | | | We use pi for pg_num and other values below; we need to update accordingly if we follow the overlay. Signed-off-by: Sage Weil <sage@redhat.com>
| * osdc/Objecter: force pg_command ops to ignore overlaySage Weil2017-02-141-0/+3
| | | | | | | | Signed-off-by: Sage Weil <sage@redhat.com>
| * osdc/Objecter: force pg_read ops to ignore cache overlaySage Weil2017-02-141-1/+3
| | | | | | | | | | | | | | | | | | | | | | pg_read is only used for PG listing and hit_set_{list,get}; these operations can't and shouldn't consider the tiering overlay. This makes the _calc_target behavior with the explicit pgid make sense; otherwise, what would it mean to try to read pg x.1 from pool x and get redirected to pg y.1 in pool y? Signed-off-by: Sage Weil <sage@redhat.com>
| * osdc/Objecter: populate both actual pgid and full has in MOSDOpSage Weil2017-02-142-5/+20
| | | | | | | | | | | | | | | | New clients need the actual pgid as well as the full hash (as part of the target hobj). Old clients only use the full hash value. We need to pass both to MOSDOp so it can encode based on the target features. Signed-off-by: Sage Weil <sage@redhat.com>
| * osdc/Objecter: remove reassert_versionSage Weil2017-02-142-6/+1
| | | | | | | | | | | | We never populate this since we never get an ack. Signed-off-by: Sage Weil <sage@redhat.com>
| * osdc/Objecter: resend ops on pg split if osd has CEPH_FEATURE_RESEND_ON_SPLITSage Weil2017-02-142-13/+15
| | | | | | | | Signed-off-by: Sage Weil <sage@redhat.com>
* | Merge pull request #13439 from Liuchang0812/cleanup-osdSage Weil2017-02-154-27/+27
|\ \ | | | | | | | | | | | | osd: add override in osd subsystem Reviewed-by: Sage Weil <sage@redhat.com.
| * | osd: add override in osd subsystemliuchang08122017-02-154-27/+27
| | | | | | | | | | | | | | | | | | Fixes: http://tracker.ceph.com/issues/18922 Signed-off-by: liuchang0812 <liuchang0812@gmail.com>
* | | Merge pull request #11128 from tchaikov/wip-16091Josh Durgin2017-02-141-1/+1
|\ \ \ | | | | | | | | | | | | | | | | mon/MonClient: hunt monitors in parallel Reviewed-by: Josh Durgin <jdurgin@redhat.com>
| * | | mon/monclient: hunt for multiple monitor in parallelKefu Chai2017-02-141-1/+1
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | * add an option "mon_client_hunt_parallel" for the maxmimum number of parallel hunting sessions. Fixes: http://tracker.ceph.com/issues/16091 Signed-off-by: Steven Dieffenbach <sdieffen@redhat.com> Signed-off-by: Kefu Chai <kchai@redhat.com>
* | | Merge pull request #13149 from liewegas/wip-list-objectsSage Weil2017-02-142-332/+85
|\ \ \ | |/ / |/| | | | | | | | librados: remove legacy object listing API, clean up newer api Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
| * | osdc/Objecter: remove unused list_objects implementationSage Weil2017-02-092-228/+0
| | | | | | | | | | | | | | | | | | | | | This is means we don't know how to list objects on pre-hammer OSDs. (The PGNLS op was added around a03f85a8e7fab296ea2df70a929a1c5e4aa0f7fb Signed-off-by: Sage Weil <sage@redhat.com>
| * | osdc/Objecter: refactor list_nobjects to use hobject_t as positionSage Weil2017-02-092-107/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stop using current_pg as a position pointer; use the hobject_t cursor explicitly. We keep current_pg *only* for compatibility with !sortbitwise clusters, and we only use it when we get back MAX from a !sortbitwise OSD and need to determine where the start of the next PG is. In !sortbitwise mode we also have the legacy kludges to behave on PG split. Signed-off-by: Sage Weil <sage@redhat.com>
* | | Merge pull request #13321 from liewegas/wip-kill-sortbitwise-harderSage Weil2017-02-132-9/+9
|\ \ \ | |_|/ |/| | | | | | | | | | | osd: kill sortbitwise Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
| * | common/hobject: eliminate wonky compartorsSage Weil2017-02-111-1/+1
| | | | | | | | | | | | Signed-off-by: Sage Weil <sage@redhat.com>
| * | common/hobject: remove cmp_* comparators; add normal operatorsSage Weil2017-02-112-8/+8
| | | | | | | | | | | | | | | | | | Fix up callers. Signed-off-by: Sage Weil <sage@redhat.com>
* | | osdc/Objecter: fix possible OSDSession leak on wrong connectionxie xingguo2017-02-111-0/+2
|/ / | | | | | | | | | | | | This is introduced by the newly added backoff logic. Not sure if it will really happen, but just in case. Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
* | osdc/Objecter: respect backoffsSage Weil2017-02-112-0/+138
| | | | | | | | Signed-off-by: Sage Weil <sage@redhat.com>
* | osdc/Objecter: fix typoSage Weil2017-02-101-1/+1
| | | | | | | | Signed-off-by: Sage Weil <sage@redhat.com>
* | osdc/Objecter: attach OSDSession to ConnectionSage Weil2017-02-101-40/+38
| | | | | | | | | | | | This lets us avoid an rbtree lookup. Signed-off-by: Sage Weil <sage@redhat.com>
* | osdc/ObjectCacher.cc: fix wrong self assignmentDanny Al-Gaaf2017-02-091-1/+1
|/ | | | | | | | | Fix for: CID 1395468 (#1 of 1): Self assignment (NO_EFFECT) self_assign: Assigning bh->last_write to itself has no effect. Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
* Merge pull request #12966 from dillaman/wip-18436Jason Dillaman2017-01-311-1/+1
|\ | | | | | | | | | | osdc: cache should ignore error bhs during trim Reviewed-by: John Spray <john.spray@redhat.com> Reviewed-by: Gregory Farnum <gfarnum@redhat.com>
| * osdc: cache should ignore error bhs during trimJason Dillaman2017-01-171-1/+1
| | | | | | | | | | | | | | | | A read error (such as injecting a timeout into an OSD op) might result in a bh in an error state. These should be trimable by the cache. Fixes: http://tracker.ceph.com/issues/18436 Signed-off-by: Jason Dillaman <dillaman@redhat.com>