summaryrefslogtreecommitdiffstats
path: root/src (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Merge pull request #44396 from cyx1231st/wip-seastore-fix-seastar-runnerKefu Chai2021-12-282-12/+46
|\ | | | | | | | | | | crimson/test: fix SeastarRunner when app is not started Reviewed-by: Xuehan Xu <xuxuehan@360.cn> Reviewed-by: Kefu Chai <tchaikov@gmail.com>
| * crimson/test: fix SeastarRunner when app is not startedYingxin Cheng2021-12-252-12/+46
| | | | | | | | | | | | | | Support the case when the SeastarRunner isn't able to start the app, for example, when start with --help. Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
* | msg/async/dpdk:add commands to obtain the NIC status and statisticsChunsong Feng2021-12-282-0/+121
| | | | | | | | | | | | | | | | | | Commands are added to obtain the network adapter status and statistics for debugging network adapter packet loss and mbuf insufficiency issues. Signed-off-by: Chunsong Feng <fengchunsong@huawei.com> Reviewed-by: luo rixin <luorixin@huawei.com> Reviewed-by: Han Fengzhe <hanfengzhe@hisilicon.com>
* | Merge pull request #44276 from fengchunsong/dpdk-affinityfengchunsong2021-12-284-3/+42
|\ \ | | | | | | common/numa: Skip the DPDK thread when setting NUMA affinity
| * | common/numa: Skip the DPDK thread when setting NUMA affinityChunsong Feng2021-12-271-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | The CPU affinity of the DPDK thread has been set during DPDK initialization. Do not modify the DPDK affinity when setting NUMA affinity. Signed-off-by: Chunsong Feng <fengchunsong@huawei.com> Reviewed-by: luo rixin <luorixin@huawei.com> Reviewed-by: Han Fengzhe <hanfengzhe@hisilicon.com>
| * | msg/async: refactory rename_thread for DPDKStackChunsong Feng2021-12-273-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | The thread_name of the DPDK thread has been set during DPDK initialization. Signed-off-by: Chunsong Feng <fengchunsong@huawei.com> Reviewed-by: luo rixin <luorixin@huawei.com> Reviewed-by: Han Fengzhe <hanfengzhe@hisilicon.com>
* | | Merge PR #44342 into masterPatrick Donnelly2021-12-271-0/+3
|\ \ \ | |/ / |/| | | | | | | | | | | | | | | | | * refs/pull/44342/head: mds: trigger stray reintegration when loading dentry qa: test that scrub causes reintegration Reviewed-by: Xiubo Li <xiubli@redhat.com>
| * | mds: trigger stray reintegration when loading dentryPatrick Donnelly2021-12-161-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During recursive scrub, the MDS will load a remote dentry into cache but not necessarily check if reintegration is necessary. Before this commit, it would only happen when the dentry is returned from a client request. To effect global reintegration when there are too many strays, this means a cluster admin would have to do `find` on the CephFS file system. This is unsavory because of the cache / cap explosion involved. Fixes: https://tracker.ceph.com/issues/53641 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
* | | Merge pull request #44333 from rzarzynski/wip-crimson-fix-recovery-discardingKefu Chai2021-12-261-0/+12
|\ \ \ | | | | | | | | | | | | | | | | | | | | crimson/osd: implement op discarding for pglog-based recovery. Reviewed-by: Samuel Just <sjust@redhat.com> Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
| * | | crimson/osd: implement op discarding for pglog-based recovery.Radoslaw Zarzynski2021-12-161-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | crimson, in regards to the classical OSD, doesn't discard `MOSDPGPush` nor `MOSDPGPull` messages that were sent in a epoch earlier than `last_peering_reset`. This was the problem behind the following crash observed at Sepia: ``` rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-12-07_08:51:40-rados-master-distro-basic-smithi$ less ./6550163/remote/smithi190/log/ceph-osd.1.log.gz ... DEBUG 2021-12-07 09:23:18,543 [shard 0] osd - handle_push: MOSDPGPush(2.1 32/29 {PushOp(2:8ae28953:::benchmark_data_smithi190_40039_object884:head, version: 19'102, data_included: [0~1], data_size: 1, omap_heade r_size: 0, omap_entries_size: 0, attrset_size: 1, recovery_info: ObjectRecoveryInfo(2:8ae28953:::benchmark_data_smithi190_40039_object884:head@19'102, size: 1, copy_subset: [0~1], clone_subset: {}, snapset: 0={} :{}, object_exist: 0), after_progress: ObjectRecoveryProgress(!first, data_recovered_to:1, data_complete:true, omap_recovered_to:, omap_complete:true, error:false), before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, omap_recovered_to:, omap_complete:false, error:false))}) v4 DEBUG 2021-12-07 09:23:18,543 [shard 0] osd - _handle_push DEBUG 2021-12-07 09:23:18,544 [shard 0] osd - submit_push_data ... DEBUG 2021-12-07 09:23:18,545 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=33) [1]/[2] r=-1 lpr=33 pi=[18,33)/2 crt=19'236 m lcod 0'0 remapped NOTIFY last_complete now 19'101 log.complete_to at end ERROR 2021-12-07 09:23:18,545 [shard 0] none - /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-89 04-g6dfda01c/rpm/el8/BUILD/ceph-17.0.0-8904-g6dfda01c/src/osd/PeeringState.cc:4198 : In function 'void PeeringState::recover_got(const hobject_t&, eversion_t, bool, ObjectStore::Transaction&)', ceph_assert(%s) info.last_complete == info.last_update Aborting on shard 0. Backtrace: Reactor stalled for 1270 ms on shard 0. Backtrace: 0xb14ab 0x470b5f58 0x46e2303d 0x46e3eeed 0x46e3f2b2 0x46e3f476 0x46e3f726 0x12b1f 0xc8e3b 0x3ffd3682 0x3ffd8b7b 0x3ffda08e 0x3ffda753 0x3ffcfdcb 0x3ffd02e2 0x3f fd0ada 0x12b1f 0x3737e 0x21db4 0x3feb09be 0x3cd7fc6e 0x3bec409f 0x3c13d963 0x3c250005 0x3c250b21 0x3c251436 0x3c251fa7 0x3a2cc920 0x3a2fa631 0x3a2fb2cd 0x46df4da1 0x46e3d04a 0x46fc744b 0x46fc9420 0x46a79302 0x46 a7db6b 0x3a18b7f2 0x23492 0x39d30edd 0# gsignal in /lib64/libc.so.6 1# abort in /lib64/libc.so.6 2# ceph::__ceph_assert_fail(char const*, char const*, int, char const*) in ceph-osd 3# PeeringState::recover_got(hobject_t const&, eversion_t, bool, ceph::os::Transaction&) in ceph-osd 4# PGRecovery::on_local_recover(hobject_t const&, ObjectRecoveryInfo const&, bool, ceph::os::Transaction&) in ceph-osd ``` The sequence of events --------------------------- The merged log had entries to `19'236`: ``` DEBUG 2021-12-07 09:22:06,727 [shard 0] osd - pg_advance_map(id=74, detail=PGAdvanceMap(pg=2.1 from=23 to=24 do_init)): complete TRACE 2021-12-07 09:22:06,729 [shard 0] osd - call_with_interruption_impl: may_interrupt: false, local interrupt_condintion: 0x603000699800, global interrupt_cond: 0x0,N7crimson3osd20IOInterruptConditionE TRACE 2021-12-07 09:22:06,729 [shard 0] osd - set: interrupt_cond: 0x603000699800, ref_count: 1 DEBUG 2021-12-07 09:22:06,729 [shard 0] osd - do_peering_event handling epoch_sent: 24 epoch_requested: 23 MLogRec from 2 log log((0'0,19'236], crt=19'236) pi ([18,22] all_participants=0,1,2,3 intervals=([18,21] acting 2,3)) pg_lease(ru 89.867439270s ub 89.867439270s int 16.000000000s) +create_info for pg: 2.1 DEBUG 2021-12-07 09:22:06,729 [shard 0] osd - pg_epoch 24 pg[2.1( empty local-lis/les=0/0 n=0 ec=12/12 lis/c=18/18 les/c/f=19/19/0 sis=23) [0,1]/[2] r=-1 lpr=23 pi=[18,23)/1 crt=0'0 mlcod 0'0 unknown state<Started/Stray>: got info+log from osd.2 2.1( v 19'236 (0'0,19'236] local-lis/les=23/24 n=0 ec=12/12 lis/c=18/18 les/c/f=19/19/0 sis=23) log((0'0,19'236], crt=19'236) DEBUG 2021-12-07 09:22:06,729 [shard 0] osd - merge_log log((0'0,19'236], crt=19'236) from osd.2 into log((0'0,0'0], crt=0'0) DEBUG 2021-12-07 09:22:06,729 [shard 0] osd - merge_log extending head to 19'236 DEBUG 2021-12-07 09:22:06,729 [shard 0] osd - ? 19'236 (0'0) modify 2:8a93bbd4:::foo.7:head by client.4668.0:1 2021-12-07T09:21:57.301709+0000 0 ObjectCleanRegions clean_offsets: [9033~18446744073709542582], clean_omap: 1, new_object: 0 ... DEBUG 2021-12-07 09:22:06,739 [shard 0] osd - ? 19'103 (0'0) modify 2:9cf5b466:::benchmark_data_smithi190_40039_object890:head by client.4423.0:891 2021-12-07T09:21:44.366732+0000 0 ObjectCleanRegions clean_ offsets: [1~18446744073709551614], clean_omap: 1, new_object: 0 DEBUG 2021-12-07 09:22:06,739 [shard 0] osd - ? 19'102 (0'0) modify 2:8ae28953:::benchmark_data_smithi190_40039_object884:head by client.4423.0:885 2021-12-07T09:21:44.292066+0000 0 ObjectCleanRegions clean_offsets: [1~18446744073709551614], clean_omap: 1, new_object: 0 DEBUG 2021-12-07 09:22:06,739 [shard 0] osd - ? 19'101 (0'0) modify 2:80595656:::benchmark_data_smithi190_40039_object883:head by client.4423.0:884 2021-12-07T09:21:44.285634+0000 0 ObjectCleanRegions clean_offsets: [1~18446744073709551614], clean_omap: 1, new_object: 0 ``` The PG log was `complete_to 19'102` when recovering previous object. ``` DEBUG 2021-12-07 09:23:18,394 [shard 0] osd - pg_epoch 32 pg[2.1( v 19'236 lc 19'100 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=29 pi=[18,29)/2 luod=0'0 c rt=19'236 mlcod 0'0 active+remapped got missing 2:80595656:::benchmark_data_smithi190_40039_object883:head v 19'101 DEBUG 2021-12-07 09:23:18,394 [shard 0] osd - pg_epoch 32 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=29 pi=[18,29)/2 luod=0'0 c rt=19'236 lcod 19'100 mlcod 0'0 active+remapped last_complete now 19'101 log.complete_to 19'102 ``` ``` DEBUG 2021-12-07 09:23:18,397 [shard 0] osd - pg_epoch 32 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=29 pi=[18,29)/2 luod=0'0 c rt=19'236 lcod 19'100 mlcod 0'0 active+remapped recovery_committed_to version 19'101 now ondisk ``` Then `PGAdvance` event happened... ``` DEBUG 2021-12-07 09:23:18,468 [shard 0] osd - pg_advance_map(id=149, detail=PGAdvanceMap(pg=2.1 from=32 to=33)): start ``` ... and the PG 2.1 went to `Reset`: ``` DEBUG 2021-12-07 09:23:18,477 [shard 0] osd - pg_epoch 32 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=29 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped handle_advance_map {1}/{2} -- 1/2 DEBUG 2021-12-07 09:23:18,477 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=29 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped state<Started>: Started advmap DEBUG 2021-12-07 09:23:18,478 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=29 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped new interval newup {1} newacting {2} DEBUG 2021-12-07 09:23:18,478 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=29 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped state<Started>: should_restart_peering, transitioning to Reset INFO 2021-12-07 09:23:18,478 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=29 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped exit Started/ReplicaActive/RepRecovering 8.880699 5 0.000298 INFO 2021-12-07 09:23:18,478 [shard 0] osd - Exiting state: Started/ReplicaActive/RepRecovering, entered at 1638868989.5980496, 0.000298968 spent on 5 events INFO 2021-12-07 09:23:18,478 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=29 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped exit Started/ReplicaActive 27.421174 0 0.000000 INFO 2021-12-07 09:23:18,478 [shard 0] osd - Exiting state: Started/ReplicaActive, entered at 1638868971.057631, 0.0 spent on 0 events INFO 2021-12-07 09:23:18,478 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=29 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped exit Started 28.590107 0 0.000000 INFO 2021-12-07 09:23:18,478 [shard 0] osd - Exiting state: Started, entered at 1638868969.8887343, 0.0 spent on 0 events INFO 2021-12-07 09:23:18,478 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=29 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped enter Reset INFO 2021-12-07 09:23:18,478 [shard 0] osd - Entering state: Reset DEBUG 2021-12-07 09:23:18,478 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=29 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped set_last_peering_reset 33 DEBUG 2021-12-07 09:23:18,478 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=33 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped Clearing blocked outgoing recovery messages DEBUG 2021-12-07 09:23:18,478 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=33 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped Beginning to block outgoing recovery messages DEBUG 2021-12-07 09:23:18,478 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=33 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped state<Reset>: Reset advmap DEBUG 2021-12-07 09:23:18,479 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=33 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped new interval newup {1} newacting {2} DEBUG 2021-12-07 09:23:18,479 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=33 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped state<Reset>: should restart peering, calling start_peering_interval again DEBUG 2021-12-07 09:23:18,479 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [0,1]/[2] r=-1 lpr=33 pi=[18,29)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped set_last_peering_reset 33 DEBUG 2021-12-07 09:23:18,479 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [1]/[2] r=-1 lpr=33 pi=[18,33)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped start_peering_interval: check_new_interval output: check_new_interval interval(29-32 up {0, 1}(0) acting {2}(2)) up_thru 30 up_from 28 last_epoch_clean 19 interval(29-32 up {0, 1}(0) acting {2}(2) maybe_went_rw) : primary up 28-30 includes interval DEBUG 2021-12-07 09:23:18,479 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=29) [1]/[2] r=-1 lpr=33 pi=[18,33)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped noting past ([18,32] all_participants=0,1,2,3 intervals=([26,28] acting 0,1),([29,32] acting 2)) DEBUG 2021-12-07 09:23:18,479 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=33) [1]/[2] r=-1 lpr=33 pi=[18,33)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped on_new_interval DEBUG 2021-12-07 09:23:18,480 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=33) [1]/[2] r=-1 lpr=33 pi=[18,33)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped on_new_interval upacting_features 0x3f01cfbb7ffdffff from {2}+{1} DEBUG 2021-12-07 09:23:18,480 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=33) [1]/[2] r=-1 lpr=33 pi=[18,33)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped on_new_interval checking missing set deletes flag. missing = missing(135 may_include_deletes = 1) DEBUG 2021-12-07 09:23:18,480 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=33) [1]/[2] r=-1 lpr=33 pi=[18,33)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped init_hb_stamps now {} DEBUG 2021-12-07 09:23:18,480 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=33) [1]/[2] r=-1 lpr=33 pi=[18,33)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped on_new_interval prior_readable_until_ub 0.000000000s (mnow 145.976531982s + 0.000000000s) INFO 2021-12-07 09:23:18,480 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=33) [1]/[2] r=-1 lpr=33 pi=[18,33)/2 luod=0'0 crt=19'236 mlcod 0'0 active+remapped start_peering_interval up {0, 1} -> {1}, acting {2} -> {2}, acting_primary 2 -> 2, up_primary 0 -> 1, role -1 -> -1, features acting 4540138303579357183 upacting 4540138303579357183 DEBUG 2021-12-07 09:23:18,480 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=33) [1]/[2] r=-1 lpr=33 pi=[18,33)/2 crt=19'236 mlcod 0'0 remapped clear_primary_state DEBUG 2021-12-07 09:23:18,480 [shard 0] osd - on_change, pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=33) [1]/[2] r=-1 lpr=33 pi=[18,33)/2 crt=19'236 mlcod 0'0 remapped DEBUG 2021-12-07 09:23:18,480 [shard 0] osd - pg_epoch 33 pg[2.1( v 19'236 lc 19'101 (0'0,19'236] local-lis/les=0/0 n=0 ec=12/12 lis/c=29/18 les/c/f=30/19/0 sis=33) [1]/[2] r=-1 lpr=33 pi=[18,33)/2 crt=19'236 mlcod 0'0 remapped NOTIFY check_recovery_sources no source osds () went down ``` Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
* | | | Merge pull request #44275 from fengchunsong/dpdk-nicKefu Chai2021-12-262-0/+17
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | msg/async/dpdk: add NIC whitelist configuration Reviewed-by: Kefu Chai <tchaikov@gmail.com>
| * | | | msg/async/dpdk: add NIC whitelist configurationChunsong Feng2021-12-252-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Each DPDK process uses some exclusive network adapters.The network adapters to be used are specified in the whitelist to prevent data sharing between multiple DPDK process.The following is an example: 1)Configure a single NIC. -a 0000:7d:010 or --allow=0000:7d:010 2)Configure the Bond Network Adapter ms_dpdk_port_options=--allow=0000:7d:01.0 --allow=0000:7d:02.6 --vdev=net_bonding0,mode=2,slave=0000:7d:01.0,slave=0000:7d:02.6 Signed-off-by: Chunsong Feng <fengchunsong@huawei.com> Reviewed-by: luo rixin <luorixin@huawei.com> Reviewed-by: Han Fengzhe <hanfengzhe@hisilicon.com>
* | | | | msg/async/dpdk:add the handling of DPDK initialization failureChunsong Feng2021-12-261-8/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If rte_eal_init returns with failure,the waiting msgr-worker thread is woken up for exception handling. Signed-off-by: Chunsong Feng <fengchunsong@huawei.com> Reviewed-by: luo rixin <luorixin@huawei.com> Reviewed-by: Han Fengzhe <hanfengzhe@hisilicon.com>
* | | | | Merge pull request #44378 from cyx1231st/wip-seastore-onode-cleanupYingxin2021-12-243-24/+19
|\ \ \ \ \ | |/ / / / |/| | | | | | | | | | | | | | | | | | | crimson/os/seastore: fix potential leak for onodes to live across transactions Reviewed-by: Xuehan Xu <xxhdx1985126@gmail.com> Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
| * | | | crimson/os/seastore: fix potential leak for onodes to live across transactionsYingxin Cheng2021-12-213-24/+19
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
* | | | | Merge pull request #44311 from ifed01/wip-ifed-cleanup-onode-pinYuri Weinstein2021-12-241-19/+3
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | os/bluestore: get rid of fake onode nref increment for pinned entry Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
| * | | | | os/bluestore: get rid of fake onode nref increment for pinned entryIgor Fedotov2021-12-201-19/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Looks like this isn't necessary any more after fixing https://tracker.ceph.com/issues/53002 Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
* | | | | | Merge pull request #44303 from 5cs/fix-new-state-weightYuri Weinstein2021-12-241-4/+0
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | mon/OSDMonitor: fix incorrect op between osd state and weight Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
| * | | | | | mon/OSDMonitor: fix incorrect op between osd state and weightTongliang Deng2021-12-151-4/+0
| |/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The cleanup in f994925908c720f8e0b3fda2f1fc51ef6e757de3 removes osd_epochs if osd is marked out, but out status is not stored in new_state, it is in new_weight. The osd_epochs of osd marked out are handled properly in 0ecef75210113ffba1b89c061cd470ee321d7a45. Thus we delete it. Signed-off-by: Tongliang Deng <dengtongliang@gmail.com>
* | | | | | Merge pull request #44352 from SMIL-Infra/journaldYuri Weinstein2021-12-231-2/+3
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | common: fix fmt::format_to deprecated warning Reviewed-by: Kefu Chai <kchai@redhat.com>
| * | | | | | common: fix fmt::format_to deprecated warning胡玮文2021-12-191-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: 胡玮文 <huww98@outlook.com>
* | | | | | | Merge pull request #42896 from ifed01/wip-ifed-bluefs-improve-logYuri Weinstein2021-12-231-0/+2
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | os/bluestore: dump alloc unit size on bluefs allocation failure. Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
| * | | | | | | os/bluestore: dump alloc unit size on bluefs allocation failure.Igor Fedotov2021-08-231-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Igor Fedotov <ifedotov@suse.com>
* | | | | | | | Merge pull request #44353 from tchaikov/wip-str-mapYuri Weinstein2021-12-234-73/+81
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | common/str_map: reimplement get_str_list() using for_each_pair Reviewed-by: Adam Emerson <aemerson@redhat.com> Reviewed-by: Casey Bodley <cbodley@redhat.com>
| * | | | | | | | mgr: move DaemonState methods into .ccKefu Chai2021-12-192-42/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | for faster compilation, and for better readability. Signed-off-by: Kefu Chai <tchaikov@gmail.com>
| * | | | | | | | mgr: refactor DaemonState::set_metadata()Kefu Chai2021-12-191-17/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | restructure DaemonState::set_metadata() using for_each_pair() for better readability. Signed-off-by: Kefu Chai <tchaikov@gmail.com>
| * | | | | | | | common/str_map: reimplement get_str_list() using for_each_pairKefu Chai2021-12-192-33/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to avoid creating a temporary list<string> and then dropping it on the floor after iterating through it for collecting the kv pairs in it. Signed-off-by: Kefu Chai <tchaikov@gmail.com>
* | | | | | | | | Merge pull request #44330 from rzarzynski/wip-osd-drop-includesYuri Weinstein2021-12-231-6/+0
|\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | osd: drop unnecessary recovery / scrub includes from OSD.cc. Reviewed-by: Neha Ojha <nojha@redhat.com>
| * | | | | | | | | osd: drop unnecessary recovery / scrub includes from OSD.cc.Radoslaw Zarzynski2021-12-161-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
* | | | | | | | | | Merge pull request #44364 from rhcs-dashboard/e2e-fixupsErnesto Puerta2021-12-236-21/+24
|\ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mgr/dashboard: fix timeout error in dashboard cephadm e2e job Reviewed-by: Aashish Sharma <aasharma@redhat.com> Reviewed-by: Alfonso Martínez <almartin@redhat.com> Reviewed-by: Nizamudeen A <nia@redhat.com>
| * | | | | | | | | | mgr/dashboard: fix timeout error in dashboard cephadm e2e jobNizamudeen A2021-12-236-21/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Fix the timeout error happening in the dashboard e2e job 2. Take care of the flaky force maintenance check Most of the time our test is getting timed out while searching for an item in the table. Its because `.clear().type()` is not clearing the content in the search field sometimes and that creates a wrong data to be entered into the search field and it starts searching based on this wrong name. To avoid this I am explicitly clearing the search area before typing. Fixes: https://tracker.ceph.com/issues/53672 Signed-off-by: Nizamudeen A <nia@redhat.com>
* | | | | | | | | | | Merge PR #44322 into masterPatrick Donnelly2021-12-231-2/+2
|\ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * refs/pull/44322/head: mds: skip directory size checks for reintegration qa: test reintegration with directory limits Reviewed-by: Xiubo Li <xiubli@redhat.com>
| * | | | | | | | | | | mds: skip directory size checks for reintegrationPatrick Donnelly2021-12-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Directory size will not change. Fixes: https://tracker.ceph.com/issues/53619 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
* | | | | | | | | | | | Merge pull request #44375 from wjwithagen/wjw-fix-missing-utilityKefu Chai2021-12-231-0/+1
|\ \ \ \ \ \ \ \ \ \ \ \ | |_|/ / / / / / / / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | common: add missing #include <utility> Reviewed-by: Kefu Chai <tchaikov@gmail.com>
| * | | | | | | | | | | common: Fix missing utility includeWillem Jan Withagen2021-12-211-0/+1
| | |_|_|_|_|_|/ / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | See: https://en.cppreference.com/w/cpp/utility/move Detected on FreeBSD/Clang/libc++: /home/jenkins/workspace/ceph-master-compile/src/common/deleter.h:111:43: error: no member named 'move' in namespace 'std' impl(deleter next) : refs(1), next(std::move(next)) {} Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
* | | | | | | | | | | Merge pull request #44365 from tchaikov/wip-seastarKefu Chai2021-12-231-0/+0
|\ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | seastar: pick up upstream change which includes cryptopp fix Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
| * | | | | | | | | | | seastar: pick up upstream change which includes cryptopp fixKefu Chai2021-12-201-0/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | update seastar submodule to upstream master, so we don't need to use our own fork. Signed-off-by: Kefu Chai <tchaikov@gmail.com>
* | | | | | | | | | | | Merge PR #44313 into masterPatrick Donnelly2021-12-221-1/+3
|\ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * refs/pull/44313/head: mds: support '~mds{rank number}' for dump tree Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
| * | | | | | | | | | | | mds: support '~mds{rank number}' for dump treeXiubo Li2021-12-151-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 'get subtrees' command will show the '~mdsdir' as '~mds{rank}' instead. It's strange that 'dump tree ~mds{rank} depth' doesn't work. Signed-off-by: Xiubo Li <xiubli@redhat.com>
* | | | | | | | | | | | | Merge PR #43881 into masterPatrick Donnelly2021-12-222-0/+8
|\ \ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * refs/pull/43881/head: osdc: add set_error in BufferHead, when split set_error to right Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
| * | | | | | | | | | | | | osdc: add set_error in BufferHead, when split set_error to rightjiawd2021-11-112-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes: https://tracker.ceph.com/issues/53227 Signed-off-by: jiawd <jiawendong@xtaotech.com>
* | | | | | | | | | | | | | Merge pull request #44241 from kamoltat/wip-ksirivad-pool-bulk-flagNeha Ojha2021-12-229-254/+320
|\ \ \ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mon: osd pool create <pool-name> with --bulk flag Reviewed-by: Josh Durgin <jdurgin@redhat.com>
| * | | | | | | | | | | | | | pg_autoscaler/test: Modified unit-test for bulk flagKamoltat2021-12-201-131/+195
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Modified the unit-test cases to account for bulk flag and remove any `profile` related things. Signed-off-by: Kamoltat <ksirivad@redhat.com>
| * | | | | | | | | | | | | | mon: osd pool create <pool-name> with --bulk flagKamoltat2021-12-208-123/+125
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Creating the pool with `--bulk` will allow the pg_autoscaler to use the `scale-down` mode on. Creating pool: `ceph osd pool create <pool-name> --bulk` Get var: `ceph osd pool get <pool-name> bulk` Set var: `ceph osd pool set <pool-name> bulk=true/false/1/0` Removed `autoscale_profile` and incorporate bulk flag into calculating `final_pg_target` for each pool. bin/ceph osd pool autoscale-status no longer has `PROFILE` column but has `BULK` instead. Signed-off-by: Kamoltat <ksirivad@redhat.com>
* | | | | | | | | | | | | | | Merge pull request #44360 from SMIL-Infra/fix-ios-frontendErnesto Puerta2021-12-221-1/+1
|\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mgr/dashboard: fix white screen on Safari Reviewed-by: Aashish Sharma <aasharma@redhat.com> Reviewed-by: Alfonso Martínez <almartin@redhat.com> Reviewed-by: huww98 <NOT@FOUND> Reviewed-by: Laura Flores <lflores@redhat.com> Reviewed-by: Nizamudeen A <nia@redhat.com>
| * | | | | | | | | | | | | | | mgr/dashboard: fix white screen on Safari胡玮文2021-12-201-1/+1
| | |_|_|/ / / / / / / / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Safari do not support lookbehind in regular expression. Fixes: https://tracker.ceph.com/issues/53665 Signed-off-by: 胡玮文 <huww98@outlook.com>
* | | | | | | | | | | | | | | Merge pull request #43598 from ideepika/wip-opentelemetryJosh Durgin2021-12-228-37/+17
|\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | migrate from using opentracing-cpp to opentelemetry-cpp static as distributed tracing API Reviewed-by: Yuval Lifshitz <ylifshit@redhat.com> Reviewed-by: Josh Durgin <jdurgin@redhat.com>
| * | | | | | | | | | | | | | | src/*/CMakeLists: update jaeger-base > jaeger_baseDeepika Upadhyay2021-11-246-10/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | jaeger-base encapsulated dependency for opentelemetry tracing libraries, when linked will provide support for tracing for the ceph target. Signed-off-by: Deepika Upadhyay <dupadhya@redhat.com>
| * | | | | | | | | | | | | | | cmake: cleanup BuildJaeger as external projectDeepika Upadhyay2021-11-241-13/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | replaced by Opentelemetry project, remove building and linking of jaeger libraries. * remove externalProject building * linking of jaegertracing dependencies to ceph project Signed-off-by: Deepika Upadhyay <dupadhya@redhat.com>
| * | | | | | | | | | | | | | | debian: remove libjaeger packagingDeepika Upadhyay2021-11-241-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | we no longer need to package libjaeger, as we would be using libopentelemetry static libraries(until we support shared). Signed-off-by: Deepika Upadhyay <dupadhya@redhat.com>