diff options
author | Sage Weil <sage@inktank.com> | 2012-08-20 21:33:08 +0200 |
---|---|---|
committer | Sage Weil <sage@inktank.com> | 2012-08-21 00:04:37 +0200 |
commit | dd4c1dc9f9dae43e4761caca049bfe7361d9ebfb (patch) | |
tree | d3fc8ee398250d4f9a32f3cb06a31035d84e14af | |
parent | osd: make notify debug output less noisy (diff) | |
download | ceph-dd4c1dc9f9dae43e4761caca049bfe7361d9ebfb.tar.xz ceph-dd4c1dc9f9dae43e4761caca049bfe7361d9ebfb.zip |
osd: fix requeue order of dup ops
The waiting_for_ondisk (and ack) maps get dups of ops that are in progress.
If we have a peering change in which the role does not change, we will
requeue the in-progress ops but leave these in the waiting_for_ondisk
maps, which will then trigger an assert the next time we examine that map
and find it didn't match up with what we expected.
Fix this by requeuing these on any peering reset in on_change(). This
keeps the two queues in sync.
Fixes: #2956
Signed-off-by: Sage Weil <sage@inktank.com>
-rw-r--r-- | src/osd/ReplicatedPG.cc | 18 |
1 files changed, 9 insertions, 9 deletions
diff --git a/src/osd/ReplicatedPG.cc b/src/osd/ReplicatedPG.cc index 50cde51a948..296bffb4387 100644 --- a/src/osd/ReplicatedPG.cc +++ b/src/osd/ReplicatedPG.cc @@ -5793,6 +5793,15 @@ void ReplicatedPG::on_change() requeue_ops(waiting_for_all_missing); waiting_for_all_missing.clear(); + // take commit waiters; these are dups of what + // apply_and_flush_repops() will requeue. + for (map<eversion_t, list<OpRequestRef> >::iterator p = waiting_for_ondisk.begin(); + p != waiting_for_ondisk.end(); + p++) + requeue_ops(p->second); + waiting_for_ondisk.clear(); + waiting_for_ack.clear(); + // this will requeue ops we were working on but didn't finish apply_and_flush_repops(is_primary()); @@ -5808,18 +5817,9 @@ void ReplicatedPG::on_change() void ReplicatedPG::on_role_change() { dout(10) << "on_role_change" << dendl; - - // take commit waiters - for (map<eversion_t, list<OpRequestRef> >::iterator p = waiting_for_ondisk.begin(); - p != waiting_for_ondisk.end(); - p++) - requeue_ops(p->second); - waiting_for_ondisk.clear(); - waiting_for_ack.clear(); } - // clear state. called on recovery completion AND cancellation. void ReplicatedPG::_clear_recovery_state() { |