| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
The only service this provided was mon space notifications, which
are now handled explicitly by the new HealthMonitor's
check_member_health(), and communited by the new MMonHealthChecks.
Signed-off-by: Sage Weil <sage@redhat.com>
|
|
|
|
| |
Signed-off-by: Sage Weil <sage@redhat.com>
|
|
|
|
|
|
|
| |
This is used to dump extra weirdness to the health detail structured
output, but we are about to remove all of that in luminous.
Signed-off-by: Sage Weil <sage@redhat.com>
|
|
|
|
| |
Signed-off-by: liuchang0812 <liuchang0812@gmail.com>
|
|
|
|
|
|
| |
same work as PR: https://github.com/ceph/ceph/pull/9161
Signed-off-by: Xiaowei Chen <chen.xiaowei@h3c.com>
|
|
|
|
| |
Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
|
|
|
|
|
|
|
|
| |
Refactor the get_health() methods to always take both a summary and detail.
Eliminate the return value and pull that directly from the summary, as we
already do with the PaxosServices.
Signed-off-by: Sage Weil <sage@inktank.com>
|
|\
| |
| |
| |
| | |
mon: Early warning system for monitor stores growing over predefined threshold
Reviewed-by: Sage Weil <sage@inktank.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If the store's size grows beyond what we believe to be reasonable, we must
let the user know that something fishy may be going on. This intends to
act as an early warning system for monitors suffering from leveldb
compaction issues. However, if the monitor's store is just growing a lot
due to normal cluster behaviour, we made sure that the warning threshold
is adjustable by tuning 'mon_leveldb_size_warn' (defaulting to 40GB).
Fixes: #5909
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
|
|/
|
|
|
|
| |
Keep consistency in the code to not generate warnings of this type.
Signed-off-by: Christophe Courtaut <christophe.courtaut@gmail.com>
|
|
|
|
|
|
|
| |
Switch to using regular pointers here. The lifecycle of these services is
very simple such that refcounting is overkill.
Signed-off-by: Sage Weil <sage@inktank.com>
|
|
|
|
|
|
|
|
|
| |
This allows us to return the appropriate overall health status on
Monitor::get_health().
Fixes: 4574
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
|
|
|
|
|
|
| |
Only warn once per percentage point per epoch.
Signed-off-by: Sage Weil <sage@inktank.com>
|
|
|
|
|
|
|
|
|
|
| |
Being unable to run a ::statfs() may be a symptom of something bigger.
We want to cleanly shutdown the monitor ASAP if such thing happens.
Fixes: #4509
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
|
|
The HealthMonitor builds upon the QuorumService interface, and should be
used to keep track of all and any relevant information about the monitor
cluster (maybe even about all the cluster if need be).
This patch also introduces the HealthService interface, used to define
a HealthMonitor service, responsible for dispatching 'MMonHealth' messages
(the QuorumService interface dispatches generic 'Message').
Based on the HealthService interface, we introduce the DataHealthService
class, a service that will track disk space consumption by the monitors,
warn when a given threshold is crossed, and gracefully shutdown the monitor
if disk space usage hits critical levels that might affect the correct
monitor behavior.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
|