| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update these in config.libsonnet:
- NVMeoFMaxGatewaysPerGroup (4->8)
- NVMeoFMaxGatewaysPerCluster (4->32)
- NVMeoFMaxNamespaces (1024->2048)
- NVMeoFHighClientCount (32->128)
Also update prometheus_alerts.yml and test_alerts.yml
accordingly.
Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
|
|
|
|
|
|
|
|
| |
NVMeoFMultipleNamespacesOfRBDImage alerts the user if a RBD image
is used for multiple namespaces. This is important alerts for cases
where namespaces are created on same image for different gateway group.
Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
|
|\
| |
| |
| |
| | |
mgr/dashboard: Add ceph_daemon filter to rgw overview grafana panel queries
Reviewed-by: Afreen Misbah <afreen@ibm.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
queries
Currently rgw_servers filtering is not working in RGW Overview garfana graphs.
It is showing data of all the RGW services, even though filter set to single service.
This PR intends to solve this issue
Fixes: https://tracker.ceph.com/issues/69074
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
|
|/
|
|
|
|
|
|
|
|
|
|
| |
NVMeoFTooManyNamespaces helps to alert user if total
number of namespaces across subsystems are more than
1024.
Change NVMeoFTooManySubsystems limit to 128 from 16.
Fixes: https://github.com/ceph/ceph-nvmeof/issues/948
Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
|
|
|
|
|
|
|
| |
Add test for alerts NVMeoFMissingListener and
NVMeoFZeroListenerSubsystem to test_alerts.yml.
Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
|
|
|
|
|
|
|
| |
Add NVMeoFMissingListener and NVMeoFZeroListenerSubsystem
alerts to prometheus_alerts.libsonnet.
Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
|
|
|
|
|
|
|
|
| |
- `NVMeoFMissingListener`: trigger if all listeners
are not created for each gateway in a subsystem
- `NVMeoFZeroListenerSubsystem`: trigger if a subsystem has no listeners
Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
|
|\
| |
| |
| |
| | |
mgr/dashboard: Add 'Browse Dashboards' button in multi-cluster and ceph-cluster Grafana dashboards
Reviewed-by: Aashish Sharma <aasharma@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
ceph-cluster Grafana dashboards
Fixes: https://tracker.ceph.com/issues/68316
Signed-off-by: piyushagarwal1411 <piyushagarwal14.pa@gmail.com>
Signed-off-by: Piyush Agarwal <piyushagarwal14.pa@gmail.com>
|
|\ \
| |/
|/|
| |
| | |
Add multi-cluster support (showMultiCluster=True) to alerts
Reviewed-by: Afreen Misbah <afreen@ibm.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Following PR https://github.com/ceph/ceph/pull/55495 fixing the
dashboard in regards to multiple clusters storing their metrics
in a single Prometheus instance, this PR addresses the issues
for alerts.
Fixes: https://tracker.ceph.com/issues/64321
Signed-off-by: Christian Rohmann <christian.rohmann@inovex.de>
|
|/
|
|
|
|
|
|
| |
clusters in Manage-clusters page
Fixes: https://tracker.ceph.com/issues/67192
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
|
|
|
|
|
|
|
|
| |
overview grafana dashboard
Fixes: https://tracker.ceph.com/issues/66994
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
|
|\
| |
| |
| |
| |
| | |
install-deps: Update Pyyaml version
Reviewed-by: Ankush Behl <cloudbehl@gmail.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Move to 6.0.1 to overcome https://github.com/yaml/pyyaml/issues/601
Fixes: https://tracker.ceph.com/issues/63591
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
grafana dashboard
Fixes: https://tracker.ceph.com/issues/65760
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
|
|\ \
| | |
| | |
| | |
| | | |
monitoring/ceph-mixin: Add cluster variable to ceph-cluster.json
Reviewed-by: Aashish Sharma <aasharma@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | | |
Fixes: https://tracker.ceph.com/issues/65218
Signed-off-by: cloudbehl <cloudbehl@gmail.com>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
monitoring/ceph-mixin: Cleanup of variables, queries and tests (to fix showMultiCluster=True)
Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Ankush Behl <cloudbehl@gmail.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Rendering the dashboards with showMultiCluster=True allows for
them to work with multiple clusters storing their metrics in a single
Prometheus instance. This works via the cluster label and that functionality
already existed. This just fixes some inconsistencies in applying the label
filters.
Additionally this contains updates to the tests to have them succeed with
with both configurations and avoid the introduction of regressions in
regards to multiCluster in the future.
There also are some consistency cleanups here and there:
* `datasource` was not used consistently
* `cluster` label_values are determined from `ceph_health_status`
* `job` template and filters on this label were removed to align multi cluster
support solely via the `cluster` label
* `ceph_hosts` filter now uses label_values from any ceph_metadata metrici
to now show all instance values, but those of hosts with some Ceph
component / daemon.
* Enable showMultiCluster=True since `cluster` label is now always present,
via https://github.com/ceph/ceph/pull/54964
Improves: https://tracker.ceph.com/issues/64321
Signed-off-by: Christian Rohmann <christian.rohmann@inovex.de>
|
|/ / /
| | |
| | |
| | |
| | |
| | | |
Recommendation from the nvmeof team
Signed-off-by: Adam King <adking@redhat.com>
|
|/ /
| |
| |
| |
| |
| |
| |
| | |
table panel
Fixes: https://tracker.ceph.com/issues/65174
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
|
|\ \
| |/
|/|
| |
| | |
mgr/dashboard: introduce multi cluster management and monitoring in ceph dashboard
Reviewed-by: Nizamudeen A <nia@redhat.com>
|
| |
| |
| |
| |
| |
| | |
https://tracker.ceph.com/issues/64530
Signed-off-by: Nizamudeen A <nia@redhat.com>
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
|
| |
| |
| |
| |
| |
| |
| | |
list/connect/disconnect/edit clusters in multi-cluster setup
Fixes: https://tracker.ceph.com/issues/64530
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
|
|\ \
| | |
| | |
| | |
| | |
| | | |
ceph-mixin: Update mixin to include alerts for the nvmeof gateway(s)
Reviewed-by: Aashish Sharma <aasharma@redhat.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Paul Cuzner <pcuzner@ibm.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Paul Cuzner <pcuzner@ibm.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Paul Cuzner <pcuzner@ibm.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Paul Cuzner <pcuzner@ibm.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Paul Cuzner <pcuzner@ibm.com>
|
| |/
|/|
| |
| |
| |
| |
| |
| | |
panel
Fixes: https://tracker.ceph.com/issues/64579
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
|
|\ \
| | |
| | |
| | |
| | |
| | | |
mgr/dashboard: Fixing RGW graph panels
Reviewed-by: Aashish Sharma <aasharma@redhat.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
- Fixing grafana panels for rgw dashboards
- Fixing RGW overview dashboard queries
fixes https://tracker.ceph.com/issues/64177
Signed-off-by: cloudbehl <cloudbehl@gmail.com>
|
| |/
|/|
| |
| |
| |
| | |
Fixes: https://tracker.ceph.com/issues/64359
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
|
|/
|
|
|
|
| |
This adds new hardware monitoring alerts.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
|
|
|
|
|
|
|
|
|
|
| |
'timeseries' panel
The graph panel type is deprecated, and disappears after Grafana v9.1 (current version is 10.0) to prevent more old type panels being created. These should be migrated to the timeseries panel type, to avoid potential problems with future Grafana versions.
Fixes: https://tracker.ceph.com/issues/61720
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
|
|
|
|
|
|
| |
Fixes the CVEs mentioned here: https://grafana.com/blog/2023/06/06/grafana-security-release-new-grafana-versions-with-security-fixes-for-cve-2023-2183-and-cve-2023-2801/
Signed-off-by: Nizamudeen A <nia@redhat.com>
|
|\
| |
| |
| |
| |
| | |
mgr/dashboard: info on why RBD graphs are empty
Reviewed-by: Ankush Behl <cloudbehl@gmail.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Those RBD IO statistics graphs are empty out of the box and it's on
purpose. Instead of giving an impression that those graphs are broken,
point users to a documentation explaining about optional steps to enable
those statistics.
https://docs.ceph.com/en/latest/mgr/prometheus/#rbd-io-statistics
Signed-off-by: Nobuto Murata <nobuto.murata@canonical.com>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Javlopez/feature/12087-upgrade-and-generate-grafana-dashboards
monitoring: add new dashboards
Fixes: https://tracker.ceph.com/issues/63592
Reviewed-by: Aashish Sharma <aasharma@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
add ceph-cluster.libsonnet file to generate ceph-cluster.json
Fixes: https://tracker.ceph.com/issues/61443
Signed-off-by: Javier <sjavierlopez@gmail.com>
|
|\ \ \
| |_|/
|/| |
| | |
| | | |
mgr/dashboard: Show the OSDs Out and Down panels as red whenever an OSD is in Out or Down state in Ceph Cluster grafana dashboard
Reviewed-by: Nizamudeen A <nia@redhat.com>
|
| |/
| |
| |
| |
| |
| |
| |
| | |
in Out or Down state in Ceph Cluster grafana dashboard
Fixes: https://tracker.ceph.com/issues/62969
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
|
|\ \
| | |
| | |
| | |
| | |
| | | |
mgr/dashboard: Consider null values as zero in grafana panels
Reviewed-by: Nizamudeen A <nia@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
After upgrading from RHCS4 to RHCS5..some of the grafana charts broke.
This is because in RHCS5 we do not generate the metrics if its value is
zero as a result the null value from that metric breaks the grafana
charts or graphs. This PR is to fix the above mentioned issue.
Fixes: https://tracker.ceph.com/issues/63088
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
|
| |/
|/|
| |
| |
| |
| |
| |
| | |
Currently the alert generator is broken if you try to run `tox
-ealerts-fix`. I fixed it and ran the command and it built a new json
file as well.
Signed-off-by: Nizamudeen A <nia@redhat.com>
|
|\ \
| |/
|/| |
ceph-mixin: Add RBD Mirror monitoring alerts
|