ceph - ceph

	Commit message (Collapse)	Author	Age	Files	Lines
*	monitoring: Update nvmeof alert limits in config	Vallari Agrawal	9 days	3	-24/+91
\| \| \| \| \| \| \| \| \| \| \| \| \|	Update these in config.libsonnet: - NVMeoFMaxGatewaysPerGroup (4->8) - NVMeoFMaxGatewaysPerCluster (4->32) - NVMeoFMaxNamespaces (1024->2048) - NVMeoFHighClientCount (32->128) Also update prometheus_alerts.yml and test_alerts.yml accordingly. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
*	monitoring: Add prometheus alert NVMeoFMultipleNamespacesOfRBDImage	Vallari Agrawal	2024-12-18	3	-0/+67
\| \| \| \| \| \| \| \|	NVMeoFMultipleNamespacesOfRBDImage alerts the user if a RBD image is used for multiple namespaces. This is important alerts for cases where namespaces are created on same image for different gateway group. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
*	Merge pull request #60873 from rhcs-dashboard/fix-69074-main	afreen23	2024-12-16	3	-12/+18
\|\ \| \| \| \| \| \| \| \|	mgr/dashboard: Add ceph_daemon filter to rgw overview grafana panel queries Reviewed-by: Afreen Misbah <afreen@ibm.com>
\| *	mgr/dashboard: Add ceph_daemon filter to rgw overview grafana panel	Aashish Sharma	2024-12-05	3	-12/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	queries Currently rgw_servers filtering is not working in RGW Overview garfana graphs. It is showing data of all the RGW services, even though filter set to single service. This PR intends to solve this issue Fixes: https://tracker.ceph.com/issues/69074 Signed-off-by: Aashish Sharma <aasharma@redhat.com>
* \|	monitoring: Add alert NVMeoFTooManyNamespaces	Vallari Agrawal	2024-11-19	4	-5/+291
\|/ \| \| \| \| \| \| \| \| \| \| \|	NVMeoFTooManyNamespaces helps to alert user if total number of namespaces across subsystems are more than 1024. Change NVMeoFTooManySubsystems limit to 128 from 16. Fixes: https://github.com/ceph/ceph-nvmeof/issues/948 Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
*	monitoring: add tests for 2 new nvmeof alerts	Vallari Agrawal	2024-11-11	1	-0/+69
\| \| \| \| \| \| \|	Add test for alerts NVMeoFMissingListener and NVMeoFZeroListenerSubsystem to test_alerts.yml. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
*	monitoring: add 2 new nvmeof alerts	Vallari Agrawal	2024-11-11	1	-0/+20
\| \| \| \| \| \| \|	Add NVMeoFMissingListener and NVMeoFZeroListenerSubsystem alerts to prometheus_alerts.libsonnet. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
*	monitoring: add 2 nvmeof alerts to prometheus_alerts.yaml	Vallari Agrawal	2024-11-11	1	-0/+18
\| \| \| \| \| \| \| \|	- `NVMeoFMissingListener`: trigger if all listeners are not created for each gateway in a subsystem - `NVMeoFZeroListenerSubsystem`: trigger if a subsystem has no listeners Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
*	Merge pull request #60100 from piyushagarwal1411/fix-68316-main	Aashish Sharma	2024-11-05	4	-2/+66
\|\ \| \| \| \| \| \| \| \|	mgr/dashboard: Add 'Browse Dashboards' button in multi-cluster and ceph-cluster Grafana dashboards Reviewed-by: Aashish Sharma <aasharma@redhat.com>
\| *	mgr/dashboard: Add 'Browse Dashboards' button in multi-cluster and ↵	Piyush Agarwal	2024-10-16	4	-2/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ceph-cluster Grafana dashboards Fixes: https://tracker.ceph.com/issues/68316 Signed-off-by: piyushagarwal1411 <piyushagarwal14.pa@gmail.com> Signed-off-by: Piyush Agarwal <piyushagarwal14.pa@gmail.com>
* \|	Merge pull request #56849 from frittentheke/issue_64321_alerts	afreen23	2024-10-21	4	-748/+771
\|\ \ \| \|/ \|/\| \| \| \| \|	Add multi-cluster support (showMultiCluster=True) to alerts Reviewed-by: Afreen Misbah <afreen@ibm.com>
\| *	Add multi-cluster support (showMultiCluster=True) to alerts	Christian Rohmann	2024-10-21	4	-748/+771
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Following PR https://github.com/ceph/ceph/pull/55495 fixing the dashboard in regards to multiple clusters storing their metrics in a single Prometheus instance, this PR addresses the issues for alerts. Fixes: https://tracker.ceph.com/issues/64321 Signed-off-by: Christian Rohmann <christian.rohmann@inovex.de>
* \|	mgr/dashboard: Add Performance Details grafana charts for individual ↵	Aashish Sharma	2024-08-22	3	-106/+10
\|/ \| \| \| \| \| \| \|	clusters in Manage-clusters page Fixes: https://tracker.ceph.com/issues/67192 Signed-off-by: Aashish Sharma <aasharma@redhat.com>
*	mgr/dashboard: Add a new chart for replication delta per shard in rgw sync ↵	Aashish Sharma	2024-07-17	2	-0/+133
\| \| \| \| \| \| \| \|	overview grafana dashboard Fixes: https://tracker.ceph.com/issues/66994 Signed-off-by: Aashish Sharma <aasharma@redhat.com>
*	Merge pull request #56014 from badone/wip-tracker-63591-pyyaml-cython_sources	Nizamudeen A	2024-05-21	3	-3/+3
\|\ \| \| \| \| \| \| \| \| \| \|	install-deps: Update Pyyaml version Reviewed-by: Ankush Behl <cloudbehl@gmail.com> Reviewed-by: Nizamudeen A <nia@redhat.com>
\| *	install-deps: Update Pyyaml version	Brad Hubbard	2024-03-07	3	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move to 6.0.1 to overcome https://github.com/yaml/pyyaml/issues/601 Fixes: https://tracker.ceph.com/issues/63591 Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
* \|	mgr/dashboard: fix cluster filter typo in multi-cluster-overview	Aashish Sharma	2024-05-02	2	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	grafana dashboard Fixes: https://tracker.ceph.com/issues/65760 Signed-off-by: Aashish Sharma <aasharma@redhat.com>
* \|	Merge pull request #56575 from cloudbehl/ceph-cluster-json-update	Aashish Sharma	2024-05-02	1	-29/+51
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	monitoring/ceph-mixin: Add cluster variable to ceph-cluster.json Reviewed-by: Aashish Sharma <aasharma@redhat.com>
\| * \|	monitoring/ceph-mixin: Add cluster variable to ceph-cluster.json	cloudbehl	2024-03-29	1	-29/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes: https://tracker.ceph.com/issues/65218 Signed-off-by: cloudbehl <cloudbehl@gmail.com>
* \| \|	Merge pull request #55495 from frittentheke/issue_64321	Nizamudeen A	2024-05-02	38	-1692/+1457
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	monitoring/ceph-mixin: Cleanup of variables, queries and tests (to fix showMultiCluster=True) Reviewed-by: Aashish Sharma <aasharma@redhat.com> Reviewed-by: Ankush Behl <cloudbehl@gmail.com> Reviewed-by: Nizamudeen A <nia@redhat.com>
\| * \| \|	Cleanup of variables, queries and tests to enable showMultiCluster=True	Christian Rohmann	2024-04-22	38	-1692/+1457
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rendering the dashboards with showMultiCluster=True allows for them to work with multiple clusters storing their metrics in a single Prometheus instance. This works via the cluster label and that functionality already existed. This just fixes some inconsistencies in applying the label filters. Additionally this contains updates to the tests to have them succeed with with both configurations and avoid the introduction of regressions in regards to multiCluster in the future. There also are some consistency cleanups here and there: * `datasource` was not used consistently * `cluster` label_values are determined from `ceph_health_status` * `job` template and filters on this label were removed to align multi cluster support solely via the `cluster` label * `ceph_hosts` filter now uses label_values from any ceph_metadata metrici to now show all instance values, but those of hosts with some Ceph component / daemon. * Enable showMultiCluster=True since `cluster` label is now always present, via https://github.com/ceph/ceph/pull/54964 Improves: https://tracker.ceph.com/issues/64321 Signed-off-by: Christian Rohmann <christian.rohmann@inovex.de>
* \| \| \|	monitoring/ceph-mixin: set NVMeoFMaxGatewaysPerGroup to 4	Adam King	2024-04-22	1	-1/+1
\|/ / / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recommendation from the nvmeof team Signed-off-by: Adam King <adking@redhat.com>
* / /	mgr/dashboard: replace deprecated table panel in grafana with a newer	Aashish Sharma	2024-04-02	13	-975/+2229
\|/ / \| \| \| \| \| \| \| \| \| \| \| \| \| \|	table panel Fixes: https://tracker.ceph.com/issues/65174 Signed-off-by: Aashish Sharma <aasharma@redhat.com>
* \|	Merge pull request #55574 from ceph/feature-multi-cluster-management-monitoring	Nizamudeen A	2024-03-06	4	-2/+3092
\|\ \ \| \|/ \|/\| \| \| \| \|	mgr/dashboard: introduce multi cluster management and monitoring in ceph dashboard Reviewed-by: Nizamudeen A <nia@redhat.com>
\| *	mgr/dashboard: introduce multi-cluster overview page	Nizamudeen A	2024-03-05	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	https://tracker.ceph.com/issues/64530 Signed-off-by: Nizamudeen A <nia@redhat.com> Signed-off-by: Aashish Sharma <aasharma@redhat.com>
\| *	mgr/dashboard: Add a manage clusters page to the multi-cluster nav to	Aashish Sharma	2024-02-22	4	-2/+3092
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	list/connect/disconnect/edit clusters in multi-cluster setup Fixes: https://tracker.ceph.com/issues/64530 Signed-off-by: Aashish Sharma <aasharma@redhat.com>
* \|	Merge pull request #55510 from pcuzner/add-nvmeof-alerts	Aashish Sharma	2024-02-29	5	-1/+715
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ceph-mixin: Update mixin to include alerts for the nvmeof gateway(s) Reviewed-by: Aashish Sharma <aasharma@redhat.com>
\| * \|	ceph-mixins: Update MIB to include nvmeof notification	Paul Cuzner	2024-02-26	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Paul Cuzner <pcuzner@ibm.com>
\| * \|	ceph-mixins: Add test cases for nvmeof alerts	Paul Cuzner	2024-02-26	1	-0/+423
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Paul Cuzner <pcuzner@ibm.com>
\| * \|	ceph-mixins: nvmeof alerts added	Paul Cuzner	2024-02-26	1	-0/+129
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Paul Cuzner <pcuzner@ibm.com>
\| * \|	ceph-mixins: Add nvmeof alerts	Paul Cuzner	2024-02-26	1	-0/+145
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Paul Cuzner <pcuzner@ibm.com>
\| * \|	ceph-mixins: Add vars to support nvmeof alerts	Paul Cuzner	2024-02-25	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Paul Cuzner <pcuzner@ibm.com>
* \| \|	mgr/dashboard: replace piechart plugin charts with native pie chart	Aashish Sharma	2024-02-27	4	-50/+220
\| \|/ \|/\| \| \| \| \| \| \| \| \| \| \| \| \|	panel Fixes: https://tracker.ceph.com/issues/64579 Signed-off-by: Aashish Sharma <aasharma@redhat.com>
* \|	Merge pull request #55314 from cloudbehl/rgw-dashboard-json	Aashish Sharma	2024-02-13	5	-47/+47
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	mgr/dashboard: Fixing RGW graph panels Reviewed-by: Aashish Sharma <aasharma@redhat.com>
\| * \|	mgr/dashboards: add generated json files	Aashish Sharma	2024-02-07	4	-29/+29
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Aashish Sharma <aasharma@redhat.com>
\| * \|	mgr/dashboard: Fixing RGW graph panels	cloudbehl	2024-01-25	1	-18/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Fixing grafana panels for rgw dashboards - Fixing RGW overview dashboard queries fixes https://tracker.ceph.com/issues/64177 Signed-off-by: cloudbehl <cloudbehl@gmail.com>
* \| \|	mgr/dashboard: Add RGW per user/bucket panels in grafana	Aashish Sharma	2024-02-09	5	-1/+7153
\| \|/ \|/\| \| \| \| \| \| \| \| \|	Fixes: https://tracker.ceph.com/issues/64359 Signed-off-by: Aashish Sharma <aasharma@redhat.com>
* \|	monitoring: add new alerts	Guillaume Abrioux	2024-01-25	4	-0/+272
\|/ \| \| \| \| \|	This adds new hardware monitoring alerts. Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
*	mgr/dashboard: upgrade from old 'graph' type panels to the new	Aashish Sharma	2023-12-22	17	-74/+605
\| \| \| \| \| \| \| \| \| \|	'timeseries' panel The graph panel type is deprecated, and disappears after Grafana v9.1 (current version is 10.0) to prevent more old type panels being created. These should be migrated to the timeseries panel type, to avoid potential problems with future Grafana versions. Fixes: https://tracker.ceph.com/issues/61720 Signed-off-by: Aashish Sharma <aasharma@redhat.com>
*	monitoring: upgrade grafana container to 9.4.12	Nizamudeen A	2023-12-06	1	-1/+1
\| \| \| \| \| \|	Fixes the CVEs mentioned here: https://grafana.com/blog/2023/06/06/grafana-security-release-new-grafana-versions-with-security-fixes-for-cve-2023-2183-and-cve-2023-2801/ Signed-off-by: Nizamudeen A <nia@redhat.com>
*	Merge pull request #54355 from nobuto-m/info-rbd-stats-pools	Nizamudeen A	2023-11-30	3	-15/+32
\|\ \| \| \| \| \| \| \| \| \| \|	mgr/dashboard: info on why RBD graphs are empty Reviewed-by: Ankush Behl <cloudbehl@gmail.com> Reviewed-by: Nizamudeen A <nia@redhat.com>
\| *	mgr/dashboard: info on why RBD graphs are empty	Nobuto Murata	2023-11-06	3	-15/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Those RBD IO statistics graphs are empty out of the box and it's on purpose. Instead of giving an impression that those graphs are broken, point users to a documentation explaining about optional steps to enable those statistics. https://docs.ceph.com/en/latest/mgr/prometheus/#rbd-io-statistics Signed-off-by: Nobuto Murata <nobuto.murata@canonical.com>
* \|	Merge pull request #51340 from ↵	Aashish Sharma	2023-11-20	7	-58/+6541
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Javlopez/feature/12087-upgrade-and-generate-grafana-dashboards monitoring: add new dashboards Fixes: https://tracker.ceph.com/issues/63592 Reviewed-by: Aashish Sharma <aasharma@redhat.com>
\| * \|	monitoring: update libsonnet files for generate ceph-cluster.json	Javier	2023-10-21	7	-58/+6541
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	add ceph-cluster.libsonnet file to generate ceph-cluster.json Fixes: https://tracker.ceph.com/issues/61443 Signed-off-by: Javier <sjavierlopez@gmail.com>
* \| \|	Merge pull request #53650 from rhcs-dashboard/fix-62969-main	Aashish Sharma	2023-11-17	1	-2/+88
\|\ \ \ \| \|_\|/ \|/\| \| \| \| \| \| \| \|	mgr/dashboard: Show the OSDs Out and Down panels as red whenever an OSD is in Out or Down state in Ceph Cluster grafana dashboard Reviewed-by: Nizamudeen A <nia@redhat.com>
\| * \|	mgr/dashboard: Show the OSD's Out and Down panels as red whenever an OSD is ↵	Aashish Sharma	2023-10-11	1	-2/+88
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \|	in Out or Down state in Ceph Cluster grafana dashboard Fixes: https://tracker.ceph.com/issues/62969 Signed-off-by: Aashish Sharma <aasharma@redhat.com>
* \|	Merge pull request #53807 from rhcs-dashboard/fix-63088-main	Aashish Sharma	2023-10-25	7	-26/+26
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	mgr/dashboard: Consider null values as zero in grafana panels Reviewed-by: Nizamudeen A <nia@redhat.com>
\| * \|	mgr/dashboard: Consider null values as zero in grafana panels	Aashish Sharma	2023-10-04	7	-26/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After upgrading from RHCS4 to RHCS5..some of the grafana charts broke. This is because in RHCS5 we do not generate the metrics if its value is zero as a result the null value from that metric breaks the grafana charts or graphs. This PR is to fix the above mentioned issue. Fixes: https://tracker.ceph.com/issues/63088 Signed-off-by: Aashish Sharma <aasharma@redhat.com>
* \| \|	mgr/dashboard: fix broken alert generator	Nizamudeen A	2023-10-13	4	-3/+9
\| \|/ \|/\| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the alert generator is broken if you try to run `tox -ealerts-fix`. I fixed it and ran the command and it built a new json file as well. Signed-off-by: Nizamudeen A <nia@redhat.com>
* \|	Merge pull request #50132 from aruniiird/add-rbd-mirror-mon-alerts	Juan Miguel Olmo	2023-10-10	9	-15/+329
\|\ \ \| \|/ \|/\|	ceph-mixin: Add RBD Mirror monitoring alerts