summaryrefslogtreecommitdiffstats
path: root/Documentation/networking/devlink
diff options
context:
space:
mode:
authorRahul Rameshbabu <rrameshbabu@nvidia.com>2023-08-09 06:10:21 +0200
committerSaeed Mahameed <saeedm@nvidia.com>2023-08-14 23:40:20 +0200
commit53b836a44db4259b94ffcfff321fb3d63f976b76 (patch)
treea4ecb2462bb2fd669c0ceb0ab4af4a0ab2270882 /Documentation/networking/devlink
parentnet/mlx5e: Make tx_port_ts logic resilient to out-of-order CQEs (diff)
downloadlinux-53b836a44db4259b94ffcfff321fb3d63f976b76.tar.xz
linux-53b836a44db4259b94ffcfff321fb3d63f976b76.zip
net/mlx5e: Add recovery flow for tx devlink health reporter for unhealthy PTP SQ
A new check for the tx devlink health reporter is introduced for determining when the PTP port timestamping SQ is considered unhealthy. If there are enough CQEs considered never to be delivered, the space that can be utilized on the SQ decreases significantly, impacting performance and usability of the SQ. The health reporter is triggered when the number of likely never delivered port timestamping CQEs that utilize the space of the PTP SQ is greater than 93.75% of the total capacity of the SQ. A devlink health reporter recover method is also provided for this specific TX error context that restarts the PTP SQ. Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Diffstat (limited to 'Documentation/networking/devlink')
-rw-r--r--Documentation/networking/devlink/mlx5.rst5
1 files changed, 4 insertions, 1 deletions
diff --git a/Documentation/networking/devlink/mlx5.rst b/Documentation/networking/devlink/mlx5.rst
index 196a4bb28df1..702f204a3dbd 100644
--- a/Documentation/networking/devlink/mlx5.rst
+++ b/Documentation/networking/devlink/mlx5.rst
@@ -135,7 +135,7 @@ Health reporters
tx reporter
-----------
-The tx reporter is responsible for reporting and recovering of the following two error scenarios:
+The tx reporter is responsible for reporting and recovering of the following three error scenarios:
- tx timeout
Report on kernel tx timeout detection.
@@ -143,6 +143,9 @@ The tx reporter is responsible for reporting and recovering of the following two
- tx error completion
Report on error tx completion.
Recover by flushing the tx queue and reset it.
+- tx PTP port timestamping CQ unhealthy
+ Report too many CQEs never delivered on port ts CQ.
+ Recover by flushing and re-creating all PTP channels.
tx reporter also support on demand diagnose callback, on which it provides
real time information of its send queues status.