summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--doc/cephadm/troubleshooting.rst103
-rw-r--r--doc/radosgw/placement.rst6
2 files changed, 53 insertions, 56 deletions
diff --git a/doc/cephadm/troubleshooting.rst b/doc/cephadm/troubleshooting.rst
index 18a0437a9ae..5ec69288166 100644
--- a/doc/cephadm/troubleshooting.rst
+++ b/doc/cephadm/troubleshooting.rst
@@ -1,22 +1,19 @@
Troubleshooting
===============
-You might need to investigate why a cephadm command failed
+You may wish to investigate why a cephadm command failed
or why a certain service no longer runs properly.
-Cephadm deploys daemons as containers. This means that
-troubleshooting those containerized daemons might work
-differently than you expect (and that is certainly true if
-you expect this troubleshooting to work the way that
-troubleshooting does when the daemons involved aren't
-containerized).
+Cephadm deploys daemons within containers. This means that
+troubleshooting those containerized daemons will require
+a different process than traditional package-install daemons.
Here are some tools and commands to help you troubleshoot
your Ceph environment.
.. _cephadm-pause:
-Pausing or disabling cephadm
+Pausing or Disabling cephadm
----------------------------
If something goes wrong and cephadm is behaving badly, you can
@@ -45,16 +42,15 @@ See :ref:`cephadm-spec-unmanaged` for information on disabling
individual services.
-Per-service and per-daemon events
+Per-service and Per-daemon Events
---------------------------------
-In order to help with the process of debugging failed daemon
-deployments, cephadm stores events per service and per daemon.
+In order to facilitate debugging failed daemons,
+cephadm stores events per service and per daemon.
These events often contain information relevant to
-troubleshooting
-your Ceph cluster.
+troubleshooting your Ceph cluster.
-Listing service events
+Listing Service Events
~~~~~~~~~~~~~~~~~~~~~~
To see the events associated with a certain service, run a
@@ -82,7 +78,7 @@ This will return something in the following form:
- '2021-02-01T12:09:25.264584 service:alertmanager [ERROR] "Failed to apply: Cannot
place <AlertManagerSpec for service_name=alertmanager> on unknown_host: Unknown hosts"'
-Listing daemon events
+Listing Daemon Events
~~~~~~~~~~~~~~~~~~~~~
To see the events associated with a certain daemon, run a
@@ -106,16 +102,16 @@ This will return something in the following form:
mds.cephfs.hostname.ppdhsz on host 'hostname'"
-Checking cephadm logs
+Checking Cephadm Logs
---------------------
-To learn how to monitor the cephadm logs as they are generated, read :ref:`watching_cephadm_logs`.
+To learn how to monitor cephadm logs as they are generated, read :ref:`watching_cephadm_logs`.
-If your Ceph cluster has been configured to log events to files, there will exist a
-cephadm log file called ``ceph.cephadm.log`` on all monitor hosts (see
-:ref:`cephadm-logs` for a more complete explanation of this).
+If your Ceph cluster has been configured to log events to files, there will be a
+``ceph.cephadm.log`` file on all monitor hosts (see
+:ref:`cephadm-logs` for a more complete explanation).
-Gathering log files
+Gathering Log Files
-------------------
Use journalctl to gather the log files of all daemons:
@@ -140,7 +136,7 @@ To fetch all log files of all daemons on a given host, run::
cephadm logs --fsid <fsid> --name "$name" > $name;
done
-Collecting systemd status
+Collecting Systemd Status
-------------------------
To print the state of a systemd unit, run::
@@ -156,7 +152,7 @@ To fetch all state of all daemons of a given host, run::
done
-List all downloaded container images
+List all Downloaded Container Images
------------------------------------
To list all container images that are downloaded on a host:
@@ -170,16 +166,16 @@ To list all container images that are downloaded on a host:
"registry.opensuse.org/opensuse/leap:15.2"
-Manually running containers
+Manually Running Containers
---------------------------
-Cephadm writes small wrappers that run a containers. Refer to
+Cephadm uses small wrappers when running containers. Refer to
``/var/lib/ceph/<cluster-fsid>/<service-name>/unit.run`` for the
container execution command.
.. _cephadm-ssh-errors:
-SSH errors
+SSH Errors
----------
Error message::
@@ -191,7 +187,7 @@ Error message::
Please make sure that the host is reachable and accepts connections using the cephadm SSH key
...
-Things users can do:
+Things Ceph administrators can do:
1. Ensure cephadm has an SSH identity key::
@@ -224,7 +220,7 @@ To verify that the public key is in the authorized_keys file, run the following
[root@mon1 ~]# cephadm shell -- ceph cephadm get-pub-key > ~/ceph.pub
[root@mon1 ~]# grep "`cat ~/ceph.pub`" /root/.ssh/authorized_keys
-Failed to infer CIDR network error
+Failed to Infer CIDR network error
----------------------------------
If you see this error::
@@ -241,7 +237,7 @@ This means that you must run a command of this form::
For more detail on operations of this kind, see :ref:`deploy_additional_monitors`
-Accessing the admin socket
+Accessing the Admin Socket
--------------------------
Each Ceph daemon provides an admin socket that bypasses the
@@ -252,12 +248,12 @@ To access the admin socket, first enter the daemon container on the host::
[root@mon1 ~]# cephadm enter --name <daemon-name>
[ceph: root@mon1 /]# ceph --admin-daemon /var/run/ceph/ceph-<daemon-name>.asok config show
-Calling miscellaneous ceph tools
+Running Various Ceph Tools
--------------------------------
-To call miscellaneous like ``ceph-objectstore-tool`` or
-``ceph-monstore-tool``, you can run them by calling
-``cephadm shell --name <daemon-name>`` like so::
+To run Ceph tools like ``ceph-objectstore-tool`` or
+``ceph-monstore-tool``, invoke the cephadm CLI with
+``cephadm shell --name <daemon-name>``. For example::
root@myhostname # cephadm unit --name mon.myhostname stop
root@myhostname # cephadm shell --name mon.myhostname
@@ -272,21 +268,21 @@ To call miscellaneous like ``ceph-objectstore-tool`` or
election_strategy: 1
0: [v2:127.0.0.1:3300/0,v1:127.0.0.1:6789/0] mon.myhostname
-This command sets up the environment in a way that is suitable
-for extended daemon maintenance and running the daemon interactively.
+The cephadm shell sets up the environment in a way that is suitable
+for extended daemon maintenance and running daemons interactively.
.. _cephadm-restore-quorum:
-Restoring the MON quorum
-------------------------
+Restoring the Monitor Quorum
+----------------------------
-In case the Ceph MONs cannot form a quorum, cephadm is not able
-to manage the cluster, until the quorum is restored.
+If the Ceph monitor daemons (mons) cannot form a quorum, cephadm will not be
+able to manage the cluster until quorum is restored.
-In order to restore the MON quorum, remove unhealthy MONs
+In order to restore the quorum, remove unhealthy monitors
form the monmap by following these steps:
-1. Stop all MONs. For each MON host::
+1. Stop all mons. For each mon host::
ssh {mon-host}
cephadm unit --name mon.`hostname` stop
@@ -301,18 +297,19 @@ form the monmap by following these steps:
.. _cephadm-manually-deploy-mgr:
-Manually deploying a MGR daemon
--------------------------------
-cephadm requires a MGR daemon in order to manage the cluster. In case the last
-MGR of a cluster was removed, follow these steps in order to deploy a MGR
+Manually Deploying a Manager Daemon
+-----------------------------------
+At least one manager (mgr) daemon is required by cephadm in order to manage the
+cluster. If the last mgr in a cluster has been removed, follow these steps in
+order to deploy a manager called (for example)
``mgr.hostname.smfvfd`` on a random host of your cluster manually.
Disable the cephadm scheduler, in order to prevent cephadm from removing the new
-MGR. See :ref:`cephadm-enable-cli`::
+manager. See :ref:`cephadm-enable-cli`::
ceph config-key set mgr/cephadm/pause true
-Then get or create the auth entry for the new MGR::
+Then get or create the auth entry for the new manager::
ceph auth get-or-create mgr.hostname.smfvfd mon "profile mgr" osd "allow *" mds "allow *"
@@ -338,26 +335,26 @@ Deploy the daemon::
cephadm --image <container-image> deploy --fsid <fsid> --name mgr.hostname.smfvfd --config-json config-json.json
-Analyzing core dumps
+Analyzing Core Dumps
---------------------
-In case a Ceph daemon crashes, cephadm supports analyzing core dumps. To enable core dumps, run
+When a Ceph daemon crashes, cephadm supports analyzing core dumps. To enable core dumps, run
.. prompt:: bash #
ulimit -c unlimited
-core dumps will now be written to ``/var/lib/systemd/coredump``.
+Core dumps will now be written to ``/var/lib/systemd/coredump``.
.. note::
- core dumps are not namespaced by the kernel, which means
+ Core dumps are not namespaced by the kernel, which means
they will be written to ``/var/lib/systemd/coredump`` on
the container host.
-Now, wait for the crash to happen again. (To simulate the crash of a daemon, run e.g. ``killall -3 ceph-mon``)
+Now, wait for the crash to happen again. To simulate the crash of a daemon, run e.g. ``killall -3 ceph-mon``.
-Install debug packages by entering the cephadm shell and install ``ceph-debuginfo``::
+Install debug packages including ``ceph-debuginfo`` by entering the cephadm shelll::
# cephadm shell --mount /var/lib/systemd/coredump
[ceph: root@host1 /]# dnf install ceph-debuginfo gdb zstd
diff --git a/doc/radosgw/placement.rst b/doc/radosgw/placement.rst
index 6274b022f4a..28c71783dd1 100644
--- a/doc/radosgw/placement.rst
+++ b/doc/radosgw/placement.rst
@@ -130,9 +130,9 @@ Then provide the zone placement info for that target:
When data is stored inline (default), it may provide an advantage for read/write workloads since the first chunk of
an object's data can be retrieved/stored in a single librados call along with object metadata. On the other hand, a
target that does not store data inline can provide a performance benefit for RGW client delete requests when
- bluestore db is located on faster storage devices (as compared to data devices) since it eliminates the need to access
- slower devices synchronously while processing the client request. In that case, all data associated with the deleted
- objects can be removed asynchronously in the background by garbage collection.
+ the BlueStore DB is located on faster storage than bucket data since it eliminates the need to access
+ slower devices synchronously while processing the client request. In that case, data associated with the deleted
+ objects is removed asynchronously in the background by garbage collection.
.. _adding_a_storage_class: