Merge pull request #61462 from zdover23/wip-doc-2025-01-21-cephfs-disaster-recovery-experts

doc/cephfs: edit disaster-recovery-experts (4 of x) Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
author: Zac Dover <zac.dover@proton.me> 2025-01-22 12:57:08 +0100
committer: GitHub <noreply@github.com> 2025-01-22 12:57:08 +0100
commit: 31101e16547a20627a7001ffa08d7a9feac37dc5 (patch)
tree: 84bf76a6fed37ae32d7201346f697a3d2ff9a045
parent: Merge pull request #61436 from soumyakoduri/wip-skoduri-dbstore (diff)
parent: doc/cephfs: edit disaster-recovery-experts (4 of x) (diff)
download: ceph-31101e16547a20627a7001ffa08d7a9feac37dc5.tar.xz
ceph-31101e16547a20627a7001ffa08d7a9feac37dc5.zip
1 files changed, 79 insertions, 55 deletions
diff --git a/doc/cephfs/disaster-recovery-experts.rst b/doc/cephfs/disaster-recovery-experts.rst
index 608c0b3d91e..c224c2a4190 100644
--- a/doc/cephfs/disaster-recovery-experts.rst
+++ b/doc/cephfs/disaster-recovery-experts.rst
@@ -213,118 +213,142 @@ Using an alternate metadata pool for recovery
 
 .. warning::
 
-   There has not been extensive testing of this procedure. It should be
-   undertaken with great care.
+   This procedure has not been extensively tested. It should be undertaken only
+   with great care.
 
-If an existing file system is damaged and inoperative, it is possible to create
-a fresh metadata pool and attempt to reconstruct the file system metadata into
-this new pool, leaving the old metadata in place. This could be used to make a
-safer attempt at recovery since the existing metadata pool would not be
-modified.
+If an existing file system is damaged and inoperative, then it is possible to
+create a fresh metadata pool and to attempt the reconstruction the of the
+damaged and inoperative file system's metadata into the new pool, while leaving
+the old metadata in place. This could be used to make a safer attempt at
+recovery since the existing metadata pool would not be modified.
 
 .. caution::
 
    During this process, multiple metadata pools will contain data referring to
    the same data pool. Extreme caution must be exercised to avoid changing the
-   data pool contents while this is the case. Once recovery is complete, the
-   damaged metadata pool should be archived or deleted.
+   contents of the data pool while this is the case. After recovery is
+   complete, archive or delete the damaged metadata pool.
 
-To begin, the existing file system should be taken down, if not done already,
-to prevent further modification of the data pool. Unmount all clients and then
-mark the file system failed:
+To begin, the existing file system should be taken down to prevent further
+modification of the data pool. Unmount all clients and then use the following
+command to mark the file system failed:
 
-::
+.. prompt:: bash #
 
-    ceph fs fail <fs_name>
+   ceph fs fail <fs_name>
 
 .. note::
 
-   <fs_name> here and below indicates the original, damaged file system.
+   ``<fs_name>`` here and below refers to the original, damaged file system.
 
 Next, create a recovery file system in which we will populate a new metadata pool
-backed by the original data pool.
+that is backed by the original data pool:
 
-::
+.. prompt:: bash #
 
-    ceph osd pool create cephfs_recovery_meta
-    ceph fs new cephfs_recovery cephfs_recovery_meta <data_pool> --recover --allow-dangerous-metadata-overlay
+   ceph osd pool create cephfs_recovery_meta
+   ceph fs new cephfs_recovery cephfs_recovery_meta <data_pool> --recover --allow-dangerous-metadata-overlay
 
 .. note::
 
    You may rename the recovery metadata pool and file system at a future time.
-   The ``--recover`` flag prevents any MDS from joining the new file system.
+   The ``--recover`` flag prevents any MDS daemon from joining the new file
+   system.
 
 Next, we will create the intial metadata for the fs:
 
-::
+.. prompt:: bash #
+
+   cephfs-table-tool cephfs_recovery:0 reset session
+
+.. prompt:: bash #
 
-    cephfs-table-tool cephfs_recovery:0 reset session
-    cephfs-table-tool cephfs_recovery:0 reset snap
-    cephfs-table-tool cephfs_recovery:0 reset inode
-    cephfs-journal-tool --rank cephfs_recovery:0 journal reset --force --yes-i-really-really-mean-it
+   cephfs-table-tool cephfs_recovery:0 reset snap
+
+.. prompt:: bash #
+   
+   cephfs-table-tool cephfs_recovery:0 reset inode
+
+.. prompt:: bash #
+
+   cephfs-journal-tool --rank cephfs_recovery:0 journal reset --force --yes-i-really-really-mean-it
 
 Now perform the recovery of the metadata pool from the data pool:
 
-::
+.. prompt:: bash #
 
-    cephfs-data-scan init --force-init --filesystem cephfs_recovery --alternate-pool cephfs_recovery_meta
-    cephfs-data-scan scan_extents --alternate-pool cephfs_recovery_meta --filesystem <fs_name>
-    cephfs-data-scan scan_inodes --alternate-pool cephfs_recovery_meta --filesystem <fs_name> --force-corrupt
-    cephfs-data-scan scan_links --filesystem cephfs_recovery
+   cephfs-data-scan init --force-init --filesystem cephfs_recovery --alternate-pool cephfs_recovery_meta
+
+.. prompt:: bash #
+   
+   cephfs-data-scan scan_extents --alternate-pool cephfs_recovery_meta --filesystem <fs_name>
+
+.. prompt:: bash #
+   
+   cephfs-data-scan scan_inodes --alternate-pool cephfs_recovery_meta --filesystem <fs_name> --force-corrupt
+
+.. prompt:: bash #
+
+   cephfs-data-scan scan_links --filesystem cephfs_recovery
 
 .. note::
 
-   Each scan procedure above goes through the entire data pool. This may take a
-   significant amount of time. See the previous section on how to distribute
-   this task among workers.
+   Each of the scan procedures above scans through the entire data pool. This
+   may take a long time. See the previous section on how to distribute this
+   task among workers.
 
 If the damaged file system contains dirty journal data, it may be recovered next
-with:
+with a command of the following form:
 
-::
+.. prompt:: bash #
 
-    cephfs-journal-tool --rank=<fs_name>:0 event recover_dentries list --alternate-pool cephfs_recovery_meta
+   cephfs-journal-tool --rank=<fs_name>:0 event recover_dentries list --alternate-pool cephfs_recovery_meta
 
 After recovery, some recovered directories will have incorrect statistics.
-Ensure the parameters ``mds_verify_scatter`` and ``mds_debug_scatterstat`` are
-set to false (the default) to prevent the MDS from checking the statistics:
+Ensure that the parameters ``mds_verify_scatter`` and ``mds_debug_scatterstat``
+are set to false (the default) to prevent the MDS from checking the statistics:
 
-::
+.. prompt:: bash #
 
-    ceph config rm mds mds_verify_scatter
-    ceph config rm mds mds_debug_scatterstat
+   ceph config rm mds mds_verify_scatter
+
+.. prompt:: bash #
+
+   ceph config rm mds mds_debug_scatterstat
 
 .. note::
 
-    Also verify the config has not been set globally or with a local ceph.conf file.
+   Verify that the config has not been set globally or with a local ``ceph.conf`` file.
 
-Now, allow an MDS to join the recovery file system:
+Now, allow an MDS daemon to join the recovery file system:
 
-::
+.. prompt:: bash #
 
-    ceph fs set cephfs_recovery joinable true
+   ceph fs set cephfs_recovery joinable true
 
 Finally, run a forward :doc:`scrub </cephfs/scrub>` to repair recursive statistics.
-Ensure you have an MDS running and issue:
+Ensure that you have an MDS daemon running and issue the following command:
 
-::
+.. prompt:: bash #
 
-    ceph tell mds.cephfs_recovery:0 scrub start / recursive,repair,force
+   ceph tell mds.cephfs_recovery:0 scrub start / recursive,repair,force
 
 .. note::
 
-   The `Symbolic link recovery <https://tracker.ceph.com/issues/46166>`_ is supported from Quincy.
+   The `Symbolic link recovery <https://tracker.ceph.com/issues/46166>`_ is
+   supported starting in the Quincy release.
+
    Symbolic links were recovered as empty regular files before.
 
-It is recommended to migrate any data from the recovery file system as soon as
-possible. Do not restore the old file system while the recovery file system is
-operational.
+It is recommended that you migrate any data from the recovery file system as
+soon as possible. Do not restore the old file system while the recovery file
+system is operational.
 
 .. note::
 
     If the data pool is also corrupt, some files may not be restored because
-    backtrace information is lost. If any data objects are missing (due to
-    issues like lost Placement Groups on the data pool), the recovered files
-    will contain holes in place of the missing data.
+    the backtrace information associated with them is lost. If any data objects
+    are missing (due to issues like lost Placement Groups on the data pool),
+    the recovered files will contain holes in place of the missing data.
 
 .. _Symbolic link recovery: https://tracker.ceph.com/issues/46166
author	Zac Dover <zac.dover@proton.me>	2025-01-22 12:57:08 +0100
committer	GitHub <noreply@github.com>	2025-01-22 12:57:08 +0100
commit	31101e16547a20627a7001ffa08d7a9feac37dc5 (patch)
tree	84bf76a6fed37ae32d7201346f697a3d2ff9a045
parent	Merge pull request #61436 from soumyakoduri/wip-skoduri-dbstore (diff)
parent	doc/cephfs: edit disaster-recovery-experts (4 of x) (diff)
download	ceph-31101e16547a20627a7001ffa08d7a9feac37dc5.tar.xz ceph-31101e16547a20627a7001ffa08d7a9feac37dc5.zip