diff options
Diffstat (limited to 'doc/rados/operations/stretch-mode.rst')
-rw-r--r-- | doc/rados/operations/stretch-mode.rst | 79 |
1 files changed, 73 insertions, 6 deletions
diff --git a/doc/rados/operations/stretch-mode.rst b/doc/rados/operations/stretch-mode.rst index a5694718a58..7a4fa46117d 100644 --- a/doc/rados/operations/stretch-mode.rst +++ b/doc/rados/operations/stretch-mode.rst @@ -94,15 +94,54 @@ configuration across the entire cluster. Conversely, opt for a ``stretch pool`` when you need a particular pool to be replicated across ``more than two data centers``, providing a more granular level of control and a larger cluster size. +Limitations +----------- + +Individual Stretch Pools do not support I/O operations during a netsplit +scenario between two or more zones. While the cluster remains accessible for +basic Ceph commands, I/O usage remains unavailable until the netsplit is +resolved. This is different from ``stretch mode``, where the tiebreaker monitor +can isolate one zone of the cluster and continue I/O operations in degraded +mode during a netsplit. See :ref:`stretch_mode1` + +Ceph is designed to tolerate multiple host failures. However, if more than 25% of +the OSDs in the cluster go down, Ceph may stop marking OSDs as out which will prevent rebalancing +and some PGs might go inactive. This behavior is controlled by the ``mon_osd_min_in_ratio`` parameter. +By default, mon_osd_min_in_ratio is set to 0.75, meaning that at least 75% of the OSDs +in the cluster must remain ``active`` before any additional OSDs can be marked out. +This setting prevents too many OSDs from being marked out as this might lead to significant +data movement. The data movement can cause high client I/O impact and long recovery times when +the OSDs are returned to service. If Ceph stops marking OSDs as out, some PGs may fail to +rebalance to surviving OSDs, potentially leading to ``inactive`` PGs. +See https://tracker.ceph.com/issues/68338 for more information. + +.. _stretch_mode1: + Stretch Mode ============ -Stretch mode is designed to handle deployments in which you cannot guarantee the -replication of data across two data centers. This kind of situation can arise -when the cluster's CRUSH rule specifies that three copies are to be made, but -then a copy is placed in each data center with a ``min_size`` of 2. Under such -conditions, a placement group can become active with two copies in the first -data center and no copies in the second data center. +Stretch mode is designed to handle netsplit scenarios between two data zones as well +as the loss of one data zone. It handles the netsplit scenario by choosing the surviving zone +that has the better connection to the ``tiebreaker monitor``. It handles the loss of one zone by +reducing the ``size`` to ``2`` and ``min_size`` to ``1``, allowing the cluster to continue operating +with the remaining zone. When the lost zone comes back, the cluster will recover the lost data +and return to normal operation. + +Connectivity Monitor Election Strategy +--------------------------------------- +When using stretch mode, the monitor election strategy must be set to ``connectivity``. +This strategy tracks network connectivity between the monitors and is +used to determine which zone should be favored when the cluster is in a netsplit scenario. + +See `Changing Monitor Elections`_ + +Stretch Peering Rule +-------------------- +One critical behavior of stretch mode is its ability to prevent a PG from going active if the acting set +contains only replicas from a single zone. This safeguard is crucial for mitigating the risk of data +loss during site failures because if a PG were allowed to go active with replicas only in a single site, +writes could be acknowledged despite a lack of redundancy. In the event of a site failure, all data in the +affected PG would be lost. Entering Stretch Mode --------------------- @@ -247,6 +286,34 @@ possible, if needed). .. _Changing Monitor elections: ../change-mon-elections +Exiting Stretch Mode +-------------------- +To exit stretch mode, run the following command: + +.. prompt:: bash $ + + ceph mon disable_stretch_mode [{crush_rule}] --yes-i-really-mean-it + + +.. describe:: {crush_rule} + + The CRUSH rule that the user wants all pools to move back to. If this + is not specified, the pools will move back to the default CRUSH rule. + + :Type: String + :Required: No. + +The command will move the cluster back to normal mode, +and the cluster will no longer be in stretch mode. +All pools will move its ``size`` and ``min_size`` +back to the default values it started with. +At this point the user is responsible for scaling down the cluster +to the desired number of OSDs if they choose to operate with less number of OSDs. + +Please note that the command will not execute when the cluster is in +``recovery stretch mode``. The command will only execute when the cluster +is in ``degraded stretch mode`` or ``healthy stretch mode``. + Limitations of Stretch Mode =========================== When using stretch mode, OSDs must be located at exactly two sites. |