| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Previously, when both the label and host pattern were
provided, only the label was actually used for the placement
Fixes: https://tracker.ceph.com/issues/64428
Signed-off-by: Adam King <adking@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Keepalive daemons need the host to have an interface
on which they can set up their VIP. If a host
does not have any interface that can work, we should
filter it out
Signed-off-by: Adam King <adking@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was recently added in https://github.com/ceph/ceph/commit/088d2c4205c599a7d4f2ce4de8e2af8e129adac8
and seems to work fine, but logging these things at info
level spams the log as every single service every serve
loop iteration is reporting on whether it has related
daemons. This changes it to only log when it finds
related daemons or when it prefers a host due to those
related daemons, and do both of those at only debug level.
Signed-off-by: Adam King <adking@redhat.com>
|
|\
| |
| |
| |
| | |
orchestrator: Fix spelling
Reviewed-by: Adam King<adking@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* a new
* accommodated
* adopted
* appended
* because
* bootstrap
* bootstrapping
* brackets
* classes
* cluster
* compatible
* completely
* confusion
* daemon
* daemons
* dashboard
* enclosure
* existing
* explicit
* following
* format
* host
* implementation
* inferred
* keepalived
* kubic
* maintenance
* necessarily
* necessary
* network
* notifier
* octopus
* permanent
* presenting
* related
* see
* snapshot
* stateful
* the
* track
* version
* wasn't
* weird
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
arbitrary hosts
For now, just for linking ingress services and
their backend services. The idea is if one, or both,
of the ingress service and backend service is using a
count, to try and get them to deploy their daemons
on the same host(s). If the placements have explicit
placements (not using count) we still stick to
those placements regardless.
This should enable something like specifying a host
for the backend service and leaving the ingress
placement as just "count: 1" and having the ingress
service get on the same host as the backend service
daemon. This is particularly useful for the keepalive-only
(VIP but no haproxy) over NFS setup where the keepalive
must share a host with the NFS to function, but will
also be useful for other VIP only setups we may do
in the future.
Signed-off-by: Adam King <adking@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* use annotation style without comment
* add noqa to import statement to prevent flake8 from complaining
those which import typing modules
to silence the warnings like:
```
flake8 run-test: commands[0] | flake8 --config=tox.ini alerts balancer cephadm cli_api crash devicehealth diskprediction_local hello iostat localpool nfs orchestrator prometheus selftest
cephadm/schedule.py:5:1: F401 'typing.Callable' imported but unused
cephadm/schedule.py:8:1: F401 'ceph.deployment.service_spec.ServiceSpec' imported but unused
cephadm/tests/fixtures.py:17:1: F401 'orchestrator.OrchResult' imported but unused
orchestrator/module.py:4:1: F401 'typing.Set' imported but unused
orchestrator/module.py:17:1: F401 'ceph.deployment.inventory.Device' imported but unused
prometheus/module.py:17:1: F401 'typing.DefaultDict' imported but unused
6 F401 'typing.Callable' imported but unused
ERROR: InvocationError for command /home/jenkins-build/build/workspace/ceph-pull-requests/src/pybind/mgr/.tox/flake8/bin/flake8 --config=tox.ini alerts
```
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
|
|
|
|
|
|
| |
Fixes: https://tracker.ceph.com/issues/57304
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Basically, if you have a placement that explicitly defines the hosts
to place on, and then add _no_schedule label to one of the hosts (which
should cause all daemons to be removed from the host) cpehadm will simply
fail to apply the spec, saying the host with the _no_schedule label is "Unknown".
This is due to the fact that we remove hosts with the _no_schedule label from
the pool of hosts the scheduler has to work with entirely. If we also provide
the scheduler with a list of currently draining hosts, it can handle this
better and the daemon can be drained off the host as expected.
Fixes: https://tracker.ceph.com/issues/56972
Signed-off-by: Adam King <adking@redhat.com>
|
|
|
|
| |
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
|
|
|
|
|
|
|
|
|
| |
To be able to detect if certain offline hosts go
offline quicker. Could be useful for the NFS
HA feature as this requires moving nfs daemons from
offline hosts within 90 seconds.
Signed-off-by: Adam King <adking@redhat.com>
|
|
|
|
|
|
|
|
|
| |
In order to improve nfs availability, if there are other
hosts we can place an nfs daemon on or if there is a host
with a lower rank nfs daemon when a higher rank one is on
an offline host, we should reschedule the nfs daemons
Signed-off-by: Adam King <adking@redhat.com>
|
|
|
|
| |
Signed-off-by: Daniel Pivonka <dpivonka@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Eevery call of find_ip_on_host() actually duplicates the list of public ip
addresses in self.networks, while it should NOT change it. As the result
value of key mgr/cephadm/host.<hostname> in kv store becomes very large
and may cause crash of ceph mgr.
Signed-off-by: Andrew Sharapov <andrewshar@gmail.com>
|
|
|
|
|
|
| |
Fixes: https://tracker.ceph.com/issues/51027
Signed-off-by: Adam King <adking@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
If we are passed a rank_map, use it maintain one daemon per rank, where
the ranks are consecutive non-negative integers starting from 0.
A bit of refactoring in place() so that we only do the rank allocations
on slots we are going to use (no more than count).
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
|
|
|
| |
DaemonDescription
CephadmDaemonDeploySpec
DaemonPlacement
unit.meta
get_unique_name() (we include it in the daemon_id)
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
| |
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
|
|
|
|
|
| |
hash(str) is non-deterministic, probably because it is using the internal
object ID or something and not the string content?
In any case, explicitly hash the string content and use that instead.
Also, sort the input pre-shuffle to ensure that variations in the original
host list ordering don't screw with the result.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|\
| |
| |
| |
| | |
mgr/cephadm/schedule: fix message
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
This is now only used when scheduling mons. (Units now enable the kernel
features needed instead of checking for them during placement.) Move the
message to the filter itself.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|/
|
|
|
|
|
|
|
| |
Some of the file in mgr/cephadm are getting changed
every time I want to run the mgr tox tests to check
changes and it's inconvenient to have to check the
files out every time
Signed-off-by: Adam King <adking@redhat.com>
|
|
|
|
|
|
| |
This only affects ingress, at least for now.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
|
|
|
| |
This will be used to schedule a per-host keepalived alongside other
services.
Implement this as a final stage for place() that puts one per host and
also takes existing/stray daemons into consideration.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
| |
Initially, this will always match the service_type.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
| |
Also, minor fix in the ipv6 addr reporting: ignore networks that aren't in CIDR
form (no /).
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
| |
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
| |
'host(ip:port)' or 'host(*:port)' so we can show it to a user.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Choose an IP from the subnet list provided by the ServiceSpec.
A few caveats:
- we ignore hosts that don't have IPs in the given subnet
- the subnet matching is STRICT. That is, the CIDR name has to exactly
match what is configured on the host. That means you can't just say 10/8
to match any 10.whatever addres--you need the exactly network on the host
(e.g, 10.1.2.0/24).
- If you modify a servicespec and change the networks when there are
already deployed daemons, we will try to deploy the new instances on
the same ports but bound to a specific IP instead of *. Which will fail.
You need to remove the service first, or remove the old daemons manually
so that creating new ones will succeed.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
|
| |
1- We only have an IP to bind to if we also have a port, and
2- If we do, we want an exact match: if the DaemonPlacement has ip of
None, then the DaemonDescription should have None too.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
| |
Dynamically number ports for RGW instances, with the start port being
the one configured on the service (or the default of 80 or 443).
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
|
| |
It would be weird to dynamically number multiple ports (although doable).
But until we have plans to support something like that, no need to handle
it here.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
|
| |
Create a new type for the result of scheduling/place(). Include new fields
like ip:port. Introduce a matches_daemon() to see whether an existing
daemon matches a scheduling slot.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
| |
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
| |
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
| |
It no longer does anything except slice the array.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
|
| |
We already have to examine existing daemons to choose the placements.
There is no reason to make the caller call another method to (re)calculate
what the net additions and removals are.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
| |
For certain daemon types, we can deploy more than one per host (mds,
rbd-mirror, rgw).
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
| |
Otherwise we may end out randomly doubling up on some hosts and none on
others (when we have more than one placement per host).
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
| |
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
|
| |
- simpler
- many/most callers already have the daemon list, so we save ourselves
duplicated effort
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
| |
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
|
| |
In the no-count cases, our job is simple: we have a set of hosts specified
via some other means (label, filter, explicit list) and simply need to
do N instances for each of those hosts.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
| |
If we are told to deploy multiple instances on a host, do it.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ceph works just fine with an even number of monitors.
Upside: more copies of critical cluster data
Downside: we can tolerate the same number of down mons as N-1, and now
we are slightly more likely to have a failing mon because we have 1 more
that might fail.
On balance it's not clear that have one fewer mon is any better, so avoid
the confusion and complexity of second-guessing ourselves.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
| |
Signed-off-by: Michael Fritch <mfritch@suse.com>
|
|
|
|
| |
Signed-off-by: Michael Fritch <mfritch@suse.com>
|
|
|
|
| |
Signed-off-by: Michael Fritch <mfritch@suse.com>
|
|
|
|
| |
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
|
|
|
| |
Cephadm deploying keepalived and HAproxy for providing High availability for RGW endpoints
Fixes: https://tracker.ceph.com/issues/45116
Signed-off-by: Daniel-Pivonka <dpivonka@redhat.com>
Signed-off-by: Adam King <adking@redhat.com>
Signed-off-by: Juan Miguel Olmo Martínez <jolmomar@redhat.com>
|