diff options
author | Renato Westphal <renato@opensourcerouting.org> | 2020-07-02 19:43:36 +0200 |
---|---|---|
committer | Renato Westphal <renato@opensourcerouting.org> | 2020-08-03 20:17:03 +0200 |
commit | b855e95fd36f66d6644437afa248ad6afe6f4c44 (patch) | |
tree | abef1d0073a82b82de75d82fdc368f1be9b1970c /pimd/pim_ssmpingd.c | |
parent | *: introduce DEFPY_YANG & friends (diff) | |
download | frr-b855e95fd36f66d6644437afa248ad6afe6f4c44.tar.xz frr-b855e95fd36f66d6644437afa248ad6afe6f4c44.zip |
lib: introduce configuration back-off timer for YANG-modeled commands
When using the default CLI mode, the northbound layer needs to create
a separate transaction to process each YANG-modeled command since
they are supposed to be applied immediately (there's no candidate
configuration nor the "commit" command like in the transactional
CLI). The problem is that configuration transactions have an overhead
associated to them, in big part because of the use of some heavy
libyang functions like `lyd_validate()` and `lyd_diff()`. As of
now this overhead is substantial and doesn't scale well when large
numbers of transactions need to be performed in sequence.
As an example, loading 50k prefix-lists using a single transaction
takes about 2 seconds on a modern CPU. Loading the same 50k
prefix-lists using 50k transactions can take more than an hour
to complete (which is unacceptable by any standard). To fix this
problem, some heavy optimization work needs to be done on libyang and
on the FRR northbound itself too (e.g. perform partial configuration
diffs whenever possible). This, however, should be a long term
effort since these optimizations shouldn't be trivial to implement
and we're far from having the performance numbers we need.
In the meanwhile, this commit introduces a simple but efficient
workaround to alleviate the issue. In short, a new back-off timer
was introduced in the CLI to monitor and detect when too many
YANG-modeled commands are being received at the same time. When
a certain threshold is reached (100 YANG-modeled commands within
one second), the northbound starts to group all subsequent commands
into a single large transaction, which allows them to be processed
much faster (e.g. seconds and not hours). It's essentially a
protection mechanism that creates dynamically-sized transactions
when necessary to prevent performance issues from happening. This
mechanism is enabled both when parsing configuration files and when
reading commands from a terminal.
The downside of this optimization is that, if several YANG-modeled
commands are grouped into the same transaction and at least one of
them fails, the whole transaction is rejected. This is undesirable
since users don't expect transactional behavior when that's not
enabled explicitly. To minimize this issue, the CLI will log all
commands that were rejected whenever that happens, to make the
user aware of what happened and have enough information to fix
the problem. Commands that fail due to parsing errors or CLI-level
validations in general are rejected separately.
Again, this proposed workaround is intended to be temporary. The
goal is to provided a quick fix to issues like #6658 while we work
on better long-term solutions.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Diffstat (limited to 'pimd/pim_ssmpingd.c')
0 files changed, 0 insertions, 0 deletions