frr - frr

	Commit message (Collapse)	Author	Age	Files	Lines
*	lib: Fix to optimize the time taken while batching huge configs	Rajasekar Raja	2024-12-18	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Issue: When the incoming config has say 30K entries of a prefix-lists, current implementation is to schedule the configs to be batched and only after batching the entire config, the processing of the configs take place. As part of batching this config, we perform string concatenation to save all the configs in the buffer which over time results in taking longer time. Ex: Imagine each line of config is 50 chars. With a delimiter of ‘- ‘ we end up adding 52 chars to buffer for each command i.e. 52*30000 = 156K of chars. Strlcat is an expensive operation and every time we strlcat, we have to traverse at end of string to append new char. Because of this, we end up adding extra 6-8 secs for accepting the config. Fix: The idea here is to bring back something similar to the backoff count implemented as part of 20e9a402 (lib: introduce configuration back-off timer for YANG-modeled commands). Essentially we keep a cap of 5000 per batch. So once 5000k config commands are batched, we process them, clear the buffer, set the count to 0 and then continue processing the rest of the config. option1 file has 30K entries of prefix-list Without Fix: root@mlx-3700-20:mgmt:/var/log/raja/frr# time sudo vtysh -f option1 <SNIP>.............. Waiting for children to finish applying config... [25191\|staticd] done [25189\|watchfrr] done [25178\|ospfd] done [25190\|pbrd] done [25181\|bgpd] done [25175\|zebra] done real 0m20.123s user 0m9.384s sys 0m2.403s With Fix: root@mlx-3700-20:mgmt:/var/log/raja/frr# time sudo vtysh -f option1 <SNIP>.............. Waiting for children to finish applying config... [19887\|staticd] done [19885\|watchfrr] done [19886\|pbrd] done [19874\|ospfd] done [19877\|bgpd] done [19871\|zebra] done real 0m12.168s user 0m7.511s sys 0m1.981s Issue: 3589101 Ticket# 3589101 Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
*	lib: rework northbound RPC callback	Igor Ryzhov	2024-04-22	1	-11/+23
\| \| \| \| \| \| \|	Change input/output arguments of the RPC callback from lists of (xpath/value) tuples to YANG data trees. Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
*	mgmt, lib: differentiate DELETE and REMOVE operations	Igor Ryzhov	2024-01-11	1	-1/+1
\| \| \| \| \| \| \| \| \|	Currently, there's a single operation type which doesn't return error if the object doesn't exists. To be compatible with NETCONF/RESTCONF, we should support differentiate between DELETE (fails when object doesn't exist) and REMOVE (doesn't fail if the object doesn't exist). Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
*	*: Convert `struct event_master` to `struct event_loop`	Donald Sharp	2023-03-24	1	-1/+1
\| \| \| \| \| \|	Let's find a better name for it. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
*	*: Convert struct thread_master to struct event_master and it's ilk	Donald Sharp	2023-03-24	1	-1/+1
\| \| \| \| \| \| \|	Convert the `struct thread_master` to `struct event_master` across the code base. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
*	*: auto-convert to SPDX License IDs	David Lamparter	2023-02-09	1	-14/+1
\| \| \| \| \| \|	Done with a combination of regex'ing and banging my head against a wall. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
*	*: apply proper format string attributes	David Lamparter	2023-01-27	1	-2/+3
\| \| \| \| \| \|	So that we get warnings about broken format strings. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
*	lib: northbound cli show/cmd functions must not modify data nodes	Igor Ryzhov	2021-10-13	1	-1/+2
\| \| \| \| \| \| \| \|	To ensure this, add a const modifier to functions' arguments. Would be great do this initially and avoid this large code change, but better late than never. Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
*	northbound: KISS always batch yang config (file read), it's faster	Christian Hopps	2021-06-02	1	-2/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The backoff code assumed that yang operations always completed quickly. It checked for > 100 YANG modeled commands happening in under 1 second to enable batching. If 100 yang modeled commands always take longer than 1 second batching is never enabled. This is the exact opposite of what we want to happen since batching speeds the operations up. Here are the results for libyang2 code without and with batching. \| action \| 1K rts \| 2K rts \| 1K rts \| 2K rts \| 20k rts \| \| \| nobatch \| nobatch \| batch \| batch \| batch \| \| Add IPv4 \| .881 \| 1.28 \| .703 \| 1.04 \| 8.16 \| \| Add Same IPv4 \| 28.7 \| 113 \| .590 \| .860 \| 6.09 \| \| Rem 1/2 IPv4 \| .376 \| .442 \| .379 \| .435 \| 1.44 \| \| Add Same IPv4 \| 28.7 \| 113 \| .576 \| .841 \| 6.02 \| \| Rem All IPv4 \| 17.4 \| 71.8 \| .559 \| .813 \| 5.57 \| (IPv6 numbers are basically the same as iPv4, a couple percent slower) Clearly we need this. Please note the growth (1K to 2K) w/o batching is non-linear and 100 times slower than batched. Notes on code: The use of the new `nb_cli_apply_changes_clear_pending` is to commit any pending changes (including the current one). This is done when the code would not correctly handle a single diff that included the current changes with possible following changes. For example, a "no" command followed by a new value to replace it would be merged into a change, and the code would not deal well with that. A good example of this is BGP neighbor peer-group changing. The other use is after entering a router level (e.g., "router bgp") where the follow-on command handlers expect that router object to now exists. The code eventually needs to be cleaned up to not fail in these cases, but that is for future NB cleanup. Signed-off-by: Christian Hopps <chopps@labn.net>
*	*: add errmsg to nb rpc	Chirag Shah	2020-10-05	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Display human readable error message in northbound rpc transaction failure. In case of vtysh nb client, the error message will be displayed to user. Testing: bharat# clear evpn dup-addr vni 1002 ip 11.11.11.11 Error type: generic error Error description: Requested IP's associated MAC aa:aa:aa:aa:aa:aa is still in duplicate state Signed-off-by: Chirag Shah <chirag@nvidia.com>
*	lib: introduce configuration back-off timer for YANG-modeled commands	Renato Westphal	2020-08-03	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When using the default CLI mode, the northbound layer needs to create a separate transaction to process each YANG-modeled command since they are supposed to be applied immediately (there's no candidate configuration nor the "commit" command like in the transactional CLI). The problem is that configuration transactions have an overhead associated to them, in big part because of the use of some heavy libyang functions like `lyd_validate()` and `lyd_diff()`. As of now this overhead is substantial and doesn't scale well when large numbers of transactions need to be performed in sequence. As an example, loading 50k prefix-lists using a single transaction takes about 2 seconds on a modern CPU. Loading the same 50k prefix-lists using 50k transactions can take more than an hour to complete (which is unacceptable by any standard). To fix this problem, some heavy optimization work needs to be done on libyang and on the FRR northbound itself too (e.g. perform partial configuration diffs whenever possible). This, however, should be a long term effort since these optimizations shouldn't be trivial to implement and we're far from having the performance numbers we need. In the meanwhile, this commit introduces a simple but efficient workaround to alleviate the issue. In short, a new back-off timer was introduced in the CLI to monitor and detect when too many YANG-modeled commands are being received at the same time. When a certain threshold is reached (100 YANG-modeled commands within one second), the northbound starts to group all subsequent commands into a single large transaction, which allows them to be processed much faster (e.g. seconds and not hours). It's essentially a protection mechanism that creates dynamically-sized transactions when necessary to prevent performance issues from happening. This mechanism is enabled both when parsing configuration files and when reading commands from a terminal. The downside of this optimization is that, if several YANG-modeled commands are grouped into the same transaction and at least one of them fails, the whole transaction is rejected. This is undesirable since users don't expect transactional behavior when that's not enabled explicitly. To minimize this issue, the CLI will log all commands that were rejected whenever that happens, to make the user aware of what happened and have enough information to fix the problem. Commands that fail due to parsing errors or CLI-level validations in general are rejected separately. Again, this proposed workaround is intended to be temporary. The goal is to provided a quick fix to issues like #6658 while we work on better long-term solutions. Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
*	lib: avoid expensive operations when editing a candidate config	Renato Westphal	2019-10-12	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	nb_candidate_edit() was calling both the lyd_schema_sort() and lyd_validate() functions whenever a new node was added to the candidate configuration. This was done to ensure the candidate is always ready to be displayed correctly (libyang only creates default child nodes during the validation process, and data nodes aren't guaranteed to be ordered by default). The problem is that the two aforementioned functions are too expensive to be called in the northbound hot path. Instead, it makes more sense to call them only before displaying the configuration (in which case a recursive sort needs to be done). Introduce the nb_cli_show_config_prepare() to achieve that purpose. Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
*	lib: add extern "C" {} blocks to all libfrr headers	Renato Westphal	2019-02-12	1	-0/+8
\| \| \| \| \| \| \|	These are necessary to use functions defined in these headers from C++. Signed-off-by: David Lamparter <equinox@diac24.net> Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
*	lib: fix segfault on freebsd when using vsnprintf() incorrectly	Renato Westphal	2019-01-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	FreeBSD's libc segfaults when vsnprintf() is called with a null format string. Add a null check before calling vsnprintf() to resolve this problem. Fixes #3537 Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
*	lib: add support for confirmed commits	Renato Westphal	2018-12-07	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Confirmed commits allow the user to request an automatic rollback to the previous configuration if the commit operation is not confirmed within a number of minutes. This is particularly useful when the user is accessing the CLI through the network (e.g. using SSH) and any configuration change might cause an unexpected loss of connectivity between the user and the managed device (e.g. misconfiguration of a routing protocol). By using a confirmed commit, the user can rest assured the connectivity will be restored after the given timeout expires, avoiding the need to access the router physically to fix the problem. When "commit confirmed TIMEOUT" is used, a new "commit" command is expected to confirm the previous commit before the given timeout expires. If "commit confirmed TIMEOUT" is used while there's already a confirmed-commit in progress, the confirmed-commit timeout is reset to the new value. In the current implementation, if other users perform commits while there's a confirmed-commit in progress, all commits are rolled back when the confirmed-commit timeout expires. It's recommended to use the "configure exclusive" configuration mode to prevent unexpected outcomes when using confirmed commits. When an user exits from the configuration mode while there's a confirmed-commit in progress, the commit is automatically rolled back and the user is notified about it. In the future we might want to prompt the user if he or she really wants to exit from the configuration mode when there's a pending confirmed commit. Needless to say, confirmed commit only work for configuration commands converted to the new northbound model. vtysh support will be implemented at a later time. Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
*	lib, ripd: rework API for converted CLI commands	Renato Westphal	2018-11-26	1	-24/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When editing the candidate configuration, the northbound must ensure that either all changes made by a command are accepted or none are. This is done to prevent inconsistent states where only parts of a command are applied in the event any error happens. The previous API for converted commands, the nb_cli_cfg_change() function, required callers to pass an array containing all changes that needed to be applied in the candidate configuration. The problem with this API is that it was very inconvenient for complex commands, which change different configuration options depending on several factors. This required users to manipulate the array of configuration changes using low-level primitives, making it complicated to implement some commands. To solve this problem, introduce a new API based on the two following functions: - nb_cli_enqueue_change() - nb_cli_apply_changes() The first function is used to enqueue configuration changes, one at time. Then the nb_cli_apply_changes() function is used to apply all the enqueued configuration changes. To implement this, a static-sized array was allocated in the "vty" structure, along with a counter of enqueued changes. This eliminates the need to declare an array of configuration changes in every converted CLI command, simplifying things quite considerably. Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
*	lib: introduce new northbound API	Renato Westphal	2018-10-27	1	-0/+66
	Signed-off-by: Renato Westphal <renato@opensourcerouting.org>