diff options
author | Petr Špaček <petr.spacek@nic.cz> | 2020-01-06 18:35:30 +0100 |
---|---|---|
committer | Tomas Krizek <tomas.krizek@nic.cz> | 2020-01-15 10:38:17 +0100 |
commit | 56947b8308edb6379235fdc21a81f0632adb2b9b (patch) | |
tree | 5d038e15a8540510d745740d2db9a6dc25d8ff65 | |
parent | doc: move reorder_RR() into policy/acl/data manipulation section (diff) | |
download | knot-resolver-56947b8308edb6379235fdc21a81f0632adb2b9b.tar.xz knot-resolver-56947b8308edb6379235fdc21a81f0632adb2b9b.zip |
doc: move Cache and Multiple instance chapters into Performance section
It logically belongs here and it will make high-level structure less
crowded.
-rw-r--r-- | daemon/bindings/cache.rst | 85 | ||||
-rw-r--r-- | doc/config.rst | 110 | ||||
-rw-r--r-- | doc/index.rst | 1 | ||||
-rw-r--r-- | lib/cache/README.rst (renamed from doc/performance.rst) | 10 | ||||
-rw-r--r-- | systemd/multiinst.rst | 90 | ||||
-rw-r--r-- | utils/cache_gc/README.rst | 4 |
6 files changed, 190 insertions, 110 deletions
diff --git a/daemon/bindings/cache.rst b/daemon/bindings/cache.rst index 388e84a3..cf388912 100644 --- a/daemon/bindings/cache.rst +++ b/daemon/bindings/cache.rst @@ -1,11 +1,86 @@ Cache -===== +----- -The default cache in Knot Resolver is persistent on disk, which means that the daemon doesn't lose -the cached data on restart or crash, and thus performace does not suffer from cold-starts. +Cache in Knot Resolver is stored on disk and also shared between `Multiple instances`_ +so resolver doesn't lose the cached data on restart or crash. -The cache may be reused between cache -daemons or manipulated from other processes, making for example synchronized load-balanced recursors possible. +To improve performance even further the resolver implements so-called aggressive caching +for DNSSEC-validated data (:rfc:`8198`), which improves performance and also protects +against some types of Random Subdomain Attacks. + + +.. _`cache_sizing`: + +Sizing +^^^^^^ + +For personal and small office use-cases cache size around 100 MB is more than enough. + +For large deployments we recommend to run Knot Resolver on a dedicated machine, +and to allocate 90% of machine's free memory for resolver's cache. + +For example, imagine you have a machine with 16 GB of memory. +After machine restart you use command ``free -m`` to determine +amount of free memory (without swap): + +.. code-block:: bash + + $ free -m + total used free + Mem: 15907 979 14928 + +Now you can configure cache size to be 90% of the free memory 14 928 MB, i.e. 13 453 MB: + +.. code-block:: lua + + -- 90 % of free memory after machine restart + cache.size = 13453 * MB + +.. _`cache_persistence`: + +Persistence +^^^^^^^^^^^ +.. tip:: Using tmpfs for cache improves performance and reduces disk I/O. + +By default the cache is saved on a persistent storage device +so the content of the cache is persisted during system reboot. +This usually leads to smaller latency after restart etc., +however in certain situations a non-persistent cache storage might be preferred, e.g.: + + - Resolver handles high volume of queries and I/O performance to disk is too low. + - Threat model includes attacker getting access to disk content in power-off state. + - Disk has limited number of writes (e.g. flash memory in routers). + +If non-persistent cache is desired configure cache directory to be on +tmpfs_ filesystem, a temporary in-memory file storage. +The cache content will be saved in memory, and thus have faster access +and will be lost on power-off or reboot. + + +.. note:: In most of the Unix-like systems ``/tmp`` and ``/var/run`` are commonly mounted to tmpfs. + While it is technically possible to move the cache to an existing + tmpfs filesystem, it is *not recommended*: The path to cache is specified in + multiple systemd units, and a shared tmpfs space could be used up by other + applications, leading to ``SIGBUS`` errors during runtime. + +Mounting the cache directory as tmpfs_ is recommended apparoach. +Make sure to use appropriate ``size=`` option and don't forget to adjust the +size in the config file as well. + +.. code-block:: none + + # /etc/fstab + tmpfs /var/cache/knot-resolver tmpfs rw,size=2G,uid=knot-resolver,gid=knot-resolver,nosuid,nodev,noexec,mode=0700 0 0 + +.. code-block:: lua + + # /etc/knot-resolver/config + cache.size = 2 * GB + +.. _tmpfs: https://en.wikipedia.org/wiki/Tmpfs + +Configuration reference +^^^^^^^^^^^^^^^^^^^^^^^ .. function:: cache.open(max_size[, config_uri]) diff --git a/doc/config.rst b/doc/config.rst index 399f28c0..1fb22551 100644 --- a/doc/config.rst +++ b/doc/config.rst @@ -67,101 +67,9 @@ Now you know what configuration file to modify, how to read examples and what mo .. include:: ../daemon/README.rst .. include:: ../daemon/bindings/net.rst -.. include:: ../daemon/bindings/cache.rst .. include:: ../daemon/lua/trust_anchors.rst -Multiple instances -================== - -Knot Resolver can utilize multiple CPUs running in multiple independent instances (processes), where each process utilizes at most single CPU core on your machine. If your machine handles a lot of DNS traffic run multiple instances. - -All instances typically share the same configuration and cache, and incomming queries are automatically distributed by operating system among all instances. - -Advantage of using multiple instances is that a problem in a single instance will not affect others, so a single instance crash will not bring whole DNS resolver service down. - -.. tip:: For maximum performance, there should be as many kresd processes as - there are available CPU threads. - -To run multiple instances, use a different identifier after `@` sign for each instance, for -example: - -.. code-block:: bash - - $ systemctl start kresd@1.service - $ systemctl start kresd@2.service - $ systemctl start kresd@3.service - $ systemctl start kresd@4.service - -With the use of brace expansion in BASH the equivalent command looks like this: - -.. code-block:: bash - - $ systemctl start kresd@{1..4}.service - -For more details see ``kresd.systemd(7)``. - - -Zero-downtime restarts ----------------------- -Resolver restart normally takes just miliseconds and cache content is persistent to avoid performance drop -after restart. If you want real zero-downtime restarts use `multiple instances`_ and do rolling -restart, i.e. restart only one resolver process at a time. - -On a system with 4 instances run these commands sequentially: - -.. code-block:: bash - - $ systemctl restart kresd@1.service - $ systemctl restart kresd@2.service - $ systemctl restart kresd@3.service - $ systemctl restart kresd@4.service - -At any given time only a single instance is stopped and restarted so remaining three instances continue to service clients. - - -.. _instance-specific-configuration: - -Instance-specific configuration -------------------------------- - -Instances can use arbitraty identifiers for the instances, for example we can name instances like `dns1`, `tls` and so on. - -.. code-block:: bash - - $ systemctl start kresd@dns1 - $ systemctl start kresd@dns2 - $ systemctl start kresd@tls - $ systemctl start kresd@doh - -The instance name is subsequently exposed to kresd via the environment variable -``SYSTEMD_INSTANCE``. This can be used to tell the instances apart, e.g. when -using the :ref:`mod-nsid` module with per-instance configuration: - -.. code-block:: lua - - local systemd_instance = os.getenv("SYSTEMD_INSTANCE") - - modules.load('nsid') - nsid.name(systemd_instance) - -More arcane set-ups are also possible. The following example isolates the -individual services for classic DNS, DoT and DoH from each other. - -.. code-block:: lua - - local systemd_instance = os.getenv("SYSTEMD_INSTANCE") - - if string.match(systemd_instance, '^dns') then - net.listen('127.0.0.1', 53, { kind = 'dns' }) - elseif string.match(systemd_instance, '^tls') then - net.listen('127.0.0.1', 853, { kind = 'tls' }) - elseif string.match(systemd_instance, '^doh') then - net.listen('127.0.0.1', 443, { kind = 'doh' }) - else - panic("Use kresd@dns*, kresd@tls* or kresd@doh* instance names") - end - Logging, monitoring, diagnostics ================================ Knot Resolver logs to standard outputs, which is then captured by supervisor @@ -232,9 +140,27 @@ order of records in DNS answers sent by resolver: It is disabled by default. +.. _performance: Performance and resiliency ========================== +For DNS resolvers, the most important parameter from performance perspective +is cache hit rate, i.e. percentage of queries answered from resolver's cache. +Generally the higher cache hit rate the better. + +Performance tunning should start with cache :ref:`cache_sizing` +and :ref:`cache_persistence`. + +It is also recommended to run `Multiple instances`_ (even on a single machine!) +because it allows to utilize multiple CPU threads +and increases overall resiliency. + +Other features described in this section can be used for fine-tunning +performance and resiliency of the resolver but generally have much smaller +impact than cache settings and number of instances. + +.. include:: ../daemon/bindings/cache.rst +.. include:: ../systemd/multiinst.rst .. include:: ../modules/predict/README.rst .. include:: ../modules/priming/README.rst .. include:: ../modules/rfc7706.rst diff --git a/doc/index.rst b/doc/index.rst index 80cfc845..28bbc5bb 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -27,7 +27,6 @@ and it provides a state-machine like API for extensions. :caption: Operation :maxdepth: 1 - performance monitoring upgrading NEWS diff --git a/doc/performance.rst b/lib/cache/README.rst index 29fed5bd..c0c3db42 100644 --- a/doc/performance.rst +++ b/lib/cache/README.rst @@ -1,8 +1,3 @@ -.. _performance: - -Performance tunning -=================== - .. _cache_sizing: Cache sizing @@ -70,8 +65,3 @@ size in the config file as well. cache.size = 2 * GB .. _tmpfs: https://en.wikipedia.org/wiki/Tmpfs - - -Utilizing all CPUs ------------------- - diff --git a/systemd/multiinst.rst b/systemd/multiinst.rst new file mode 100644 index 00000000..d9be8538 --- /dev/null +++ b/systemd/multiinst.rst @@ -0,0 +1,90 @@ +Multiple instances +------------------ + +Knot Resolver can utilize multiple CPUs running in multiple independent instances (processes), where each process utilizes at most single CPU core on your machine. If your machine handles a lot of DNS traffic run multiple instances. + +All instances typically share the same configuration and cache, and incomming queries are automatically distributed by operating system among all instances. + +Advantage of using multiple instances is that a problem in a single instance will not affect others, so a single instance crash will not bring whole DNS resolver service down. + +.. tip:: For maximum performance, there should be as many kresd processes as + there are available CPU threads. + +To run multiple instances, use a different identifier after `@` sign for each instance, for +example: + +.. code-block:: bash + + $ systemctl start kresd@1.service + $ systemctl start kresd@2.service + $ systemctl start kresd@3.service + $ systemctl start kresd@4.service + +With the use of brace expansion in BASH the equivalent command looks like this: + +.. code-block:: bash + + $ systemctl start kresd@{1..4}.service + +For more details see ``kresd.systemd(7)``. + + +Zero-downtime restarts +^^^^^^^^^^^^^^^^^^^^^^ +Resolver restart normally takes just miliseconds and cache content is persistent to avoid performance drop +after restart. If you want real zero-downtime restarts use `multiple instances`_ and do rolling +restart, i.e. restart only one resolver process at a time. + +On a system with 4 instances run these commands sequentially: + +.. code-block:: bash + + $ systemctl restart kresd@1.service + $ systemctl restart kresd@2.service + $ systemctl restart kresd@3.service + $ systemctl restart kresd@4.service + +At any given time only a single instance is stopped and restarted so remaining three instances continue to service clients. + + +.. _instance-specific-configuration: + +Instance-specific configuration +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Instances can use arbitraty identifiers for the instances, for example we can name instances like `dns1`, `tls` and so on. + +.. code-block:: bash + + $ systemctl start kresd@dns1 + $ systemctl start kresd@dns2 + $ systemctl start kresd@tls + $ systemctl start kresd@doh + +The instance name is subsequently exposed to kresd via the environment variable +``SYSTEMD_INSTANCE``. This can be used to tell the instances apart, e.g. when +using the :ref:`mod-nsid` module with per-instance configuration: + +.. code-block:: lua + + local systemd_instance = os.getenv("SYSTEMD_INSTANCE") + + modules.load('nsid') + nsid.name(systemd_instance) + +More arcane set-ups are also possible. The following example isolates the +individual services for classic DNS, DoT and DoH from each other. + +.. code-block:: lua + + local systemd_instance = os.getenv("SYSTEMD_INSTANCE") + + if string.match(systemd_instance, '^dns') then + net.listen('127.0.0.1', 53, { kind = 'dns' }) + elseif string.match(systemd_instance, '^tls') then + net.listen('127.0.0.1', 853, { kind = 'tls' }) + elseif string.match(systemd_instance, '^doh') then + net.listen('127.0.0.1', 443, { kind = 'doh' }) + else + panic("Use kresd@dns*, kresd@tls* or kresd@doh* instance names") + end diff --git a/utils/cache_gc/README.rst b/utils/cache_gc/README.rst index 41991296..6a3f4003 100644 --- a/utils/cache_gc/README.rst +++ b/utils/cache_gc/README.rst @@ -1,7 +1,7 @@ Garbage Collector ------------------ +^^^^^^^^^^^^^^^^^ -Knot Resolver employs a separate garbage collector daemon (process ``kres-cache-gc``) which periodically trims the cache to keep its size below a configured size limit. +Knot Resolver employs a separate garbage collector daemon which periodically trims the cache to keep its size below size limit configured using :envvar:`cache.size`. Systemd service ``kres-cache-gc.service`` is enabled by default and does not need any manual intervention. |