summaryrefslogtreecommitdiffstats
path: root/Documentation/networking (follow)
Commit message (Collapse)AuthorAgeFilesLines
* mptcp: add sysctl allow_join_initial_addr_portGeliang Tang2021-06-221-0/+13
| | | | | | | | | | | | This patch added a new sysctl, named allow_join_initial_addr_port, to control whether allow peers to send join requests to the IP address and port number used by the initial subflow. Suggested-by: Florian Westphal <fw@strlen.de> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: add probe_interval in sysctl and sock/asoc/transportXin Long2021-06-221-0/+8
| | | | | | | | | | | | PLPMTUD can be enabled by doing 'sysctl -w net.sctp.probe_interval=n'. 'n' is the interval for PLPMTUD probe timer in milliseconds, and it can't be less than 5000 if it's not 0. All asoc/transport's PLPMTUD in a new socket will be enabled by default. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: Document behavior when module EEPROM bank attribute is omittedIdo Schimmel2021-06-221-0/+2
| | | | | | | | | | The kernel assumes bank 0 when 'ETHTOOL_MSG_MODULE_EEPROM_GET' is sent without 'ETHTOOL_A_MODULE_EEPROM_BANK'. Document it as part of the interface documentation. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: Document correct attribute typeIdo Schimmel2021-06-221-1/+1
| | | | | | | 'ETHTOOL_A_MODULE_EEPROM_DATA' is a binary attribute, not a nested one. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: Use correct command name in titleIdo Schimmel2021-06-221-2/+2
| | | | | | | | The command is called 'ETHTOOL_MSG_MODULE_EEPROM_GET', not 'ETHTOOL_MSG_MODULE_EEPROM'. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* mptcp: add a new sysctl checksum_enabledGeliang Tang2021-06-181-0/+8
| | | | | | | | | | | | This patch added a new sysctl, named checksum_enabled, to control whether DSS checksum can be enabled. Acked-by: Paolo Abeni <pabeni@redhat.com> Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller2021-06-171-0/+25
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Borkmann says: ==================== pull-request: bpf-next 2021-06-17 The following pull-request contains BPF updates for your *net-next* tree. We've added 50 non-merge commits during the last 25 day(s) which contain a total of 148 files changed, 4779 insertions(+), 1248 deletions(-). The main changes are: 1) BPF infrastructure to migrate TCP child sockets from a listener to another in the same reuseport group/map, from Kuniyuki Iwashima. 2) Add a provably sound, faster and more precise algorithm for tnum_mul() as noted in https://arxiv.org/abs/2105.05398, from Harishankar Vishwanathan. 3) Streamline error reporting changes in libbpf as planned out in the 'libbpf: the road to v1.0' effort, from Andrii Nakryiko. 4) Add broadcast support to xdp_redirect_map(), from Hangbin Liu. 5) Extends bpf_map_lookup_and_delete_elem() functionality to 4 more map types, that is, {LRU_,PERCPU_,LRU_PERCPU_,}HASH, from Denis Salopek. 6) Support new LLVM relocations in libbpf to make them more linker friendly, also add a doc to describe the BPF backend relocations, from Yonghong Song. 7) Silence long standing KUBSAN complaints on register-based shifts in interpreter, from Daniel Borkmann and Eric Biggers. 8) Add dummy PT_REGS macros in libbpf to fail BPF program compilation when target arch cannot be determined, from Lorenz Bauer. 9) Extend AF_XDP to support large umems with 1M+ pages, from Magnus Karlsson. 10) Fix two minor libbpf tc BPF API issues, from Kumar Kartikeya Dwivedi. 11) Move libbpf BPF_SEQ_PRINTF/BPF_SNPRINTF macros that can be used by BPF programs to bpf_helpers.h header, from Florent Revest. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Introduce net.ipv4.tcp_migrate_req.Kuniyuki Iwashima2021-06-151-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds a new sysctl option: net.ipv4.tcp_migrate_req. If this option is enabled or eBPF program is attached, we will be able to migrate child sockets from a listener to another in the same reuseport group after close() or shutdown() syscalls. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Benjamin Herrenschmidt <benh@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210612123224.12525-2-kuniyu@amazon.co.jp
* | documentation: networking: devlink: fix prestera.rst formatting that causes ↵Oleksandr Mazur2021-06-173-2/+4
| | | | | | | | | | | | | | | | | | build warnings Fixes: 66826c43e63d ("documentation: networking: devlink: add prestera switched driver Documentation") Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: wwan: iosm: Fix htmldocs warningsM Chetan Kumar2021-06-151-4/+4
| | | | | | | | | | | | | | | | | | Fixes .rst file warnings seen on linux-next build. Fixes: f7af616c632e ("net: iosm: infrastructure") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: M Chetan Kumar <m.chetan.kumar@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | documentation: networking: devlink: add prestera switched driver DocumentationOleksandr Mazur2021-06-141-0/+141
| | | | | | | | | | | | | | | | | | Add documentation for the devlink feature prestera switchdev driver supports: add description for the support of the driver-specific devlink traps (include both traps with action TRAP and action DROP); Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: iosm: infrastructureM Chetan Kumar2021-06-133-0/+115
| | | | | | | | | | | | | | | | | | 1) Kconfig & Makefile changes for IOSM Driver compilation. 2) Add IOSM Driver documentation. 3) Modified MAINTAINER file for IOSM Driver addition. Signed-off-by: M Chetan Kumar <m.chetan.kumar@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: phy: Add 25G BASE-R interface modeSteen Hegelund2021-06-121-0/+6
| | | | | | | | | | | | | | | | | | | | Add 25gbase-r phy interface mode Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: Bjarni Jonasson <bjarni.jonasson@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: dsa: generalize overhead for taggers that use both headers and trailersVladimir Oltean2021-06-111-10/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some really really weird switches just couldn't decide whether to use a normal or a tail tagger, so they just did both. This creates problems for DSA, because we only have the concept of an 'overhead' which can be applied to the headroom or to the tailroom of the skb (like for example during the central TX reallocation procedure), depending on the value of bool tail_tag, but not to both. We need to generalize DSA to cater for these odd switches by transforming the 'overhead / tail_tag' pair into 'needed_headroom / needed_tailroom'. The DSA master's MTU is increased to account for both. The flow dissector code is modified such that it only calls the DSA adjustment callback if the tagger has a non-zero header length. Taggers are trivially modified to declare either needed_headroom or needed_tailroom, based on the tail_tag value that they currently declare. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/mlx5: Bridge, add tracepointsVlad Buslov2021-06-101-0/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move private bridge structures to dedicated headers that is accessible to bridge tracepoint header. Implemented following tracepoints: - Initialize FDB entry. - Refresh FDB entry. - Cleanup FDB entry. - Create VLAN. - Cleanup VLAN. - Attach port to bridge. - Detach port from bridge. Usage example: ># cd /sys/kernel/debug/tracing ># echo mlx5:mlx5_esw_bridge_fdb_entry_init >> set_event ># cat trace ... kworker/u20:1-96 [001] .... 231.892503: mlx5_esw_bridge_fdb_entry_init: net_device=enp8s0f0_0 addr=e4:fd:05:08:00:02 vid=3 flags=0 lastuse=4294895695 Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | net/mlx5: Bridge, support pvid and untagged vlan configurationsVlad Buslov2021-06-101-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement support for pushing vlan header into untagged packet on ingress of port that has pvid configured and support for popping vlan on egress of port that has the matching vlan configured as untagged. To support such configurations packet reformat contexts of {INSERT|REMOVE}_HEADER types are created per such vlan and saved to struct mlx5_esw_bridge_vlan which allows all FDB entries on particular vlan to share single packet reformat instance. When initializing FDB entries with pvid or untagged vlan type set its mlx5_flow_act->pkt_reformat action accordingly. Flush all flows when removing vlan from port. This is necessary because even though software bridge removes all FDB entries before removing their vlan, mlx5 bridge implementation deletes their corresponding flow entries from hardware in asynchronous workqueue task, which will cause firmware error if vlan packet reformat context is deleted before all flows that point to it. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | net/mlx5: Bridge, match FDB entry vlan tagVlad Buslov2021-06-101-0/+9
| | | | | | | | | | | | | | | | | | | | | | Add support for FDB vlan-tagged entries. Extend ingress and egress flow tables with flow groups to match packet vlan tag. Modify the flow creation code to include vlan tag, if vlan is configured on port and vlan configuration is supported for offload. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | net/mlx5: Bridge, handle FDB eventsVlad Buslov2021-06-101-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hardware supported by mlx5 driver doesn't provide learning and requires the driver to emulate all switch-like behavior in software. As such, all packets by default go through miss path, appear on representor and get to software bridge, if it is the upper device of the representor. This causes bridge to process packet in software, learn the MAC address to FDB and send SWITCHDEV_FDB_ADD_TO_DEVICE event to all subscribers. In order to offload FDB entries in mlx5, register switchdev notifier callback and implement support for both 'added_by_user' and dynamic FDB entry SWITCHDEV_FDB_ADD_TO_DEVICE events asynchronously using new mlx5_esw_bridge_offloads->wq ordered workqueue. In workqueue callback offload the ingress rule (matching FDB entry MAC as packet source MAC) and egress table rule (matching FDB entry MAC as destination MAC). For ingress table rule also match source vport to ensure that only traffic coming from expected bridge port is matched by offloaded rule. Save all the relevant FDB entry data in struct mlx5_esw_bridge_fdb_entry instance and insert the instance in new mlx5_esw_bridge->fdb_list list (for traversing all entries by software ageing implementation in following patch) and in new mlx5_esw_bridge->fdb_ht hash table for fast retrieval. Notify the bridge that FDB entry has been offloaded by sending SWITCHDEV_FDB_OFFLOADED notification. Delete FDB entry on reception of SWITCHDEV_FDB_DEL_TO_DEVICE event. Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
* | net: ena: fix RST format in ENA documentation fileShay Agroskin2021-06-081-86/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The documentation file used to be written in markdown format but was converted to reStructuredText (rst). The converted file doesn't keep up with rst format requirements which results in hard-to-read text. This patch fixes the formatting of the file. The patch also * Highlights and emphasizes some lines to improve readability * Rephrases some hard-to-understand text * Updates outdated function descriptions. * Removes TSO description which falsely claims the driver supports it Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Documentation: devlink rate objectsDmytro Linkin2021-06-022-0/+61
| | | | | | | | | | | | | | | | | | Add devlink rate objects section at devlink port documentation. Add devlink rate support info at netdevsim devlink documentation. Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | docs: networking: Add documentation for MAPv5Sharath Chandra Vurukala2021-06-021-12/+114
| | | | | | | | | | | | | | | | Adding documentation explaining the new MAPv4/v5 packet formats and the corresponding checksum offload headers. Signed-off-by: Sharath Chandra Vurukala <sharathv@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | mptcp: restrict values of 'enabled' sysctlMatthieu Baerts2021-05-281-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To avoid confusions, it seems better to parse this sysctl parameter as a boolean. We use it as a boolean, no need to parse an integer and bring confusions if we see a value different from 0 and 1, especially with this parameter name: enabled. It seems fine to do this modification because the default value is 1 (enabled). Then the only other interesting value to set is 0 (disabled). All other values would not have changed the default behaviour. Suggested-by: Florian Westphal <fw@strlen.de> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2021-05-272-3/+3
|\ \ | |/ |/| | | | | | | cdc-wdm: s/kill_urbs/poison_urbs/ to fix build Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * docs: networking: device_drivers: fix bad usage of UTF-8 charsMauro Carvalho Chehab2021-05-112-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Probably because the original file was pre-processed by some tool, both i40e.rst and iavf.rst files are using this character: - U+2013 ('–'): EN DASH meaning an hyphen when calling a command line application, which is obviously wrong. So, replace them by an hyphen, ensuring that it will be properly displayed as literals when building the documentation. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Link: https://lore.kernel.org/r/95eb2a48d0ca3528780ce0dfce64359977fa8cb3.1620744606.git.mchehab+huawei@kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>
* | ipv6: Add custom multipath hash policyIdo Schimmel2021-05-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new multipath hash policy where the packet fields used for hash calculation are determined by user space via the fib_multipath_hash_fields sysctl that was introduced in the previous patch. The current set of available packet fields includes both outer and inner fields, which requires two invocations of the flow dissector. Avoid unnecessary dissection of the outer or inner flows by skipping dissection if none of the outer or inner fields are required. In accordance with the existing policies, when an skb is not available, packet fields are extracted from the provided flow key. In which case, only outer fields are considered. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ipv6: Add a sysctl to control multipath hash fieldsIdo Schimmel2021-05-181-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A subsequent patch will add a new multipath hash policy where the packet fields used for multipath hash calculation are determined by user space. This patch adds a sysctl that allows user space to set these fields. The packet fields are represented using a bitmask and are common between IPv4 and IPv6 to allow user space to use the same numbering across both protocols. For example, to hash based on standard 5-tuple: # sysctl -w net.ipv6.fib_multipath_hash_fields=0x0037 net.ipv6.fib_multipath_hash_fields = 0x0037 To avoid introducing holes in 'struct netns_sysctl_ipv6', move the 'bindv6only' field after the multipath hash fields. The kernel rejects unknown fields, for example: # sysctl -w net.ipv6.fib_multipath_hash_fields=0x1000 sysctl: setting key "net.ipv6.fib_multipath_hash_fields": Invalid argument Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ipv4: Add custom multipath hash policyIdo Schimmel2021-05-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new multipath hash policy where the packet fields used for hash calculation are determined by user space via the fib_multipath_hash_fields sysctl that was introduced in the previous patch. The current set of available packet fields includes both outer and inner fields, which requires two invocations of the flow dissector. Avoid unnecessary dissection of the outer or inner flows by skipping dissection if none of the outer or inner fields are required. In accordance with the existing policies, when an skb is not available, packet fields are extracted from the provided flow key. In which case, only outer fields are considered. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ipv4: Add a sysctl to control multipath hash fieldsIdo Schimmel2021-05-181-0/+27
|/ | | | | | | | | | | | | | | | | | | | | | | | A subsequent patch will add a new multipath hash policy where the packet fields used for multipath hash calculation are determined by user space. This patch adds a sysctl that allows user space to set these fields. The packet fields are represented using a bitmask and are common between IPv4 and IPv6 to allow user space to use the same numbering across both protocols. For example, to hash based on standard 5-tuple: # sysctl -w net.ipv4.fib_multipath_hash_fields=0x0037 net.ipv4.fib_multipath_hash_fields = 0x0037 The kernel rejects unknown fields, for example: # sysctl -w net.ipv4.fib_multipath_hash_fields=0x1000 sysctl: setting key "net.ipv4.fib_multipath_hash_fields": Invalid argument More fields can be added in the future, if needed. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski2021-04-281-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Borkmann says: ==================== pull-request: bpf-next 2021-04-28 The main changes are: 1) Add link detach and following re-attach for trampolines, from Jiri Olsa. 2) Use kernel's "binary printf" lib for formatted output BPF helpers (which avoids the needs for variadic argument handling), from Florent Revest. 3) Fix verifier 64 to 32 bit min/max bound propagation, from Daniel Borkmann. 4) Convert cpumap to use netif_receive_skb_list(), from Lorenzo Bianconi. 5) Add generic batched-ops support to percpu array map, from Pedro Tammela. 6) Various CO-RE relocation BPF selftests fixes, from Andrii Nakryiko. 7) Misc doc rst fixes, from Hengqi Chen. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: bpf, selftests: Update array map tests for per-cpu batched ops bpf: Add batched ops support for percpu array bpf: Implement formatted output helpers with bstr_printf seq_file: Add a seq_bprintf function bpf, docs: Fix literal block for example code bpf, cpumap: Bulk skb using netif_receive_skb_list bpf: Fix propagation of 32 bit unsigned bounds from 64 bit bounds bpf: Lock bpf_trace_printk's tmp buf before it is written to selftests/bpf: Fix core_reloc test runner selftests/bpf: Fix field existence CO-RE reloc tests selftests/bpf: Fix BPF_CORE_READ_BITFIELD() macro libbpf: Support BTF_KIND_FLOAT during type compatibility checks in CO-RE selftests/bpf: Add remaining ASSERT_xxx() variants selftests/bpf: Use ASSERT macros in lsm test selftests/bpf: Test that module can't be unloaded with attached trampoline selftests/bpf: Add re-attach test to lsm test selftests/bpf: Add re-attach test to fexit_test selftests/bpf: Add re-attach test to fentry_test bpf: Allow trampoline re-attach for tracing and lsm programs ==================== Link: https://lore.kernel.org/r/20210427233740.22238-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * bpf, docs: Fix literal block for example codeHengqi Chen2021-04-271-1/+1
| | | | | | | | | | | | | | | | | | Add a missing colon so that the code block followed can be rendered properly. Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210424021208.832116-1-hengqi.chen@gmail.com
* | docs: networking: timestamping: update for DSA switchesYangbo Lu2021-04-271-24/+39
| | | | | | | | | | | | | | | | | | | | Update timestamping doc for DSA switches to describe current implementation accurately. On TX, the skb cloning is no longer in DSA generic code. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | can: add a note that RECV_OWN_MSGS frames are subject to filteringErik Flodin2021-04-241-0/+2
|/ | | | | | | | | | | | | | | | | | | | Some parts of the documentation may lead the reader to think that the socket's own frames are always received when CAN_RAW_RECV_OWN_MSGS is enabled, but all frames are subject to filtering. As explained by Marc Kleine-Budde: On TX complete of a CAN frame it's pushed into the RX path of the networking stack, along with the information of the originating socket. Then the CAN frame is delivered into AF_CAN, where it is passed on to all registered receivers depending on filters. One receiver is the sending socket in CAN_RAW. Then in CAN_RAW the it is checked if the sending socket has RECV_OWN_MSGS enabled. Link: https://lore.kernel.org/r/20210420191212.42753-1-erik@flodin.me Signed-off-by: Erik Flodin <erik@flodin.me> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
* ethtool: add missing EEPROM to list of messagesJakub Kicinski2021-04-201-34/+36
| | | | | | | | | | | | ETHTOOL_MSG_MODULE_EEPROM_GET is missing from the list of messages. ETHTOOL_MSG_MODULE_EEPROM_GET_REPLY is sadly a rather long name so we need to adjust column length. v2: use spaces (Andrew) Fixes: c781ff12a2f3 ("ethtool: Allow network drivers to dump arbitrary EEPROM data") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2021-04-172-15/+13
|\ | | | | | | | | | | | | | | | | drivers/net/ethernet/stmicro/stmmac/stmmac_main.c - keep the ZC code, drop the code related to reinit net/bridge/netfilter/ebtables.c - fix build after move to net_generic Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * doc: move seg6_flowlabel to seg6-sysctl.rstNicolas Dichtel2021-04-142-15/+13
| | | | | | | | | | | | | | | | Let's have all seg6 sysctl at the same place. Fixes: a6dc6670cd7e ("ipv6: sr: Add documentation for seg_flowlabel sysctl") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | docs: ethtool: document standard statisticsJakub Kicinski2021-04-171-0/+82
| | | | | | | | | | | | | | Add documentation for ETHTOOL_MSG_STATS_GET. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | docs: networking: extend the statistics documentationJakub Kicinski2021-04-171-2/+42
| | | | | | | | | | | | | | | | Make the lack of expectations for switching NICs explicit, describe the new stats. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ethtool: add FEC statisticsJakub Kicinski2021-04-162-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similarly to pause statistics add stats for FEC. The IEEE standard mandates two sets of counters: - 30.5.1.1.17 aFECCorrectedBlocks - 30.5.1.1.18 aFECUncorrectableBlocks where block is a block of bits FEC operates on. Each of these counters is defined per lane (PCS instance). Multiple vendors provide number of corrected _bits_ rather than/as well as blocks. This set adds the 2 standard-based block counters and a extra one for corrected bits. Counters are exposed to user space via netlink in new attributes. Each attribute carries an array of u64s, first element is the total count, and the following ones are a per-lane break down. Much like with pause stats the operation will not fail when driver does not implement the get_fec_stats callback (nor can the driver fail the operation by returning an error). If stats can't be reported the relevant attributes will be empty. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/mlx5: E-Switch, let user to enable disable metadataParav Pandit2021-04-141-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently each packet inserted in eswitch is tagged with a internal metadata to indicate source vport. Metadata tagging is not always needed. Metadata insertion is needed for multi-port RoCE, failover between representors and stacked devices. In many other cases, metadata enablement is not needed. Metadata insertion slows down the packet processing rate of the E-switch when it is in switchdev mode. Below table show performance gain with metadata disabled for VXLAN offload rules in both SMFS and DMFS steering mode on ConnectX-5 device. ---------------------------------------------- | steering | metadata | pkt size | rx pps | | mode | | | (million) | ---------------------------------------------- | smfs | disabled | 128Bytes | 42 | ---------------------------------------------- | smfs | enabled | 128Bytes | 36 | ---------------------------------------------- | dmfs | disabled | 128Bytes | 42 | ---------------------------------------------- | dmfs | enabled | 128Bytes | 36 | ---------------------------------------------- Hence, allow user to disable metadata using driver specific devlink parameter. Metadata setting of the eswitch is applicable only for the switchdev mode. Example to show and disable metadata before changing eswitch mode: $ devlink dev param show pci/0000:06:00.0 name esw_port_metadata pci/0000:06:00.0: name esw_port_metadata type driver-specific values: cmode runtime value true $ devlink dev param set pci/0000:06:00.0 \ name esw_port_metadata value false cmode runtime $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Vu Pham <vuhuong@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> --- changelog: v1->v2: - added performance numbers in commit log - updated commit log and documentation for switchdev mode - added explicit note on when user can disable metadata in documentation
* | ethtool: Allow network drivers to dump arbitrary EEPROM dataVladyslav Tarasiuk2021-04-121-2/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Define get_module_eeprom_by_page() ethtool callback and implement netlink infrastructure. get_module_eeprom_by_page() allows network drivers to dump a part of module's EEPROM specified by page and bank numbers along with offset and length. It is effectively a netlink replacement for get_module_info() and get_module_eeprom() pair, which is needed due to emergence of complex non-linear EEPROM layouts. Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2021-04-101-5/+5
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: MAINTAINERS - keep Chandrasekar drivers/net/ethernet/mellanox/mlx5/core/en_main.c - simple fix + trust the code re-added to param.c in -next is fine include/linux/bpf.h - trivial include/linux/ethtool.h - trivial, fix kdoc while at it include/linux/skmsg.h - move to relevant place in tcp.c, comment re-wrapped net/core/skmsg.c - add the sk = sk // sk = NULL around calls net/tipc/crypto.c - trivial Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * docs: ethtool: fix some copy-paste errorsJakub Kicinski2021-04-071-5/+5
| | | | | | | | | | | | | | | | Fix incorrect documentation. Mostly referring to other objects, likely because the text was copied and not adjusted. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | docs: ethtool: correct quotesJakub Kicinski2021-04-071-2/+2
| | | | | | | | | | | | | | | | Quotes to backticks. All commands use backticks since the names are constants. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: x25: Queue received packets in the drivers instead of per-CPU queuesXie He2021-04-051-56/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | X.25 Layer 3 (the Packet Layer) expects layer 2 to provide a reliable datalink service such that no packets are reordered or dropped. And X.25 Layer 2 (the LAPB layer) is indeed designed to provide such service. However, this reliability is not preserved when a driver calls "netif_rx" to deliver the received packets to layer 3, because "netif_rx" will put the packets into per-CPU queues before they are delivered to layer 3. If there are multiple CPUs, the order of the packets may not be preserved. The per-CPU queues may also drop packets if there are too many. Therefore, we should not call "netif_rx" to let it queue the packets. Instead, we should use our own queue that won't reorder or drop packets. This patch changes all X.25 drivers to use their own queues instead of calling "netif_rx". The patch also documents this requirement in the "x25-iface" documentation. Cc: Martin Schiller <ms@dev.tdt.de> Signed-off-by: Xie He <xie.he.0141@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: document a side effect of ip_local_reserved_portsOtto Hollmann2021-04-021-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If there is overlapp between ip_local_port_range and ip_local_reserved_ports with a huge reserved block, it will affect probability of selecting ephemeral ports, see file net/ipv4/inet_hashtables.c:723 int __inet_hash_connect( ... for (i = 0; i < remaining; i += 2, port += 2) { if (unlikely(port >= high)) port -= remaining; if (inet_is_local_reserved_port(net, port)) continue; E.g. if there is reserved block of 10000 ports, two ports right after this block will be 5000 more likely selected than others. If this was intended, we can/should add note into documentation as proposed in this commit, otherwise we should think about different solution. One option could be mapping table of continuous port ranges. Second option could be letting user to modify step (port+=2) in above loop, e.g. using new sysctl parameter. Signed-off-by: Otto Hollmann <otto.hollmann@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ethtool: support FEC settings over netlinkJakub Kicinski2021-03-311-2/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add FEC API to netlink. This is not a 1-to-1 conversion. FEC settings already depend on link modes to tell user which modes are supported. Take this further an use link modes for manual configuration. Old struct ethtool_fecparam is still used to talk to the drivers, so we need to translate back and forth. We can revisit the internal API if number of FEC encodings starts to grow. Enforce only one active FEC bit (by using a bit position rather than another mask). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: add sysctl for enabling RFC 8335 PROBE messagesAndreas Roeseler2021-03-301-0/+6
| | | | | | | | | | | | | | | | | | | | Section 8 of RFC 8335 specifies potential security concerns of responding to PROBE requests, and states that nodes that support PROBE functionality MUST be able to enable/disable responses and that responses MUST be disabled by default Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Documentation: net: Document resilient next-hop groupsPetr Machata2021-03-292-0/+294
| | | | | | | | | | | | | | | | | | | | | | Add a document describing the principles behind resilient next-hop groups, and some notes about how to configure and offload them. Suggested-by: David Ahern <dsahern@gmail.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | docs: nf_flowtable: fix compilation and warningsPablo Neira Ayuso2021-03-261-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | ... cannot be used in block quote, it breaks compilation, remove it. Fix warnings due to missing blank line such as: net-next/Documentation/networking/nf_flowtable.rst:142: WARNING: Block quote ends without a blank line; unexpected unindent. Fixes: 143490cde566 ("docs: nf_flowtable: update documentation with enhancements") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller2021-03-254-5/+5
|\| | | | | | | Signed-off-by: David S. Miller <davem@davemloft.net>