| Commit message (Collapse) | Author | Files | Lines |
|
When the 'cachestat()' system call was added in commit cf264e1329fb
("cachestat: implement cachestat syscall"), it was meant to be a much
more convenient (and performant) version of mincore() that didn't need
mapping things into the user virtual address space in order to work.
But it ended up missing the "check for writability or ownership" fix for
mincore(), done in commit 134fca9063ad ("mm/mincore.c: make mincore()
more conservative").
This just adds equivalent logic to 'cachestat()', modified for the file
context (rather than vma).
Reported-by: Sudheendra Raghav Neela <sneela@tugraz.at>
Fixes: cf264e1329fb ("cachestat: implement cachestat syscall")
Tested-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Nhat Pham <nphamcs@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Old email still works, and will indefinitely, but I'm switching to a new
one.
Signed-off-by: Corey Minyard <corey@minyard.net>
|
|
A logic inversion in rseq_reset_rseq_cpu_node_id() causes the rseq
unregistration to fail when rseq_validate_ro_fields() succeeds rather
than the opposite.
This affects both CONFIG_DEBUG_RSEQ=y and CONFIG_DEBUG_RSEQ=n.
Fixes: 7d5265ffcd8b ("rseq: Validate read-only fields under DEBUG_RSEQ config")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250116205956.836074-1-mathieu.desnoyers@efficios.com
|
|
When enabling a pass-through port an interrupt might come before psmouse
driver binds to the pass-through port. However synaptics sub-driver
tries to access psmouse instance presumably associated with the
pass-through port to figure out if only 1 byte of response or entire
protocol packet needs to be forwarded to the pass-through port and may
crash if psmouse instance has not been attached to the port yet.
Fix the crash by introducing open() and close() methods for the port and
check if the port is open before trying to access psmouse instance.
Because psmouse calls serio_open() only after attaching psmouse instance
to serio port instance this prevents the potential crash.
Reported-by: Takashi Iwai <tiwai@suse.de>
Fixes: 100e16959c3c ("Input: libps2 - attach ps2dev instances as serio port's drvdata")
Link: https://bugzilla.suse.com/show_bug.cgi?id=1219522
Cc: stable@vger.kernel.org
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/Z4qSHORvPn7EU2j1@google.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
Microsoft defined Meta+Shift+F23 as the Copilot shortcut instead of a
dedicated keycode, and multiple vendors have their keyboards emit this
sequence in response to users pressing a dedicated "Copilot" key.
Unfortunately the default keymap table in atkbd does not map scancode
0x6e (F23) and so the key combination does not work even if userspace
is ready to handle it.
Because this behavior is common between multiple vendors and the
scancode is currently unused map 0x6e to keycode 193 (KEY_F23) so that
key sequence is generated properly.
MS documentation for the scan code:
https://learn.microsoft.com/en-us/windows/win32/inputdev/about-keyboard-input#scan-codes
Confirmed on Lenovo, HP and Dell machines by Canonical.
Tested on Lenovo T14s G6 AMD.
Signed-off-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Link: https://lore.kernel.org/r/20250107034554.25843-1-mpearson-lenovo@squebb.ca
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
In case of possible unpredictably large arguments passed to
rose_setsockopt() and multiplied by extra values on top of that,
integer overflows may occur.
Do the safest minimum and fix these issues by checking the
contents of 'opt' and returning -EINVAL if they are too large. Also,
switch to unsigned int and remove useless check for negative 'opt'
in ROSE_IDLE case.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Link: https://patch.msgid.link/20250115164220.19954-1-n.zhandarovich@fintech.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Some PHYs don't support clause 45 access, and return -EOPNOTSUPP from
phy_modify_mmd(), which causes phylink_bringup_phy() to fail. Prevent
this failure by allowing -EOPNOTSUPP to also mean success.
Reported-by: Jiawen Wu <jiawenwu@trustnetic.com>
Tested-by: Jiawen Wu <jiawenwu@trustnetic.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/E1tZp1a-001V62-DT@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Introduce am65_cpsw_create_txqs() and am65_cpsw_destroy_txqs()
and use them.
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Link: https://patch.msgid.link/20250117-am65-cpsw-streamline-v2-3-91a29c97e569@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Introduce am65_cpsw_create_rxqs() and am65_cpsw_destroy_rxqs()
and use them.
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Link: https://patch.msgid.link/20250117-am65-cpsw-streamline-v2-2-91a29c97e569@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We are missing netif_napi_del() and am65_cpsw_nuss_free_tx/rx_chns()
in error path when am65_cpsw_nuss_init_tx/rx_chns() is used anywhere
other than at probe(). i.e. am65_cpsw_nuss_update_tx_rx_chns and
am65_cpsw_nuss_resume()
As reported, in am65_cpsw_nuss_update_tx_rx_chns(),
if am65_cpsw_nuss_init_tx_chns() partially fails then
devm_add_action(dev, am65_cpsw_nuss_free_tx_chns,..) is added
but the cleanup via am65_cpsw_nuss_free_tx_chns() will not run.
Same issue exists for am65_cpsw_nuss_init_tx/rx_chns() failures
in am65_cpsw_nuss_resume() as well.
This would otherwise require more instances of devm_add/remove_action
and is clearly more of a distraction than any benefit.
So, drop devm_add/remove_action for am65_cpsw_nuss_free_tx/rx_chns()
and call am65_cpsw_nuss_free_tx/rx_chns() and netif_napi_del()
where required.
Reported-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Closes: https://lore.kernel.org/all/m4rhkzcr7dlylxr54udyt6lal5s2q4krrvmyay6gzgzhcu4q2c@r34snfumzqxy/
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Link: https://patch.msgid.link/20250117-am65-cpsw-streamline-v2-1-91a29c97e569@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Let's register inet6_rtm_deladdr() with RTNL_FLAG_DOIT_PERNET and
hold rtnl_net_lock() before inet6_addr_del().
Now that inet6_addr_del() is always called under per-netns RTNL.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250115080608.28127-12-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Let's register inet6_rtm_newaddr() with RTNL_FLAG_DOIT_PERNET
and hold rtnl_net_lock() before __dev_get_by_index().
Now that inet6_addr_add() and inet6_addr_modify() are always
called under per-netns RTNL.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250115080608.28127-11-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
inet6_addr_add() and inet6_addr_modify() have the same code to validate
IPv6 lifetime that is done under RTNL.
Let's factorise it out to inet6_rtm_newaddr() so that we can validate
the lifetime without RTNL later.
Note that inet6_addr_add() is called from addrconf_add_ifaddr(), but the
lifetime is INFINITY_LIFE_TIME in the path, so expires and flags are 0.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250115080608.28127-10-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We will convert inet6_rtm_newaddr() to per-netns RTNL.
Except for IFA_F_OPTIMISTIC, cfg.ifa_flags can be set before
__dev_get_by_index().
Let's move ifa_flags setup before __dev_get_by_index() so that
we can set ifa_flags without RTNL.
Also, now it's moved before tb[IFA_CACHEINFO] in preparing for
the next patch.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250115080608.28127-9-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
inet6_addr_add() is called from inet6_rtm_newaddr() and
addrconf_add_ifaddr().
inet6_addr_add() looks up dev by __dev_get_by_index(), but
it's already done in inet6_rtm_newaddr().
Let's move the 2nd lookup to addrconf_add_ifaddr() and pass
dev to inet6_addr_add().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250115080608.28127-8-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
These functions are called from inet6_ioctl() with a socket's netns
and hold RTNL.
* SIOCSIFADDR : addrconf_add_ifaddr()
* SIOCDIFADDR : addrconf_del_ifaddr()
* SIOCSIFDSTADDR : addrconf_set_dstaddr()
Let's use rtnl_net_lock().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250115080608.28127-7-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
addrconf_init() holds RTNL for blackhole_netdev, which is the global
device in init_net.
addrconf_cleanup() holds RTNL to clean up devices in init_net too.
Let's use rtnl_net_lock(&init_net) there.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250115080608.28127-6-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
addrconf_dad_work() is per-address work and holds RTNL internally.
We can fetch netns as dev_net(ifp->idev->dev).
Let's use rtnl_net_lock().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250115080608.28127-5-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
addrconf_verify_work() is per-netns work to call addrconf_verify_rtnl()
under RTNL.
Let's use rtnl_net_lock().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250115080608.28127-4-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
net.ipv6.conf.${DEV}.XXX sysctl are changed under RTNL:
* forwarding
* ignore_routes_with_linkdown
* disable_ipv6
* proxy_ndp
* addr_gen_mode
* stable_secret
* disable_policy
Let's use rtnl_net_lock() there.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250115080608.28127-3-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We will convert rtnl_lock() with rtnl_net_lock(), and we want to
convert __in6_dev_get() too.
__in6_dev_get() uses rcu_dereference_rtnl(), but as written in its
comment, rtnl_dereference() or rcu_dereference() is preferable.
Let's add __in6_dev_get_rtnl_net() that uses rtnl_net_dereference().
We can add the RCU version helper later if needed.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250115080608.28127-2-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After commit df542f669307 ("net: stmmac: Switch to zero-copy in
non-XDP RX path"), SKBs are always marked for recycle, it is redundant
to mark SKBs more than once when new frags are appended.
Signed-off-by: Furong Xu <0x1207@gmail.com>
Link: https://patch.msgid.link/20250117062805.192393-1-0x1207@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Two different models of usb card, the drivers are r8152 and asix. If no
network cable is connected, Speed = 10Mb/s. This problem is repeated in
linux 3.10, 4.19, 5.4, 6.12. This problem also exists on the latest
kernel. Both drivers call mii_ethtool_get_link_ksettings,
but the value of cmd->base.speed in this
function can only be SPEED_1000 or SPEED_100 or SPEED_10.
When the network cable is not connected, set cmd->base.speed
=SPEED_UNKNOWN.
Signed-off-by: Xiangqian Zhang <zhangxiangqian@kylinos.cn>
Link: https://patch.msgid.link/20250117094603.4192594-1-zhangxiangqian@kylinos.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since dccp and llc makefiles already check sysctl code
compilation with xxx-$(CONFIG_SYSCTL)
we can drop the checks
Signed-off-by: Denis Kirjanov <kirjanov@gmail.com>
Link: https://patch.msgid.link/20250119134254.19250-1-kirjanov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
300-400B RPC requests are fairly common. With the current default
of 256B HDS threshold bnxt ends up splitting those, lowering PCIe
bandwidth efficiency and increasing the number of memory allocation.
Increase the HDS threshold to fit 4 buffers in a 4k page.
This works out to 640B as the threshold on a typical kernel confing.
This change increases the performance for a microbenchmark which
receives 400B RPCs and sends empty responses by 4.5%.
Admittedly this is just a single benchmark, but 256B works out to
just 6 (so 2 more) packets per head page, because shinfo size
dominates the headers.
Now that we use page pool for the header pages I was also tempted
to default rx_copybreak to 0, but in synthetic testing the copybreak
size doesn't seem to make much difference.
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250119020518.1962249-8-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Now that we can configure HDS threshold separately from the rx_copybreak
HDS threshold may be higher than rx_copybreak.
We need to make sure that we have enough space for the headers.
Fixes: 6b43673a25c3 ("bnxt_en: add support for hds-thresh ethtool command")
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250119020518.1962249-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The core has the current HDS config, it can pre-populate the values
for the drivers. While at it, remove the zero-setting in netdevsim.
Zero are the default values since the config is zalloc'ed.
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250119020518.1962249-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use the pending config for hds_thrs. Core will only update the "current"
one after we return success. Without this change 2 reconfigs would be
required for the setting to reach the device.
Fixes: 6b43673a25c3 ("bnxt_en: add support for hds-thresh ethtool command")
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250119020518.1962249-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Record the pending configuration in net_device struct.
ethtool core duplicates the current config and the specific
handlers (for now just ringparam) can modify it.
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250119020518.1962249-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
For ease of review of the next patch store the dev pointer
on the stack, instead of referring to req_info.dev every time.
No functional changes.
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250119020518.1962249-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Separate the HDS config from the ethtool state struct.
The HDS config contains just simple parameters, not state.
Having it as a separate struct will make it easier to clone / copy
and also long term potentially make it per-queue.
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250119020518.1962249-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This is based on Donald Hunter's patch.
These functions could fail for various reasons, sometimes
triggering kfree_skb().
* unix_stream_connect() : connect()
* unix_stream_sendmsg() : sendmsg()
* queue_oob() : sendmsg(MSG_OOB)
* unix_dgram_sendmsg() : sendmsg()
Such kfree_skb() is tied to the errno of connect() and
sendmsg(), and we need not define skb drop reasons.
Let's use consume_skb() not to churn kfree_skb() events.
Link: https://lore.kernel.org/netdev/eb30b164-7f86-46bf-a5d3-0f8bda5e9398@redhat.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250116053441.5758-10-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This is a follow-up of commit d460b04bc452 ("af_unix: Clean up
error paths in unix_stream_sendmsg().").
If we initialise skb with NULL in unix_stream_sendmsg(), we can
reuse the existing out_pipe label for the SEND_SHUTDOWN check.
Let's rename it and adjust the existing label as out_pipe_lock.
While at it, size and data_len are moved to the while loop scope.
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250116053441.5758-9-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
unix_dgram_disconnected() is called from two places:
1. when a connect()ed socket dis-connect()s or re-connect()s to
another socket
2. when sendmsg() fails because the peer socket that the client
has connect()ed to has been close()d
Then, the client's recv queue is purged to remove all messages from
the old peer socket.
Let's define a new drop reason for that case.
# echo 1 > /sys/kernel/tracing/events/skb/kfree_skb/enable
# python3
>>> from socket import *
>>>
>>> # s1 has a message from s2
>>> s1, s2 = socketpair(AF_UNIX, SOCK_DGRAM)
>>> s2.send(b'hello world')
>>>
>>> # re-connect() drops the message from s2
>>> s3 = socket(AF_UNIX, SOCK_DGRAM)
>>> s3.bind('')
>>> s1.connect(s3.getsockname())
# cat /sys/kernel/tracing/trace_pipe
python3-250 ... kfree_skb: ... location=skb_queue_purge_reason+0xdc/0x110 reason: UNIX_DISCONNECT
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250116053441.5758-8-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
unix_stream_read_skb() is called when BPF SOCKMAP reads some data
from a socket in the map.
SOCKMAP does not support MSG_OOB, and reading OOB results in a drop.
Let's set drop reasons respectively.
* SOCKET_CLOSE : the socket in SOCKMAP was close()d
* UNIX_SKIP_OOB : OOB was read from the socket in SOCKMAP
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250116053441.5758-7-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
AF_UNIX SOCK_STREAM socket supports MSG_OOB.
When OOB data is sent to a socket, recv() will break at that point.
If the next recv() does not have MSG_OOB, the normal data following
the OOB data is returned.
Then, the OOB skb is dropped.
Let's define a new drop reason for that case in manage_oob().
# echo 1 > /sys/kernel/tracing/events/skb/kfree_skb/enable
# python3
>>> from socket import *
>>> s1, s2 = socketpair(AF_UNIX)
>>> s1.send(b'a', MSG_OOB)
>>> s1.send(b'b')
>>> s2.recv(2)
b'b'
# cat /sys/kernel/tracing/trace_pipe
...
python3-223 ... kfree_skb: ... location=unix_stream_read_generic+0x59e/0xc20 reason: UNIX_SKIP_OOB
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250116053441.5758-6-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Inflight file descriptors by SCM_RIGHTS hold references to the
struct file.
AF_UNIX sockets could hold references to each other, forming
reference cycles.
Once such sockets are close()d without the fd recv()ed, they
will be unaccessible from userspace but remain in kernel.
__unix_gc() garbage-collects skb with the dead file descriptors
and frees them by __skb_queue_purge().
Let's set SKB_DROP_REASON_SOCKET_CLOSE there.
# echo 1 > /sys/kernel/tracing/events/skb/kfree_skb/enable
# python3
>>> from socket import *
>>> from array import array
>>>
>>> # Create a reference cycle
>>> s1 = socket(AF_UNIX, SOCK_DGRAM)
>>> s1.bind('')
>>> s1.sendmsg([b"nop"], [(SOL_SOCKET, SCM_RIGHTS, array("i", [s1.fileno()]))], 0, s1.getsockname())
>>> s1.close()
>>>
>>> # Trigger GC
>>> s2 = socket(AF_UNIX)
>>> s2.close()
# cat /sys/kernel/tracing/trace_pipe
...
kworker/u16:2-42 ... kfree_skb: ... location=__unix_gc+0x4ad/0x580 reason: SOCKET_CLOSE
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250116053441.5758-5-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
unix_sock_destructor() is called as sk->sk_destruct() just before
the socket is actually freed.
Let's use SKB_DROP_REASON_SOCKET_CLOSE for skb_queue_purge().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250116053441.5758-4-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
unix_release_sock() is called when the last refcnt of struct file
is released.
Let's define a new drop reason SKB_DROP_REASON_SOCKET_CLOSE and
set it for kfree_skb() in unix_release_sock().
# echo 1 > /sys/kernel/tracing/events/skb/kfree_skb/enable
# python3
>>> from socket import *
>>> s1, s2 = socketpair(AF_UNIX)
>>> s1.send(b'hello world')
>>> s2.close()
# cat /sys/kernel/tracing/trace_pipe
...
python3-280 ... kfree_skb: ... protocol=0 location=unix_release_sock+0x260/0x420 reason: SOCKET_CLOSE
To be precise, unix_release_sock() is also called for a new child
socket in unix_stream_connect() when something fails, but the new
sk does not have skb in the recv queue then and no event is logged.
Note that only tcp_inbound_ao_hash() uses a similar drop reason,
SKB_DROP_REASON_TCP_CLOSE, and this can be generalised later.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250116053441.5758-3-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The following patch adds a new drop reason starting with
the SOCKET_ prefix.
Let's gather the existing SOCKET_ reasons.
Note that the order is not part of uAPI.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250116053441.5758-2-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Address Null pointer dereference / undefined behavior in rtattr_pack
(note that size is 0 in the bad case).
Flagged by cppcheck as:
tools/testing/selftests/net/ipsec.c:230:25: warning: Possible null pointer
dereference: payload [nullPointer]
memcpy(RTA_DATA(attr), payload, size);
^
tools/testing/selftests/net/ipsec.c:1618:54: note: Calling function 'rtattr_pack',
4th argument 'NULL' value is 0
if (rtattr_pack(&req.nh, sizeof(req), XFRMA_IF_ID, NULL, 0)) {
^
tools/testing/selftests/net/ipsec.c:230:25: note: Null pointer dereference
memcpy(RTA_DATA(attr), payload, size);
^
Signed-off-by: Liu Ye <liuye@kylinos.cn>
Link: https://patch.msgid.link/20250116013037.29470-1-liuye@kylinos.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This was a suggestion by David Laight, and while I was slightly worried
that some micro-architecture would predict cmov like a conditional
branch, there is little reason to actually believe any core would be
that broken.
Intel documents that their existing cores treat CMOVcc as a data
dependency that will constrain speculation in their "Speculative
Execution Side Channel Mitigations" whitepaper:
"Other instructions such as CMOVcc, AND, ADC, SBB and SETcc can also
be used to prevent bounds check bypass by constraining speculative
execution on current family 6 processors (Intel® Core™, Intel® Atom™,
Intel® Xeon® and Intel® Xeon Phi™ processors)"
and while that leaves the future uarch issues open, that's certainly
true of our traditional SBB usage too.
Any core that predicts CMOV will be unusable for various crypto
algorithms that need data-independent timing stability, so let's just
treat CMOV as the safe choice that simplifies the address masking by
avoiding an extra instruction and doesn't need a temporary register.
Suggested-by: David Laight <David.Laight@aculab.com>
Link: https://www.intel.com/content/dam/develop/external/us/en/documents/336996-speculative-execution-side-channel-mitigations.pdf
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Back when we added SMAP support, all versions of binutils didn't
necessarily understand the 'clac' and 'stac' instructions. So we
implemented those instructions manually as ".byte" sequences.
But we've since upgraded the minimum version of binutils to version
2.25, and that included proper support for the SMAP instructions, and
there's no reason for us to use some line noise to express them any
more.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
HWMON support has been added for the RTL8221/8251 PHYs integrated together
with the MAC inside the RTL8125/8126 chips. This patch extends temperature
reading support for standalone variants of the mentioned PHYs.
I don't know whether the earlier revisions of the RTL8226 also have a
built-in temperature sensor, so they have been skipped for now.
Tested on RTL8221B-VB-CG.
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
I noticed that HyStart incorrectly marks the start of rounds,
leading to inaccurate measurements of ACK train lengths and
resetting the `ca->sample_cnt` variable. This inaccuracy can impact
HyStart's functionality in terminating exponential cwnd growth during
Slow-Start, potentially degrading TCP performance.
The issue arises because the changes introduced in commit 4e1fddc98d25
("tcp_cubic: fix spurious Hystart ACK train detections for not-cwnd-limited flows")
moved the caller of the `bictcp_hystart_reset` function inside the `hystart_update` function.
This modification added an additional condition for triggering the caller,
requiring that (tcp_snd_cwnd(tp) >= hystart_low_window) must also
be satisfied before invoking `bictcp_hystart_reset`.
This fix ensures that `bictcp_hystart_reset` is correctly called
at the start of a new round, regardless of the congestion window size.
This is achieved by moving the condition
(tcp_snd_cwnd(tp) >= hystart_low_window)
from before calling `bictcp_hystart_reset` to after it.
I tested with a client and a server connected through two Linux software routers.
In this setup, the minimum RTT was 150 ms, the bottleneck bandwidth was 50 Mbps,
and the bottleneck buffer size was 1 BDP, calculated as (50M / 1514 / 8) * 0.150 = 619 packets.
I conducted the test twice, transferring data from the server to the client for 1.5 seconds.
Before the patch was applied, HYSTART-DELAY stopped the exponential growth of cwnd when
cwnd = 516, and the bottleneck link was not yet saturated (516 < 619).
After the patch was applied, HYSTART-ACK-TRAIN stopped the exponential growth of cwnd when
cwnd = 632, and the bottleneck link was saturated (632 > 619).
In this test, applying the patch resulted in 300 KB more data delivered.
Fixes: 4e1fddc98d25 ("tcp_cubic: fix spurious Hystart ACK train detections for not-cwnd-limited flows")
Signed-off-by: Mahdi Arghavani <ma.arghavani@yahoo.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Haibo Zhang <haibo.zhang@otago.ac.nz>
Cc: David Eyers <david.eyers@otago.ac.nz>
Cc: Abbas Arghavani <abbas.arghavani@mdu.se>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Tested-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This change resolves warning produced by sparse tool as currently
there is a mismatch between normal generic type in salt and endian
annotated type in macsec driver code. Endian annotated types should
be used here.
Sparse output:
warning: restricted ssci_t degrades to integer
warning: incorrect type in assignment (different base types)
expected restricted ssci_t [usertype] ssci
got unsigned int
warning: restricted __be64 degrades to integer
warning: incorrect type in assignment (different base types)
expected restricted __be64 [usertype] pn
got unsigned long long
Signed-off-by: Ales Nezbeda <anezbeda@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
On a 32bit system the "keylen + sizeof(struct tipc_aead_key)" math could
have an integer wrapping issue. It doesn't matter because the "keylen"
is checked on the next line, but just to make life easier for static
analysis tools, let's re-order these conditions and avoid the integer
overflow.
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Background: https://lore.kernel.org/r/20250107123615.161095-1-ericwouds@gmail.com
Since adding negotiation of in-band capabilities, it is no longer
sufficient to just look at the MLO_AN_xxx mode and PHY interface to
decide whether to do a major configuration, since the result now
depends on the capabilities of the attaching PHY.
Always trigger a major configuration in this case.
Testing log: https://lore.kernel.org/r/f20c9744-3953-40e7-a9c9-5534b25d2e2a@gmail.com
Reported-by: Eric Woudstra <ericwouds@gmail.com>
Tested-by: Eric Woudstra <ericwouds@gmail.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix build warnings reported from linux-next.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://lore.kernel.org/r/20250120192504.4a1965a0@canb.auug.org.au
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Share some infrastructure between sample programs and fix a build
failure that was reported.
Reported-by: Sasha Levin <sashal@kernel.org>
Link: https://lore.kernel.org/r/Z42UkSXx0MS9qZ9w@lappy
Link: https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-511-g109a8e0fa9d6/testrun/26809210/suite/build/test/gcc-8-allyesconfig/log
Signed-off-by: Christian Brauner <brauner@kernel.org>
|