| Commit message (Collapse) | Author | Files | Lines |
|
Currently dma_mapping_error returns a boolean as int, with 1 meaning
error. This is rather unusual and many callers have to convert it to
errno value. The callers are highly inconsistent with error codes
ranging from -ENOMEM over -EIO, -EINVAL and -EFAULT ranging to -EAGAIN.
Return -ENOMEM which seems to be what the largest number of callers
convert it to, and which also matches the typical error case where
we are out of resources.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
No users left except for vmd which just forwards it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Return DMA_MAPPING_ERROR instead of 0 on a dma mapping failure and let
the core dma-mapping code handle the rest.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Return DMA_MAPPING_ERROR instead of 0 on a dma mapping failure and let
the core dma-mapping code handle the rest.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Return DMA_MAPPING_ERROR instead of 0 on a dma mapping failure and let
the core dma-mapping code handle the rest.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Pass the page + offset to the low-level __iommu_map_single helper
(which gets renamed to fit the new calling conventions) as both
callers have the page at hand.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Return DMA_MAPPING_ERROR instead of 0 on a dma mapping failure and let
the core dma-mapping code handle the rest.
Note that the existing code used AMD_IOMMU_MAPPING_ERROR to check from
a 0 return from the IOVA allocator, which is replaced with an explicit
0 as in the implementation and other users of that interface.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Return DMA_MAPPING_ERROR instead of the magic bad_dma_addr on a dma
mapping failure and let the core dma-mapping code handle the rest.
Remove the magic EMERGENCY_PAGES that the bad_dma_addr gets redirected to.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Return DMA_MAPPING_ERROR instead of the magic bad_dma_addr on a dma
mapping failure and let the core dma-mapping code handle the rest.
Remove the magic EMERGENCY_PAGES that the bad_dma_addr gets redirected to.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Return DMA_MAPPING_ERROR instead of 0 on a dma mapping failure and let
the core dma-mapping code handle the rest.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Return DMA_MAPPING_ERROR instead of 0 on a dma mapping failure and let
the core dma-mapping code handle the rest.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Remove the odd sba_{un,}map_single_attrs wrappers, check errors
everywhere.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Return DMA_MAPPING_ERROR instead of 0 on a dma mapping failure and let
the core dma-mapping code handle the rest.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Just return DMA_MAPPING_ERROR from __dummy_map_page and let the core
dma-mapping code handle the rest.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The SBA iommu code already returns (~(dma_addr_t)0x0) on mapping
failures, so we can switch over to returning DMA_MAPPING_ERROR and let
the core dma-mapping code handle the rest.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The CCIO iommu code already returns (~(dma_addr_t)0x0) on mapping
failures, so we can switch over to returning DMA_MAPPING_ERROR and let
the core dma-mapping code handle the rest.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Sparc already returns (~(dma_addr_t)0x0) on mapping failures, so we can
switch over to returning DMA_MAPPING_ERROR and let the core dma-mapping
code handle the rest.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
S390 already returns (~(dma_addr_t)0x0) on mapping failures, so we can
switch over to returning DMA_MAPPING_ERROR and let the core dma-mapping
code handle the rest.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The Jazz iommu code already returns (~(dma_addr_t)0x0) on mapping
failures, so we can switch over to returning DMA_MAPPING_ERROR and
let the core dma-mapping code handle the rest.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The powerpc iommu code already returns (~(dma_addr_t)0x0) on mapping
failures, so we can switch over to returning DMA_MAPPING_ERROR and let
the core dma-mapping code handle the rest.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Arm already returns (~(dma_addr_t)0x0) on mapping failures, so we can
switch over to returning DMA_MAPPING_ERROR and let the core dma-mapping
code handle the rest.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The dma-direct code already returns (~(dma_addr_t)0x0) on mapping
failures, so we can switch over to returning DMA_MAPPING_ERROR and let
the core dma-mapping code handle the rest.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Error handling of the dma_map_single and dma_map_page APIs is a little
problematic at the moment, in that we use different encodings in the
returned dma_addr_t to indicate an error. That means we require an
additional indirect call to figure out if a dma mapping call returned
an error, and a lot of boilerplate code to implement these semantics.
Instead return the maximum addressable value as the error. As long
as we don't allow mapping single-byte ranges with single-byte alignment
this value can never be a valid return. Additionaly if drivers do
not check the return value from the dma_map* routines this values means
they will generally not be pointed to actual memory.
Once the default value is added here we can start removing the
various mapping_error methods and just rely on this generic check.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit bfd56cd60521 ("dma-mapping: support highmem in the generic remap
allocator") replaced dma_direct_alloc_pages() with __dma_direct_alloc_pages(),
which doesn't set dma_handle and zero allocated memory. Fix it by doing this
directly in the caller function.
Fixes: bfd56cd60521 ("dma-mapping: support highmem in the generic remap allocator")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
The csky code was largely copied from arm/arm64, so switch to the
generic arm64-based implementation instead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Guo Ren <ren_guo@c-sky.com>
|
|
csky does not implement ZONE_DMA, which means passing GFP_DMA is a no-op.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Guo Ren <ren_guo@c-sky.com>
|
|
This option is gone past Linux 4.19.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Guo Ren <ren_guo@c-sky.com>
|
|
Do not waste vmalloc space on allocations that do not require a mapping
into the kernel address space.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
|
|
By using __dma_direct_alloc_pages we can deal entirely with struct page
instead of having to derive a kernel virtual address.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
|
|
The arm64 codebase to implement coherent dma allocation for architectures
with non-coherent DMA is a good start for a generic implementation, given
that is uses the generic remap helpers, provides the atomic pool for
allocations that can't sleep and still is realtively simple and well
tested. Move it to kernel/dma and allow architectures to opt into it
using a config symbol. Architectures just need to provide a new
arch_dma_prep_coherent helper to writeback an invalidate the caches
for any memory that gets remapped for uncached access.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
|
|
The dma remap code only makes sense for not cache coherent architectures
(or possibly the corner case of highmem CMA allocations) and currently
is only used by arm, arm64, csky and xtensa. Split it out into a
separate file with a separate Kconfig symbol, which gets the right
copyright notice given that this code was written by Laura Abbott
working for Code Aurora at that point.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Laura Abbott <labbott@redhat.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
|
|
dma_alloc_from_contiguous can return highmem pages depending on the
setup, which a plain non-remapping DMA allocator can't handle. Detect
this case and fail the allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
|
|
Some architectures support remapping highmem into DMA coherent
allocations. To use the common code for them we need variants of
dma_direct_{alloc,free}_pages that do not use kernel virtual addresses.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
|
|
The function dma_set_max_seg_size() can return either 0 on success or
-EIO on error. Change its return type from unsigned int to int to
capture this.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
The numa_slit variable used by node_distance is available to a
module as long as it is linked at compile-time. However, it is
not available to loadable modules. Leading to errors such as:
ERROR: "numa_slit" [drivers/nvme/host/nvme-core.ko] undefined!
The error above is caused by the nvme multipath code that makes
use of node_distance for its path calculation. When the patch was
added, the lightnvm subsystem would select nvme and always compile
it in, leading to the node_distance call to always succeed.
However, when this requirement was removed, nvme could be compiled
in as a module, which exposed this bug.
This patch extracts node_distance to a function and exports it.
Since ACPI is depending on node_distance being a simple lookup to
numa_slit, the previous behavior is exposed as slit_distance and its
users updated.
Fixes: f333444708f82 "nvme: take node locality into account when selecting a path"
Fixes: 73569e11032f "lightnvm: remove dependencies on BLK_DEV_NVME and PCI"
Signed-off-by: Matias Bjøring <mb@lightnvm.io>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
I'm taking over the maintainance of Sparse so add myself as
maintainer and move Christopher's info to CREDITS.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The TX stats should be started with the tx_stats_syncp,
there seems to be a copy/paste error in the driver.
Signed-off-by: Andreas Fiedler <andreas.fiedler@gmx.net>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The vsc85xx_default_config function called in the vsc85xx_config_init
function which is used by VSC8530, VSC8531, VSC8540 and VSC8541 PHYs
mistakenly calls phy_read and phy_write in-between phy_select_page and
phy_restore_page.
phy_select_page and phy_restore_page actually take and release the MDIO
bus lock and phy_write and phy_read take and release the lock to write
or read to a PHY register.
Let's fix this deadlock by using phy_modify_paged which handles
correctly a read followed by a write in a non-standard page.
Fixes: 6a0bfbbe20b0 ("net: phy: mscc: migrate to phy_select/restore_page functions")
Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The correct form is "can be probed", so fix the typo.
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Reset snd_queue tso_hdrs pointer to NULL in nicvf_free_snd_queue routine
since it is used to check if tso dma descriptor queue has been previously
allocated. The issue can be triggered with the following reproducer:
$ip link set dev enP2p1s0v0 xdpdrv obj xdp_dummy.o
$ip link set dev enP2p1s0v0 xdpdrv off
[ 341.467649] WARNING: CPU: 74 PID: 2158 at mm/vmalloc.c:1511 __vunmap+0x98/0xe0
[ 341.515010] Hardware name: GIGABYTE H270-T70/MT70-HD0, BIOS T49 02/02/2018
[ 341.521874] pstate: 60400005 (nZCv daif +PAN -UAO)
[ 341.526654] pc : __vunmap+0x98/0xe0
[ 341.530132] lr : __vunmap+0x98/0xe0
[ 341.533609] sp : ffff00001c5db860
[ 341.536913] x29: ffff00001c5db860 x28: 0000000000020000
[ 341.542214] x27: ffff810feb5090b0 x26: ffff000017e57000
[ 341.547515] x25: 0000000000000000 x24: 00000000fbd00000
[ 341.552816] x23: 0000000000000000 x22: ffff810feb5090b0
[ 341.558117] x21: 0000000000000000 x20: 0000000000000000
[ 341.563418] x19: ffff000017e57000 x18: 0000000000000000
[ 341.568719] x17: 0000000000000000 x16: 0000000000000000
[ 341.574020] x15: 0000000000000010 x14: ffffffffffffffff
[ 341.579321] x13: ffff00008985eb27 x12: ffff00000985eb2f
[ 341.584622] x11: ffff0000096b3000 x10: ffff00001c5db510
[ 341.589923] x9 : 00000000ffffffd0 x8 : ffff0000086868e8
[ 341.595224] x7 : 3430303030303030 x6 : 00000000000006ef
[ 341.600525] x5 : 00000000003fffff x4 : 0000000000000000
[ 341.605825] x3 : 0000000000000000 x2 : ffffffffffffffff
[ 341.611126] x1 : ffff0000096b3728 x0 : 0000000000000038
[ 341.616428] Call trace:
[ 341.618866] __vunmap+0x98/0xe0
[ 341.621997] vunmap+0x3c/0x50
[ 341.624961] arch_dma_free+0x68/0xa0
[ 341.628534] dma_direct_free+0x50/0x80
[ 341.632285] nicvf_free_resources+0x160/0x2d8 [nicvf]
[ 341.637327] nicvf_config_data_transfer+0x174/0x5e8 [nicvf]
[ 341.642890] nicvf_stop+0x298/0x340 [nicvf]
[ 341.647066] __dev_close_many+0x9c/0x108
[ 341.650977] dev_close_many+0xa4/0x158
[ 341.654720] rollback_registered_many+0x140/0x530
[ 341.659414] rollback_registered+0x54/0x80
[ 341.663499] unregister_netdevice_queue+0x9c/0xe8
[ 341.668192] unregister_netdev+0x28/0x38
[ 341.672106] nicvf_remove+0xa4/0xa8 [nicvf]
[ 341.676280] nicvf_shutdown+0x20/0x30 [nicvf]
[ 341.680630] pci_device_shutdown+0x44/0x88
[ 341.684720] device_shutdown+0x144/0x250
[ 341.688640] kernel_restart_prepare+0x44/0x50
[ 341.692986] kernel_restart+0x20/0x68
[ 341.696638] __se_sys_reboot+0x210/0x238
[ 341.700550] __arm64_sys_reboot+0x24/0x30
[ 341.704555] el0_svc_handler+0x94/0x110
[ 341.708382] el0_svc+0x8/0xc
[ 341.711252] ---[ end trace 3f4019c8439959c9 ]---
[ 341.715874] page:ffff7e0003ef4000 count:0 mapcount:0 mapping:0000000000000000 index:0x4
[ 341.723872] flags: 0x1fffe000000000()
[ 341.727527] raw: 001fffe000000000 ffff7e0003f1a008 ffff7e0003ef4048 0000000000000000
[ 341.735263] raw: 0000000000000004 0000000000000000 00000000ffffffff 0000000000000000
[ 341.742994] page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0)
where xdp_dummy.c is a simple bpf program that forwards the incoming
frames to the network stack (available here:
https://github.com/altoor/xdp_walkthrough_examples/blob/master/sample_1/xdp_dummy.c)
Fixes: 05c773f52b96 ("net: thunderx: Add basic XDP support")
Fixes: 4863dea3fab0 ("net: Adding support for Cavium ThunderX network controller")
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
of_find_node_by_path() acquires a reference to the node
returned by it and that reference needs to be dropped by its caller.
This place doesn't do that, so fix it.
Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
team_notify_peers() will send ARP and NA to notify peers. team_mcast_rejoin()
will send multicast join group message to notify peers. We should do this when
enabling/changed to a new port. But it doesn't make sense to do it when a port
is disabled.
On the other hand, when we set mcast_rejoin_count to 2, and do a failover,
team_port_disable() will increase mcast_rejoin.count_pending to 2 and then
team_port_enable() will increase mcast_rejoin.count_pending to 4. We will send
4 mcast rejoin messages at latest, which will make user confused. The same
with notify_peers.count.
Fix it by deleting team_notify_peers() and team_mcast_rejoin() in
team_port_disable().
Reported-by: Liang Li <liali@redhat.com>
Fixes: fc423ff00df3a ("team: add peer notification")
Fixes: 492b200efdd20 ("team: add support for sending multicast rejoins")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We don't support partial csumed packet since its metadata will be lost
or incorrect during XDP processing. So fail the XDP set if guest_csum
feature is negotiated.
Fixes: f600b6905015 ("virtio_net: Add XDP support")
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Pavel Popa <pashinho1990@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We don't disable VIRTIO_NET_F_GUEST_CSUM if XDP was set. This means we
can receive partial csumed packets with metadata kept in the
vnet_hdr. This may have several side effects:
- It could be overridden by header adjustment, thus is might be not
correct after XDP processing.
- There's no way to pass such metadata information through
XDP_REDIRECT to another driver.
- XDP does not support checksum offload right now.
So simply disable guest csum if possible in this the case of XDP.
Fixes: 3f93522ffab2d ("virtio-net: switch off offloads on demand if possible on XDP set")
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Pavel Popa <pashinho1990@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
commit f2cbd4852820 ("net/sched: act_police: fix race condition on state
variables") introduces a new spinlock, but forgets its initialization.
Ensure that tcf_police_init() initializes 'tcfp_lock' every time a 'police'
action is newly created, to avoid the following lockdep splat:
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
<...>
Call Trace:
dump_stack+0x85/0xcb
register_lock_class+0x581/0x590
__lock_acquire+0xd4/0x1330
? tcf_police_init+0x2fa/0x650 [act_police]
? lock_acquire+0x9e/0x1a0
lock_acquire+0x9e/0x1a0
? tcf_police_init+0x2fa/0x650 [act_police]
? tcf_police_init+0x55a/0x650 [act_police]
_raw_spin_lock_bh+0x34/0x40
? tcf_police_init+0x2fa/0x650 [act_police]
tcf_police_init+0x2fa/0x650 [act_police]
tcf_action_init_1+0x384/0x4c0
tcf_action_init+0xf6/0x160
tcf_action_add+0x73/0x170
tc_ctl_action+0x122/0x160
rtnetlink_rcv_msg+0x2a4/0x490
? netlink_deliver_tap+0x99/0x400
? validate_linkmsg+0x370/0x370
netlink_rcv_skb+0x4d/0x130
netlink_unicast+0x196/0x230
netlink_sendmsg+0x2e5/0x3e0
sock_sendmsg+0x36/0x40
___sys_sendmsg+0x280/0x2f0
? _raw_spin_unlock+0x24/0x30
? handle_pte_fault+0xafe/0xf30
? find_held_lock+0x2d/0x90
? syscall_trace_enter+0x1df/0x360
? __sys_sendmsg+0x5e/0xa0
__sys_sendmsg+0x5e/0xa0
do_syscall_64+0x60/0x210
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f1841c7cf10
Code: c3 48 8b 05 82 6f 2c 00 f7 db 64 89 18 48 83 cb ff eb dd 0f 1f 80 00 00 00 00 83 3d 8d d0 2c 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ae cc 00 00 48 89 04 24
RSP: 002b:00007ffcf9df4d68 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f1841c7cf10
RDX: 0000000000000000 RSI: 00007ffcf9df4dc0 RDI: 0000000000000003
RBP: 000000005bf56105 R08: 0000000000000002 R09: 00007ffcf9df8edc
R10: 00007ffcf9df47e0 R11: 0000000000000246 R12: 0000000000671be0
R13: 00007ffcf9df4e84 R14: 0000000000000008 R15: 0000000000000000
Fixes: f2cbd4852820 ("net/sched: act_police: fix race condition on state variables")
Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Eric noted that with UDP GRO and NAPI timeout, we could keep a single
UDP packet inside the GRO hash forever, if the related NAPI instance
calls napi_gro_complete() at an higher frequency than the NAPI timeout.
Willem noted that even TCP packets could be trapped there, till the
next retransmission.
This patch tries to address the issue, flushing the old packets -
those with a NAPI_GRO_CB age before the current jiffy - before scheduling
the NAPI timeout. The rationale is that such a timeout should be
well below a jiffy and we are not flushing packets eligible for sane GRO.
v1 -> v2:
- clarified the commit message and comment
RFC -> v1:
- added 'Fixes tags', cleaned-up the wording.
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Fixes: 3b47d30396ba ("net: gro: add a per device gro flush timer")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When we add a new IPv6 address, we should also join corresponding solicited-node
multicast address, unless the interface has IFF_NOARP flag, as function
addrconf_join_solict() did. But if we remove IFF_NOARP flag later, we do
not do dad and add the mcast address. So we will drop corresponding neighbour
discovery message that came from other nodes.
A typical example is after creating a ipvlan with mode l3, setting up an ipv6
address and changing the mode to l2. Then we will not be able to ping this
address as the interface doesn't join related solicited-node mcast address.
Fix it by re-doing dad when interface changed IFF_NOARP flag. Then we will add
corresponding mcast group and check if there is a duplicate address on the
network.
Reported-by: Jianlin Shi <jishi@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
tpacket_snd sends packets with user pages linked into skb frags. It
notifies that pages can be reused when the skb is released by setting
skb->destructor to tpacket_destruct_skb.
This can cause data corruption if the skb is orphaned (e.g., on
transmit through veth) or cloned (e.g., on mirror to another psock).
Create a kernel-private copy of data in these cases, same as tun/tap
zerocopy transmission. Reuse that infrastructure: mark the skb as
SKBTX_ZEROCOPY_FRAG, which will trigger copy in skb_orphan_frags(_rx).
Unlike other zerocopy packets, do not set shinfo destructor_arg to
struct ubuf_info. tpacket_destruct_skb already uses that ptr to notify
when the original skb is released and a timestamp is recorded. Do not
change this timestamp behavior. The ubuf_info->callback is not needed
anyway, as no zerocopy notification is expected.
Mark destructor_arg as not-a-uarg by setting the lower bit to 1. The
resulting value is not a valid ubuf_info pointer, nor a valid
tpacket_snd frame address. Add skb_zcopy_.._nouarg helpers for this.
The fix relies on features introduced in commit 52267790ef52 ("sock:
add MSG_ZEROCOPY"), so can be backported as is only to 4.14.
Tested with from `./in_netns.sh ./txring_overwrite` from
http://github.com/wdebruij/kerneltools/tests
Fixes: 69e3c75f4d54 ("net: TX_RING and packet mmap")
Reported-by: Anand H. Krishnan <anandhkrishnan@gmail.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When merging support for SSBD and the CRC32 instructions, the conflict
resolution for the new capability entries in arm64_features[]
inadvertedly predicated the availability of the CRC32 instructions on
CONFIG_ARM64_SSBD, despite the functionality being entirely unrelated.
Move the #ifdef CONFIG_ARM64_SSBD down so that it only covers the SSBD
capability.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|