| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
same time
'struct evsel' uses a union for the two lists. This turned out to be
error prone.
For example:
[root@quaco ~]# perf stat --bpf-prog 5246
Error: cpu-clock event does not have sample flags 66c660
failed to set filter "(null)" on event cpu-clock with 2 (No such file or directory)
[root@quaco ~]# perf stat --bpf-prog 5246
Fix this issue by separating the two lists.
Fixes: 56ec9457a4a20c5e ("perf bpf filter: Implement event sample filtering")
Signed-off-by: Song Liu <song@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kernel-team@meta.com
Link: https://lore.kernel.org/r/20230519235757.3613719-1-song@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Picking the changes from:
3632679d9e4f879f ("ipv{4,6}/raw: fix output xfrm lookup wrt protocol")
That includes a define (IP_PROTOCOL) that isn't being used in generating
any id -> string table used by 'perf trace'.
Addresses this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/in.h' differs from latest version at 'include/uapi/linux/in.h'
diff -u tools/include/uapi/linux/in.h include/uapi/linux/in.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/lkml/ZHD/Ms0DMq7viaq+@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Copy the kernel version of the header to fix the header diff build
warning. Some new definitions were only added to the tools side header,
but these are only used in Perf so move them to a different header.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20230522102604.1081416-1-james.clark@arm.com
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: acme@kernel.org
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: coresight@lists.linaro.org
Cc: siyanteng@loongson.cn
Cc: John Garry <john.g.garry@oracle.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
llvm-strip is not really required. Remove this dependency to make it
easier to build perf with BUILD_BPF_SKEL=1.
Committer notes:
This removes the need for the 'llvm' package just to get llvm-strip.
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Song Liu <song@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
demangle-cxx.cpp requires a C++ compiler, but feature checks may fail
because of the absence of this. Add a CONFIG_CXX_DEMANGLE so that the
source isn't built if not supported. Copy libbfd and cplus demangle
variants to a weak symbol-elf.c version so they aren't dependent on
C++. These variants are only built with the build option
BUILD_NONDISTRO=1.
Committer note:
This also handles this build break when a C++ compiler isn't available:
CXX /tmp/build/perf/util/demangle-cxx.o
/bin/sh: g++: command not found
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi115@huawei.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20230417192546.99923-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change "../cs-etm.h" to just "../../../util/cs-etm.h" as ../cs-etm.h
doesn't exist.
Suggested-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230515165039.544045-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BPF CO-RE requires 3 underscores for the ignored suffix rule but it
mistakenly used only 2. Let's fix it.
Fixes: 3a8b8fc3174891c4 ("perf bpf filter: Support pre-5.16 kernels where 'mem_hops' isn't in 'union perf_mem_data_src'")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230525000307.3202449-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The option name should not have the dashes. Current version shows four
dashes for the option.
$ perf ftrace latency -h
Usage: perf ftrace [<options>] [<command>]
or: perf ftrace [<options>] -- [<command>] [<options>]
or: perf ftrace {trace|latency} [<options>] [<command>]
or: perf ftrace {trace|latency} [<options>] -- [<command>] [<options>]
-b, --use-bpf Use BPF to measure function latency
-n, ----use-nsec Use nano-second histogram
-T, --trace-funcs <func>
Show latency of given function
Fixes: 84005bb6148618cc ("perf ftrace latency: Add -n/--use-nsec option")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230525212038.3535851-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Pull smb directory moves and client fixes from Steve French:
"Four smb3 client fixes (three of which marked for stable) and three
patches to move of fs/cifs and fs/ksmbd to a new common "fs/smb"
parent directory
- Move the client and server source directories to a common parent
directory:
fs/cifs -> fs/smb/client
fs/ksmbd -> fs/smb/server
fs/smbfs_common -> fs/smb/common
- important readahead fix
- important fix for SMB1 regression
- fix for missing mount option ("mapchars") in mount API conversion
- minor debugging improvement"
* tag '6.4-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb3: move Documentation/filesystems/cifs to Documentation/filesystems/smb
cifs: correct references in Documentation to old fs/cifs path
smb: move client and server files to common directory fs/smb
cifs: mapchars mount option ignored
smb3: display debug information better for encryption
cifs: fix smb1 mount regression
cifs: Fix cifs_limit_bvec_subset() to correctly check the maxmimum size
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Documentation/filesystems/cifs contains both server and client information
so its pathname is misleading. In addition, the directory fs/smb
now contains both server and client, so move Documentation/filesystems/cifs
to Documentation/filesystems/smb
Suggested-by: Namjae Jeon <linkinjeon@kernel.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
The fs/cifs directory has moved to fs/smb/client, correct mentions
of this in Documentation and comments.
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Move CIFS/SMB3 related client and server files (cifs.ko and ksmbd.ko
and helper modules) to new fs/smb subdirectory:
fs/cifs --> fs/smb/client
fs/ksmbd --> fs/smb/server
fs/smbfs_common --> fs/smb/common
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
There are two ways that special characters (not allowed in some
other operating systems like Windows, but allowed in POSIX) have
been mapped in the past ("SFU" and "SFM" mappings) to allow them
to be stored in a range reserved for special chars. The default
for Linux has been to use "mapposix" (ie the SFM mapping) but
the conversion to the new mount API in the 5.11 kernel broke
the ability to override the default mapping of the reserved
characters (like '?' and '*' and '\') via "mapchars" mount option.
This patch fixes that - so can now mount with "mapchars"
mount option to override the default ("mapposix" ie SFM) mapping.
Reported-by: Tyler Spivey <tspivey8@gmail.com>
Fixes: 24e0a1eff9e2 ("cifs: switch to new mount api")
Signed-off-by: Steve French <stfrench@microsoft.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Fix /proc/fs/cifs/DebugData to use the same case for "encryption"
(ie "Encryption" with init capital letter was used in one place).
In addition, if gcm256 encryption (intead of gcm128) is used on
a connection to a server, note that in the DebugData as well.
It now displays (when gcm256 negotiated):
Security type: RawNTLMSSP SessionId: 0x86125800bc000b0d encrypted(gcm256)
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
cifs.ko maps NT_STATUS_NOT_FOUND to -EIO when SMB1 servers couldn't
resolve referral paths. Proceed to tree connect when we get -EIO from
dfs_get_referral() as well.
Reported-by: Kris Karas (Bug Reporting) <bugs-a21@moonlit-rail.com>
Tested-by: Woody Suwalski <terraluna977@gmail.com>
Fixes: 8e3554150d6c ("cifs: fix sharing of DFS connections")
Cc: stable@vger.kernel.org # v6.2+
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Fix cifs_limit_bvec_subset() so that it limits the span to the maximum
specified and won't return with a size greater than max_size.
Fixes: d08089f649a0 ("cifs: Change the I/O paths to use an iterator rather than a page list")
Cc: stable@vger.kernel.org # 6.3
Reported-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <smfrench@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Paulo Alcantara <pc@manguebit.com>
cc: Tom Talpey <tom@talpey.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc architecture fixes from Helge Deller:
"Quite a bunch of real bugfixes in here and most of them are tagged for
backporting: A fix for cache flushing from irq context, a kprobes &
kgdb breakpoint handling fix, and a fix in the alternative code
patching function to take care of CPU hotplugging.
parisc now provides LOCKDEP support and comes with a lightweight
spinlock check. Both features helped me to find the cache flush bug.
Additionally writing the AGP gatt has been fixed, the machine allows
the user to reboot after a system halt and arch_sync_dma_for_cpu() has
been optimized for PCXL PCUs.
Summary:
- Fix flush_dcache_page() for usage from irq context
- Handle kprobes breakpoints only in kernel context
- Handle kgdb breakpoints only in kernel context
- Use num_present_cpus() in alternative patching code
- Enable LOCKDEP support
- Add lightweight spinlock checks
- Flush AGP gatt writes and adjust gatt mask in parisc_agp_mask_memory()
- Allow to reboot machine after system halt
- Improve cache flushing for PCXL in arch_sync_dma_for_cpu()"
* tag 'parisc-for-6.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Fix flush_dcache_page() for usage from irq context
parisc: Handle kgdb breakpoints only in kernel context
parisc: Handle kprobes breakpoints only in kernel context
parisc: Allow to reboot machine after system halt
parisc: Enable LOCKDEP support
parisc: Add lightweight spinlock checks
parisc: Use num_present_cpus() in alternative patching code
parisc: Flush gatt writes and adjust gatt mask in parisc_agp_mask_memory()
parisc: Improve cache flushing for PCXL in arch_sync_dma_for_cpu()
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Since at least kernel 6.1, flush_dcache_page() is called with IRQs
disabled, e.g. from aio_complete().
But the current implementation for flush_dcache_page() on parisc
unintentionally re-enables IRQs, which may lead to deadlocks.
Fix it by using xa_lock_irqsave() and xa_unlock_irqrestore()
for the flush_dcache_mmap_*lock() macros instead.
Cc: linux-parisc@vger.kernel.org
Cc: stable@kernel.org # 5.18+
Signed-off-by: Helge Deller <deller@gmx.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The kernel kgdb break instructions should only be handled when running
in kernel context.
Cc: <stable@vger.kernel.org> # v5.4+
Signed-off-by: Helge Deller <deller@gmx.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The kernel kprobes break instructions should only be handled when running
in kernel context.
Cc: <stable@vger.kernel.org> # v5.18+
Signed-off-by: Helge Deller <deller@gmx.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
In case a machine can't power-off itself on system shutdown,
allow the user to reboot it by pressing the RETURN key.
Cc: <stable@vger.kernel.org> # v4.14+
Signed-off-by: Helge Deller <deller@gmx.de>
|
| | |
| | |
| | |
| | |
| | | |
Cc: <stable@vger.kernel.org> # v6.0+
Signed-off-by: Helge Deller <deller@gmx.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add a lightweight spinlock check which uses only two instructions
per spinlock call. It detects if a spinlock has been trashed by
some memory corruption and then halts the kernel. It will not detect
uninitialized spinlocks, for which CONFIG_DEBUG_SPINLOCK needs to
be enabled.
This lightweight spinlock check shouldn't influence runtime, so it's
safe to enable it by default.
The __ARCH_SPIN_LOCK_UNLOCKED_VAL constant has been choosen small enough
to be able to be loaded by one LDI assembler statement.
Signed-off-by: Helge Deller <deller@gmx.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When patching the kernel code some alternatives depend on SMP vs. !SMP.
Use the value of num_present_cpus() instead of num_online_cpus() to
decide, otherwise we may run into issues if and additional CPU is
enabled after having loaded a module while only one CPU was enabled.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v6.1+
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Flush caches after changing gatt entries and calculate entry according
to SBA requirements.
Signed-off-by: Helge Deller <deller@gmx.de>
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add comment in arch_sync_dma_for_device() and handle the direction flag in
arch_sync_dma_for_cpu().
When receiving data from the device (DMA_FROM_DEVICE) unconditionally
purge the data cache in arch_sync_dma_for_cpu().
Signed-off-by: Helge Deller <deller@gmx.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
It turns out that udev under certain circumstances will concurrently try
to load the same modules over-and-over excessively. This isn't a kernel
bug, but it ends up affecting the kernel, to the point that under
certain circumstances we can fail to boot, because the kernel uses a lot
of memory to read all the module data all at once.
Note that it isn't a memory leak, it's just basically a thundering herd
problem happening at bootup with a lot of CPUs, with the worst cases
then being pretty bad.
Admittedly the worst situations are somewhat contrived: lots and lots of
CPUs, not a lot of memory, and KASAN enabled to make it all slower and
as such (unintentionally) exacerbate the problem.
Luis explains: [1]
"My best assessment of the situation is that each CPU in udev ends up
triggering a load of duplicate set of modules, not just one, but *a
lot*. Not sure what heuristics udev uses to load a set of modules per
CPU."
Petr Pavlu chimes in: [2]
"My understanding is that udev workers are forked. An initial kmod
context is created by the main udevd process but no sharing happens
after the fork. It means that the mentioned memory pool logic doesn't
really kick in.
Multiple parallel load requests come from multiple udev workers, for
instance, each handling an udev event for one CPU device and making
the exactly same requests as all others are doing at the same time.
The optimization idea would be to recognize these duplicate requests
at the udevd/kmod level and converge them"
Note that module loading has tried to mitigate this issue before, see
for example commit 064f4536d139 ("module: avoid allocation if module is
already present and ready"), which has a few ASCII graphs on memory use
due to this same issue.
However, while that noticed that the module was already loaded, and
exited with an error early before spending any more time on setting up
the module, it didn't handle the case of multiple concurrent module
loads all being active - but not complete - at the same time.
Yes, one of them will eventually win the race and finalize its copy, and
the others will then notice that the module already exists and error
out, but while this all happens, we have tons of unnecessary concurrent
work being done.
Again, the real fix is for udev to not do that (maybe it should use
threads instead of fork, and have actual shared data structures and not
cause duplicate work). That real fix is apparently not trivial.
But it turns out that the kernel already has a pretty good model for
dealing with concurrent access to the same file: the i_writecount of the
inode.
In fact, the module loading already indirectly uses 'i_writecount' ,
because 'kernel_file_read()' will in fact do
ret = deny_write_access(file);
if (ret)
return ret;
...
allow_write_access(file);
around the read of the file data. We do not allow concurrent writes to
the file, and return -ETXTBUSY if the file was open for writing at the
same time as the module data is loaded from it.
And the solution to the reader concurrency problem is to simply extend
this "no concurrent writers" logic to simply be "exclusive access".
Note that "exclusive" in this context isn't really some absolute thing:
it's only exclusion from writers and from other "special readers" that
do this writer denial. So we simply introduce a variation of that
"deny_write_access()" logic that not only denies write access, but also
requires that this is the _only_ such access that denies write access.
Which means that you can't start loading a module that is already being
loaded as a module by somebody else, or you will get the same -ETXTBSY
error that you would get if there were writers around.
[ It also means that you can't try to load a currently executing
executable as a module, for the same reason: executables do that same
"deny_write_access()" thing, and that's obviously where the whole
ETXTBSY logic traditionally came from.
This is not a problem for kernel modules, since the set of normal
executable files and kernel module files is entirely disjoint. ]
This new function is called "exclusive_deny_write_access()", and the
implementation is trivial, in that it's just an atomic decrement of
i_writecount if it was 0 before.
To use that new exclusivity check, all we then do is wrap the module
loading with that exclusive_deny_write_access()() / allow_write_access()
pair. The actual patch is a bit bigger than that, because we want to
surround not just the "load file data" part, but the whole module setup,
to get maximum exclusion.
So this ends up splitting up "finit_module()" into a few helper
functions to make it all very clear and legible.
In Luis' test-case (bringing up 255 vcpu's in a virtual machine [3]),
the "wasted vmalloc" space (ie module data read into a vmalloc'ed area
in order to be loaded as a module, but then discarded because somebody
else loaded the same module instead) dropped from 1.8GiB to 474kB. Yes,
that's gigabytes to kilobytes.
It doesn't drop completely to zero, because even with this change, you
can still end up having completely serial pointless module loads, where
one udev process has loaded a module fully (and thus the kernel has
released that exclusive lock on the module file), and then another udev
process tries to load the same module again.
So while we cannot fully get rid of the fundamental bug in user space,
we _can_ get rid of the excessive concurrent thundering herd effect.
A couple of final side notes on this all:
- This tweak only affects the "finit_module()" system call, which gives
the kernel a file descriptor with the module data.
You can also just feed the module data as raw data from user space
with "init_module()" (note the lack of 'f' at the beginning), and
obviously for that case we do _not_ have any "exclusive read" logic.
So if you absolutely want to do things wrong in user space, and try
to load the same module multiple times, and error out only later when
the kernel ends up saying "you can't load the same module name
twice", you can still do that.
And in fact, some distros will do exactly that, because they will
uncompress the kernel module data in user space before feeding it to
the kernel (mainly because they haven't started using the new kernel
side decompression yet).
So this is not some absolute "you can't do concurrent loads of the
same module". It's literally just a very simple heuristic that will
catch it early in case you try to load the exact same module file at
the same time, and in that case avoid a potentially nasty situation.
- There is another user of "deny_write_access()": the verity code that
enables fs-verity on a file (the FS_IOC_ENABLE_VERITY ioctl).
If you use fs-verity and you care about verifying the kernel modules
(which does make sense), you should do it *before* loading said
kernel module. That may sound obvious, but now the implementation
basically requires it. Because if you try to do it concurrently, the
kernel may refuse to load the module file that is being set up by the
fs-verity code.
- This all will obviously mean that if you insist on loading the same
module in parallel, only one module load will succeed, and the others
will return with an error.
That was true before too, but what is different is that the -ETXTBSY
error can be returned *before* the success case of another process
fully loading and instantiating the module.
Again, that might sound obvious, and it is indeed the whole point of
the whole change: we are much quicker to notice the whole "you're
already in the process of loading this module".
So it's very much intentional, but it does mean that if you just
spray the kernel with "finit_module()", and expect that the module is
immediately loaded afterwards without checking the return value, you
are doing something horribly horribly wrong.
I'd like to say that that would never happen, but the whole _reason_
for this commit is that udev is currently doing something horribly
horribly wrong, so ...
Link: https://lore.kernel.org/all/ZEGopJ8VAYnE7LQ2@bombadil.infradead.org/ [1]
Link: https://lore.kernel.org/all/23bd0ce6-ef78-1cd8-1f21-0e706a00424a@suse.com/ [2]
Link: https://lore.kernel.org/lkml/ZG%2Fa+nrt4%2FAAUi5z@bombadil.infradead.org/ [3]
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Petr Pavlu <petr.pavlu@suse.com>
Tested-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
- During the acl rework we merged this cycle the generic_listxattr()
helper had to be modified in a way that in principle it would allow
for POSIX ACLs to be reported. At least that was the impression we
had initially. Because before the acl rework POSIX ACLs would be
reported if the filesystem did have POSIX ACL xattr handlers in
sb->s_xattr. That logic changed and now we can simply check whether
the superblock has SB_POSIXACL set and if the inode has
inode->i_{default_}acl set report the appropriate POSIX ACL name.
However, we didn't realize that generic_listxattr() was only ever
used by two filesystems. Both of them don't support POSIX ACLs via
sb->s_xattr handlers and so never reported POSIX ACLs via
generic_listxattr() even if they raised SB_POSIXACL and did contain
inodes which had acls set. The example here is nfs4.
As a result, generic_listxattr() suddenly started reporting POSIX
ACLs when it wouldn't have before. Since SB_POSIXACL implies that the
umask isn't stripped in the VFS nfs4 can't just drop SB_POSIXACL from
the superblock as it would also alter umask handling for them.
So just have generic_listxattr() not report POSIX ACLs as it never
did anyway. It's documented as such.
- Our SB_* flags currently use a signed integer and we shift the last
bit causing UBSAN to complain about undefined behavior. Switch to
using unsigned. While the original patch used an explicit unsigned
bitshift it's now pretty common to rely on the BIT() macro in a lot
of headers nowadays. So the patch has been adjusted to use that.
- Add Namjae as ntfs reviewer. They're already active this cycle so
let's make it explicit right now.
* tag 'vfs/v6.4-rc3/misc.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
ntfs: Add myself as a reviewer
fs: don't call posix_acl_listxattr in generic_listxattr
fs: fix undefined behavior in bit shift for SB_NOUSER
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
I'm volunteering to help review patches for current unmaintained
ntfs filesytem.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Acked-by: Bagas Sanjaya <bagasdotme@gmail.com>
Message-Id: <20230517070739.6505-1-linkinjeon@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Commit f2620f166e2a caused the kernel to start emitting POSIX ACL xattrs
for NFSv4 inodes, which it doesn't support. The only other user of
generic_listxattr is HFS (classic) and it doesn't support POSIX ACLs
either.
Fixes: f2620f166e2a xattr: simplify listxattr helpers
Reported-by: Ondrej Valousek <ondrej.valousek.xm@renesas.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Message-Id: <20230516124655.82283-1-jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Shifting signed 32-bit value by 31 bits is undefined, so changing
significant bit to unsigned. It was spotted by UBSAN.
So let's just fix this by using the BIT() helper for all SB_* flags.
Fixes: e462ec50cb5f ("VFS: Differentiate mount flags (MS_*) from internal superblock flags")
Signed-off-by: Hao Ge <gehao@kylinos.cn>
Message-Id: <20230424051835.374204-1-gehao@kylinos.cn>
[brauner@kernel.org: use BIT() for all SB_* flags]
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from bluetooth and bpf.
Current release - regressions:
- net: fix skb leak in __skb_tstamp_tx()
- eth: mtk_eth_soc: fix QoS on DSA MAC on non MTK_NETSYS_V2 SoCs
Current release - new code bugs:
- handshake:
- fix sock->file allocation
- fix handshake_dup() ref counting
- bluetooth:
- fix potential double free caused by hci_conn_unlink
- fix UAF in hci_conn_hash_flush
Previous releases - regressions:
- core: fix stack overflow when LRO is disabled for virtual
interfaces
- tls: fix strparser rx issues
- bpf:
- fix many sockmap/TCP related issues
- fix a memory leak in the LRU and LRU_PERCPU hash maps
- init the offload table earlier
- eth: mlx5e:
- do as little as possible in napi poll when budget is 0
- fix using eswitch mapping in nic mode
- fix deadlock in tc route query code
Previous releases - always broken:
- udplite: fix NULL pointer dereference in __sk_mem_raise_allocated()
- raw: fix output xfrm lookup wrt protocol
- smc: reset connection when trying to use SMCRv2 fails
- phy: mscc: enable VSC8501/2 RGMII RX clock
- eth: octeontx2-pf: fix TSOv6 offload
- eth: cdc_ncm: deal with too low values of dwNtbOutMaxSize"
* tag 'net-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (79 commits)
udplite: Fix NULL pointer dereference in __sk_mem_raise_allocated().
net: phy: mscc: enable VSC8501/2 RGMII RX clock
net: phy: mscc: remove unnecessary phydev locking
net: phy: mscc: add support for VSC8501
net: phy: mscc: add VSC8502 to MODULE_DEVICE_TABLE
net/handshake: Enable the SNI extension to work properly
net/handshake: Unpin sock->file if a handshake is cancelled
net/handshake: handshake_genl_notify() shouldn't ignore @flags
net/handshake: Fix uninitialized local variable
net/handshake: Fix handshake_dup() ref counting
net/handshake: Remove unneeded check from handshake_dup()
ipv6: Fix out-of-bounds access in ipv6_find_tlv()
net: ethernet: mtk_eth_soc: fix QoS on DSA MAC on non MTK_NETSYS_V2 SoCs
docs: netdev: document the existence of the mail bot
net: fix skb leak in __skb_tstamp_tx()
r8169: Use a raw_spinlock_t for the register locks.
page_pool: fix inconsistency for page_pool_ring_[un]lock()
bpf, sockmap: Test progs verifier error with latest clang
bpf, sockmap: Test FIONREAD returns correct bytes in rx buffer with drops
bpf, sockmap: Test FIONREAD returns correct bytes in rx buffer
...
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
syzbot reported [0] a null-ptr-deref in sk_get_rmem0() while using
IPPROTO_UDPLITE (0x88):
14:25:52 executing program 1:
r0 = socket$inet6(0xa, 0x80002, 0x88)
We had a similar report [1] for probably sk_memory_allocated_add()
in __sk_mem_raise_allocated(), and commit c915fe13cbaa ("udplite: fix
NULL pointer dereference") fixed it by setting .memory_allocated for
udplite_prot and udplitev6_prot.
To fix the variant, we need to set either .sysctl_wmem_offset or
.sysctl_rmem.
Now UDP and UDPLITE share the same value for .memory_allocated, so we
use the same .sysctl_wmem_offset for UDP and UDPLITE.
[0]:
general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 0 PID: 6829 Comm: syz-executor.1 Not tainted 6.4.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/28/2023
RIP: 0010:sk_get_rmem0 include/net/sock.h:2907 [inline]
RIP: 0010:__sk_mem_raise_allocated+0x806/0x17a0 net/core/sock.c:3006
Code: c1 ea 03 80 3c 02 00 0f 85 23 0f 00 00 48 8b 44 24 08 48 8b 98 38 01 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 da 48 c1 ea 03 <0f> b6 14 02 48 89 d8 83 e0 07 83 c0 03 38 d0 0f 8d 6f 0a 00 00 8b
RSP: 0018:ffffc90005d7f450 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffc90004d92000
RDX: 0000000000000000 RSI: ffffffff88066482 RDI: ffffffff8e2ccbb8
RBP: ffff8880173f7000 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000030000
R13: 0000000000000001 R14: 0000000000000340 R15: 0000000000000001
FS: 0000000000000000(0000) GS:ffff8880b9800000(0063) knlGS:00000000f7f1cb40
CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: 000000002e82f000 CR3: 0000000034ff0000 CR4: 00000000003506f0
Call Trace:
<TASK>
__sk_mem_schedule+0x6c/0xe0 net/core/sock.c:3077
udp_rmem_schedule net/ipv4/udp.c:1539 [inline]
__udp_enqueue_schedule_skb+0x776/0xb30 net/ipv4/udp.c:1581
__udpv6_queue_rcv_skb net/ipv6/udp.c:666 [inline]
udpv6_queue_rcv_one_skb+0xc39/0x16c0 net/ipv6/udp.c:775
udpv6_queue_rcv_skb+0x194/0xa10 net/ipv6/udp.c:793
__udp6_lib_mcast_deliver net/ipv6/udp.c:906 [inline]
__udp6_lib_rcv+0x1bda/0x2bd0 net/ipv6/udp.c:1013
ip6_protocol_deliver_rcu+0x2e7/0x1250 net/ipv6/ip6_input.c:437
ip6_input_finish+0x150/0x2f0 net/ipv6/ip6_input.c:482
NF_HOOK include/linux/netfilter.h:303 [inline]
NF_HOOK include/linux/netfilter.h:297 [inline]
ip6_input+0xa0/0xd0 net/ipv6/ip6_input.c:491
ip6_mc_input+0x40b/0xf50 net/ipv6/ip6_input.c:585
dst_input include/net/dst.h:468 [inline]
ip6_rcv_finish net/ipv6/ip6_input.c:79 [inline]
NF_HOOK include/linux/netfilter.h:303 [inline]
NF_HOOK include/linux/netfilter.h:297 [inline]
ipv6_rcv+0x250/0x380 net/ipv6/ip6_input.c:309
__netif_receive_skb_one_core+0x114/0x180 net/core/dev.c:5491
__netif_receive_skb+0x1f/0x1c0 net/core/dev.c:5605
netif_receive_skb_internal net/core/dev.c:5691 [inline]
netif_receive_skb+0x133/0x7a0 net/core/dev.c:5750
tun_rx_batched+0x4b3/0x7a0 drivers/net/tun.c:1553
tun_get_user+0x2452/0x39c0 drivers/net/tun.c:1989
tun_chr_write_iter+0xdf/0x200 drivers/net/tun.c:2035
call_write_iter include/linux/fs.h:1868 [inline]
new_sync_write fs/read_write.c:491 [inline]
vfs_write+0x945/0xd50 fs/read_write.c:584
ksys_write+0x12b/0x250 fs/read_write.c:637
do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
__do_fast_syscall_32+0x65/0xf0 arch/x86/entry/common.c:178
do_fast_syscall_32+0x33/0x70 arch/x86/entry/common.c:203
entry_SYSENTER_compat_after_hwframe+0x70/0x82
RIP: 0023:0xf7f21579
Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00
RSP: 002b:00000000f7f1c590 EFLAGS: 00000282 ORIG_RAX: 0000000000000004
RAX: ffffffffffffffda RBX: 00000000000000c8 RCX: 0000000020000040
RDX: 0000000000000083 RSI: 00000000f734e000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000296 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
</TASK>
Modules linked in:
Link: https://lore.kernel.org/netdev/CANaxB-yCk8hhP68L4Q2nFOJht8sqgXGGQO2AftpHs0u1xyGG5A@mail.gmail.com/ [1]
Fixes: 850cbaddb52d ("udp: use it's own memory accounting schema")
Reported-by: syzbot+444ca0907e96f7c5e48b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=444ca0907e96f7c5e48b
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230523163305.66466-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
| |\ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
David Epping says:
====================
net: phy: mscc: support VSC8501
this updated series of patches adds support for the VSC8501 Ethernet
PHY and fixes support for the VSC8502 PHY in cases where no other
software (like U-Boot) has initialized the PHY after power up.
The first patch simply adds the VSC8502 to the MODULE_DEVICE_TABLE,
where I guess it was unintentionally missing. I have no hardware to
test my change.
The second patch adds the VSC8501 PHY with exactly the same driver
implementation as the existing VSC8502.
The (new) third patch removes phydev locking from
vsc85xx_rgmii_set_skews(), as discussed for v2 of the patch set.
The (now) fourth patch fixes the initialization for VSC8501 and VSC8502.
I have tested this patch with VSC8501 on hardware in RGMII mode only.
https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/DataSheets/VSC8501-03_Datasheet_60001741A.PDF
https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/DataSheets/VSC8502-03_Datasheet_60001742B.pdf
Table 4-42 "RGMII CONTROL, ADDRESS 20E2 (0X14)" Bit 11 for each of
them.
By default the RX_CLK is disabled for these PHYs. In cases where no
other software, like U-Boot, enabled the clock, this results in no
received packets being handed to the MAC.
The patch enables this clock output.
According to Microchip support (case number 01268776) this applies
to all modes (RGMII, GMII, and MII).
Other PHYs sharing the same register map and code, like
VSC8530/31/40/41 have the clock enabled and the relevant bit 11 is
reserved and read-only for them. As per previous discussion the
patch still clears the bit on these PHYs, too, possibly more easily
supporting other future PHYs implementing this functionality.
For the VSC8572 family of PHYs, having a different register map,
no such changes are applied.
====================
Link: https://lore.kernel.org/r/20230523153108.18548-1-david.epping@missinglinkelectronics.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
By default the VSC8501 and VSC8502 RGMII/GMII/MII RX_CLK output is
disabled. To allow packet forwarding towards the MAC it needs to be
enabled.
For other PHYs supported by this driver the clock output is enabled
by default.
Fixes: d3169863310d ("net: phy: mscc: add support for VSC8502")
Signed-off-by: David Epping <david.epping@missinglinkelectronics.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Holding the struct phy_device (phydev) lock is unnecessary when
accessing phydev->interface in the PHY driver .config_init method,
which is the only place that vsc85xx_rgmii_set_skews() is called from.
The phy_modify_paged() function implements required MDIO bus level
locking, which can not be achieved by a phydev lock.
Signed-off-by: David Epping <david.epping@missinglinkelectronics.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The VSC8501 PHY can use the same driver implementation as the VSC8502.
Adding the PHY ID and copying the handler functions of VSC8502 is
sufficient to operate it.
Signed-off-by: David Epping <david.epping@missinglinkelectronics.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| |/ / /
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The mscc driver implements support for VSC8502, so its ID should be in
the MODULE_DEVICE_TABLE for automatic loading.
Signed-off-by: David Epping <david.epping@missinglinkelectronics.com>
Fixes: d3169863310d ("net: phy: mscc: add support for VSC8502")
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| |\ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Chuck Lever says:
====================
Bug fixes for net/handshake
Paolo observed that there is a possible leak of sock->file. I
haven't looked into that yet, but it seems to be separate from
the fixes in this series, so no need to hold these up.
====================
The submissions mentions net-next but it means netdev (perhaps
merge window left over when trees are converged). In any case,
it should have gone into net, but was instead applied to net-next
as commit deb2e484baf9 ("Merge branch 'net-handshake-fixes'").
These are fixes tho, and Chuck needs them to make progress with
the client so double-merging them into net... it is what it is :(
Link: https://lore.kernel.org/r/168381978252.84244.1933636428135211300.stgit@91.116.238.104.host.secureserver.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Enable the upper layer protocol to specify the SNI peername. This
avoids the need for tlshd to use a DNS lookup, which can return a
hostname that doesn't match the incoming certificate's SubjectName.
Fixes: 2fd5532044a8 ("net/handshake: Add a kernel API for requesting a TLSv1.3 handshake")
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
If user space never calls DONE, sock->file's reference count remains
elevated. Enable sock->file to be freed eventually in this case.
Reported-by: Jakub Kacinski <kuba@kernel.org>
Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
trace_handshake_cmd_done_err() simply records the pointer in @req,
so initializing it to NULL is sufficient and safe.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
If get_unused_fd_flags() fails, we ended up calling fput(sock->file)
twice.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| |/ / /
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
handshake_req_submit() now verifies that the socket has a file.
Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| |\ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2023-05-24
We've added 19 non-merge commits during the last 10 day(s) which contain
a total of 20 files changed, 738 insertions(+), 448 deletions(-).
The main changes are:
1) Batch of BPF sockmap fixes found when running against NGINX TCP tests,
from John Fastabend.
2) Fix a memleak in the LRU{,_PERCPU} hash map when bucket locking fails,
from Anton Protopopov.
3) Init the BPF offload table earlier than just late_initcall,
from Jakub Kicinski.
4) Fix ctx access mask generation for 32-bit narrow loads of 64-bit fields,
from Will Deacon.
5) Remove a now unsupported __fallthrough in BPF samples,
from Andrii Nakryiko.
6) Fix a typo in pkg-config call for building sign-file,
from Jeremy Sowden.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf, sockmap: Test progs verifier error with latest clang
bpf, sockmap: Test FIONREAD returns correct bytes in rx buffer with drops
bpf, sockmap: Test FIONREAD returns correct bytes in rx buffer
bpf, sockmap: Test shutdown() correctly exits epoll and recv()=0
bpf, sockmap: Build helper to create connected socket pair
bpf, sockmap: Pull socket helpers out of listen test for general use
bpf, sockmap: Incorrectly handling copied_seq
bpf, sockmap: Wake up polling after data copy
bpf, sockmap: TCP data stall on recv before accept
bpf, sockmap: Handle fin correctly
bpf, sockmap: Improved check for empty queue
bpf, sockmap: Reschedule is now done through backlog
bpf, sockmap: Convert schedule_work into delayed_work
bpf, sockmap: Pass skb ownership through read_skb
bpf: fix a memory leak in the LRU and LRU_PERCPU hash maps
bpf: Fix mask generation for 32-bit narrow loads of 64-bit fields
samples/bpf: Drop unnecessary fallthrough
bpf: netdev: init the offload table earlier
selftests/bpf: Fix pkg-config call building sign-file
====================
Link: https://lore.kernel.org/r/20230524170839.13905-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
With a relatively recent clang (7090c10273119) and with this commit
to fix warnings in selftests (c8ed668593972) that uses __sink(err)
to resolve unused variables. We get the following verifier error.
root@6e731a24b33a:/host/tools/testing/selftests/bpf# ./test_sockmap
libbpf: prog 'bpf_sockmap': BPF program load failed: Permission denied
libbpf: prog 'bpf_sockmap': -- BEGIN PROG LOAD LOG --
0: R1=ctx(off=0,imm=0) R10=fp0
; op = (int) skops->op;
0: (61) r2 = *(u32 *)(r1 +0) ; R1=ctx(off=0,imm=0) R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
; switch (op) {
1: (16) if w2 == 0x4 goto pc+5 ; R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
2: (56) if w2 != 0x5 goto pc+15 ; R2_w=5
; lport = skops->local_port;
3: (61) r2 = *(u32 *)(r1 +68) ; R1=ctx(off=0,imm=0) R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
; if (lport == 10000) {
4: (56) if w2 != 0x2710 goto pc+13 18: R1=ctx(off=0,imm=0) R2=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0
; __sink(err);
18: (bc) w1 = w0
R0 !read_ok
processed 18 insns (limit 1000000) max_states_per_insn 0 total_states 2 peak_states 2 mark_read 1
-- END PROG LOAD LOG --
libbpf: prog 'bpf_sockmap': failed to load: -13
libbpf: failed to load object 'test_sockmap_kern.bpf.o'
load_bpf_file: (-1) No such file or directory
ERROR: (-1) load bpf failed
libbpf: prog 'bpf_sockmap': BPF program load failed: Permission denied
libbpf: prog 'bpf_sockmap': -- BEGIN PROG LOAD LOG --
0: R1=ctx(off=0,imm=0) R10=fp0
; op = (int) skops->op;
0: (61) r2 = *(u32 *)(r1 +0) ; R1=ctx(off=0,imm=0) R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
; switch (op) {
1: (16) if w2 == 0x4 goto pc+5 ; R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
2: (56) if w2 != 0x5 goto pc+15 ; R2_w=5
; lport = skops->local_port;
3: (61) r2 = *(u32 *)(r1 +68) ; R1=ctx(off=0,imm=0) R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
; if (lport == 10000) {
4: (56) if w2 != 0x2710 goto pc+13 18: R1=ctx(off=0,imm=0) R2=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0
; __sink(err);
18: (bc) w1 = w0
R0 !read_ok
processed 18 insns (limit 1000000) max_states_per_insn 0 total_states 2 peak_states 2 mark_read 1
-- END PROG LOAD LOG --
libbpf: prog 'bpf_sockmap': failed to load: -13
libbpf: failed to load object 'test_sockhash_kern.bpf.o'
load_bpf_file: (-1) No such file or directory
ERROR: (-1) load bpf failed
libbpf: prog 'bpf_sockmap': BPF program load failed: Permission denied
libbpf: prog 'bpf_sockmap': -- BEGIN PROG LOAD LOG --
0: R1=ctx(off=0,imm=0) R10=fp0
; op = (int) skops->op;
0: (61) r2 = *(u32 *)(r1 +0) ; R1=ctx(off=0,imm=0) R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
; switch (op) {
1: (16) if w2 == 0x4 goto pc+5 ; R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
2: (56) if w2 != 0x5 goto pc+15 ; R2_w=5
; lport = skops->local_port;
3: (61) r2 = *(u32 *)(r1 +68) ; R1=ctx(off=0,imm=0) R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
; if (lport == 10000) {
4: (56) if w2 != 0x2710 goto pc+13 18: R1=ctx(off=0,imm=0) R2=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0
; __sink(err);
18: (bc) w1 = w0
R0 !read_ok
processed 18 insns (limit 1000000) max_states_per_insn 0 total_states 2 peak_states 2 mark_read 1
-- END PROG LOAD LOG --
To fix simply remove the err value because its not actually used anywhere
in the testing. We can investigate the root cause later. Future patch should
probably actually test the err value as well. Although if the map updates
fail they will get caught eventually by userspace.
Fixes: c8ed668593972 ("selftests/bpf: fix lots of silly mistakes pointed out by compiler")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20230523025618.113937-15-john.fastabend@gmail.com
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
When BPF program drops pkts the sockmap logic 'eats' the packet and
updates copied_seq. In the PASS case where the sk_buff is accepted
we update copied_seq from recvmsg path so we need a new test to
handle the drop case.
Original patch series broke this resulting in
test_sockmap_skb_verdict_fionread:PASS:ioctl(FIONREAD) error 0 nsec
test_sockmap_skb_verdict_fionread:FAIL:ioctl(FIONREAD) unexpected ioctl(FIONREAD): actual 1503041772 != expected 256
After updated patch with fix.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20230523025618.113937-14-john.fastabend@gmail.com
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
A bug was reported where ioctl(FIONREAD) returned zero even though the
socket with a SK_SKB verdict program attached had bytes in the msg
queue. The result is programs may hang or more likely try to recover,
but use suboptimal buffer sizes.
Add a test to check that ioctl(FIONREAD) returns the correct number of
bytes.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20230523025618.113937-13-john.fastabend@gmail.com
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
When session gracefully shutdowns epoll needs to wake up and any recv()
readers should return 0 not the -EAGAIN they previously returned.
Note we use epoll instead of select to test the epoll wake on shutdown
event as well.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20230523025618.113937-12-john.fastabend@gmail.com
|