diff options
author | Paolo Abeni <pabeni@redhat.com> | 2024-10-25 09:08:22 +0200 |
---|---|---|
committer | Paolo Abeni <pabeni@redhat.com> | 2024-10-25 09:08:22 +0200 |
commit | 03fc07a24735e0be8646563913abf5f5cb71ad19 (patch) | |
tree | 5fe666c4ed732b163324aa08e210a69b593e4e8c | |
parent | selftests: tls: add a selftest for wrapping rec_seq (diff) | |
parent | Merge tag 'net-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/net... (diff) | |
download | linux-03fc07a24735e0be8646563913abf5f5cb71ad19.tar.xz linux-03fc07a24735e0be8646563913abf5f5cb71ad19.zip |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.
No conflicts and no adjacent changes.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
498 files changed, 4773 insertions, 2561 deletions
@@ -73,6 +73,8 @@ Andrey Ryabinin <ryabinin.a.a@gmail.com> <aryabinin@virtuozzo.com> Andrzej Hajda <andrzej.hajda@intel.com> <a.hajda@samsung.com> André Almeida <andrealmeid@igalia.com> <andrealmeid@collabora.com> Andy Adamson <andros@citi.umich.edu> +Andy Chiu <andybnac@gmail.com> <andy.chiu@sifive.com> +Andy Chiu <andybnac@gmail.com> <taochiu@synology.com> Andy Shevchenko <andy@kernel.org> <andy@smile.org.ua> Andy Shevchenko <andy@kernel.org> <ext-andriy.shevchenko@nokia.com> Anilkumar Kolli <quic_akolli@quicinc.com> <akolli@codeaurora.org> @@ -304,6 +306,11 @@ Jens Axboe <axboe@kernel.dk> <axboe@fb.com> Jens Axboe <axboe@kernel.dk> <axboe@meta.com> Jens Osterkamp <Jens.Osterkamp@de.ibm.com> Jernej Skrabec <jernej.skrabec@gmail.com> <jernej.skrabec@siol.net> +Jesper Dangaard Brouer <hawk@kernel.org> <brouer@redhat.com> +Jesper Dangaard Brouer <hawk@kernel.org> <hawk@comx.dk> +Jesper Dangaard Brouer <hawk@kernel.org> <jbrouer@redhat.com> +Jesper Dangaard Brouer <hawk@kernel.org> <jdb@comx.dk> +Jesper Dangaard Brouer <hawk@kernel.org> <netoptimizer@brouer.com> Jessica Zhang <quic_jesszhan@quicinc.com> <jesszhan@codeaurora.org> Jilai Wang <quic_jilaiw@quicinc.com> <jilaiw@codeaurora.org> Jiri Kosina <jikos@kernel.org> <jikos@jikos.cz> diff --git a/Documentation/admin-guide/LSM/ipe.rst b/Documentation/admin-guide/LSM/ipe.rst index f38e641df0e9..f93a467db628 100644 --- a/Documentation/admin-guide/LSM/ipe.rst +++ b/Documentation/admin-guide/LSM/ipe.rst @@ -223,7 +223,10 @@ are signed through the PKCS#7 message format to enforce some level of authorization of the policies (prohibiting an attacker from gaining unconstrained root, and deploying an "allow all" policy). These policies must be signed by a certificate that chains to the -``SYSTEM_TRUSTED_KEYRING``. With openssl, the policy can be signed by:: +``SYSTEM_TRUSTED_KEYRING``, or to the secondary and/or platform keyrings if +``CONFIG_IPE_POLICY_SIG_SECONDARY_KEYRING`` and/or +``CONFIG_IPE_POLICY_SIG_PLATFORM_KEYRING`` are enabled, respectively. +With openssl, the policy can be signed by:: openssl smime -sign \ -in "$MY_POLICY" \ @@ -266,7 +269,7 @@ in the kernel. This file is write-only and accepts a PKCS#7 signed policy. Two checks will always be performed on this policy: First, the ``policy_names`` must match with the updated version and the existing version. Second the updated policy must have a policy version greater than -or equal to the currently-running version. This is to prevent rollback attacks. +the currently-running version. This is to prevent rollback attacks. The ``delete`` file is used to remove a policy that is no longer needed. This file is write-only and accepts a value of ``1`` to delete the policy. diff --git a/Documentation/core-api/protection-keys.rst b/Documentation/core-api/protection-keys.rst index bf28ac0401f3..7eb7c6023e09 100644 --- a/Documentation/core-api/protection-keys.rst +++ b/Documentation/core-api/protection-keys.rst @@ -12,7 +12,10 @@ Pkeys Userspace (PKU) is a feature which can be found on: * Intel server CPUs, Skylake and later * Intel client CPUs, Tiger Lake (11th Gen Core) and later * Future AMD CPUs + * arm64 CPUs implementing the Permission Overlay Extension (FEAT_S1POE) +x86_64 +====== Pkeys work by dedicating 4 previously Reserved bits in each page table entry to a "protection key", giving 16 possible keys. @@ -28,6 +31,22 @@ register. The feature is only available in 64-bit mode, even though there is theoretically space in the PAE PTEs. These permissions are enforced on data access only and have no effect on instruction fetches. +arm64 +===== + +Pkeys use 3 bits in each page table entry, to encode a "protection key index", +giving 8 possible keys. + +Protections for each key are defined with a per-CPU user-writable system +register (POR_EL0). This is a 64-bit register encoding read, write and execute +overlay permissions for each protection key index. + +Being a CPU register, POR_EL0 is inherently thread-local, potentially giving +each thread a different set of protections from every other thread. + +Unlike x86_64, the protection key permissions also apply to instruction +fetches. + Syscalls ======== @@ -38,11 +57,10 @@ There are 3 system calls which directly interact with pkeys:: int pkey_mprotect(unsigned long start, size_t len, unsigned long prot, int pkey); -Before a pkey can be used, it must first be allocated with -pkey_alloc(). An application calls the WRPKRU instruction -directly in order to change access permissions to memory covered -with a key. In this example WRPKRU is wrapped by a C function -called pkey_set(). +Before a pkey can be used, it must first be allocated with pkey_alloc(). An +application writes to the architecture specific CPU register directly in order +to change access permissions to memory covered with a key. In this example +this is wrapped by a C function called pkey_set(). :: int real_prot = PROT_READ|PROT_WRITE; @@ -64,9 +82,9 @@ is no longer in use:: munmap(ptr, PAGE_SIZE); pkey_free(pkey); -.. note:: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions. - An example implementation can be found in - tools/testing/selftests/x86/protection_keys.c. +.. note:: pkey_set() is a wrapper around writing to the CPU register. + Example implementations can be found in + tools/testing/selftests/mm/pkey-{arm64,powerpc,x86}.h Behavior ======== @@ -96,3 +114,7 @@ with a read():: The kernel will send a SIGSEGV in both cases, but si_code will be set to SEGV_PKERR when violating protection keys versus SEGV_ACCERR when the plain mprotect() permissions are violated. + +Note that kernel accesses from a kthread (such as io_uring) will use a default +value for the protection key register and so will not be consistent with +userspace's value of the register or mprotect(). diff --git a/Documentation/devicetree/bindings/iio/dac/adi,ad5686.yaml b/Documentation/devicetree/bindings/iio/dac/adi,ad5686.yaml index b4400c52bec3..713f535bb33a 100644 --- a/Documentation/devicetree/bindings/iio/dac/adi,ad5686.yaml +++ b/Documentation/devicetree/bindings/iio/dac/adi,ad5686.yaml @@ -4,7 +4,7 @@ $id: http://devicetree.org/schemas/iio/dac/adi,ad5686.yaml# $schema: http://devicetree.org/meta-schemas/core.yaml# -title: Analog Devices AD5360 and similar DACs +title: Analog Devices AD5360 and similar SPI DACs maintainers: - Michael Hennerich <michael.hennerich@analog.com> @@ -12,41 +12,22 @@ maintainers: properties: compatible: - oneOf: - - description: SPI devices - enum: - - adi,ad5310r - - adi,ad5672r - - adi,ad5674r - - adi,ad5676 - - adi,ad5676r - - adi,ad5679r - - adi,ad5681r - - adi,ad5682r - - adi,ad5683 - - adi,ad5683r - - adi,ad5684 - - adi,ad5684r - - adi,ad5685r - - adi,ad5686 - - adi,ad5686r - - description: I2C devices - enum: - - adi,ad5311r - - adi,ad5337r - - adi,ad5338r - - adi,ad5671r - - adi,ad5675r - - adi,ad5691r - - adi,ad5692r - - adi,ad5693 - - adi,ad5693r - - adi,ad5694 - - adi,ad5694r - - adi,ad5695r - - adi,ad5696 - - adi,ad5696r - + enum: + - adi,ad5310r + - adi,ad5672r + - adi,ad5674r + - adi,ad5676 + - adi,ad5676r + - adi,ad5679r + - adi,ad5681r + - adi,ad5682r + - adi,ad5683 + - adi,ad5683r + - adi,ad5684 + - adi,ad5684r + - adi,ad5685r + - adi,ad5686 + - adi,ad5686r reg: maxItems: 1 diff --git a/Documentation/devicetree/bindings/iio/dac/adi,ad5696.yaml b/Documentation/devicetree/bindings/iio/dac/adi,ad5696.yaml index 56b0cda0f30a..b5a88b03dc2f 100644 --- a/Documentation/devicetree/bindings/iio/dac/adi,ad5696.yaml +++ b/Documentation/devicetree/bindings/iio/dac/adi,ad5696.yaml @@ -4,7 +4,7 @@ $id: http://devicetree.org/schemas/iio/dac/adi,ad5696.yaml# $schema: http://devicetree.org/meta-schemas/core.yaml# -title: Analog Devices AD5696 and similar multi-channel DACs +title: Analog Devices AD5696 and similar I2C multi-channel DACs maintainers: - Michael Auchter <michael.auchter@ni.com> @@ -16,6 +16,7 @@ properties: compatible: enum: - adi,ad5311r + - adi,ad5337r - adi,ad5338r - adi,ad5671r - adi,ad5675r diff --git a/Documentation/filesystems/iomap/operations.rst b/Documentation/filesystems/iomap/operations.rst index 8e6c721d2330..b93115ab8748 100644 --- a/Documentation/filesystems/iomap/operations.rst +++ b/Documentation/filesystems/iomap/operations.rst @@ -208,7 +208,7 @@ The filesystem must arrange to `cancel such `reservations <https://lore.kernel.org/linux-xfs/20220817093627.GZ3600936@dread.disaster.area/>`_ because writeback will not consume the reservation. -The ``iomap_file_buffered_write_punch_delalloc`` can be called from a +The ``iomap_write_delalloc_release`` can be called from a ``->iomap_end`` function to find all the clean areas of the folios caching a fresh (``IOMAP_F_NEW``) delalloc mapping. It takes the ``invalidate_lock``. diff --git a/Documentation/filesystems/netfs_library.rst b/Documentation/filesystems/netfs_library.rst index f0d2cb257bb8..73f0bfd7e903 100644 --- a/Documentation/filesystems/netfs_library.rst +++ b/Documentation/filesystems/netfs_library.rst @@ -592,4 +592,3 @@ API Function Reference .. kernel-doc:: include/linux/netfs.h .. kernel-doc:: fs/netfs/buffered_read.c -.. kernel-doc:: fs/netfs/io.c diff --git a/Documentation/mm/damon/maintainer-profile.rst b/Documentation/mm/damon/maintainer-profile.rst index 2365c9a3c1f0..ce3e98458339 100644 --- a/Documentation/mm/damon/maintainer-profile.rst +++ b/Documentation/mm/damon/maintainer-profile.rst @@ -7,26 +7,26 @@ The DAMON subsystem covers the files that are listed in 'DATA ACCESS MONITOR' section of 'MAINTAINERS' file. The mailing lists for the subsystem are damon@lists.linux.dev and -linux-mm@kvack.org. Patches should be made against the mm-unstable `tree -<https://git.kernel.org/akpm/mm/h/mm-unstable>` whenever possible and posted to -the mailing lists. +linux-mm@kvack.org. Patches should be made against the `mm-unstable tree +<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ whenever possible and posted +to the mailing lists. SCM Trees --------- There are multiple Linux trees for DAMON development. Patches under development or testing are queued in `damon/next -<https://git.kernel.org/sj/h/damon/next>` by the DAMON maintainer. +<https://git.kernel.org/sj/h/damon/next>`_ by the DAMON maintainer. Sufficiently reviewed patches will be queued in `mm-unstable -<https://git.kernel.org/akpm/mm/h/mm-unstable>` by the memory management +<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ by the memory management subsystem maintainer. After more sufficient tests, the patches will be queued -in `mm-stable <https://git.kernel.org/akpm/mm/h/mm-stable>` , and finally +in `mm-stable <https://git.kernel.org/akpm/mm/h/mm-stable>`_, and finally pull-requested to the mainline by the memory management subsystem maintainer. -Note again the patches for mm-unstable `tree -<https://git.kernel.org/akpm/mm/h/mm-unstable>` are queued by the memory +Note again the patches for `mm-unstable tree +<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ are queued by the memory management subsystem maintainer. If the patches requires some patches in -damon/next `tree <https://git.kernel.org/sj/h/damon/next>` which not yet merged +`damon/next tree <https://git.kernel.org/sj/h/damon/next>`_ which not yet merged in mm-unstable, please make sure the requirement is clearly specified. Submit checklist addendum @@ -37,25 +37,25 @@ When making DAMON changes, you should do below. - Build changes related outputs including kernel and documents. - Ensure the builds introduce no new errors or warnings. - Run and ensure no new failures for DAMON `selftests - <https://github.com/awslabs/damon-tests/blob/master/corr/run.sh#L49>` and + <https://github.com/damonitor/damon-tests/blob/master/corr/run.sh#L49>`_ and `kunittests - <https://github.com/awslabs/damon-tests/blob/master/corr/tests/kunit.sh>`. + <https://github.com/damonitor/damon-tests/blob/master/corr/tests/kunit.sh>`_. Further doing below and putting the results will be helpful. - Run `damon-tests/corr - <https://github.com/awslabs/damon-tests/tree/master/corr>` for normal + <https://github.com/damonitor/damon-tests/tree/master/corr>`_ for normal changes. - Run `damon-tests/perf - <https://github.com/awslabs/damon-tests/tree/master/perf>` for performance + <https://github.com/damonitor/damon-tests/tree/master/perf>`_ for performance changes. Key cycle dates --------------- Patches can be sent anytime. Key cycle dates of the `mm-unstable -<https://git.kernel.org/akpm/mm/h/mm-unstable>` and `mm-stable -<https://git.kernel.org/akpm/mm/h/mm-stable>` trees depend on the memory +<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ and `mm-stable +<https://git.kernel.org/akpm/mm/h/mm-stable>`_ trees depend on the memory management subsystem maintainer. Review cadence @@ -72,13 +72,13 @@ Mailing tool Like many other Linux kernel subsystems, DAMON uses the mailing lists (damon@lists.linux.dev and linux-mm@kvack.org) as the major communication channel. There is a simple tool called `HacKerMaiL -<https://github.com/damonitor/hackermail>` (``hkml``), which is for people who +<https://github.com/damonitor/hackermail>`_ (``hkml``), which is for people who are not very familiar with the mailing lists based communication. The tool could be particularly helpful for DAMON community members since it is developed and maintained by DAMON maintainer. The tool is also officially announced to support DAMON and general Linux kernel development workflow. -In other words, `hkml <https://github.com/damonitor/hackermail>` is a mailing +In other words, `hkml <https://github.com/damonitor/hackermail>`_ is a mailing tool for DAMON community, which DAMON maintainer is committed to support. Please feel free to try and report issues or feature requests for the tool to the maintainer. @@ -98,8 +98,8 @@ slots, and attendees should reserve one of those at least 24 hours before the time slot, by reaching out to the maintainer. Schedules and available reservation time slots are available at the Google `doc -<https://docs.google.com/document/d/1v43Kcj3ly4CYqmAkMaZzLiM2GEnWfgdGbZAH3mi2vpM/edit?usp=sharing>`. +<https://docs.google.com/document/d/1v43Kcj3ly4CYqmAkMaZzLiM2GEnWfgdGbZAH3mi2vpM/edit?usp=sharing>`_. There is also a public Google `calendar -<https://calendar.google.com/calendar/u/0?cid=ZDIwOTA4YTMxNjc2MDQ3NTIyMmUzYTM5ZmQyM2U4NDA0ZGIwZjBiYmJlZGQxNDM0MmY4ZTRjOTE0NjdhZDRiY0Bncm91cC5jYWxlbmRhci5nb29nbGUuY29t>` +<https://calendar.google.com/calendar/u/0?cid=ZDIwOTA4YTMxNjc2MDQ3NTIyMmUzYTM5ZmQyM2U4NDA0ZGIwZjBiYmJlZGQxNDM0MmY4ZTRjOTE0NjdhZDRiY0Bncm91cC5jYWxlbmRhci5nb29nbGUuY29t>`_ that has the events. Anyone can subscribe it. DAMON maintainer will also provide periodic reminder to the mailing list (damon@lists.linux.dev). diff --git a/Documentation/process/maintainer-soc.rst b/Documentation/process/maintainer-soc.rst index 12637530d68f..fe9d8bcfbd2b 100644 --- a/Documentation/process/maintainer-soc.rst +++ b/Documentation/process/maintainer-soc.rst @@ -30,10 +30,13 @@ tree as a dedicated branch covering multiple subsystems. The main SoC tree is housed on git.kernel.org: https://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git/ +Maintainers +----------- + Clearly this is quite a wide range of topics, which no one person, or even small group of people are capable of maintaining. Instead, the SoC subsystem -is comprised of many submaintainers, each taking care of individual platforms -and driver subdirectories. +is comprised of many submaintainers (platform maintainers), each taking care of +individual platforms and driver subdirectories. In this regard, "platform" usually refers to a series of SoCs from a given vendor, for example, Nvidia's series of Tegra SoCs. Many submaintainers operate on a vendor level, responsible for multiple product lines. For several reasons, @@ -43,14 +46,43 @@ MAINTAINERS file. Most of these submaintainers have their own trees where they stage patches, sending pull requests to the main SoC tree. These trees are usually, but not -always, listed in MAINTAINERS. The main SoC maintainers can be reached via the -alias soc@kernel.org if there is no platform-specific maintainer, or if they -are unresponsive. +always, listed in MAINTAINERS. What the SoC tree is not, however, is a location for architecture-specific code changes. Each architecture has its own maintainers that are responsible for architectural details, CPU errata and the like. +Submitting Patches for Given SoC +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +All typical platform related patches should be sent via SoC submaintainers +(platform-specific maintainers). This includes also changes to per-platform or +shared defconfigs (scripts/get_maintainer.pl might not provide correct +addresses in such case). + +Submitting Patches to the Main SoC Maintainers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The main SoC maintainers can be reached via the alias soc@kernel.org only in +following cases: + +1. There are no platform-specific maintainers. + +2. Platform-specific maintainers are unresponsive. + +3. Introducing a completely new SoC platform. Such new SoC work should be sent + first to common mailing lists, pointed out by scripts/get_maintainer.pl, for + community review. After positive community review, work should be sent to + soc@kernel.org in one patchset containing new arch/foo/Kconfig entry, DTS + files, MAINTAINERS file entry and optionally initial drivers with their + Devicetree bindings. The MAINTAINERS file entry should list new + platform-specific maintainers, who are going to be responsible for handling + patches for the platform from now on. + +Note that the soc@kernel.org is usually not the place to discuss the patches, +thus work sent to this address should be already considered as acceptable by +the community. + Information for (new) Submaintainers ------------------------------------ diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index e32471977d0a..edc070c6e19b 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -8098,13 +8098,15 @@ KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS By default, KVM emulates MONITOR/MWAIT (if KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT is disabled. -KVM_X86_QUIRK_SLOT_ZAP_ALL By default, KVM invalidates all SPTEs in - fast way for memslot deletion when VM type - is KVM_X86_DEFAULT_VM. - When this quirk is disabled or when VM type - is other than KVM_X86_DEFAULT_VM, KVM zaps - only leaf SPTEs that are within the range of - the memslot being deleted. +KVM_X86_QUIRK_SLOT_ZAP_ALL By default, for KVM_X86_DEFAULT_VM VMs, KVM + invalidates all SPTEs in all memslots and + address spaces when a memslot is deleted or + moved. When this quirk is disabled (or the + VM type isn't KVM_X86_DEFAULT_VM), KVM only + ensures the backing memory of the deleted + or moved memslot isn't reachable, i.e KVM + _may_ invalidate only SPTEs related to the + memslot. =================================== ============================================ 7.32 KVM_CAP_MAX_VCPU_ID diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst index 20a9a37d1cdd..1bedd56e2fe3 100644 --- a/Documentation/virt/kvm/locking.rst +++ b/Documentation/virt/kvm/locking.rst @@ -136,7 +136,7 @@ For direct sp, we can easily avoid it since the spte of direct sp is fixed to gfn. For indirect sp, we disabled fast page fault for simplicity. A solution for indirect sp could be to pin the gfn, for example via -kvm_vcpu_gfn_to_pfn_atomic, before the cmpxchg. After the pinning: +gfn_to_pfn_memslot_atomic, before the cmpxchg. After the pinning: - We have held the refcount of pfn; that means the pfn can not be freed and be reused for another gfn. diff --git a/MAINTAINERS b/MAINTAINERS index aed1fa42cfd2..f39ab140710f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -258,12 +258,6 @@ L: linux-acenic@sunsite.dk S: Maintained F: drivers/net/ethernet/alteon/acenic* -ACER ASPIRE 1 EMBEDDED CONTROLLER DRIVER -M: Nikita Travkin <nikita@trvn.ru> -S: Maintained -F: Documentation/devicetree/bindings/platform/acer,aspire1-ec.yaml -F: drivers/platform/arm64/acer-aspire1-ec.c - ACER ASPIRE ONE TEMPERATURE AND FAN DRIVER M: Peter Kaestle <peter@piie.net> L: platform-driver-x86@vger.kernel.org @@ -888,7 +882,6 @@ F: drivers/staging/media/sunxi/cedrus/ ALPHA PORT M: Richard Henderson <richard.henderson@linaro.org> -M: Ivan Kokshaysky <ink@jurassic.park.msu.ru> M: Matt Turner <mattst88@gmail.com> L: linux-alpha@vger.kernel.org S: Odd Fixes @@ -1761,8 +1754,8 @@ F: include/uapi/linux/if_arcnet.h ARM AND ARM64 SoC SUB-ARCHITECTURES (COMMON PARTS) M: Arnd Bergmann <arnd@arndb.de> M: Olof Johansson <olof@lixom.net> -M: soc@kernel.org L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) +L: soc@lists.linux.dev S: Maintained P: Documentation/process/maintainer-soc.rst C: irc://irc.libera.chat/armlinux @@ -2263,12 +2256,6 @@ L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) S: Maintained F: arch/arm/mach-ep93xx/ts72xx.c -ARM/CIRRUS LOGIC CLPS711X ARM ARCHITECTURE -M: Alexander Shiyan <shc_work@mail.ru> -L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) -S: Odd Fixes -N: clps711x - ARM/CIRRUS LOGIC EP93XX ARM ARCHITECTURE M: Hartley Sweeten <hsweeten@visionengravers.com> M: Alexander Sverdlin <alexander.sverdlin@gmail.com> @@ -3815,14 +3802,6 @@ F: drivers/video/backlight/ F: include/linux/backlight.h F: include/linux/pwm_backlight.h -BAIKAL-T1 PVT HARDWARE MONITOR DRIVER -M: Serge Semin <fancer.lancer@gmail.com> -L: linux-hwmon@vger.kernel.org -S: Supported -F: Documentation/devicetree/bindings/hwmon/baikal,bt1-pvt.yaml -F: Documentation/hwmon/bt1-pvt.rst -F: drivers/hwmon/bt1-pvt.[ch] - BARCO P50 GPIO DRIVER M: Santosh Kumar Yadav <santoshkumar.yadav@barco.com> M: Peter Korsgaard <peter.korsgaard@barco.com> @@ -6476,7 +6455,6 @@ F: drivers/mtd/nand/raw/denali* DESIGNWARE EDMA CORE IP DRIVER M: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> -R: Serge Semin <fancer.lancer@gmail.com> L: dmaengine@vger.kernel.org S: Maintained F: drivers/dma/dw-edma/ @@ -9759,14 +9737,6 @@ F: drivers/gpio/gpiolib-cdev.c F: include/uapi/linux/gpio.h F: tools/gpio/ -GRE DEMULTIPLEXER DRIVER -M: Dmitry Kozlov <xeb@mail.ru> -L: netdev@vger.kernel.org -S: Maintained -F: include/net/gre.h -F: net/ipv4/gre_demux.c -F: net/ipv4/gre_offload.c - GRETH 10/100/1G Ethernet MAC device driver M: Andreas Larsson <andreas@gaisler.com> L: netdev@vger.kernel.org @@ -11289,10 +11259,10 @@ F: security/integrity/ F: security/integrity/ima/ INTEGRITY POLICY ENFORCEMENT (IPE) -M: Fan Wu <wufan@linux.microsoft.com> +M: Fan Wu <wufan@kernel.org> L: linux-security-module@vger.kernel.org S: Supported -T: git https://github.com/microsoft/ipe.git +T: git git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe.git F: Documentation/admin-guide/LSM/ipe.rst F: Documentation/security/ipe.rst F: scripts/ipe/ @@ -11610,6 +11580,16 @@ F: drivers/crypto/intel/keembay/keembay-ocs-hcu-core.c F: drivers/crypto/intel/keembay/ocs-hcu.c F: drivers/crypto/intel/keembay/ocs-hcu.h +INTEL LA JOLLA COVE ADAPTER (LJCA) USB I/O EXPANDER DRIVERS +M: Wentong Wu <wentong.wu@intel.com> +M: Sakari Ailus <sakari.ailus@linux.intel.com> +S: Maintained +F: drivers/gpio/gpio-ljca.c +F: drivers/i2c/busses/i2c-ljca.c +F: drivers/spi/spi-ljca.c +F: drivers/usb/misc/usb-ljca.c +F: include/linux/usb/ljca.h + INTEL MANAGEMENT ENGINE (mei) M: Tomas Winkler <tomas.winkler@intel.com> L: linux-kernel@vger.kernel.org @@ -12248,6 +12228,7 @@ R: Dmitry Vyukov <dvyukov@google.com> R: Vincenzo Frascino <vincenzo.frascino@arm.com> L: kasan-dev@googlegroups.com S: Maintained +B: https://bugzilla.kernel.org/buglist.cgi?component=Sanitizers&product=Memory%20Management F: Documentation/dev-tools/kasan.rst F: arch/*/include/asm/*kasan.h F: arch/*/mm/kasan_init* @@ -12271,6 +12252,7 @@ R: Dmitry Vyukov <dvyukov@google.com> R: Andrey Konovalov <andreyknvl@gmail.com> L: kasan-dev@googlegroups.com S: Maintained +B: https://bugzilla.kernel.org/buglist.cgi?component=Sanitizers&product=Memory%20Management F: Documentation/dev-tools/kcov.rst F: include/linux/kcov.h F: include/uapi/linux/kcov.h @@ -12953,12 +12935,6 @@ S: Maintained F: drivers/ata/pata_arasan_cf.c F: include/linux/pata_arasan_cf_data.h -LIBATA PATA DRIVERS -R: Sergey Shtylyov <s.shtylyov@omp.ru> -L: linux-ide@vger.kernel.org -F: drivers/ata/ata_*.c -F: drivers/ata/pata_*.c - LIBATA PATA FARADAY FTIDE010 AND GEMINI SATA BRIDGE DRIVERS M: Linus Walleij <linus.walleij@linaro.org> L: linux-ide@vger.kernel.org @@ -12975,14 +12951,6 @@ F: drivers/ata/ahci_platform.c F: drivers/ata/libahci_platform.c F: include/linux/ahci_platform.h -LIBATA SATA AHCI SYNOPSYS DWC CONTROLLER DRIVER -M: Serge Semin <fancer.lancer@gmail.com> -L: linux-ide@vger.kernel.org -S: Maintained -F: Documentation/devicetree/bindings/ata/baikal,bt1-ahci.yaml -F: Documentation/devicetree/bindings/ata/snps,dwc-ahci.yaml -F: drivers/ata/ahci_dwc.c - LIBATA SATA PROMISE TX2/TX4 CONTROLLER DRIVER M: Mikael Pettersson <mikpelinux@gmail.com> L: linux-ide@vger.kernel.org @@ -14178,16 +14146,6 @@ S: Maintained T: git git://linuxtv.org/media_tree.git F: drivers/media/platform/nxp/imx-pxp.[ch] -MEDIA DRIVERS FOR ASCOT2E -M: Sergey Kozlov <serjk@netup.ru> -M: Abylay Ospan <aospan@netup.ru> -L: linux-media@vger.kernel.org -S: Supported -W: https://linuxtv.org -W: http://netup.tv/ -T: git git://linuxtv.org/media_tree.git -F: drivers/media/dvb-frontends/ascot2e* - MEDIA DRIVERS FOR CXD2099AR CI CONTROLLERS M: Jasmin Jessich <jasmin@anw.at> L: linux-media@vger.kernel.org @@ -14196,16 +14154,6 @@ W: https://linuxtv.org T: git git://linuxtv.org/media_tree.git F: drivers/media/dvb-frontends/cxd2099* -MEDIA DRIVERS FOR CXD2841ER -M: Sergey Kozlov <serjk@netup.ru> -M: Abylay Ospan <aospan@netup.ru> -L: linux-media@vger.kernel.org -S: Supported -W: https://linuxtv.org -W: http://netup.tv/ -T: git git://linuxtv.org/media_tree.git -F: drivers/media/dvb-frontends/cxd2841er* - MEDIA DRIVERS FOR CXD2880 M: Yasunari Takiguchi <Yasunari.Takiguchi@sony.com> L: linux-media@vger.kernel.org @@ -14250,35 +14198,6 @@ F: drivers/media/platform/nxp/imx-mipi-csis.c F: drivers/media/platform/nxp/imx7-media-csi.c F: drivers/media/platform/nxp/imx8mq-mipi-csi2.c -MEDIA DRIVERS FOR HELENE -M: Abylay Ospan <aospan@netup.ru> -L: linux-media@vger.kernel.org -S: Supported -W: https://linuxtv.org -W: http://netup.tv/ -T: git git://linuxtv.org/media_tree.git -F: drivers/media/dvb-frontends/helene* - -MEDIA DRIVERS FOR HORUS3A -M: Sergey Kozlov <serjk@netup.ru> -M: Abylay Ospan <aospan@netup.ru> -L: linux-media@vger.kernel.org -S: Supported -W: https://linuxtv.org -W: http://netup.tv/ -T: git git://linuxtv.org/media_tree.git -F: drivers/media/dvb-frontends/horus3a* - -MEDIA DRIVERS FOR LNBH25 -M: Sergey Kozlov <serjk@netup.ru> -M: Abylay Ospan <aospan@netup.ru> -L: linux-media@vger.kernel.org -S: Supported -W: https://linuxtv.org -W: http://netup.tv/ -T: git git://linuxtv.org/media_tree.git -F: drivers/media/dvb-frontends/lnbh25* - MEDIA DRIVERS FOR MXL5XX TUNER DEMODULATORS L: linux-media@vger.kernel.org S: Orphan @@ -14286,16 +14205,6 @@ W: https://linuxtv.org T: git git://linuxtv.org/media_tree.git F: drivers/media/dvb-frontends/mxl5xx* -MEDIA DRIVERS FOR NETUP PCI UNIVERSAL DVB devices -M: Sergey Kozlov <serjk@netup.ru> -M: Abylay Ospan <aospan@netup.ru> -L: linux-media@vger.kernel.org -S: Supported -W: https://linuxtv.org -W: http://netup.tv/ -T: git git://linuxtv.org/media_tree.git -F: drivers/media/pci/netup_unidvb/* - MEDIA DRIVERS FOR NVIDIA TEGRA - VDE M: Dmitry Osipenko <digetx@gmail.com> L: linux-media@vger.kernel.org @@ -14913,9 +14822,10 @@ N: include/linux/page[-_]* MEMORY MAPPING M: Andrew Morton <akpm@linux-foundation.org> -R: Liam R. Howlett <Liam.Howlett@oracle.com> +M: Liam R. Howlett <Liam.Howlett@oracle.com> +M: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> R: Vlastimil Babka <vbabka@suse.cz> -R: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> +R: Jann Horn <jannh@google.com> L: linux-mm@kvack.org S: Maintained W: http://www.linux-mm.org @@ -14938,13 +14848,6 @@ F: drivers/mtd/ F: include/linux/mtd/ F: include/uapi/mtd/ -MEMSENSING MICROSYSTEMS MSA311 DRIVER -M: Dmitry Rokosov <ddrokosov@sberdevices.ru> -L: linux-iio@vger.kernel.org -S: Maintained -F: Documentation/devicetree/bindings/iio/accel/memsensing,msa311.yaml -F: drivers/iio/accel/msa311.c - MEN A21 WATCHDOG DRIVER M: Johannes Thumshirn <morbidrsa@gmail.com> L: linux-watchdog@vger.kernel.org @@ -15278,7 +15181,6 @@ F: drivers/tty/serial/8250/8250_pci1xxxx.c MICROCHIP POLARFIRE FPGA DRIVERS M: Conor Dooley <conor.dooley@microchip.com> -R: Vladimir Georgiev <v.georgiev@metrotek.ru> L: linux-fpga@vger.kernel.org S: Supported F: Documentation/devicetree/bindings/fpga/microchip,mpf-spi-fpga-mgr.yaml @@ -15533,17 +15435,6 @@ F: arch/mips/ F: drivers/platform/mips/ F: include/dt-bindings/mips/ -MIPS BAIKAL-T1 PLATFORM -M: Serge Semin <fancer.lancer@gmail.com> -L: linux-mips@vger.kernel.org -S: Supported -F: Documentation/devicetree/bindings/bus/baikal,bt1-*.yaml -F: Documentation/devicetree/bindings/clock/baikal,bt1-*.yaml -F: drivers/bus/bt1-*.c -F: drivers/clk/baikal-t1/ -F: drivers/memory/bt1-l2-ctl.c -F: drivers/mtd/maps/physmap-bt1-rom.[ch] - MIPS BOSTON DEVELOPMENT BOARD M: Paul Burton <paulburton@kernel.org> L: linux-mips@vger.kernel.org @@ -15556,7 +15447,6 @@ F: include/dt-bindings/clock/boston-clock.h MIPS CORE DRIVERS M: Thomas Bogendoerfer <tsbogend@alpha.franken.de> -M: Serge Semin <fancer.lancer@gmail.com> L: linux-mips@vger.kernel.org S: Supported F: drivers/bus/mips_cdmm.c @@ -16159,6 +16049,7 @@ M: "David S. Miller" <davem@davemloft.net> M: Eric Dumazet <edumazet@google.com> M: Jakub Kicinski <kuba@kernel.org> M: Paolo Abeni <pabeni@redhat.com> +R: Simon Horman <horms@kernel.org> L: netdev@vger.kernel.org S: Maintained P: Documentation/process/maintainer-netdev.rst @@ -16201,6 +16092,7 @@ F: include/uapi/linux/rtnetlink.h F: lib/net_utils.c F: lib/random32.c F: net/ +F: samples/pktgen/ F: tools/net/ F: tools/testing/selftests/net/ X: Documentation/networking/mac80211-injection.rst @@ -16525,12 +16417,6 @@ F: include/linux/ntb.h F: include/linux/ntb_transport.h F: tools/testing/selftests/ntb/ -NTB IDT DRIVER -M: Serge Semin <fancer.lancer@gmail.com> -L: ntb@lists.linux.dev -S: Supported -F: drivers/ntb/hw/idt/ - NTB INTEL DRIVER M: Dave Jiang <dave.jiang@intel.com> L: ntb@lists.linux.dev @@ -18552,13 +18438,6 @@ F: drivers/pps/ F: include/linux/pps*.h F: include/uapi/linux/pps.h -PPTP DRIVER -M: Dmitry Kozlov <xeb@mail.ru> -L: netdev@vger.kernel.org -S: Maintained -W: http://sourceforge.net/projects/accel-pptp -F: drivers/net/ppp/pptp.c - PRESSURE STALL INFORMATION (PSI) M: Johannes Weiner <hannes@cmpxchg.org> M: Suren Baghdasaryan <surenb@google.com> @@ -19539,6 +19418,14 @@ S: Maintained F: Documentation/tools/rtla/ F: tools/tracing/rtla/ +Real-time Linux (PREEMPT_RT) +M: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +M: Clark Williams <clrkwllms@kernel.org> +M: Steven Rostedt <rostedt@goodmis.org> +L: linux-rt-devel@lists.linux.dev +S: Supported +K: PREEMPT_RT + REALTEK AUDIO CODECS M: Oder Chiou <oder_chiou@realtek.com> S: Maintained @@ -19648,15 +19535,6 @@ S: Supported F: Documentation/devicetree/bindings/i2c/renesas,iic-emev2.yaml F: drivers/i2c/busses/i2c-emev2.c -RENESAS ETHERNET AVB DRIVER -R: Sergey Shtylyov <s.shtylyov@omp.ru> -L: netdev@vger.kernel.org -L: linux-renesas-soc@vger.kernel.org -F: Documentation/devicetree/bindings/net/renesas,etheravb.yaml -F: drivers/net/ethernet/renesas/Kconfig -F: drivers/net/ethernet/renesas/Makefile -F: drivers/net/ethernet/renesas/ravb* - RENESAS ETHERNET SWITCH DRIVER R: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> L: netdev@vger.kernel.org @@ -19706,14 +19584,6 @@ F: Documentation/devicetree/bindings/i2c/renesas,rmobile-iic.yaml F: drivers/i2c/busses/i2c-rcar.c F: drivers/i2c/busses/i2c-sh_mobile.c -RENESAS R-CAR SATA DRIVER -R: Sergey Shtylyov <s.shtylyov@omp.ru> -L: linux-ide@vger.kernel.org -L: linux-renesas-soc@vger.kernel.org -S: Supported -F: Documentation/devicetree/bindings/ata/renesas,rcar-sata.yaml -F: drivers/ata/sata_rcar.c - RENESAS R-CAR THERMAL DRIVERS M: Niklas Söderlund <niklas.soderlund@ragnatech.se> L: linux-renesas-soc@vger.kernel.org @@ -19789,16 +19659,6 @@ S: Supported F: Documentation/devicetree/bindings/i2c/renesas,rzv2m.yaml F: drivers/i2c/busses/i2c-rzv2m.c -RENESAS SUPERH ETHERNET DRIVER -R: Sergey Shtylyov <s.shtylyov@omp.ru> -L: netdev@vger.kernel.org -L: linux-renesas-soc@vger.kernel.org -F: Documentation/devicetree/bindings/net/renesas,ether.yaml -F: drivers/net/ethernet/renesas/Kconfig -F: drivers/net/ethernet/renesas/Makefile -F: drivers/net/ethernet/renesas/sh_eth* -F: include/linux/sh_eth.h - RENESAS USB PHY DRIVER M: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> L: linux-renesas-soc@vger.kernel.org @@ -21793,8 +21653,8 @@ F: drivers/accessibility/speakup/ SPEAR PLATFORM/CLOCK/PINCTRL SUPPORT M: Viresh Kumar <vireshk@kernel.org> M: Shiraz Hashim <shiraz.linux.kernel@gmail.com> -M: soc@kernel.org L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) +L: soc@lists.linux.dev S: Maintained W: http://www.st.com/spear F: arch/arm/boot/dts/st/spear* @@ -22446,19 +22306,11 @@ F: drivers/tty/serial/8250/8250_lpss.c SYNOPSYS DESIGNWARE APB GPIO DRIVER M: Hoan Tran <hoan@os.amperecomputing.com> -M: Serge Semin <fancer.lancer@gmail.com> L: linux-gpio@vger.kernel.org S: Maintained F: Documentation/devicetree/bindings/gpio/snps,dw-apb-gpio.yaml F: drivers/gpio/gpio-dwapb.c -SYNOPSYS DESIGNWARE APB SSI DRIVER -M: Serge Semin <fancer.lancer@gmail.com> -L: linux-spi@vger.kernel.org -S: Supported -F: Documentation/devicetree/bindings/spi/snps,dw-apb-ssi.yaml -F: drivers/spi/spi-dw* - SYNOPSYS DESIGNWARE AXI DMAC DRIVER M: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> S: Maintained @@ -23768,12 +23620,6 @@ L: linux-input@vger.kernel.org S: Maintained F: drivers/hid/hid-udraw-ps3.c -UFS FILESYSTEM -M: Evgeniy Dushistov <dushistov@mail.ru> -S: Maintained -F: Documentation/admin-guide/ufs.rst -F: fs/ufs/ - UHID USERSPACE HID IO DRIVER M: David Rheinsberg <david@readahead.eu> L: linux-input@vger.kernel.org @@ -24076,6 +23922,7 @@ USB RAW GADGET DRIVER R: Andrey Konovalov <andreyknvl@gmail.com> L: linux-usb@vger.kernel.org S: Maintained +B: https://github.com/xairy/raw-gadget/issues F: Documentation/usb/raw-gadget.rst F: drivers/usb/gadget/legacy/raw_gadget.c F: include/uapi/linux/usb/raw_gadget.h @@ -24747,9 +24594,10 @@ F: tools/testing/vsock/ VMA M: Andrew Morton <akpm@linux-foundation.org> -R: Liam R. Howlett <Liam.Howlett@oracle.com> +M: Liam R. Howlett <Liam.Howlett@oracle.com> +M: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> R: Vlastimil Babka <vbabka@suse.cz> -R: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> +R: Jann Horn <jannh@google.com> L: linux-mm@kvack.org S: Maintained W: https://www.linux-mm.org @@ -2,7 +2,7 @@ VERSION = 6 PATCHLEVEL = 12 SUBLEVEL = 0 -EXTRAVERSION = -rc3 +EXTRAVERSION = -rc4 NAME = Baby Opossum Posse # *DOCUMENTATION* diff --git a/arch/Kconfig b/arch/Kconfig index 8af374ea1adc..00163e4a237c 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -838,7 +838,7 @@ config CFI_CLANG config CFI_ICALL_NORMALIZE_INTEGERS bool "Normalize CFI tags for integers" depends on CFI_CLANG - depends on HAVE_CFI_ICALL_NORMALIZE_INTEGERS + depends on HAVE_CFI_ICALL_NORMALIZE_INTEGERS_CLANG help This option normalizes the CFI tags for integer types so that all integer types of the same size and signedness receive the same CFI @@ -851,21 +851,19 @@ config CFI_ICALL_NORMALIZE_INTEGERS This option is necessary for using CFI with Rust. If unsure, say N. -config HAVE_CFI_ICALL_NORMALIZE_INTEGERS - def_bool !GCOV_KERNEL && !KASAN - depends on CFI_CLANG +config HAVE_CFI_ICALL_NORMALIZE_INTEGERS_CLANG + def_bool y depends on $(cc-option,-fsanitize=kcfi -fsanitize-cfi-icall-experimental-normalize-integers) - help - Is CFI_ICALL_NORMALIZE_INTEGERS supported with the set of compilers - currently in use? + # With GCOV/KASAN we need this fix: https://github.com/llvm/llvm-project/pull/104826 + depends on CLANG_VERSION >= 190000 || (!GCOV_KERNEL && !KASAN_GENERIC && !KASAN_SW_TAGS) - This option defaults to false if GCOV or KASAN is enabled, as there is - an LLVM bug that makes normalized integers tags incompatible with - KASAN and GCOV. Kconfig currently does not have the infrastructure to - detect whether your rustc compiler contains the fix for this bug, so - it is assumed that it doesn't. If your compiler has the fix, you can - explicitly enable this option in your config file. The Kconfig logic - needed to detect this will be added in a future kernel release. +config HAVE_CFI_ICALL_NORMALIZE_INTEGERS_RUSTC + def_bool y + depends on HAVE_CFI_ICALL_NORMALIZE_INTEGERS_CLANG + depends on RUSTC_VERSION >= 107900 + # With GCOV/KASAN we need this fix: https://github.com/rust-lang/rust/pull/129373 + depends on (RUSTC_LLVM_VERSION >= 190000 && RUSTC_VERSION >= 108200) || \ + (!GCOV_KERNEL && !KASAN_GENERIC && !KASAN_SW_TAGS) config CFI_PERMISSIVE bool "Use CFI in permissive mode" diff --git a/arch/arm/boot/dts/broadcom/bcm2837-rpi-cm3-io3.dts b/arch/arm/boot/dts/broadcom/bcm2837-rpi-cm3-io3.dts index 72d26d130efa..85f54fa595aa 100644 --- a/arch/arm/boot/dts/broadcom/bcm2837-rpi-cm3-io3.dts +++ b/arch/arm/boot/dts/broadcom/bcm2837-rpi-cm3-io3.dts @@ -77,7 +77,7 @@ }; &hdmi { - hpd-gpios = <&expgpio 1 GPIO_ACTIVE_LOW>; + hpd-gpios = <&expgpio 0 GPIO_ACTIVE_LOW>; power-domains = <&power RPI_POWER_DOMAIN_HDMI>; status = "okay"; }; diff --git a/arch/arm64/boot/dts/marvell/cn9130-sr-som.dtsi b/arch/arm64/boot/dts/marvell/cn9130-sr-som.dtsi index 4676e3488f54..cb8d54895a77 100644 --- a/arch/arm64/boot/dts/marvell/cn9130-sr-som.dtsi +++ b/arch/arm64/boot/dts/marvell/cn9130-sr-som.dtsi @@ -136,7 +136,7 @@ }; cp0_mdio_pins: cp0-mdio-pins { - marvell,pins = "mpp40", "mpp41"; + marvell,pins = "mpp0", "mpp1"; marvell,function = "ge"; }; diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index b36a3b6cc011..67afac659231 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -178,6 +178,7 @@ struct kvm_nvhe_init_params { unsigned long hcr_el2; unsigned long vttbr; unsigned long vtcr; + unsigned long tmp; }; /* diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 94cff508874b..bf64fed9820e 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -51,6 +51,7 @@ #define KVM_REQ_RELOAD_PMU KVM_ARCH_REQ(5) #define KVM_REQ_SUSPEND KVM_ARCH_REQ(6) #define KVM_REQ_RESYNC_PMU_EL0 KVM_ARCH_REQ(7) +#define KVM_REQ_NESTED_S2_UNMAP KVM_ARCH_REQ(8) #define KVM_DIRTY_LOG_MANUAL_CAPS (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \ KVM_DIRTY_LOG_INITIALLY_SET) @@ -212,6 +213,12 @@ struct kvm_s2_mmu { bool nested_stage2_enabled; /* + * true when this MMU needs to be unmapped before being used for a new + * purpose. + */ + bool pending_unmap; + + /* * 0: Nobody is currently using this, check vttbr for validity * >0: Somebody is actively using this. */ diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h index cd4087fbda9a..66d93e320ec8 100644 --- a/arch/arm64/include/asm/kvm_mmu.h +++ b/arch/arm64/include/asm/kvm_mmu.h @@ -166,7 +166,8 @@ int create_hyp_exec_mappings(phys_addr_t phys_addr, size_t size, int create_hyp_stack(phys_addr_t phys_addr, unsigned long *haddr); void __init free_hyp_pgds(void); -void kvm_stage2_unmap_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 size); +void kvm_stage2_unmap_range(struct kvm_s2_mmu *mmu, phys_addr_t start, + u64 size, bool may_block); void kvm_stage2_flush_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, phys_addr_t end); void kvm_stage2_wp_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, phys_addr_t end); diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h index e8bc6d67aba2..233e65522716 100644 --- a/arch/arm64/include/asm/kvm_nested.h +++ b/arch/arm64/include/asm/kvm_nested.h @@ -78,6 +78,8 @@ extern void kvm_s2_mmu_iterate_by_vmid(struct kvm *kvm, u16 vmid, extern void kvm_vcpu_load_hw_mmu(struct kvm_vcpu *vcpu); extern void kvm_vcpu_put_hw_mmu(struct kvm_vcpu *vcpu); +extern void check_nested_vcpu_requests(struct kvm_vcpu *vcpu); + struct kvm_s2_trans { phys_addr_t output; unsigned long block_size; @@ -124,7 +126,7 @@ extern int kvm_s2_handle_perm_fault(struct kvm_vcpu *vcpu, struct kvm_s2_trans *trans); extern int kvm_inject_s2_fault(struct kvm_vcpu *vcpu, u64 esr_el2); extern void kvm_nested_s2_wp(struct kvm *kvm); -extern void kvm_nested_s2_unmap(struct kvm *kvm); +extern void kvm_nested_s2_unmap(struct kvm *kvm, bool may_block); extern void kvm_nested_s2_flush(struct kvm *kvm); unsigned long compute_tlb_inval_range(struct kvm_s2_mmu *mmu, u64 val); diff --git a/arch/arm64/include/asm/uprobes.h b/arch/arm64/include/asm/uprobes.h index 2b09495499c6..014b02897f8e 100644 --- a/arch/arm64/include/asm/uprobes.h +++ b/arch/arm64/include/asm/uprobes.h @@ -10,11 +10,9 @@ #include <asm/insn.h> #include <asm/probes.h> -#define MAX_UINSN_BYTES AARCH64_INSN_SIZE - #define UPROBE_SWBP_INSN cpu_to_le32(BRK64_OPCODE_UPROBES) #define UPROBE_SWBP_INSN_SIZE AARCH64_INSN_SIZE -#define UPROBE_XOL_SLOT_BYTES MAX_UINSN_BYTES +#define UPROBE_XOL_SLOT_BYTES AARCH64_INSN_SIZE typedef __le32 uprobe_opcode_t; @@ -23,8 +21,8 @@ struct arch_uprobe_task { struct arch_uprobe { union { - u8 insn[MAX_UINSN_BYTES]; - u8 ixol[MAX_UINSN_BYTES]; + __le32 insn; + __le32 ixol; }; struct arch_probe_insn api; bool simulate; diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c index 27de1dddb0ab..b21dd24b8efc 100644 --- a/arch/arm64/kernel/asm-offsets.c +++ b/arch/arm64/kernel/asm-offsets.c @@ -146,6 +146,7 @@ int main(void) DEFINE(NVHE_INIT_HCR_EL2, offsetof(struct kvm_nvhe_init_params, hcr_el2)); DEFINE(NVHE_INIT_VTTBR, offsetof(struct kvm_nvhe_init_params, vttbr)); DEFINE(NVHE_INIT_VTCR, offsetof(struct kvm_nvhe_init_params, vtcr)); + DEFINE(NVHE_INIT_TMP, offsetof(struct kvm_nvhe_init_params, tmp)); #endif #ifdef CONFIG_CPU_PM DEFINE(CPU_CTX_SP, offsetof(struct cpu_suspend_ctx, sp)); diff --git a/arch/arm64/kernel/probes/decode-insn.c b/arch/arm64/kernel/probes/decode-insn.c index 968d5fffe233..3496d6169e59 100644 --- a/arch/arm64/kernel/probes/decode-insn.c +++ b/arch/arm64/kernel/probes/decode-insn.c @@ -99,10 +99,6 @@ arm_probe_decode_insn(probe_opcode_t insn, struct arch_probe_insn *api) aarch64_insn_is_blr(insn) || aarch64_insn_is_ret(insn)) { api->handler = simulate_br_blr_ret; - } else if (aarch64_insn_is_ldr_lit(insn)) { - api->handler = simulate_ldr_literal; - } else if (aarch64_insn_is_ldrsw_lit(insn)) { - api->handler = simulate_ldrsw_literal; } else { /* * Instruction cannot be stepped out-of-line and we don't @@ -140,6 +136,17 @@ arm_kprobe_decode_insn(kprobe_opcode_t *addr, struct arch_specific_insn *asi) probe_opcode_t insn = le32_to_cpu(*addr); probe_opcode_t *scan_end = NULL; unsigned long size = 0, offset = 0; + struct arch_probe_insn *api = &asi->api; + + if (aarch64_insn_is_ldr_lit(insn)) { + api->handler = simulate_ldr_literal; + decoded = INSN_GOOD_NO_SLOT; + } else if (aarch64_insn_is_ldrsw_lit(insn)) { + api->handler = simulate_ldrsw_literal; + decoded = INSN_GOOD_NO_SLOT; + } else { + decoded = arm_probe_decode_insn(insn, &asi->api); + } /* * If there's a symbol defined in front of and near enough to @@ -157,7 +164,6 @@ arm_kprobe_decode_insn(kprobe_opcode_t *addr, struct arch_specific_insn *asi) else scan_end = addr - MAX_ATOMIC_CONTEXT_SIZE; } - decoded = arm_probe_decode_insn(insn, &asi->api); if (decoded != INSN_REJECTED && scan_end) if (is_probed_address_atomic(addr - 1, scan_end)) diff --git a/arch/arm64/kernel/probes/simulate-insn.c b/arch/arm64/kernel/probes/simulate-insn.c index 22d0b3252476..b65334ab79d2 100644 --- a/arch/arm64/kernel/probes/simulate-insn.c +++ b/arch/arm64/kernel/probes/simulate-insn.c @@ -171,17 +171,15 @@ simulate_tbz_tbnz(u32 opcode, long addr, struct pt_regs *regs) void __kprobes simulate_ldr_literal(u32 opcode, long addr, struct pt_regs *regs) { - u64 *load_addr; + unsigned long load_addr; int xn = opcode & 0x1f; - int disp; - disp = ldr_displacement(opcode); - load_addr = (u64 *) (addr + disp); + load_addr = addr + ldr_displacement(opcode); if (opcode & (1 << 30)) /* x0-x30 */ - set_x_reg(regs, xn, *load_addr); + set_x_reg(regs, xn, READ_ONCE(*(u64 *)load_addr)); else /* w0-w30 */ - set_w_reg(regs, xn, *load_addr); + set_w_reg(regs, xn, READ_ONCE(*(u32 *)load_addr)); instruction_pointer_set(regs, instruction_pointer(regs) + 4); } @@ -189,14 +187,12 @@ simulate_ldr_literal(u32 opcode, long addr, struct pt_regs *regs) void __kprobes simulate_ldrsw_literal(u32 opcode, long addr, struct pt_regs *regs) { - s32 *load_addr; + unsigned long load_addr; int xn = opcode & 0x1f; - int disp; - disp = ldr_displacement(opcode); - load_addr = (s32 *) (addr + disp); + load_addr = addr + ldr_displacement(opcode); - set_x_reg(regs, xn, *load_addr); + set_x_reg(regs, xn, READ_ONCE(*(s32 *)load_addr)); instruction_pointer_set(regs, instruction_pointer(regs) + 4); } diff --git a/arch/arm64/kernel/probes/uprobes.c b/arch/arm64/kernel/probes/uprobes.c index d49aef2657cd..a2f137a595fc 100644 --- a/arch/arm64/kernel/probes/uprobes.c +++ b/arch/arm64/kernel/probes/uprobes.c @@ -42,7 +42,7 @@ int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, else if (!IS_ALIGNED(addr, AARCH64_INSN_SIZE)) return -EINVAL; - insn = *(probe_opcode_t *)(&auprobe->insn[0]); + insn = le32_to_cpu(auprobe->insn); switch (arm_probe_decode_insn(insn, &auprobe->api)) { case INSN_REJECTED: @@ -108,7 +108,7 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs) if (!auprobe->simulate) return false; - insn = *(probe_opcode_t *)(&auprobe->insn[0]); + insn = le32_to_cpu(auprobe->insn); addr = instruction_pointer(regs); if (auprobe->api.handler) diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index 0540653fbf38..3e7c8c8195c3 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -412,6 +412,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) p->thread.cpu_context.x19 = (unsigned long)args->fn; p->thread.cpu_context.x20 = (unsigned long)args->fn_arg; + + if (system_supports_poe()) + p->thread.por_el0 = POR_EL0_INIT; } p->thread.cpu_context.pc = (unsigned long)ret_from_fork; p->thread.cpu_context.sp = (unsigned long)childregs; diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index a0d01c46e408..48cafb65d6ac 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -997,6 +997,9 @@ static int kvm_vcpu_suspend(struct kvm_vcpu *vcpu) static int check_vcpu_requests(struct kvm_vcpu *vcpu) { if (kvm_request_pending(vcpu)) { + if (kvm_check_request(KVM_REQ_VM_DEAD, vcpu)) + return -EIO; + if (kvm_check_request(KVM_REQ_SLEEP, vcpu)) kvm_vcpu_sleep(vcpu); @@ -1031,6 +1034,8 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu) if (kvm_dirty_ring_check_request(vcpu)) return 0; + + check_nested_vcpu_requests(vcpu); } return 1; diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-init.S b/arch/arm64/kvm/hyp/nvhe/hyp-init.S index 401af1835be6..fc1866226067 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-init.S +++ b/arch/arm64/kvm/hyp/nvhe/hyp-init.S @@ -24,28 +24,25 @@ .align 11 SYM_CODE_START(__kvm_hyp_init) - ventry __invalid // Synchronous EL2t - ventry __invalid // IRQ EL2t - ventry __invalid // FIQ EL2t - ventry __invalid // Error EL2t + ventry . // Synchronous EL2t + ventry . // IRQ EL2t + ventry . // FIQ EL2t + ventry . // Error EL2t - ventry __invalid // Synchronous EL2h - ventry __invalid // IRQ EL2h - ventry __invalid // FIQ EL2h - ventry __invalid // Error EL2h + ventry . // Synchronous EL2h + ventry . // IRQ EL2h + ventry . // FIQ EL2h + ventry . // Error EL2h ventry __do_hyp_init // Synchronous 64-bit EL1 - ventry __invalid // IRQ 64-bit EL1 - ventry __invalid // FIQ 64-bit EL1 - ventry __invalid // Error 64-bit EL1 + ventry . // IRQ 64-bit EL1 + ventry . // FIQ 64-bit EL1 + ventry . // Error 64-bit EL1 - ventry __invalid // Synchronous 32-bit EL1 - ventry __invalid // IRQ 32-bit EL1 - ventry __invalid // FIQ 32-bit EL1 - ventry __invalid // Error 32-bit EL1 - -__invalid: - b . + ventry . // Synchronous 32-bit EL1 + ventry . // IRQ 32-bit EL1 + ventry . // FIQ 32-bit EL1 + ventry . // Error 32-bit EL1 /* * Only uses x0..x3 so as to not clobber callee-saved SMCCC registers. @@ -76,6 +73,13 @@ __do_hyp_init: eret SYM_CODE_END(__kvm_hyp_init) +SYM_CODE_START_LOCAL(__kvm_init_el2_state) + /* Initialize EL2 CPU state to sane values. */ + init_el2_state // Clobbers x0..x2 + finalise_el2_state + ret +SYM_CODE_END(__kvm_init_el2_state) + /* * Initialize the hypervisor in EL2. * @@ -102,9 +106,12 @@ SYM_CODE_START_LOCAL(___kvm_hyp_init) // TPIDR_EL2 is used to preserve x0 across the macro maze... isb msr tpidr_el2, x0 - init_el2_state - finalise_el2_state + str lr, [x0, #NVHE_INIT_TMP] + + bl __kvm_init_el2_state + mrs x0, tpidr_el2 + ldr lr, [x0, #NVHE_INIT_TMP] 1: ldr x1, [x0, #NVHE_INIT_TPIDR_EL2] @@ -199,9 +206,8 @@ SYM_CODE_START_LOCAL(__kvm_hyp_init_cpu) 2: msr SPsel, #1 // We want to use SP_EL{1,2} - /* Initialize EL2 CPU state to sane values. */ - init_el2_state // Clobbers x0..x2 - finalise_el2_state + bl __kvm_init_el2_state + __init_el2_nvhe_prepare_eret /* Enable MMU, set vectors and stack. */ diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c index 5763d979d8ca..ee6573befb81 100644 --- a/arch/arm64/kvm/hypercalls.c +++ b/arch/arm64/kvm/hypercalls.c @@ -317,7 +317,7 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu) * to the guest, and hide SSBS so that the * guest stays protected. */ - if (cpus_have_final_cap(ARM64_SSBS)) + if (kvm_has_feat(vcpu->kvm, ID_AA64PFR1_EL1, SSBS, IMP)) break; fallthrough; case SPECTRE_UNAFFECTED: @@ -428,7 +428,7 @@ int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices) * Convert the workaround level into an easy-to-compare number, where higher * values mean better protection. */ -static int get_kernel_wa_level(u64 regid) +static int get_kernel_wa_level(struct kvm_vcpu *vcpu, u64 regid) { switch (regid) { case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1: @@ -449,7 +449,7 @@ static int get_kernel_wa_level(u64 regid) * don't have any FW mitigation if SSBS is there at * all times. */ - if (cpus_have_final_cap(ARM64_SSBS)) + if (kvm_has_feat(vcpu->kvm, ID_AA64PFR1_EL1, SSBS, IMP)) return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL; fallthrough; case SPECTRE_UNAFFECTED: @@ -486,7 +486,7 @@ int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1: case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2: case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3: - val = get_kernel_wa_level(reg->id) & KVM_REG_FEATURE_LEVEL_MASK; + val = get_kernel_wa_level(vcpu, reg->id) & KVM_REG_FEATURE_LEVEL_MASK; break; case KVM_REG_ARM_STD_BMAP: val = READ_ONCE(smccc_feat->std_bmap); @@ -588,7 +588,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) if (val & ~KVM_REG_FEATURE_LEVEL_MASK) return -EINVAL; - if (get_kernel_wa_level(reg->id) < val) + if (get_kernel_wa_level(vcpu, reg->id) < val) return -EINVAL; return 0; @@ -624,7 +624,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) * We can deal with NOT_AVAIL on NOT_REQUIRED, but not the * other way around. */ - if (get_kernel_wa_level(reg->id) < wa_level) + if (get_kernel_wa_level(vcpu, reg->id) < wa_level) return -EINVAL; return 0; diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index a509b63bd4dd..0f7658aefa1a 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -328,9 +328,10 @@ static void __unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 may_block)); } -void kvm_stage2_unmap_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 size) +void kvm_stage2_unmap_range(struct kvm_s2_mmu *mmu, phys_addr_t start, + u64 size, bool may_block) { - __unmap_stage2_range(mmu, start, size, true); + __unmap_stage2_range(mmu, start, size, may_block); } void kvm_stage2_flush_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, phys_addr_t end) @@ -1015,7 +1016,7 @@ static void stage2_unmap_memslot(struct kvm *kvm, if (!(vma->vm_flags & VM_PFNMAP)) { gpa_t gpa = addr + (vm_start - memslot->userspace_addr); - kvm_stage2_unmap_range(&kvm->arch.mmu, gpa, vm_end - vm_start); + kvm_stage2_unmap_range(&kvm->arch.mmu, gpa, vm_end - vm_start, true); } hva = vm_end; } while (hva < reg_end); @@ -1042,7 +1043,7 @@ void stage2_unmap_vm(struct kvm *kvm) kvm_for_each_memslot(memslot, bkt, slots) stage2_unmap_memslot(kvm, memslot); - kvm_nested_s2_unmap(kvm); + kvm_nested_s2_unmap(kvm, true); write_unlock(&kvm->mmu_lock); mmap_read_unlock(current->mm); @@ -1912,7 +1913,7 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range) (range->end - range->start) << PAGE_SHIFT, range->may_block); - kvm_nested_s2_unmap(kvm); + kvm_nested_s2_unmap(kvm, range->may_block); return false; } @@ -2179,8 +2180,8 @@ void kvm_arch_flush_shadow_memslot(struct kvm *kvm, phys_addr_t size = slot->npages << PAGE_SHIFT; write_lock(&kvm->mmu_lock); - kvm_stage2_unmap_range(&kvm->arch.mmu, gpa, size); - kvm_nested_s2_unmap(kvm); + kvm_stage2_unmap_range(&kvm->arch.mmu, gpa, size, true); + kvm_nested_s2_unmap(kvm, true); write_unlock(&kvm->mmu_lock); } diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c index f9e30dd34c7a..c4b17d90fc49 100644 --- a/arch/arm64/kvm/nested.c +++ b/arch/arm64/kvm/nested.c @@ -632,9 +632,9 @@ static struct kvm_s2_mmu *get_s2_mmu_nested(struct kvm_vcpu *vcpu) /* Set the scene for the next search */ kvm->arch.nested_mmus_next = (i + 1) % kvm->arch.nested_mmus_size; - /* Clear the old state */ + /* Make sure we don't forget to do the laundry */ if (kvm_s2_mmu_valid(s2_mmu)) - kvm_stage2_unmap_range(s2_mmu, 0, kvm_phys_size(s2_mmu)); + s2_mmu->pending_unmap = true; /* * The virtual VMID (modulo CnP) will be used as a key when matching @@ -650,6 +650,16 @@ static struct kvm_s2_mmu *get_s2_mmu_nested(struct kvm_vcpu *vcpu) out: atomic_inc(&s2_mmu->refcnt); + + /* + * Set the vCPU request to perform an unmap, even if the pending unmap + * originates from another vCPU. This guarantees that the MMU has been + * completely unmapped before any vCPU actually uses it, and allows + * multiple vCPUs to lend a hand with completing the unmap. + */ + if (s2_mmu->pending_unmap) + kvm_make_request(KVM_REQ_NESTED_S2_UNMAP, vcpu); + return s2_mmu; } @@ -663,6 +673,13 @@ void kvm_init_nested_s2_mmu(struct kvm_s2_mmu *mmu) void kvm_vcpu_load_hw_mmu(struct kvm_vcpu *vcpu) { + /* + * The vCPU kept its reference on the MMU after the last put, keep + * rolling with it. + */ + if (vcpu->arch.hw_mmu) + return; + if (is_hyp_ctxt(vcpu)) { vcpu->arch.hw_mmu = &vcpu->kvm->arch.mmu; } else { @@ -674,10 +691,18 @@ void kvm_vcpu_load_hw_mmu(struct kvm_vcpu *vcpu) void kvm_vcpu_put_hw_mmu(struct kvm_vcpu *vcpu) { - if (kvm_is_nested_s2_mmu(vcpu->kvm, vcpu->arch.hw_mmu)) { + /* + * Keep a reference on the associated stage-2 MMU if the vCPU is + * scheduling out and not in WFI emulation, suggesting it is likely to + * reuse the MMU sometime soon. + */ + if (vcpu->scheduled_out && !vcpu_get_flag(vcpu, IN_WFI)) + return; + + if (kvm_is_nested_s2_mmu(vcpu->kvm, vcpu->arch.hw_mmu)) atomic_dec(&vcpu->arch.hw_mmu->refcnt); - vcpu->arch.hw_mmu = NULL; - } + + vcpu->arch.hw_mmu = NULL; } /* @@ -730,7 +755,7 @@ void kvm_nested_s2_wp(struct kvm *kvm) } } -void kvm_nested_s2_unmap(struct kvm *kvm) +void kvm_nested_s2_unmap(struct kvm *kvm, bool may_block) { int i; @@ -740,7 +765,7 @@ void kvm_nested_s2_unmap(struct kvm *kvm) struct kvm_s2_mmu *mmu = &kvm->arch.nested_mmus[i]; if (kvm_s2_mmu_valid(mmu)) - kvm_stage2_unmap_range(mmu, 0, kvm_phys_size(mmu)); + kvm_stage2_unmap_range(mmu, 0, kvm_phys_size(mmu), may_block); } } @@ -1184,3 +1209,17 @@ int kvm_init_nv_sysregs(struct kvm *kvm) return 0; } + +void check_nested_vcpu_requests(struct kvm_vcpu *vcpu) +{ + if (kvm_check_request(KVM_REQ_NESTED_S2_UNMAP, vcpu)) { + struct kvm_s2_mmu *mmu = vcpu->arch.hw_mmu; + + write_lock(&vcpu->kvm->mmu_lock); + if (mmu->pending_unmap) { + kvm_stage2_unmap_range(mmu, 0, kvm_phys_size(mmu), true); + mmu->pending_unmap = false; + } + write_unlock(&vcpu->kvm->mmu_lock); + } +} diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index dad88e31f953..ff8c4e1b847e 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -1527,6 +1527,14 @@ static u64 __kvm_read_sanitised_id_reg(const struct kvm_vcpu *vcpu, val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_MTE); val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_SME); + val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_RNDR_trap); + val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_NMI); + val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_MTE_frac); + val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_GCS); + val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_THE); + val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_MTEX); + val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_DF2); + val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_PFAR); break; case SYS_ID_AA64PFR2_EL1: /* We only expose FPMR */ @@ -1550,7 +1558,8 @@ static u64 __kvm_read_sanitised_id_reg(const struct kvm_vcpu *vcpu, val &= ~ID_AA64MMFR2_EL1_CCIDX_MASK; break; case SYS_ID_AA64MMFR3_EL1: - val &= ID_AA64MMFR3_EL1_TCRX | ID_AA64MMFR3_EL1_S1POE; + val &= ID_AA64MMFR3_EL1_TCRX | ID_AA64MMFR3_EL1_S1POE | + ID_AA64MMFR3_EL1_S1PIE; break; case SYS_ID_MMFR4_EL1: val &= ~ARM64_FEATURE_MASK(ID_MMFR4_EL1_CCIDX); @@ -1985,7 +1994,7 @@ static u64 reset_clidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r) * one cache line. */ if (kvm_has_mte(vcpu->kvm)) - clidr |= 2 << CLIDR_TTYPE_SHIFT(loc); + clidr |= 2ULL << CLIDR_TTYPE_SHIFT(loc); __vcpu_sys_reg(vcpu, r->reg) = clidr; @@ -2376,7 +2385,19 @@ static const struct sys_reg_desc sys_reg_descs[] = { ID_AA64PFR0_EL1_RAS | ID_AA64PFR0_EL1_AdvSIMD | ID_AA64PFR0_EL1_FP), }, - ID_SANITISED(ID_AA64PFR1_EL1), + ID_WRITABLE(ID_AA64PFR1_EL1, ~(ID_AA64PFR1_EL1_PFAR | + ID_AA64PFR1_EL1_DF2 | + ID_AA64PFR1_EL1_MTEX | + ID_AA64PFR1_EL1_THE | + ID_AA64PFR1_EL1_GCS | + ID_AA64PFR1_EL1_MTE_frac | + ID_AA64PFR1_EL1_NMI | + ID_AA64PFR1_EL1_RNDR_trap | + ID_AA64PFR1_EL1_SME | + ID_AA64PFR1_EL1_RES0 | + ID_AA64PFR1_EL1_MPAM_frac | + ID_AA64PFR1_EL1_RAS_frac | + ID_AA64PFR1_EL1_MTE)), ID_WRITABLE(ID_AA64PFR2_EL1, ID_AA64PFR2_EL1_FPMR), ID_UNALLOCATED(4,3), ID_WRITABLE(ID_AA64ZFR0_EL1, ~ID_AA64ZFR0_EL1_RES0), @@ -2390,7 +2411,21 @@ static const struct sys_reg_desc sys_reg_descs[] = { .get_user = get_id_reg, .set_user = set_id_aa64dfr0_el1, .reset = read_sanitised_id_aa64dfr0_el1, - .val = ID_AA64DFR0_EL1_PMUVer_MASK | + /* + * Prior to FEAT_Debugv8.9, the architecture defines context-aware + * breakpoints (CTX_CMPs) as the highest numbered breakpoints (BRPs). + * KVM does not trap + emulate the breakpoint registers, and as such + * cannot support a layout that misaligns with the underlying hardware. + * While it may be possible to describe a subset that aligns with + * hardware, just prevent changes to BRPs and CTX_CMPs altogether for + * simplicity. + * + * See DDI0487K.a, section D2.8.3 Breakpoint types and linking + * of breakpoints for more details. + */ + .val = ID_AA64DFR0_EL1_DoubleLock_MASK | + ID_AA64DFR0_EL1_WRPs_MASK | + ID_AA64DFR0_EL1_PMUVer_MASK | ID_AA64DFR0_EL1_DebugVer_MASK, }, ID_SANITISED(ID_AA64DFR1_EL1), ID_UNALLOCATED(5,2), @@ -2433,6 +2468,7 @@ static const struct sys_reg_desc sys_reg_descs[] = { ID_AA64MMFR2_EL1_NV | ID_AA64MMFR2_EL1_CCIDX)), ID_WRITABLE(ID_AA64MMFR3_EL1, (ID_AA64MMFR3_EL1_TCRX | + ID_AA64MMFR3_EL1_S1PIE | ID_AA64MMFR3_EL1_S1POE)), ID_SANITISED(ID_AA64MMFR4_EL1), ID_UNALLOCATED(7,5), @@ -2903,7 +2939,7 @@ static bool handle_alle1is(struct kvm_vcpu *vcpu, struct sys_reg_params *p, * Drop all shadow S2s, resulting in S1/S2 TLBIs for each of the * corresponding VMIDs. */ - kvm_nested_s2_unmap(vcpu->kvm); + kvm_nested_s2_unmap(vcpu->kvm, true); write_unlock(&vcpu->kvm->mmu_lock); @@ -2955,7 +2991,30 @@ union tlbi_info { static void s2_mmu_unmap_range(struct kvm_s2_mmu *mmu, const union tlbi_info *info) { - kvm_stage2_unmap_range(mmu, info->range.start, info->range.size); + /* + * The unmap operation is allowed to drop the MMU lock and block, which + * means that @mmu could be used for a different context than the one + * currently being invalidated. + * + * This behavior is still safe, as: + * + * 1) The vCPU(s) that recycled the MMU are responsible for invalidating + * the entire MMU before reusing it, which still honors the intent + * of a TLBI. + * + * 2) Until the guest TLBI instruction is 'retired' (i.e. increment PC + * and ERET to the guest), other vCPUs are allowed to use stale + * translations. + * + * 3) Accidentally unmapping an unrelated MMU context is nonfatal, and + * at worst may cause more aborts for shadow stage-2 fills. + * + * Dropping the MMU lock also implies that shadow stage-2 fills could + * happen behind the back of the TLBI. This is still safe, though, as + * the L1 needs to put its stage-2 in a consistent state before doing + * the TLBI. + */ + kvm_stage2_unmap_range(mmu, info->range.start, info->range.size, true); } static bool handle_vmalls12e1is(struct kvm_vcpu *vcpu, struct sys_reg_params *p, @@ -3050,7 +3109,11 @@ static void s2_mmu_unmap_ipa(struct kvm_s2_mmu *mmu, max_size = compute_tlb_inval_range(mmu, info->ipa.addr); base_addr &= ~(max_size - 1); - kvm_stage2_unmap_range(mmu, base_addr, max_size); + /* + * See comment in s2_mmu_unmap_range() for why this is allowed to + * reschedule. + */ + kvm_stage2_unmap_range(mmu, base_addr, max_size, true); } static bool handle_ipas2e1is(struct kvm_vcpu *vcpu, struct sys_reg_params *p, diff --git a/arch/arm64/kvm/vgic/vgic-init.c b/arch/arm64/kvm/vgic/vgic-init.c index e7c53e8af3d1..48c952563e85 100644 --- a/arch/arm64/kvm/vgic/vgic-init.c +++ b/arch/arm64/kvm/vgic/vgic-init.c @@ -417,8 +417,28 @@ static void __kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu) kfree(vgic_cpu->private_irqs); vgic_cpu->private_irqs = NULL; - if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) + if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) { + /* + * If this vCPU is being destroyed because of a failed creation + * then unregister the redistributor to avoid leaving behind a + * dangling pointer to the vCPU struct. + * + * vCPUs that have been successfully created (i.e. added to + * kvm->vcpu_array) get unregistered in kvm_vgic_destroy(), as + * this function gets called while holding kvm->arch.config_lock + * in the VM teardown path and would otherwise introduce a lock + * inversion w.r.t. kvm->srcu. + * + * vCPUs that failed creation are torn down outside of the + * kvm->arch.config_lock and do not get unregistered in + * kvm_vgic_destroy(), meaning it is both safe and necessary to + * do so here. + */ + if (kvm_get_vcpu_by_id(vcpu->kvm, vcpu->vcpu_id) != vcpu) + vgic_unregister_redist_iodev(vcpu); + vgic_cpu->rd_iodev.base_addr = VGIC_ADDR_UNDEF; + } } void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu) @@ -524,22 +544,31 @@ int kvm_vgic_map_resources(struct kvm *kvm) if (ret) goto out; - dist->ready = true; dist_base = dist->vgic_dist_base; mutex_unlock(&kvm->arch.config_lock); ret = vgic_register_dist_iodev(kvm, dist_base, type); - if (ret) + if (ret) { kvm_err("Unable to register VGIC dist MMIO regions\n"); + goto out_slots; + } + /* + * kvm_io_bus_register_dev() guarantees all readers see the new MMIO + * registration before returning through synchronize_srcu(), which also + * implies a full memory barrier. As such, marking the distributor as + * 'ready' here is guaranteed to be ordered after all vCPUs having seen + * a completely configured distributor. + */ + dist->ready = true; goto out_slots; out: mutex_unlock(&kvm->arch.config_lock); out_slots: - mutex_unlock(&kvm->slots_lock); - if (ret) - kvm_vgic_destroy(kvm); + kvm_vm_dead(kvm); + + mutex_unlock(&kvm->slots_lock); return ret; } diff --git a/arch/arm64/kvm/vgic/vgic-kvm-device.c b/arch/arm64/kvm/vgic/vgic-kvm-device.c index 1d26bb5b02f4..5f4f57aaa23e 100644 --- a/arch/arm64/kvm/vgic/vgic-kvm-device.c +++ b/arch/arm64/kvm/vgic/vgic-kvm-device.c @@ -236,7 +236,12 @@ static int vgic_set_common_attr(struct kvm_device *dev, mutex_lock(&dev->kvm->arch.config_lock); - if (vgic_ready(dev->kvm) || dev->kvm->arch.vgic.nr_spis) + /* + * Either userspace has already configured NR_IRQS or + * the vgic has already been initialized and vgic_init() + * supplied a default amount of SPIs. + */ + if (dev->kvm->arch.vgic.nr_spis) ret = -EBUSY; else dev->kvm->arch.vgic.nr_spis = diff --git a/arch/loongarch/include/asm/bootinfo.h b/arch/loongarch/include/asm/bootinfo.h index 6d5846dd075c..7657e016233f 100644 --- a/arch/loongarch/include/asm/bootinfo.h +++ b/arch/loongarch/include/asm/bootinfo.h @@ -26,6 +26,10 @@ struct loongson_board_info { #define NR_WORDS DIV_ROUND_UP(NR_CPUS, BITS_PER_LONG) +/* + * The "core" of cores_per_node and cores_per_package stands for a + * logical core, which means in a SMT system it stands for a thread. + */ struct loongson_system_configuration { int nr_cpus; int nr_nodes; diff --git a/arch/loongarch/include/asm/kasan.h b/arch/loongarch/include/asm/kasan.h index cd6084f4e153..c6bce5fbff57 100644 --- a/arch/loongarch/include/asm/kasan.h +++ b/arch/loongarch/include/asm/kasan.h @@ -16,7 +16,7 @@ #define XRANGE_SHIFT (48) /* Valid address length */ -#define XRANGE_SHADOW_SHIFT (PGDIR_SHIFT + PAGE_SHIFT - 3) +#define XRANGE_SHADOW_SHIFT min(cpu_vabits, VA_BITS) /* Used for taking out the valid address */ #define XRANGE_SHADOW_MASK GENMASK_ULL(XRANGE_SHADOW_SHIFT - 1, 0) /* One segment whole address space size */ diff --git a/arch/loongarch/include/asm/loongarch.h b/arch/loongarch/include/asm/loongarch.h index 26542413a5b0..64ad277e096e 100644 --- a/arch/loongarch/include/asm/loongarch.h +++ b/arch/loongarch/include/asm/loongarch.h @@ -250,7 +250,7 @@ #define CSR_ESTAT_IS_WIDTH 15 #define CSR_ESTAT_IS (_ULCAST_(0x7fff) << CSR_ESTAT_IS_SHIFT) -#define LOONGARCH_CSR_ERA 0x6 /* ERA */ +#define LOONGARCH_CSR_ERA 0x6 /* Exception return address */ #define LOONGARCH_CSR_BADV 0x7 /* Bad virtual address */ diff --git a/arch/loongarch/include/asm/pgalloc.h b/arch/loongarch/include/asm/pgalloc.h index 4e2d6b7ca2ee..a7b9c9e73593 100644 --- a/arch/loongarch/include/asm/pgalloc.h +++ b/arch/loongarch/include/asm/pgalloc.h @@ -10,6 +10,7 @@ #define __HAVE_ARCH_PMD_ALLOC_ONE #define __HAVE_ARCH_PUD_ALLOC_ONE +#define __HAVE_ARCH_PTE_ALLOC_ONE_KERNEL #include <asm-generic/pgalloc.h> static inline void pmd_populate_kernel(struct mm_struct *mm, @@ -44,6 +45,16 @@ extern void pagetable_init(void); extern pgd_t *pgd_alloc(struct mm_struct *mm); +static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm) +{ + pte_t *pte = __pte_alloc_one_kernel(mm); + + if (pte) + kernel_pte_init(pte); + + return pte; +} + #define __pte_free_tlb(tlb, pte, address) \ do { \ pagetable_pte_dtor(page_ptdesc(pte)); \ diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/asm/pgtable.h index 9965f52ef65b..20714b73f14c 100644 --- a/arch/loongarch/include/asm/pgtable.h +++ b/arch/loongarch/include/asm/pgtable.h @@ -269,6 +269,7 @@ extern void set_pmd_at(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp, pm extern void pgd_init(void *addr); extern void pud_init(void *addr); extern void pmd_init(void *addr); +extern void kernel_pte_init(void *addr); /* * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that @@ -325,39 +326,17 @@ static inline void set_pte(pte_t *ptep, pte_t pteval) { WRITE_ONCE(*ptep, pteval); - if (pte_val(pteval) & _PAGE_GLOBAL) { - pte_t *buddy = ptep_buddy(ptep); - /* - * Make sure the buddy is global too (if it's !none, - * it better already be global) - */ - if (pte_none(ptep_get(buddy))) { #ifdef CONFIG_SMP - /* - * For SMP, multiple CPUs can race, so we need - * to do this atomically. - */ - __asm__ __volatile__( - __AMOR "$zero, %[global], %[buddy] \n" - : [buddy] "+ZB" (buddy->pte) - : [global] "r" (_PAGE_GLOBAL) - : "memory"); - - DBAR(0b11000); /* o_wrw = 0b11000 */ -#else /* !CONFIG_SMP */ - WRITE_ONCE(*buddy, __pte(pte_val(ptep_get(buddy)) | _PAGE_GLOBAL)); -#endif /* CONFIG_SMP */ - } - } + if (pte_val(pteval) & _PAGE_GLOBAL) + DBAR(0b11000); /* o_wrw = 0b11000 */ +#endif } static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - /* Preserve global status for the pair */ - if (pte_val(ptep_get(ptep_buddy(ptep))) & _PAGE_GLOBAL) - set_pte(ptep, __pte(_PAGE_GLOBAL)); - else - set_pte(ptep, __pte(0)); + pte_t pte = ptep_get(ptep); + pte_val(pte) &= _PAGE_GLOBAL; + set_pte(ptep, pte); } #define PGD_T_LOG2 (__builtin_ffs(sizeof(pgd_t)) - 1) diff --git a/arch/loongarch/kernel/process.c b/arch/loongarch/kernel/process.c index f2ff8b5d591e..6e58f65455c7 100644 --- a/arch/loongarch/kernel/process.c +++ b/arch/loongarch/kernel/process.c @@ -293,13 +293,15 @@ unsigned long stack_top(void) { unsigned long top = TASK_SIZE & PAGE_MASK; - /* Space for the VDSO & data page */ - top -= PAGE_ALIGN(current->thread.vdso->size); - top -= VVAR_SIZE; - - /* Space to randomize the VDSO base */ - if (current->flags & PF_RANDOMIZE) - top -= VDSO_RANDOMIZE_SIZE; + if (current->thread.vdso) { + /* Space for the VDSO & data page */ + top -= PAGE_ALIGN(current->thread.vdso->size); + top -= VVAR_SIZE; + + /* Space to randomize the VDSO base */ + if (current->flags & PF_RANDOMIZE) + top -= VDSO_RANDOMIZE_SIZE; + } return top; } diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c index 00e307203ddb..cbd3c09a93c1 100644 --- a/arch/loongarch/kernel/setup.c +++ b/arch/loongarch/kernel/setup.c @@ -55,6 +55,7 @@ #define SMBIOS_FREQHIGH_OFFSET 0x17 #define SMBIOS_FREQLOW_MASK 0xFF #define SMBIOS_CORE_PACKAGE_OFFSET 0x23 +#define SMBIOS_THREAD_PACKAGE_OFFSET 0x25 #define LOONGSON_EFI_ENABLE (1 << 3) unsigned long fw_arg0, fw_arg1, fw_arg2; @@ -125,7 +126,7 @@ static void __init parse_cpu_table(const struct dmi_header *dm) cpu_clock_freq = freq_temp * 1000000; loongson_sysconf.cpuname = (void *)dmi_string_parse(dm, dmi_data[16]); - loongson_sysconf.cores_per_package = *(dmi_data + SMBIOS_CORE_PACKAGE_OFFSET); + loongson_sysconf.cores_per_package = *(dmi_data + SMBIOS_THREAD_PACKAGE_OFFSET); pr_info("CpuClock = %llu\n", cpu_clock_freq); } diff --git a/arch/loongarch/kernel/traps.c b/arch/loongarch/kernel/traps.c index f9f4eb00c92e..c57b4134f3e8 100644 --- a/arch/loongarch/kernel/traps.c +++ b/arch/loongarch/kernel/traps.c @@ -555,6 +555,9 @@ asmlinkage void noinstr do_ale(struct pt_regs *regs) #else unsigned int *pc; + if (regs->csr_prmd & CSR_PRMD_PIE) + local_irq_enable(); + perf_sw_event(PERF_COUNT_SW_ALIGNMENT_FAULTS, 1, regs, regs->csr_badvaddr); /* @@ -579,6 +582,8 @@ sigbus: die_if_kernel("Kernel ale access", regs); force_sig_fault(SIGBUS, BUS_ADRALN, (void __user *)regs->csr_badvaddr); out: + if (regs->csr_prmd & CSR_PRMD_PIE) + local_irq_disable(); #endif irqentry_exit(regs, state); } diff --git a/arch/loongarch/kernel/vdso.c b/arch/loongarch/kernel/vdso.c index f6fcc52aefae..2c0d852ca536 100644 --- a/arch/loongarch/kernel/vdso.c +++ b/arch/loongarch/kernel/vdso.c @@ -34,7 +34,6 @@ static union { struct loongarch_vdso_data vdata; } loongarch_vdso_data __page_aligned_data; -static struct page *vdso_pages[] = { NULL }; struct vdso_data *vdso_data = generic_vdso_data.data; struct vdso_pcpu_data *vdso_pdata = loongarch_vdso_data.vdata.pdata; struct vdso_rng_data *vdso_rng_data = &loongarch_vdso_data.vdata.rng_data; @@ -85,10 +84,8 @@ static vm_fault_t vvar_fault(const struct vm_special_mapping *sm, struct loongarch_vdso_info vdso_info = { .vdso = vdso_start, - .size = PAGE_SIZE, .code_mapping = { .name = "[vdso]", - .pages = vdso_pages, .mremap = vdso_mremap, }, .data_mapping = { @@ -103,11 +100,14 @@ static int __init init_vdso(void) unsigned long i, cpu, pfn; BUG_ON(!PAGE_ALIGNED(vdso_info.vdso)); - BUG_ON(!PAGE_ALIGNED(vdso_info.size)); for_each_possible_cpu(cpu) vdso_pdata[cpu].node = cpu_to_node(cpu); + vdso_info.size = PAGE_ALIGN(vdso_end - vdso_start); + vdso_info.code_mapping.pages = + kcalloc(vdso_info.size / PAGE_SIZE, sizeof(struct page *), GFP_KERNEL); + pfn = __phys_to_pfn(__pa_symbol(vdso_info.vdso)); for (i = 0; i < vdso_info.size / PAGE_SIZE; i++) vdso_info.code_mapping.pages[i] = pfn_to_page(pfn + i); diff --git a/arch/loongarch/kvm/timer.c b/arch/loongarch/kvm/timer.c index 74a4b5c272d6..32dc213374be 100644 --- a/arch/loongarch/kvm/timer.c +++ b/arch/loongarch/kvm/timer.c @@ -161,10 +161,11 @@ static void _kvm_save_timer(struct kvm_vcpu *vcpu) if (kvm_vcpu_is_blocking(vcpu)) { /* - * HRTIMER_MODE_PINNED is suggested since vcpu may run in - * the same physical cpu in next time + * HRTIMER_MODE_PINNED_HARD is suggested since vcpu may run in + * the same physical cpu in next time, and the timer should run + * in hardirq context even in the PREEMPT_RT case. */ - hrtimer_start(&vcpu->arch.swtimer, expire, HRTIMER_MODE_ABS_PINNED); + hrtimer_start(&vcpu->arch.swtimer, expire, HRTIMER_MODE_ABS_PINNED_HARD); } } diff --git a/arch/loongarch/kvm/vcpu.c b/arch/loongarch/kvm/vcpu.c index 0697b1064251..174734a23d0a 100644 --- a/arch/loongarch/kvm/vcpu.c +++ b/arch/loongarch/kvm/vcpu.c @@ -1457,7 +1457,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) vcpu->arch.vpid = 0; vcpu->arch.flush_gpa = INVALID_GPA; - hrtimer_init(&vcpu->arch.swtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_PINNED); + hrtimer_init(&vcpu->arch.swtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_PINNED_HARD); vcpu->arch.swtimer.function = kvm_swtimer_wakeup; vcpu->arch.handle_exit = kvm_handle_exit; diff --git a/arch/loongarch/mm/init.c b/arch/loongarch/mm/init.c index 8a87a482c8f4..188b52bbb254 100644 --- a/arch/loongarch/mm/init.c +++ b/arch/loongarch/mm/init.c @@ -201,7 +201,9 @@ pte_t * __init populate_kernel_pte(unsigned long addr) pte = memblock_alloc(PAGE_SIZE, PAGE_SIZE); if (!pte) panic("%s: Failed to allocate memory\n", __func__); + pmd_populate_kernel(&init_mm, pmd, pte); + kernel_pte_init(pte); } return pte_offset_kernel(pmd, addr); diff --git a/arch/loongarch/mm/pgtable.c b/arch/loongarch/mm/pgtable.c index eb6a29b491a7..3fa69b23ff84 100644 --- a/arch/loongarch/mm/pgtable.c +++ b/arch/loongarch/mm/pgtable.c @@ -116,6 +116,26 @@ void pud_init(void *addr) EXPORT_SYMBOL_GPL(pud_init); #endif +void kernel_pte_init(void *addr) +{ + unsigned long *p, *end; + + p = (unsigned long *)addr; + end = p + PTRS_PER_PTE; + + do { + p[0] = _PAGE_GLOBAL; + p[1] = _PAGE_GLOBAL; + p[2] = _PAGE_GLOBAL; + p[3] = _PAGE_GLOBAL; + p[4] = _PAGE_GLOBAL; + p += 8; + p[-3] = _PAGE_GLOBAL; + p[-2] = _PAGE_GLOBAL; + p[-1] = _PAGE_GLOBAL; + } while (p != end); +} + pmd_t mk_pmd(struct page *page, pgprot_t prot) { pmd_t pmd; diff --git a/arch/powerpc/platforms/powernv/opal-irqchip.c b/arch/powerpc/platforms/powernv/opal-irqchip.c index 56a1f7ce78d2..d92759c21fae 100644 --- a/arch/powerpc/platforms/powernv/opal-irqchip.c +++ b/arch/powerpc/platforms/powernv/opal-irqchip.c @@ -282,6 +282,7 @@ int __init opal_event_init(void) name, NULL); if (rc) { pr_warn("Error %d requesting OPAL irq %d\n", rc, (int)r->start); + kfree(name); continue; } } diff --git a/arch/riscv/kvm/aia_imsic.c b/arch/riscv/kvm/aia_imsic.c index 0a1e859323b4..a8085cd8215e 100644 --- a/arch/riscv/kvm/aia_imsic.c +++ b/arch/riscv/kvm/aia_imsic.c @@ -55,7 +55,7 @@ struct imsic { /* IMSIC SW-file */ struct imsic_mrif *swfile; phys_addr_t swfile_pa; - spinlock_t swfile_extirq_lock; + raw_spinlock_t swfile_extirq_lock; }; #define imsic_vs_csr_read(__c) \ @@ -622,7 +622,7 @@ static void imsic_swfile_extirq_update(struct kvm_vcpu *vcpu) * interruptions between reading topei and updating pending status. */ - spin_lock_irqsave(&imsic->swfile_extirq_lock, flags); + raw_spin_lock_irqsave(&imsic->swfile_extirq_lock, flags); if (imsic_mrif_atomic_read(mrif, &mrif->eidelivery) && imsic_mrif_topei(mrif, imsic->nr_eix, imsic->nr_msis)) @@ -630,7 +630,7 @@ static void imsic_swfile_extirq_update(struct kvm_vcpu *vcpu) else kvm_riscv_vcpu_unset_interrupt(vcpu, IRQ_VS_EXT); - spin_unlock_irqrestore(&imsic->swfile_extirq_lock, flags); + raw_spin_unlock_irqrestore(&imsic->swfile_extirq_lock, flags); } static void imsic_swfile_read(struct kvm_vcpu *vcpu, bool clear, @@ -1051,7 +1051,7 @@ int kvm_riscv_vcpu_aia_imsic_init(struct kvm_vcpu *vcpu) } imsic->swfile = page_to_virt(swfile_page); imsic->swfile_pa = page_to_phys(swfile_page); - spin_lock_init(&imsic->swfile_extirq_lock); + raw_spin_lock_init(&imsic->swfile_extirq_lock); /* Setup IO device */ kvm_iodevice_init(&imsic->iodev, &imsic_iodoev_ops); diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c index 99f34409fb60..4cc631fa7039 100644 --- a/arch/riscv/net/bpf_jit_comp64.c +++ b/arch/riscv/net/bpf_jit_comp64.c @@ -18,6 +18,7 @@ #define RV_MAX_REG_ARGS 8 #define RV_FENTRY_NINSNS 2 #define RV_FENTRY_NBYTES (RV_FENTRY_NINSNS * 4) +#define RV_KCFI_NINSNS (IS_ENABLED(CONFIG_CFI_CLANG) ? 1 : 0) /* imm that allows emit_imm to emit max count insns */ #define RV_MAX_COUNT_IMM 0x7FFF7FF7FF7FF7FF @@ -271,7 +272,8 @@ static void __build_epilogue(bool is_tail_call, struct rv_jit_context *ctx) if (!is_tail_call) emit_addiw(RV_REG_A0, RV_REG_A5, 0, ctx); emit_jalr(RV_REG_ZERO, is_tail_call ? RV_REG_T3 : RV_REG_RA, - is_tail_call ? (RV_FENTRY_NINSNS + 1) * 4 : 0, /* skip reserved nops and TCC init */ + /* kcfi, fentry and TCC init insns will be skipped on tailcall */ + is_tail_call ? (RV_KCFI_NINSNS + RV_FENTRY_NINSNS + 1) * 4 : 0, ctx); } @@ -548,8 +550,8 @@ static void emit_atomic(u8 rd, u8 rs, s16 off, s32 imm, bool is64, rv_lr_w(r0, 0, rd, 0, 0), ctx); jmp_offset = ninsns_rvoff(8); emit(rv_bne(RV_REG_T2, r0, jmp_offset >> 1), ctx); - emit(is64 ? rv_sc_d(RV_REG_T3, rs, rd, 0, 0) : - rv_sc_w(RV_REG_T3, rs, rd, 0, 0), ctx); + emit(is64 ? rv_sc_d(RV_REG_T3, rs, rd, 0, 1) : + rv_sc_w(RV_REG_T3, rs, rd, 0, 1), ctx); jmp_offset = ninsns_rvoff(-6); emit(rv_bne(RV_REG_T3, 0, jmp_offset >> 1), ctx); emit(rv_fence(0x3, 0x3), ctx); diff --git a/arch/s390/configs/debug_defconfig b/arch/s390/configs/debug_defconfig index 9b57add02cd5..fb0e9a1d9be2 100644 --- a/arch/s390/configs/debug_defconfig +++ b/arch/s390/configs/debug_defconfig @@ -50,7 +50,6 @@ CONFIG_NUMA=y CONFIG_HZ_100=y CONFIG_CERT_STORE=y CONFIG_EXPOLINE=y -# CONFIG_EXPOLINE_EXTERN is not set CONFIG_EXPOLINE_AUTO=y CONFIG_CHSC_SCH=y CONFIG_VFIO_CCW=m @@ -95,6 +94,7 @@ CONFIG_BINFMT_MISC=m CONFIG_ZSWAP=y CONFIG_ZSWAP_ZPOOL_DEFAULT_ZBUD=y CONFIG_ZSMALLOC_STAT=y +CONFIG_SLAB_BUCKETS=y CONFIG_SLUB_STATS=y # CONFIG_COMPAT_BRK is not set CONFIG_MEMORY_HOTPLUG=y @@ -426,6 +426,13 @@ CONFIG_DEVTMPFS_SAFE=y # CONFIG_FW_LOADER is not set CONFIG_CONNECTOR=y CONFIG_ZRAM=y +CONFIG_ZRAM_BACKEND_LZ4=y +CONFIG_ZRAM_BACKEND_LZ4HC=y +CONFIG_ZRAM_BACKEND_ZSTD=y +CONFIG_ZRAM_BACKEND_DEFLATE=y +CONFIG_ZRAM_BACKEND_842=y +CONFIG_ZRAM_BACKEND_LZO=y +CONFIG_ZRAM_DEF_COMP_DEFLATE=y CONFIG_BLK_DEV_LOOP=m CONFIG_BLK_DEV_DRBD=m CONFIG_BLK_DEV_NBD=m @@ -486,6 +493,7 @@ CONFIG_DM_UEVENT=y CONFIG_DM_FLAKEY=m CONFIG_DM_VERITY=m CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG=y +CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG_PLATFORM_KEYRING=y CONFIG_DM_SWITCH=m CONFIG_DM_INTEGRITY=m CONFIG_DM_VDO=m @@ -535,6 +543,7 @@ CONFIG_NLMON=m CONFIG_MLX4_EN=m CONFIG_MLX5_CORE=m CONFIG_MLX5_CORE_EN=y +# CONFIG_NET_VENDOR_META is not set # CONFIG_NET_VENDOR_MICREL is not set # CONFIG_NET_VENDOR_MICROCHIP is not set # CONFIG_NET_VENDOR_MICROSEMI is not set @@ -695,6 +704,7 @@ CONFIG_NFSD=m CONFIG_NFSD_V3_ACL=y CONFIG_NFSD_V4=y CONFIG_NFSD_V4_SECURITY_LABEL=y +# CONFIG_NFSD_LEGACY_CLIENT_TRACKING is not set CONFIG_CIFS=m CONFIG_CIFS_UPCALL=y CONFIG_CIFS_XATTR=y @@ -740,7 +750,6 @@ CONFIG_CRYPTO_DH=m CONFIG_CRYPTO_ECDH=m CONFIG_CRYPTO_ECDSA=m CONFIG_CRYPTO_ECRDSA=m -CONFIG_CRYPTO_SM2=m CONFIG_CRYPTO_CURVE25519=m CONFIG_CRYPTO_AES_TI=m CONFIG_CRYPTO_ANUBIS=m diff --git a/arch/s390/configs/defconfig b/arch/s390/configs/defconfig index df4addd1834a..88be0a734b60 100644 --- a/arch/s390/configs/defconfig +++ b/arch/s390/configs/defconfig @@ -48,7 +48,6 @@ CONFIG_NUMA=y CONFIG_HZ_100=y CONFIG_CERT_STORE=y CONFIG_EXPOLINE=y -# CONFIG_EXPOLINE_EXTERN is not set CONFIG_EXPOLINE_AUTO=y CONFIG_CHSC_SCH=y CONFIG_VFIO_CCW=m @@ -89,6 +88,7 @@ CONFIG_BINFMT_MISC=m CONFIG_ZSWAP=y CONFIG_ZSWAP_ZPOOL_DEFAULT_ZBUD=y CONFIG_ZSMALLOC_STAT=y +CONFIG_SLAB_BUCKETS=y # CONFIG_COMPAT_BRK is not set CONFIG_MEMORY_HOTPLUG=y CONFIG_MEMORY_HOTREMOVE=y @@ -416,6 +416,13 @@ CONFIG_DEVTMPFS_SAFE=y # CONFIG_FW_LOADER is not set CONFIG_CONNECTOR=y CONFIG_ZRAM=y +CONFIG_ZRAM_BACKEND_LZ4=y +CONFIG_ZRAM_BACKEND_LZ4HC=y +CONFIG_ZRAM_BACKEND_ZSTD=y +CONFIG_ZRAM_BACKEND_DEFLATE=y +CONFIG_ZRAM_BACKEND_842=y +CONFIG_ZRAM_BACKEND_LZO=y +CONFIG_ZRAM_DEF_COMP_DEFLATE=y CONFIG_BLK_DEV_LOOP=m CONFIG_BLK_DEV_DRBD=m CONFIG_BLK_DEV_NBD=m @@ -476,6 +483,7 @@ CONFIG_DM_UEVENT=y CONFIG_DM_FLAKEY=m CONFIG_DM_VERITY=m CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG=y +CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG_PLATFORM_KEYRING=y CONFIG_DM_SWITCH=m CONFIG_DM_INTEGRITY=m CONFIG_DM_VDO=m @@ -525,6 +533,7 @@ CONFIG_NLMON=m CONFIG_MLX4_EN=m CONFIG_MLX5_CORE=m CONFIG_MLX5_CORE_EN=y +# CONFIG_NET_VENDOR_META is not set # CONFIG_NET_VENDOR_MICREL is not set # CONFIG_NET_VENDOR_MICROCHIP is not set # CONFIG_NET_VENDOR_MICROSEMI is not set @@ -682,6 +691,7 @@ CONFIG_NFSD=m CONFIG_NFSD_V3_ACL=y CONFIG_NFSD_V4=y CONFIG_NFSD_V4_SECURITY_LABEL=y +# CONFIG_NFSD_LEGACY_CLIENT_TRACKING is not set CONFIG_CIFS=m CONFIG_CIFS_UPCALL=y CONFIG_CIFS_XATTR=y @@ -726,7 +736,6 @@ CONFIG_CRYPTO_DH=m CONFIG_CRYPTO_ECDH=m CONFIG_CRYPTO_ECDSA=m CONFIG_CRYPTO_ECRDSA=m -CONFIG_CRYPTO_SM2=m CONFIG_CRYPTO_CURVE25519=m CONFIG_CRYPTO_AES_TI=m CONFIG_CRYPTO_ANUBIS=m @@ -767,6 +776,7 @@ CONFIG_CRYPTO_LZ4=m CONFIG_CRYPTO_LZ4HC=m CONFIG_CRYPTO_ZSTD=m CONFIG_CRYPTO_ANSI_CPRNG=m +CONFIG_CRYPTO_JITTERENTROPY_OSR=1 CONFIG_CRYPTO_USER_API_HASH=m CONFIG_CRYPTO_USER_API_SKCIPHER=m CONFIG_CRYPTO_USER_API_RNG=m diff --git a/arch/s390/configs/zfcpdump_defconfig b/arch/s390/configs/zfcpdump_defconfig index 8c2b61363bab..bcbaa069de96 100644 --- a/arch/s390/configs/zfcpdump_defconfig +++ b/arch/s390/configs/zfcpdump_defconfig @@ -49,6 +49,7 @@ CONFIG_ZFCP=y # CONFIG_HVC_IUCV is not set # CONFIG_HW_RANDOM_S390 is not set # CONFIG_HMC_DRV is not set +# CONFIG_S390_UV_UAPI is not set # CONFIG_S390_TAPE is not set # CONFIG_VMCP is not set # CONFIG_MONWRITER is not set diff --git a/arch/s390/include/asm/perf_event.h b/arch/s390/include/asm/perf_event.h index 66200d4a2134..29ee289108c5 100644 --- a/arch/s390/include/asm/perf_event.h +++ b/arch/s390/include/asm/perf_event.h @@ -49,6 +49,7 @@ struct perf_sf_sde_regs { }; #define perf_arch_fetch_caller_regs(regs, __ip) do { \ + (regs)->psw.mask = 0; \ (regs)->psw.addr = (__ip); \ (regs)->gprs[15] = (unsigned long)__builtin_frame_address(0) - \ offsetof(struct stack_frame, back_chain); \ diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c index 2a32438e09ce..74f73141f9b9 100644 --- a/arch/s390/kvm/diag.c +++ b/arch/s390/kvm/diag.c @@ -77,7 +77,7 @@ static int __diag_page_ref_service(struct kvm_vcpu *vcpu) vcpu->stat.instruction_diagnose_258++; if (vcpu->run->s.regs.gprs[rx] & 7) return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION); - rc = read_guest(vcpu, vcpu->run->s.regs.gprs[rx], rx, &parm, sizeof(parm)); + rc = read_guest_real(vcpu, vcpu->run->s.regs.gprs[rx], &parm, sizeof(parm)); if (rc) return kvm_s390_inject_prog_cond(vcpu, rc); if (parm.parm_version != 2 || parm.parm_len < 5 || parm.code != 0x258) diff --git a/arch/s390/kvm/gaccess.c b/arch/s390/kvm/gaccess.c index e65f597e3044..a688351f4ab5 100644 --- a/arch/s390/kvm/gaccess.c +++ b/arch/s390/kvm/gaccess.c @@ -828,6 +828,8 @@ static int access_guest_page(struct kvm *kvm, enum gacc_mode mode, gpa_t gpa, const gfn_t gfn = gpa_to_gfn(gpa); int rc; + if (!gfn_to_memslot(kvm, gfn)) + return PGM_ADDRESSING; if (mode == GACC_STORE) rc = kvm_write_guest_page(kvm, gfn, data, offset, len); else @@ -985,6 +987,8 @@ int access_guest_real(struct kvm_vcpu *vcpu, unsigned long gra, gra += fragment_len; data += fragment_len; } + if (rc > 0) + vcpu->arch.pgm.code = rc; return rc; } diff --git a/arch/s390/kvm/gaccess.h b/arch/s390/kvm/gaccess.h index b320d12aa049..3fde45a151f2 100644 --- a/arch/s390/kvm/gaccess.h +++ b/arch/s390/kvm/gaccess.h @@ -405,11 +405,12 @@ int read_guest_abs(struct kvm_vcpu *vcpu, unsigned long gpa, void *data, * @len: number of bytes to copy * * Copy @len bytes from @data (kernel space) to @gra (guest real address). - * It is up to the caller to ensure that the entire guest memory range is - * valid memory before calling this function. * Guest low address and key protection are not checked. * - * Returns zero on success or -EFAULT on error. + * Returns zero on success, -EFAULT when copying from @data failed, or + * PGM_ADRESSING in case @gra is outside a memslot. In this case, pgm check info + * is also stored to allow injecting into the guest (if applicable) using + * kvm_s390_inject_prog_cond(). * * If an error occurs data may have been copied partially to guest memory. */ @@ -428,11 +429,12 @@ int write_guest_real(struct kvm_vcpu *vcpu, unsigned long gra, void *data, * @len: number of bytes to copy * * Copy @len bytes from @gra (guest real address) to @data (kernel space). - * It is up to the caller to ensure that the entire guest memory range is - * valid memory before calling this function. * Guest key protection is not checked. * - * Returns zero on success or -EFAULT on error. + * Returns zero on success, -EFAULT when copying to @data failed, or + * PGM_ADRESSING in case @gra is outside a memslot. In this case, pgm check info + * is also stored to allow injecting into the guest (if applicable) using + * kvm_s390_inject_prog_cond(). * * If an error occurs data may have been copied partially to kernel space. */ diff --git a/arch/s390/pci/pci_event.c b/arch/s390/pci/pci_event.c index dbe95ec5917e..d4f19d33914c 100644 --- a/arch/s390/pci/pci_event.c +++ b/arch/s390/pci/pci_event.c @@ -280,18 +280,19 @@ static void __zpci_event_error(struct zpci_ccdf_err *ccdf) goto no_pdev; switch (ccdf->pec) { - case 0x003a: /* Service Action or Error Recovery Successful */ + case 0x002a: /* Error event concerns FMB */ + case 0x002b: + case 0x002c: + break; + case 0x0040: /* Service Action or Error Recovery Failed */ + case 0x003b: + zpci_event_io_failure(pdev, pci_channel_io_perm_failure); + break; + default: /* PCI function left in the error state attempt to recover */ ers_res = zpci_event_attempt_error_recovery(pdev); if (ers_res != PCI_ERS_RESULT_RECOVERED) zpci_event_io_failure(pdev, pci_channel_io_perm_failure); break; - default: - /* - * Mark as frozen not permanently failed because the device - * could be subsequently recovered by the platform. - */ - zpci_event_io_failure(pdev, pci_channel_io_frozen); - break; } pci_dev_put(pdev); no_pdev: diff --git a/arch/x86/entry/entry.S b/arch/x86/entry/entry.S index d9feadffa972..324686bca368 100644 --- a/arch/x86/entry/entry.S +++ b/arch/x86/entry/entry.S @@ -9,6 +9,8 @@ #include <asm/unwind_hints.h> #include <asm/segment.h> #include <asm/cache.h> +#include <asm/cpufeatures.h> +#include <asm/nospec-branch.h> #include "calling.h" @@ -19,6 +21,9 @@ SYM_FUNC_START(entry_ibpb) movl $PRED_CMD_IBPB, %eax xorl %edx, %edx wrmsr + + /* Make sure IBPB clears return stack preductions too. */ + FILL_RETURN_BUFFER %rax, RSB_CLEAR_LOOPS, X86_BUG_IBPB_NO_RET RET SYM_FUNC_END(entry_ibpb) /* For KVM */ diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S index d3a814efbff6..20be5758c2d2 100644 --- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -871,6 +871,8 @@ SYM_FUNC_START(entry_SYSENTER_32) /* Now ready to switch the cr3 */ SWITCH_TO_USER_CR3 scratch_reg=%eax + /* Clobbers ZF */ + CLEAR_CPU_BUFFERS /* * Restore all flags except IF. (We restore IF separately because @@ -881,7 +883,6 @@ SYM_FUNC_START(entry_SYSENTER_32) BUG_IF_WRONG_CR3 no_user_check=1 popfl popl %eax - CLEAR_CPU_BUFFERS /* * Return back to the vDSO, which will pop ecx and edx. @@ -1144,7 +1145,6 @@ SYM_CODE_START(asm_exc_nmi) /* Not on SYSENTER stack. */ call exc_nmi - CLEAR_CPU_BUFFERS jmp .Lnmi_return .Lnmi_from_sysenter_stack: @@ -1165,6 +1165,7 @@ SYM_CODE_START(asm_exc_nmi) CHECK_AND_APPLY_ESPFIX RESTORE_ALL_NMI cr3_reg=%edi pop=4 + CLEAR_CPU_BUFFERS jmp .Lirq_return #ifdef CONFIG_X86_ESPFIX32 @@ -1206,6 +1207,7 @@ SYM_CODE_START(asm_exc_nmi) * 1 - orig_ax */ lss (1+5+6)*4(%esp), %esp # back to espfix stack + CLEAR_CPU_BUFFERS jmp .Lirq_return #endif SYM_CODE_END(asm_exc_nmi) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index dd4682857c12..913fd3a7bac6 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -215,7 +215,7 @@ #define X86_FEATURE_SPEC_STORE_BYPASS_DISABLE ( 7*32+23) /* Disable Speculative Store Bypass. */ #define X86_FEATURE_LS_CFG_SSBD ( 7*32+24) /* AMD SSBD implementation via LS_CFG MSR */ #define X86_FEATURE_IBRS ( 7*32+25) /* "ibrs" Indirect Branch Restricted Speculation */ -#define X86_FEATURE_IBPB ( 7*32+26) /* "ibpb" Indirect Branch Prediction Barrier */ +#define X86_FEATURE_IBPB ( 7*32+26) /* "ibpb" Indirect Branch Prediction Barrier without a guaranteed RSB flush */ #define X86_FEATURE_STIBP ( 7*32+27) /* "stibp" Single Thread Indirect Branch Predictors */ #define X86_FEATURE_ZEN ( 7*32+28) /* Generic flag for all Zen and newer */ #define X86_FEATURE_L1TF_PTEINV ( 7*32+29) /* L1TF workaround PTE inversion */ @@ -348,6 +348,7 @@ #define X86_FEATURE_CPPC (13*32+27) /* "cppc" Collaborative Processor Performance Control */ #define X86_FEATURE_AMD_PSFD (13*32+28) /* Predictive Store Forwarding Disable */ #define X86_FEATURE_BTC_NO (13*32+29) /* Not vulnerable to Branch Type Confusion */ +#define X86_FEATURE_AMD_IBPB_RET (13*32+30) /* IBPB clears return address predictor */ #define X86_FEATURE_BRS (13*32+31) /* "brs" Branch Sampling available */ /* Thermal and Power Management Leaf, CPUID level 0x00000006 (EAX), word 14 */ @@ -523,4 +524,5 @@ #define X86_BUG_DIV0 X86_BUG(1*32 + 1) /* "div0" AMD DIV0 speculation bug */ #define X86_BUG_RFDS X86_BUG(1*32 + 2) /* "rfds" CPU is vulnerable to Register File Data Sampling */ #define X86_BUG_BHI X86_BUG(1*32 + 3) /* "bhi" CPU is affected by Branch History Injection */ +#define X86_BUG_IBPB_NO_RET X86_BUG(1*32 + 4) /* "ibpb_no_ret" IBPB omits return target predictions */ #endif /* _ASM_X86_CPUFEATURES_H */ diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index ff5f1ecc7d1e..96b410b1d4e8 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -323,7 +323,16 @@ * Note: Only the memory operand variant of VERW clears the CPU buffers. */ .macro CLEAR_CPU_BUFFERS - ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF +#ifdef CONFIG_X86_64 + ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF +#else + /* + * In 32bit mode, the memory operand must be a %cs reference. The data + * segments may not be usable (vm86 mode), and the stack segment may not + * be flat (ESPFIX32). + */ + ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF +#endif .endm #ifdef CONFIG_X86_64 diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c index dc5d3216af24..9fe9972d2071 100644 --- a/arch/x86/kernel/amd_nb.c +++ b/arch/x86/kernel/amd_nb.c @@ -44,6 +44,7 @@ #define PCI_DEVICE_ID_AMD_19H_M70H_DF_F4 0x14f4 #define PCI_DEVICE_ID_AMD_19H_M78H_DF_F4 0x12fc #define PCI_DEVICE_ID_AMD_1AH_M00H_DF_F4 0x12c4 +#define PCI_DEVICE_ID_AMD_1AH_M20H_DF_F4 0x16fc #define PCI_DEVICE_ID_AMD_1AH_M60H_DF_F4 0x124c #define PCI_DEVICE_ID_AMD_1AH_M70H_DF_F4 0x12bc #define PCI_DEVICE_ID_AMD_MI200_DF_F4 0x14d4 @@ -127,6 +128,7 @@ static const struct pci_device_id amd_nb_link_ids[] = { { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M78H_DF_F4) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_CNB17H_F4) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M00H_DF_F4) }, + { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M20H_DF_F4) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M60H_DF_F4) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M70H_DF_F4) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_MI200_DF_F4) }, diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 6513c53c9459..c5fb28e6451a 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -440,7 +440,19 @@ static int lapic_timer_shutdown(struct clock_event_device *evt) v = apic_read(APIC_LVTT); v |= (APIC_LVT_MASKED | LOCAL_TIMER_VECTOR); apic_write(APIC_LVTT, v); - apic_write(APIC_TMICT, 0); + + /* + * Setting APIC_LVT_MASKED (above) should be enough to tell + * the hardware that this timer will never fire. But AMD + * erratum 411 and some Intel CPU behavior circa 2024 say + * otherwise. Time for belt and suspenders programming: mask + * the timer _and_ zero the counter registers: + */ + if (v & APIC_LVT_TIMER_TSCDEADLINE) + wrmsrl(MSR_IA32_TSC_DEADLINE, 0); + else + apic_write(APIC_TMICT, 0); + return 0; } diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 015971adadfc..fab5caec0b72 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -1202,5 +1202,6 @@ void amd_check_microcode(void) if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD) return; - on_each_cpu(zenbleed_check_cpu, NULL, 1); + if (cpu_feature_enabled(X86_FEATURE_ZEN2)) + on_each_cpu(zenbleed_check_cpu, NULL, 1); } diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index d1915427b4ff..47a01d4028f6 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -1115,8 +1115,25 @@ do_cmd_auto: case RETBLEED_MITIGATION_IBPB: setup_force_cpu_cap(X86_FEATURE_ENTRY_IBPB); + + /* + * IBPB on entry already obviates the need for + * software-based untraining so clear those in case some + * other mitigation like SRSO has selected them. + */ + setup_clear_cpu_cap(X86_FEATURE_UNRET); + setup_clear_cpu_cap(X86_FEATURE_RETHUNK); + setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT); mitigate_smt = true; + + /* + * There is no need for RSB filling: entry_ibpb() ensures + * all predictions, including the RSB, are invalidated, + * regardless of IBPB implementation. + */ + setup_clear_cpu_cap(X86_FEATURE_RSB_VMEXIT); + break; case RETBLEED_MITIGATION_STUFF: @@ -2627,6 +2644,14 @@ static void __init srso_select_mitigation(void) if (has_microcode) { setup_force_cpu_cap(X86_FEATURE_ENTRY_IBPB); srso_mitigation = SRSO_MITIGATION_IBPB; + + /* + * IBPB on entry already obviates the need for + * software-based untraining so clear those in case some + * other mitigation like Retbleed has selected them. + */ + setup_clear_cpu_cap(X86_FEATURE_UNRET); + setup_clear_cpu_cap(X86_FEATURE_RETHUNK); } } else { pr_err("WARNING: kernel not compiled with MITIGATION_IBPB_ENTRY.\n"); @@ -2638,6 +2663,13 @@ static void __init srso_select_mitigation(void) if (!boot_cpu_has(X86_FEATURE_ENTRY_IBPB) && has_microcode) { setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT); srso_mitigation = SRSO_MITIGATION_IBPB_ON_VMEXIT; + + /* + * There is no need for RSB filling: entry_ibpb() ensures + * all predictions, including the RSB, are invalidated, + * regardless of IBPB implementation. + */ + setup_clear_cpu_cap(X86_FEATURE_RSB_VMEXIT); } } else { pr_err("WARNING: kernel not compiled with MITIGATION_SRSO.\n"); diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 07a34d723505..f1040cb64841 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1443,6 +1443,9 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) boot_cpu_has(X86_FEATURE_HYPERVISOR))) setup_force_cpu_bug(X86_BUG_BHI); + if (cpu_has(c, X86_FEATURE_AMD_IBPB) && !cpu_has(c, X86_FEATURE_AMD_IBPB_RET)) + setup_force_cpu_bug(X86_BUG_IBPB_NO_RET); + if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN)) return; diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 8591d53c144b..b681c2e07dbf 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -207,7 +207,7 @@ static inline bool rdt_get_mb_table(struct rdt_resource *r) return false; } -static bool __get_mem_config_intel(struct rdt_resource *r) +static __init bool __get_mem_config_intel(struct rdt_resource *r) { struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); union cpuid_0x10_3_eax eax; @@ -241,7 +241,7 @@ static bool __get_mem_config_intel(struct rdt_resource *r) return true; } -static bool __rdt_get_mem_config_amd(struct rdt_resource *r) +static __init bool __rdt_get_mem_config_amd(struct rdt_resource *r) { struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); u32 eax, ebx, ecx, edx, subleaf; diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c index 50fa1fe9a073..200d89a64027 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -29,10 +29,10 @@ * hardware. The allocated bandwidth percentage is rounded to the next * control step available on the hardware. */ -static bool bw_validate(char *buf, unsigned long *data, struct rdt_resource *r) +static bool bw_validate(char *buf, u32 *data, struct rdt_resource *r) { - unsigned long bw; int ret; + u32 bw; /* * Only linear delay values is supported for current Intel SKUs. @@ -42,16 +42,21 @@ static bool bw_validate(char *buf, unsigned long *data, struct rdt_resource *r) return false; } - ret = kstrtoul(buf, 10, &bw); + ret = kstrtou32(buf, 10, &bw); if (ret) { - rdt_last_cmd_printf("Non-decimal digit in MB value %s\n", buf); + rdt_last_cmd_printf("Invalid MB value %s\n", buf); return false; } - if ((bw < r->membw.min_bw || bw > r->default_ctrl) && - !is_mba_sc(r)) { - rdt_last_cmd_printf("MB value %ld out of range [%d,%d]\n", bw, - r->membw.min_bw, r->default_ctrl); + /* Nothing else to do if software controller is enabled. */ + if (is_mba_sc(r)) { + *data = bw; + return true; + } + + if (bw < r->membw.min_bw || bw > r->default_ctrl) { + rdt_last_cmd_printf("MB value %u out of range [%d,%d]\n", + bw, r->membw.min_bw, r->default_ctrl); return false; } @@ -65,7 +70,7 @@ int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s, struct resctrl_staged_config *cfg; u32 closid = data->rdtgrp->closid; struct rdt_resource *r = s->res; - unsigned long bw_val; + u32 bw_val; cfg = &d->staged_config[s->conf_type]; if (cfg->have_new_ctrl) { diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 263f8aed4e2c..21e9e4845354 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -37,6 +37,7 @@ #include <asm/apic.h> #include <asm/apicdef.h> #include <asm/hypervisor.h> +#include <asm/mtrr.h> #include <asm/tlb.h> #include <asm/cpuidle_haltpoll.h> #include <asm/ptrace.h> @@ -980,6 +981,9 @@ static void __init kvm_init_platform(void) } kvmclock_init(); x86_platform.apic_post_init = kvm_apic_init; + + /* Set WB as the default cache mode for SEV-SNP and TDX */ + mtrr_overwrite_state(NULL, 0, MTRR_TYPE_WRBACK); } #if defined(CONFIG_AMD_MEM_ENCRYPT) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index a9a23e058555..8e853a5fc867 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1556,6 +1556,17 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range) { bool flush = false; + /* + * To prevent races with vCPUs faulting in a gfn using stale data, + * zapping a gfn range must be protected by mmu_invalidate_in_progress + * (and mmu_invalidate_seq). The only exception is memslot deletion; + * in that case, SRCU synchronization ensures that SPTEs are zapped + * after all vCPUs have unlocked SRCU, guaranteeing that vCPUs see the + * invalid slot. + */ + lockdep_assert_once(kvm->mmu_invalidate_in_progress || + lockdep_is_held(&kvm->slots_lock)); + if (kvm_memslots_have_rmaps(kvm)) flush = __kvm_rmap_zap_gfn_range(kvm, range->slot, range->start, range->end, @@ -1884,14 +1895,10 @@ static bool sp_has_gptes(struct kvm_mmu_page *sp) if (is_obsolete_sp((_kvm), (_sp))) { \ } else -#define for_each_gfn_valid_sp(_kvm, _sp, _gfn) \ +#define for_each_gfn_valid_sp_with_gptes(_kvm, _sp, _gfn) \ for_each_valid_sp(_kvm, _sp, \ &(_kvm)->arch.mmu_page_hash[kvm_page_table_hashfn(_gfn)]) \ - if ((_sp)->gfn != (_gfn)) {} else - -#define for_each_gfn_valid_sp_with_gptes(_kvm, _sp, _gfn) \ - for_each_gfn_valid_sp(_kvm, _sp, _gfn) \ - if (!sp_has_gptes(_sp)) {} else + if ((_sp)->gfn != (_gfn) || !sp_has_gptes(_sp)) {} else static bool kvm_sync_page_check(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp) { @@ -7063,15 +7070,15 @@ static void kvm_mmu_zap_memslot_pages_and_flush(struct kvm *kvm, /* * Since accounting information is stored in struct kvm_arch_memory_slot, - * shadow pages deletion (e.g. unaccount_shadowed()) requires that all - * gfns with a shadow page have a corresponding memslot. Do so before - * the memslot goes away. + * all MMU pages that are shadowing guest PTEs must be zapped before the + * memslot is deleted, as freeing such pages after the memslot is freed + * will result in use-after-free, e.g. in unaccount_shadowed(). */ for (i = 0; i < slot->npages; i++) { struct kvm_mmu_page *sp; gfn_t gfn = slot->base_gfn + i; - for_each_gfn_valid_sp(kvm, sp, gfn) + for_each_gfn_valid_sp_with_gptes(kvm, sp, gfn) kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list); if (need_resched() || rwlock_needbreak(&kvm->mmu_lock)) { diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index d5314cb7dff4..cf84103ce38b 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -63,8 +63,12 @@ static u64 nested_svm_get_tdp_pdptr(struct kvm_vcpu *vcpu, int index) u64 pdpte; int ret; + /* + * Note, nCR3 is "assumed" to be 32-byte aligned, i.e. the CPU ignores + * nCR3[4:0] when loading PDPTEs from memory. + */ ret = kvm_vcpu_read_guest_page(vcpu, gpa_to_gfn(cr3), &pdpte, - offset_in_page(cr3) + index * 8, 8); + (cr3 & GENMASK(11, 5)) + index * 8, 8); if (ret) return 0; return pdpte; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 1a4438358c5e..81ed596e4454 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -4888,9 +4888,6 @@ void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) vmx->hv_deadline_tsc = -1; kvm_set_cr8(vcpu, 0); - vmx_segment_cache_clear(vmx); - kvm_register_mark_available(vcpu, VCPU_EXREG_SEGMENTS); - seg_setup(VCPU_SREG_CS); vmcs_write16(GUEST_CS_SELECTOR, 0xf000); vmcs_writel(GUEST_CS_BASE, 0xffff0000ul); @@ -4917,6 +4914,9 @@ void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) vmcs_writel(GUEST_IDTR_BASE, 0); vmcs_write32(GUEST_IDTR_LIMIT, 0xffff); + vmx_segment_cache_clear(vmx); + kvm_register_mark_available(vcpu, VCPU_EXREG_SEGMENTS); + vmcs_write32(GUEST_ACTIVITY_STATE, GUEST_ACTIVITY_ACTIVE); vmcs_write32(GUEST_INTERRUPTIBILITY_INFO, 0); vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS, 0); diff --git a/block/blk-mq.c b/block/blk-mq.c index 4b2c8e940f59..cf626e061dd7 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -4310,6 +4310,12 @@ int blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, /* mark the queue as mq asap */ q->mq_ops = set->ops; + /* + * ->tag_set has to be setup before initialize hctx, which cpuphp + * handler needs it for checking queue mapping + */ + q->tag_set = set; + if (blk_mq_alloc_ctxs(q)) goto err_exit; @@ -4328,8 +4334,6 @@ int blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, INIT_WORK(&q->timeout_work, blk_mq_timeout_work); blk_queue_rq_timeout(q, set->timeout ? set->timeout : 30 * HZ); - q->tag_set = set; - q->queue_flags |= QUEUE_FLAG_MQ_DEFAULT; INIT_DELAYED_WORK(&q->requeue_work, blk_mq_requeue_work); diff --git a/block/blk-rq-qos.c b/block/blk-rq-qos.c index 2cfb297d9a62..058f92c4f9d5 100644 --- a/block/blk-rq-qos.c +++ b/block/blk-rq-qos.c @@ -219,8 +219,8 @@ static int rq_qos_wake_function(struct wait_queue_entry *curr, data->got_token = true; smp_wmb(); - list_del_init(&curr->entry); wake_up_process(data->task); + list_del_init_careful(&curr->entry); return 1; } diff --git a/block/elevator.c b/block/elevator.c index 4122026b11f1..9430cde13d1a 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -106,8 +106,7 @@ static struct elevator_type *__elevator_find(const char *name) return NULL; } -static struct elevator_type *elevator_find_get(struct request_queue *q, - const char *name) +static struct elevator_type *elevator_find_get(const char *name) { struct elevator_type *e; @@ -551,7 +550,7 @@ EXPORT_SYMBOL_GPL(elv_unregister); static inline bool elv_support_iosched(struct request_queue *q) { if (!queue_is_mq(q) || - (q->tag_set && (q->tag_set->flags & BLK_MQ_F_NO_SCHED))) + (q->tag_set->flags & BLK_MQ_F_NO_SCHED)) return false; return true; } @@ -562,14 +561,14 @@ static inline bool elv_support_iosched(struct request_queue *q) */ static struct elevator_type *elevator_get_default(struct request_queue *q) { - if (q->tag_set && q->tag_set->flags & BLK_MQ_F_NO_SCHED_BY_DEFAULT) + if (q->tag_set->flags & BLK_MQ_F_NO_SCHED_BY_DEFAULT) return NULL; if (q->nr_hw_queues != 1 && !blk_mq_is_shared_tags(q->tag_set->flags)) return NULL; - return elevator_find_get(q, "mq-deadline"); + return elevator_find_get("mq-deadline"); } /* @@ -697,7 +696,7 @@ static int elevator_change(struct request_queue *q, const char *elevator_name) if (q->elevator && elevator_match(q->elevator->type, elevator_name)) return 0; - e = elevator_find_get(q, elevator_name); + e = elevator_find_get(elevator_name); if (!e) return -EINVAL; ret = elevator_switch(q, e); @@ -709,13 +708,21 @@ int elv_iosched_load_module(struct gendisk *disk, const char *buf, size_t count) { char elevator_name[ELV_NAME_MAX]; + struct elevator_type *found; + const char *name; if (!elv_support_iosched(disk->queue)) return -EOPNOTSUPP; strscpy(elevator_name, buf, sizeof(elevator_name)); + name = strstrip(elevator_name); - request_module("%s-iosched", strstrip(elevator_name)); + spin_lock(&elv_list_lock); + found = __elevator_find(name); + spin_unlock(&elv_list_lock); + + if (!found) + request_module("%s-iosched", name); return 0; } diff --git a/drivers/accel/qaic/qaic_control.c b/drivers/accel/qaic/qaic_control.c index 9e8a8cbadf6b..d8bdab69f800 100644 --- a/drivers/accel/qaic/qaic_control.c +++ b/drivers/accel/qaic/qaic_control.c @@ -496,7 +496,7 @@ static int encode_addr_size_pairs(struct dma_xfer *xfer, struct wrapper_list *wr nents = sgt->nents; nents_dma = nents; *size = QAIC_MANAGE_EXT_MSG_LENGTH - msg_hdr_len - sizeof(**out_trans); - for_each_sgtable_sg(sgt, sg, i) { + for_each_sgtable_dma_sg(sgt, sg, i) { *size -= sizeof(*asp); /* Save 1K for possible follow-up transactions. */ if (*size < SZ_1K) { diff --git a/drivers/accel/qaic/qaic_data.c b/drivers/accel/qaic/qaic_data.c index e86e71c1cdd8..c20eb63750f5 100644 --- a/drivers/accel/qaic/qaic_data.c +++ b/drivers/accel/qaic/qaic_data.c @@ -184,7 +184,7 @@ static int clone_range_of_sgt_for_slice(struct qaic_device *qdev, struct sg_tabl nents = 0; size = size ? size : PAGE_SIZE; - for (sg = sgt_in->sgl; sg; sg = sg_next(sg)) { + for_each_sgtable_dma_sg(sgt_in, sg, j) { len = sg_dma_len(sg); if (!len) @@ -221,7 +221,7 @@ static int clone_range_of_sgt_for_slice(struct qaic_device *qdev, struct sg_tabl /* copy relevant sg node and fix page and length */ sgn = sgf; - for_each_sgtable_sg(sgt, sg, j) { + for_each_sgtable_dma_sg(sgt, sg, j) { memcpy(sg, sgn, sizeof(*sg)); if (sgn == sgf) { sg_dma_address(sg) += offf; @@ -301,7 +301,7 @@ static int encode_reqs(struct qaic_device *qdev, struct bo_slice *slice, * fence. */ dev_addr = req->dev_addr; - for_each_sgtable_sg(slice->sgt, sg, i) { + for_each_sgtable_dma_sg(slice->sgt, sg, i) { slice->reqs[i].cmd = cmd; slice->reqs[i].src_addr = cpu_to_le64(slice->dir == DMA_TO_DEVICE ? sg_dma_address(sg) : dev_addr); diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h index 2a05d955e30b..e21492981f7d 100644 --- a/drivers/block/drbd/drbd_int.h +++ b/drivers/block/drbd/drbd_int.h @@ -1364,7 +1364,6 @@ extern struct bio_set drbd_io_bio_set; extern struct mutex resources_mutex; -extern int conn_lowest_minor(struct drbd_connection *connection); extern enum drbd_ret_code drbd_create_device(struct drbd_config_context *adm_ctx, unsigned int minor); extern void drbd_destroy_device(struct kref *kref); extern void drbd_delete_device(struct drbd_device *device); diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c index 0d74d75260ef..5bbd312c3e14 100644 --- a/drivers/block/drbd/drbd_main.c +++ b/drivers/block/drbd/drbd_main.c @@ -471,20 +471,6 @@ void _drbd_thread_stop(struct drbd_thread *thi, int restart, int wait) wait_for_completion(&thi->stop); } -int conn_lowest_minor(struct drbd_connection *connection) -{ - struct drbd_peer_device *peer_device; - int vnr = 0, minor = -1; - - rcu_read_lock(); - peer_device = idr_get_next(&connection->peer_devices, &vnr); - if (peer_device) - minor = device_to_minor(peer_device->device); - rcu_read_unlock(); - - return minor; -} - #ifdef CONFIG_SMP /* * drbd_calc_cpu_mask() - Generate CPU masks, spread over all CPUs diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index a6c8e5cc6051..6ba2c1dd1d87 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -2380,10 +2380,19 @@ static int ublk_ctrl_add_dev(struct io_uring_cmd *cmd) * TODO: provide forward progress for RECOVERY handler, so that * unprivileged device can benefit from it */ - if (info.flags & UBLK_F_UNPRIVILEGED_DEV) + if (info.flags & UBLK_F_UNPRIVILEGED_DEV) { info.flags &= ~(UBLK_F_USER_RECOVERY_REISSUE | UBLK_F_USER_RECOVERY); + /* + * For USER_COPY, we depends on userspace to fill request + * buffer by pwrite() to ublk char device, which can't be + * used for unprivileged device + */ + if (info.flags & UBLK_F_USER_COPY) + return -EINVAL; + } + /* the created device is always owned by current user */ ublk_store_owner_uid_gid(&info.owner_uid, &info.owner_gid); diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index a3e45b3060d1..e9534fbc92e3 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -1345,10 +1345,15 @@ static int btusb_submit_intr_urb(struct hci_dev *hdev, gfp_t mem_flags) if (!urb) return -ENOMEM; - /* Use maximum HCI Event size so the USB stack handles - * ZPL/short-transfer automatically. - */ - size = HCI_MAX_EVENT_SIZE; + if (le16_to_cpu(data->udev->descriptor.idVendor) == 0x0a12 && + le16_to_cpu(data->udev->descriptor.idProduct) == 0x0001) + /* Fake CSR devices don't seem to support sort-transter */ + size = le16_to_cpu(data->intr_ep->wMaxPacketSize); + else + /* Use maximum HCI Event size so the USB stack handles + * ZPL/short-transfer automatically. + */ + size = HCI_MAX_EVENT_SIZE; buf = kmalloc(size, mem_flags); if (!buf) { @@ -4038,7 +4043,6 @@ static void btusb_disconnect(struct usb_interface *intf) static int btusb_suspend(struct usb_interface *intf, pm_message_t message) { struct btusb_data *data = usb_get_intfdata(intf); - int err; BT_DBG("intf %p", intf); @@ -4051,16 +4055,6 @@ static int btusb_suspend(struct usb_interface *intf, pm_message_t message) if (data->suspend_count++) return 0; - /* Notify Host stack to suspend; this has to be done before stopping - * the traffic since the hci_suspend_dev itself may generate some - * traffic. - */ - err = hci_suspend_dev(data->hdev); - if (err) { - data->suspend_count--; - return err; - } - spin_lock_irq(&data->txlock); if (!(PMSG_IS_AUTO(message) && data->tx_in_flight)) { set_bit(BTUSB_SUSPENDING, &data->flags); @@ -4068,7 +4062,6 @@ static int btusb_suspend(struct usb_interface *intf, pm_message_t message) } else { spin_unlock_irq(&data->txlock); data->suspend_count--; - hci_resume_dev(data->hdev); return -EBUSY; } @@ -4189,8 +4182,6 @@ static int btusb_resume(struct usb_interface *intf) spin_unlock_irq(&data->txlock); schedule_work(&data->work); - hci_resume_dev(data->hdev); - return 0; failed: diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c index 9b0f37d4b9d4..6a99a459b80b 100644 --- a/drivers/cdrom/cdrom.c +++ b/drivers/cdrom/cdrom.c @@ -2313,7 +2313,7 @@ static int cdrom_ioctl_media_changed(struct cdrom_device_info *cdi, return -EINVAL; /* Prevent arg from speculatively bypassing the length check */ - barrier_nospec(); + arg = array_index_nospec(arg, cdi->capacity); info = kmalloc(sizeof(*info), GFP_KERNEL); if (!info) diff --git a/drivers/clk/clk_test.c b/drivers/clk/clk_test.c index 41fc8eba3418..aa3ddcfc00eb 100644 --- a/drivers/clk/clk_test.c +++ b/drivers/clk/clk_test.c @@ -473,7 +473,7 @@ clk_multiple_parents_mux_test_init(struct kunit *test) &clk_dummy_rate_ops, 0); ctx->parents_ctx[0].rate = DUMMY_CLOCK_RATE_1; - ret = clk_hw_register(NULL, &ctx->parents_ctx[0].hw); + ret = clk_hw_register_kunit(test, NULL, &ctx->parents_ctx[0].hw); if (ret) return ret; @@ -481,7 +481,7 @@ clk_multiple_parents_mux_test_init(struct kunit *test) &clk_dummy_rate_ops, 0); ctx->parents_ctx[1].rate = DUMMY_CLOCK_RATE_2; - ret = clk_hw_register(NULL, &ctx->parents_ctx[1].hw); + ret = clk_hw_register_kunit(test, NULL, &ctx->parents_ctx[1].hw); if (ret) return ret; @@ -489,23 +489,13 @@ clk_multiple_parents_mux_test_init(struct kunit *test) ctx->hw.init = CLK_HW_INIT_PARENTS("test-mux", parents, &clk_multiple_parents_mux_ops, CLK_SET_RATE_PARENT); - ret = clk_hw_register(NULL, &ctx->hw); + ret = clk_hw_register_kunit(test, NULL, &ctx->hw); if (ret) return ret; return 0; } -static void -clk_multiple_parents_mux_test_exit(struct kunit *test) -{ - struct clk_multiple_parent_ctx *ctx = test->priv; - - clk_hw_unregister(&ctx->hw); - clk_hw_unregister(&ctx->parents_ctx[0].hw); - clk_hw_unregister(&ctx->parents_ctx[1].hw); -} - /* * Test that for a clock with multiple parents, clk_get_parent() * actually returns the current one. @@ -561,18 +551,18 @@ clk_test_multiple_parents_mux_set_range_set_parent_get_rate(struct kunit *test) { struct clk_multiple_parent_ctx *ctx = test->priv; struct clk_hw *hw = &ctx->hw; - struct clk *clk = clk_hw_get_clk(hw, NULL); + struct clk *clk = clk_hw_get_clk_kunit(test, hw, NULL); struct clk *parent1, *parent2; unsigned long rate; int ret; kunit_skip(test, "This needs to be fixed in the core."); - parent1 = clk_hw_get_clk(&ctx->parents_ctx[0].hw, NULL); + parent1 = clk_hw_get_clk_kunit(test, &ctx->parents_ctx[0].hw, NULL); KUNIT_ASSERT_NOT_ERR_OR_NULL(test, parent1); KUNIT_ASSERT_TRUE(test, clk_is_match(clk_get_parent(clk), parent1)); - parent2 = clk_hw_get_clk(&ctx->parents_ctx[1].hw, NULL); + parent2 = clk_hw_get_clk_kunit(test, &ctx->parents_ctx[1].hw, NULL); KUNIT_ASSERT_NOT_ERR_OR_NULL(test, parent2); ret = clk_set_rate(parent1, DUMMY_CLOCK_RATE_1); @@ -593,10 +583,6 @@ clk_test_multiple_parents_mux_set_range_set_parent_get_rate(struct kunit *test) KUNIT_ASSERT_GT(test, rate, 0); KUNIT_EXPECT_GE(test, rate, DUMMY_CLOCK_RATE_1 - 1000); KUNIT_EXPECT_LE(test, rate, DUMMY_CLOCK_RATE_1 + 1000); - - clk_put(parent2); - clk_put(parent1); - clk_put(clk); } static struct kunit_case clk_multiple_parents_mux_test_cases[] = { @@ -617,7 +603,6 @@ static struct kunit_suite clk_multiple_parents_mux_test_suite = { .name = "clk-multiple-parents-mux-test", .init = clk_multiple_parents_mux_test_init, - .exit = clk_multiple_parents_mux_test_exit, .test_cases = clk_multiple_parents_mux_test_cases, }; @@ -637,29 +622,20 @@ clk_orphan_transparent_multiple_parent_mux_test_init(struct kunit *test) &clk_dummy_rate_ops, 0); ctx->parents_ctx[1].rate = DUMMY_CLOCK_INIT_RATE; - ret = clk_hw_register(NULL, &ctx->parents_ctx[1].hw); + ret = clk_hw_register_kunit(test, NULL, &ctx->parents_ctx[1].hw); if (ret) return ret; ctx->hw.init = CLK_HW_INIT_PARENTS("test-orphan-mux", parents, &clk_multiple_parents_mux_ops, CLK_SET_RATE_PARENT); - ret = clk_hw_register(NULL, &ctx->hw); + ret = clk_hw_register_kunit(test, NULL, &ctx->hw); if (ret) return ret; return 0; } -static void -clk_orphan_transparent_multiple_parent_mux_test_exit(struct kunit *test) -{ - struct clk_multiple_parent_ctx *ctx = test->priv; - - clk_hw_unregister(&ctx->hw); - clk_hw_unregister(&ctx->parents_ctx[1].hw); -} - /* * Test that, for a mux whose current parent hasn't been registered yet and is * thus orphan, clk_get_parent() will return NULL. @@ -912,7 +888,7 @@ clk_test_orphan_transparent_multiple_parent_mux_set_range_set_parent_get_rate(st { struct clk_multiple_parent_ctx *ctx = test->priv; struct clk_hw *hw = &ctx->hw; - struct clk *clk = clk_hw_get_clk(hw, NULL); + struct clk *clk = clk_hw_get_clk_kunit(test, hw, NULL); struct clk *parent; unsigned long rate; int ret; @@ -921,7 +897,7 @@ clk_test_orphan_transparent_multiple_parent_mux_set_range_set_parent_get_rate(st clk_hw_set_rate_range(hw, DUMMY_CLOCK_RATE_1, DUMMY_CLOCK_RATE_2); - parent = clk_hw_get_clk(&ctx->parents_ctx[1].hw, NULL); + parent = clk_hw_get_clk_kunit(test, &ctx->parents_ctx[1].hw, NULL); KUNIT_ASSERT_NOT_ERR_OR_NULL(test, parent); ret = clk_set_parent(clk, parent); @@ -931,9 +907,6 @@ clk_test_orphan_transparent_multiple_parent_mux_set_range_set_parent_get_rate(st KUNIT_ASSERT_GT(test, rate, 0); KUNIT_EXPECT_GE(test, rate, DUMMY_CLOCK_RATE_1); KUNIT_EXPECT_LE(test, rate, DUMMY_CLOCK_RATE_2); - - clk_put(parent); - clk_put(clk); } static struct kunit_case clk_orphan_transparent_multiple_parent_mux_test_cases[] = { @@ -961,7 +934,6 @@ static struct kunit_case clk_orphan_transparent_multiple_parent_mux_test_cases[] static struct kunit_suite clk_orphan_transparent_multiple_parent_mux_test_suite = { .name = "clk-orphan-transparent-multiple-parent-mux-test", .init = clk_orphan_transparent_multiple_parent_mux_test_init, - .exit = clk_orphan_transparent_multiple_parent_mux_test_exit, .test_cases = clk_orphan_transparent_multiple_parent_mux_test_cases, }; @@ -986,7 +958,7 @@ static int clk_single_parent_mux_test_init(struct kunit *test) &clk_dummy_rate_ops, 0); - ret = clk_hw_register(NULL, &ctx->parent_ctx.hw); + ret = clk_hw_register_kunit(test, NULL, &ctx->parent_ctx.hw); if (ret) return ret; @@ -994,7 +966,7 @@ static int clk_single_parent_mux_test_init(struct kunit *test) &clk_dummy_single_parent_ops, CLK_SET_RATE_PARENT); - ret = clk_hw_register(NULL, &ctx->hw); + ret = clk_hw_register_kunit(test, NULL, &ctx->hw); if (ret) return ret; @@ -1060,7 +1032,7 @@ clk_test_single_parent_mux_set_range_disjoint_child_last(struct kunit *test) { struct clk_single_parent_ctx *ctx = test->priv; struct clk_hw *hw = &ctx->hw; - struct clk *clk = clk_hw_get_clk(hw, NULL); + struct clk *clk = clk_hw_get_clk_kunit(test, hw, NULL); struct clk *parent; int ret; @@ -1074,8 +1046,6 @@ clk_test_single_parent_mux_set_range_disjoint_child_last(struct kunit *test) ret = clk_set_rate_range(clk, 3000, 4000); KUNIT_EXPECT_LT(test, ret, 0); - - clk_put(clk); } /* @@ -1092,7 +1062,7 @@ clk_test_single_parent_mux_set_range_disjoint_parent_last(struct kunit *test) { struct clk_single_parent_ctx *ctx = test->priv; struct clk_hw *hw = &ctx->hw; - struct clk *clk = clk_hw_get_clk(hw, NULL); + struct clk *clk = clk_hw_get_clk_kunit(test, hw, NULL); struct clk *parent; int ret; @@ -1106,8 +1076,6 @@ clk_test_single_parent_mux_set_range_disjoint_parent_last(struct kunit *test) ret = clk_set_rate_range(parent, 3000, 4000); KUNIT_EXPECT_LT(test, ret, 0); - - clk_put(clk); } /* @@ -1238,7 +1206,6 @@ static struct kunit_suite clk_single_parent_mux_test_suite = { .name = "clk-single-parent-mux-test", .init = clk_single_parent_mux_test_init, - .exit = clk_single_parent_mux_test_exit, .test_cases = clk_single_parent_mux_test_cases, }; diff --git a/drivers/clk/rockchip/clk.c b/drivers/clk/rockchip/clk.c index 2fa7253c73b2..88629a9abc9c 100644 --- a/drivers/clk/rockchip/clk.c +++ b/drivers/clk/rockchip/clk.c @@ -439,7 +439,7 @@ unsigned long rockchip_clk_find_max_clk_id(struct rockchip_clk_branch *list, if (list->id > max) max = list->id; if (list->child && list->child->id > max) - max = list->id; + max = list->child->id; } return max; diff --git a/drivers/clk/samsung/clk-exynosautov920.c b/drivers/clk/samsung/clk-exynosautov920.c index 7ba9748c0526..f60f0a0c598d 100644 --- a/drivers/clk/samsung/clk-exynosautov920.c +++ b/drivers/clk/samsung/clk-exynosautov920.c @@ -1155,6 +1155,7 @@ static const struct of_device_id exynosautov920_cmu_of_match[] = { .compatible = "samsung,exynosautov920-cmu-peric0", .data = &peric0_cmu_info, }, + { } }; static struct platform_driver exynosautov920_cmu_driver __refdata = { diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c index 15e201d5e911..b63863f77c67 100644 --- a/drivers/cpufreq/amd-pstate.c +++ b/drivers/cpufreq/amd-pstate.c @@ -536,11 +536,16 @@ static int amd_pstate_verify(struct cpufreq_policy_data *policy) static int amd_pstate_update_min_max_limit(struct cpufreq_policy *policy) { - u32 max_limit_perf, min_limit_perf, lowest_perf; + u32 max_limit_perf, min_limit_perf, lowest_perf, max_perf; struct amd_cpudata *cpudata = policy->driver_data; - max_limit_perf = div_u64(policy->max * cpudata->highest_perf, cpudata->max_freq); - min_limit_perf = div_u64(policy->min * cpudata->highest_perf, cpudata->max_freq); + if (cpudata->boost_supported && !policy->boost_enabled) + max_perf = READ_ONCE(cpudata->nominal_perf); + else + max_perf = READ_ONCE(cpudata->highest_perf); + + max_limit_perf = div_u64(policy->max * max_perf, policy->cpuinfo.max_freq); + min_limit_perf = div_u64(policy->min * max_perf, policy->cpuinfo.max_freq); lowest_perf = READ_ONCE(cpudata->lowest_perf); if (min_limit_perf < lowest_perf) @@ -1201,11 +1206,21 @@ static int amd_pstate_register_driver(int mode) return -EINVAL; cppc_state = mode; + + ret = amd_pstate_enable(true); + if (ret) { + pr_err("failed to enable cppc during amd-pstate driver registration, return %d\n", + ret); + amd_pstate_driver_cleanup(); + return ret; + } + ret = cpufreq_register_driver(current_pstate_driver); if (ret) { amd_pstate_driver_cleanup(); return ret; } + return 0; } @@ -1496,10 +1511,13 @@ static int amd_pstate_epp_update_limit(struct cpufreq_policy *policy) u64 value; s16 epp; - max_perf = READ_ONCE(cpudata->highest_perf); + if (cpudata->boost_supported && !policy->boost_enabled) + max_perf = READ_ONCE(cpudata->nominal_perf); + else + max_perf = READ_ONCE(cpudata->highest_perf); min_perf = READ_ONCE(cpudata->lowest_perf); - max_limit_perf = div_u64(policy->max * cpudata->highest_perf, cpudata->max_freq); - min_limit_perf = div_u64(policy->min * cpudata->highest_perf, cpudata->max_freq); + max_limit_perf = div_u64(policy->max * max_perf, policy->cpuinfo.max_freq); + min_limit_perf = div_u64(policy->min * max_perf, policy->cpuinfo.max_freq); if (min_limit_perf < min_perf) min_limit_perf = min_perf; diff --git a/drivers/dma/ep93xx_dma.c b/drivers/dma/ep93xx_dma.c index 995427afe077..6b98a23e3332 100644 --- a/drivers/dma/ep93xx_dma.c +++ b/drivers/dma/ep93xx_dma.c @@ -1391,11 +1391,12 @@ static struct ep93xx_dma_engine *ep93xx_dma_of_probe(struct platform_device *pde INIT_LIST_HEAD(&dma_dev->channels); for (i = 0; i < edma->num_channels; i++) { struct ep93xx_dma_chan *edmac = &edma->channels[i]; + int len; edmac->chan.device = dma_dev; edmac->regs = devm_platform_ioremap_resource(pdev, i); if (IS_ERR(edmac->regs)) - return edmac->regs; + return ERR_CAST(edmac->regs); edmac->irq = fwnode_irq_get(dev_fwnode(dev), i); if (edmac->irq < 0) @@ -1404,9 +1405,11 @@ static struct ep93xx_dma_engine *ep93xx_dma_of_probe(struct platform_device *pde edmac->edma = edma; if (edma->m2m) - snprintf(dma_clk_name, sizeof(dma_clk_name), "m2m%u", i); + len = snprintf(dma_clk_name, sizeof(dma_clk_name), "m2m%u", i); else - snprintf(dma_clk_name, sizeof(dma_clk_name), "m2p%u", i); + len = snprintf(dma_clk_name, sizeof(dma_clk_name), "m2p%u", i); + if (len >= sizeof(dma_clk_name)) + return ERR_PTR(-ENOBUFS); edmac->clk = devm_clk_get(dev, dma_clk_name); if (IS_ERR(edmac->clk)) { diff --git a/drivers/firmware/arm_ffa/driver.c b/drivers/firmware/arm_ffa/driver.c index 4d231bc375e0..b14cbdae94e8 100644 --- a/drivers/firmware/arm_ffa/driver.c +++ b/drivers/firmware/arm_ffa/driver.c @@ -481,11 +481,16 @@ static int ffa_msg_send_direct_req2(u16 src_id, u16 dst_id, const uuid_t *uuid, struct ffa_send_direct_data2 *data) { u32 src_dst_ids = PACK_TARGET_INFO(src_id, dst_id); + union { + uuid_t uuid; + __le64 regs[2]; + } uuid_regs = { .uuid = *uuid }; ffa_value_t ret, args = { - .a0 = FFA_MSG_SEND_DIRECT_REQ2, .a1 = src_dst_ids, + .a0 = FFA_MSG_SEND_DIRECT_REQ2, + .a1 = src_dst_ids, + .a2 = le64_to_cpu(uuid_regs.regs[0]), + .a3 = le64_to_cpu(uuid_regs.regs[1]), }; - - export_uuid((u8 *)&args.a2, uuid); memcpy((void *)&args + offsetof(ffa_value_t, a4), data, sizeof(*data)); invoke_ffa_fn(args, &ret); @@ -496,7 +501,7 @@ static int ffa_msg_send_direct_req2(u16 src_id, u16 dst_id, const uuid_t *uuid, return ffa_to_linux_errno((int)ret.a2); if (ret.a0 == FFA_MSG_SEND_DIRECT_RESP2) { - memcpy(data, &ret.a4, sizeof(*data)); + memcpy(data, (void *)&ret + offsetof(ffa_value_t, a4), sizeof(*data)); return 0; } diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c index 88c5c4ff4bb6..a477b5ade38d 100644 --- a/drivers/firmware/arm_scmi/driver.c +++ b/drivers/firmware/arm_scmi/driver.c @@ -2976,10 +2976,8 @@ static struct scmi_debug_info *scmi_debugfs_common_setup(struct scmi_info *info) dbg->top_dentry = top_dentry; if (devm_add_action_or_reset(info->dev, - scmi_debugfs_common_cleanup, dbg)) { - scmi_debugfs_common_cleanup(dbg); + scmi_debugfs_common_cleanup, dbg)) return NULL; - } return dbg; } diff --git a/drivers/firmware/arm_scmi/transports/Makefile b/drivers/firmware/arm_scmi/transports/Makefile index 362a406f08e6..3ba3d3bee151 100644 --- a/drivers/firmware/arm_scmi/transports/Makefile +++ b/drivers/firmware/arm_scmi/transports/Makefile @@ -1,8 +1,10 @@ # SPDX-License-Identifier: GPL-2.0-only -scmi_transport_mailbox-objs := mailbox.o -obj-$(CONFIG_ARM_SCMI_TRANSPORT_MAILBOX) += scmi_transport_mailbox.o +# Keep before scmi_transport_mailbox.o to allow precedence +# while matching the compatible. scmi_transport_smc-objs := smc.o obj-$(CONFIG_ARM_SCMI_TRANSPORT_SMC) += scmi_transport_smc.o +scmi_transport_mailbox-objs := mailbox.o +obj-$(CONFIG_ARM_SCMI_TRANSPORT_MAILBOX) += scmi_transport_mailbox.o scmi_transport_optee-objs := optee.o obj-$(CONFIG_ARM_SCMI_TRANSPORT_OPTEE) += scmi_transport_optee.o scmi_transport_virtio-objs := virtio.o diff --git a/drivers/firmware/arm_scmi/transports/mailbox.c b/drivers/firmware/arm_scmi/transports/mailbox.c index 1a754dee24f7..e3d5f7560990 100644 --- a/drivers/firmware/arm_scmi/transports/mailbox.c +++ b/drivers/firmware/arm_scmi/transports/mailbox.c @@ -25,6 +25,7 @@ * @chan_platform_receiver: Optional Platform Receiver mailbox unidirectional channel * @cinfo: SCMI channel info * @shmem: Transmit/Receive shared memory area + * @chan_lock: Lock that prevents multiple xfers from being queued */ struct scmi_mailbox { struct mbox_client cl; @@ -33,6 +34,7 @@ struct scmi_mailbox { struct mbox_chan *chan_platform_receiver; struct scmi_chan_info *cinfo; struct scmi_shared_mem __iomem *shmem; + struct mutex chan_lock; }; #define client_to_scmi_mailbox(c) container_of(c, struct scmi_mailbox, cl) @@ -238,6 +240,7 @@ static int mailbox_chan_setup(struct scmi_chan_info *cinfo, struct device *dev, cinfo->transport_info = smbox; smbox->cinfo = cinfo; + mutex_init(&smbox->chan_lock); return 0; } @@ -267,13 +270,23 @@ static int mailbox_send_message(struct scmi_chan_info *cinfo, struct scmi_mailbox *smbox = cinfo->transport_info; int ret; - ret = mbox_send_message(smbox->chan, xfer); + /* + * The mailbox layer has its own queue. However the mailbox queue + * confuses the per message SCMI timeouts since the clock starts when + * the message is submitted into the mailbox queue. So when multiple + * messages are queued up the clock starts on all messages instead of + * only the one inflight. + */ + mutex_lock(&smbox->chan_lock); - /* mbox_send_message returns non-negative value on success, so reset */ - if (ret > 0) - ret = 0; + ret = mbox_send_message(smbox->chan, xfer); + /* mbox_send_message returns non-negative value on success */ + if (ret < 0) { + mutex_unlock(&smbox->chan_lock); + return ret; + } - return ret; + return 0; } static void mailbox_mark_txdone(struct scmi_chan_info *cinfo, int ret, @@ -281,13 +294,10 @@ static void mailbox_mark_txdone(struct scmi_chan_info *cinfo, int ret, { struct scmi_mailbox *smbox = cinfo->transport_info; - /* - * NOTE: we might prefer not to need the mailbox ticker to manage the - * transfer queueing since the protocol layer queues things by itself. - * Unfortunately, we have to kick the mailbox framework after we have - * received our message. - */ mbox_client_txdone(smbox->chan, ret); + + /* Release channel */ + mutex_unlock(&smbox->chan_lock); } static void mailbox_fetch_response(struct scmi_chan_info *cinfo, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 1e475eb01417..d891ab779ca7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -265,7 +265,7 @@ static int amdgpu_cs_pass1(struct amdgpu_cs_parser *p, /* Only a single BO list is allowed to simplify handling. */ if (p->bo_list) - ret = -EINVAL; + goto free_partial_kdata; ret = amdgpu_cs_p1_bo_handles(p, p->chunks[i].kdata); if (ret) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c index 83e54697f0ee..f1ffab5a1eae 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c @@ -1635,11 +1635,9 @@ int amdgpu_gfx_sysfs_isolation_shader_init(struct amdgpu_device *adev) { int r; - if (!amdgpu_sriov_vf(adev)) { - r = device_create_file(adev->dev, &dev_attr_enforce_isolation); - if (r) - return r; - } + r = device_create_file(adev->dev, &dev_attr_enforce_isolation); + if (r) + return r; r = device_create_file(adev->dev, &dev_attr_run_cleaner_shader); if (r) @@ -1650,8 +1648,7 @@ int amdgpu_gfx_sysfs_isolation_shader_init(struct amdgpu_device *adev) void amdgpu_gfx_sysfs_isolation_shader_fini(struct amdgpu_device *adev) { - if (!amdgpu_sriov_vf(adev)) - device_remove_file(adev->dev, &dev_attr_enforce_isolation); + device_remove_file(adev->dev, &dev_attr_enforce_isolation); device_remove_file(adev->dev, &dev_attr_run_cleaner_shader); } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c index 10b61ff63802..7d4b540340e0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c @@ -1203,8 +1203,10 @@ int amdgpu_mes_add_ring(struct amdgpu_device *adev, int gang_id, r = amdgpu_ring_init(adev, ring, 1024, NULL, 0, AMDGPU_RING_PRIO_DEFAULT, NULL); - if (r) + if (r) { + amdgpu_mes_unlock(&adev->mes); goto clean_up_memory; + } amdgpu_mes_ring_to_queue_props(adev, ring, &qprops); @@ -1237,7 +1239,6 @@ clean_up_ring: amdgpu_ring_fini(ring); clean_up_memory: kfree(ring); - amdgpu_mes_unlock(&adev->mes); return r; } diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c index 8d27421689c9..a37a6801c9ea 100644 --- a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c +++ b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c @@ -621,7 +621,7 @@ static int mes_v12_0_set_hw_resources(struct amdgpu_mes *mes, int pipe) if (amdgpu_mes_log_enable) { mes_set_hw_res_pkt.enable_mes_event_int_logging = 1; - mes_set_hw_res_pkt.event_intr_history_gpu_mc_ptr = mes->event_log_gpu_addr; + mes_set_hw_res_pkt.event_intr_history_gpu_mc_ptr = mes->event_log_gpu_addr + pipe * AMDGPU_MES_LOG_BUFFER_SIZE; } return mes_v12_0_submit_pkt_and_poll_completion(mes, pipe, @@ -1336,7 +1336,7 @@ static int mes_v12_0_sw_init(void *handle) adev->mes.kiq_hw_fini = &mes_v12_0_kiq_hw_fini; adev->mes.enable_legacy_queue_map = true; - adev->mes.event_log_size = AMDGPU_MES_LOG_BUFFER_SIZE; + adev->mes.event_log_size = adev->enable_uni_mes ? (AMDGPU_MAX_MES_PIPES * AMDGPU_MES_LOG_BUFFER_SIZE) : AMDGPU_MES_LOG_BUFFER_SIZE; r = amdgpu_mes_init(adev); if (r) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 9044bdb38cf4..3e6b4736a7fe 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -1148,7 +1148,7 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file *filep, if (flags & KFD_IOC_ALLOC_MEM_FLAGS_AQL_QUEUE_MEM) size >>= 1; - WRITE_ONCE(pdd->vram_usage, pdd->vram_usage + PAGE_ALIGN(size)); + atomic64_add(PAGE_ALIGN(size), &pdd->vram_usage); } mutex_unlock(&p->mutex); @@ -1219,7 +1219,7 @@ static int kfd_ioctl_free_memory_of_gpu(struct file *filep, kfd_process_device_remove_obj_handle( pdd, GET_IDR_HANDLE(args->handle)); - WRITE_ONCE(pdd->vram_usage, pdd->vram_usage - size); + atomic64_sub(size, &pdd->vram_usage); err_unlock: err_pdd: @@ -2347,7 +2347,7 @@ static int criu_restore_memory_of_gpu(struct kfd_process_device *pdd, } else if (bo_bucket->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM) { bo_bucket->restored_offset = offset; /* Update the VRAM usage count */ - WRITE_ONCE(pdd->vram_usage, pdd->vram_usage + bo_bucket->size); + atomic64_add(bo_bucket->size, &pdd->vram_usage); } return 0; } diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index d6530febabad..26e48fdc8728 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -775,7 +775,7 @@ struct kfd_process_device { enum kfd_pdd_bound bound; /* VRAM usage */ - uint64_t vram_usage; + atomic64_t vram_usage; struct attribute attr_vram; char vram_filename[MAX_SYSFS_FILENAME_LEN]; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c index d665ecdcd12f..d4aa843aacfd 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c @@ -332,7 +332,7 @@ static ssize_t kfd_procfs_show(struct kobject *kobj, struct attribute *attr, } else if (strncmp(attr->name, "vram_", 5) == 0) { struct kfd_process_device *pdd = container_of(attr, struct kfd_process_device, attr_vram); - return snprintf(buffer, PAGE_SIZE, "%llu\n", READ_ONCE(pdd->vram_usage)); + return snprintf(buffer, PAGE_SIZE, "%llu\n", atomic64_read(&pdd->vram_usage)); } else if (strncmp(attr->name, "sdma_", 5) == 0) { struct kfd_process_device *pdd = container_of(attr, struct kfd_process_device, attr_sdma); @@ -1625,7 +1625,7 @@ struct kfd_process_device *kfd_create_process_device_data(struct kfd_node *dev, pdd->bound = PDD_UNBOUND; pdd->already_dequeued = false; pdd->runtime_inuse = false; - pdd->vram_usage = 0; + atomic64_set(&pdd->vram_usage, 0); pdd->sdma_past_activity_counter = 0; pdd->user_gpu_id = dev->id; atomic64_set(&pdd->evict_duration_counter, 0); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 04e746923697..1893c27746a5 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -405,6 +405,27 @@ static void svm_range_bo_release(struct kref *kref) spin_lock(&svm_bo->list_lock); } spin_unlock(&svm_bo->list_lock); + + if (mmget_not_zero(svm_bo->eviction_fence->mm)) { + struct kfd_process_device *pdd; + struct kfd_process *p; + struct mm_struct *mm; + + mm = svm_bo->eviction_fence->mm; + /* + * The forked child process takes svm_bo device pages ref, svm_bo could be + * released after parent process is gone. + */ + p = kfd_lookup_process_by_mm(mm); + if (p) { + pdd = kfd_get_process_device_data(svm_bo->node, p); + if (pdd) + atomic64_sub(amdgpu_bo_size(svm_bo->bo), &pdd->vram_usage); + kfd_unref_process(p); + } + mmput(mm); + } + if (!dma_fence_is_signaled(&svm_bo->eviction_fence->base)) /* We're not in the eviction worker. Signal the fence. */ dma_fence_signal(&svm_bo->eviction_fence->base); @@ -532,6 +553,7 @@ int svm_range_vram_node_new(struct kfd_node *node, struct svm_range *prange, bool clear) { + struct kfd_process_device *pdd; struct amdgpu_bo_param bp; struct svm_range_bo *svm_bo; struct amdgpu_bo_user *ubo; @@ -623,6 +645,10 @@ svm_range_vram_node_new(struct kfd_node *node, struct svm_range *prange, list_add(&prange->svm_bo_list, &svm_bo->range_list); spin_unlock(&svm_bo->list_lock); + pdd = svm_range_get_pdd_by_node(prange, node); + if (pdd) + atomic64_add(amdgpu_bo_size(bo), &pdd->vram_usage); + return 0; reserve_bo_failed: diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c index bb3bc68dfc39..9ad9cf7a9c98 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c @@ -1264,7 +1264,11 @@ static int smu_sw_init(void *handle) smu->workload_prority[PP_SMC_POWER_PROFILE_VR] = 4; smu->workload_prority[PP_SMC_POWER_PROFILE_COMPUTE] = 5; smu->workload_prority[PP_SMC_POWER_PROFILE_CUSTOM] = 6; - smu->workload_mask = 1 << smu->workload_prority[PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT]; + + if (smu->is_apu) + smu->workload_mask = 1 << smu->workload_prority[PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT]; + else + smu->workload_mask = 1 << smu->workload_prority[PP_SMC_POWER_PROFILE_FULLSCREEN3D]; smu->workload_setting[0] = PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; smu->workload_setting[1] = PP_SMC_POWER_PROFILE_FULLSCREEN3D; @@ -2226,7 +2230,7 @@ static int smu_bump_power_profile_mode(struct smu_context *smu, static int smu_adjust_power_state_dynamic(struct smu_context *smu, enum amd_dpm_forced_level level, bool skip_display_settings, - bool force_update) + bool init) { int ret = 0; int index = 0; @@ -2255,7 +2259,7 @@ static int smu_adjust_power_state_dynamic(struct smu_context *smu, } } - if (force_update || smu_dpm_ctx->dpm_level != level) { + if (smu_dpm_ctx->dpm_level != level) { ret = smu_asic_set_performance_level(smu, level); if (ret) { dev_err(smu->adev->dev, "Failed to set performance level!"); @@ -2272,7 +2276,7 @@ static int smu_adjust_power_state_dynamic(struct smu_context *smu, index = index > 0 && index <= WORKLOAD_POLICY_MAX ? index - 1 : 0; workload[0] = smu->workload_setting[index]; - if (force_update || smu->power_profile_mode != workload[0]) + if (init || smu->power_profile_mode != workload[0]) smu_bump_power_profile_mode(smu, workload, 0); } diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c index 1d024b122b0c..cb923e33fd6f 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c @@ -2555,18 +2555,16 @@ static int smu_v13_0_0_set_power_profile_mode(struct smu_context *smu, workload_mask = 1 << workload_type; /* Add optimizations for SMU13.0.0/10. Reuse the power saving profile */ - if (smu->power_profile_mode == PP_SMC_POWER_PROFILE_COMPUTE) { - if ((amdgpu_ip_version(smu->adev, MP1_HWIP, 0) == IP_VERSION(13, 0, 0) && - ((smu->adev->pm.fw_version == 0x004e6601) || - (smu->adev->pm.fw_version >= 0x004e7300))) || - (amdgpu_ip_version(smu->adev, MP1_HWIP, 0) == IP_VERSION(13, 0, 10) && - smu->adev->pm.fw_version >= 0x00504500)) { - workload_type = smu_cmn_to_asic_specific_index(smu, - CMN2ASIC_MAPPING_WORKLOAD, - PP_SMC_POWER_PROFILE_POWERSAVING); - if (workload_type >= 0) - workload_mask |= 1 << workload_type; - } + if ((amdgpu_ip_version(smu->adev, MP1_HWIP, 0) == IP_VERSION(13, 0, 0) && + ((smu->adev->pm.fw_version == 0x004e6601) || + (smu->adev->pm.fw_version >= 0x004e7300))) || + (amdgpu_ip_version(smu->adev, MP1_HWIP, 0) == IP_VERSION(13, 0, 10) && + smu->adev->pm.fw_version >= 0x00504500)) { + workload_type = smu_cmn_to_asic_specific_index(smu, + CMN2ASIC_MAPPING_WORKLOAD, + PP_SMC_POWER_PROFILE_POWERSAVING); + if (workload_type >= 0) + workload_mask |= 1 << workload_type; } ret = smu_cmn_send_smc_msg_with_param(smu, diff --git a/drivers/gpu/drm/ast/ast_sil164.c b/drivers/gpu/drm/ast/ast_sil164.c index 496c7120e515..c231389936bd 100644 --- a/drivers/gpu/drm/ast/ast_sil164.c +++ b/drivers/gpu/drm/ast/ast_sil164.c @@ -29,6 +29,8 @@ static int ast_sil164_connector_helper_get_modes(struct drm_connector *connector if (ast_connector->physical_status == connector_status_connected) { count = drm_connector_helper_get_modes(connector); } else { + drm_edid_connector_update(connector, NULL); + /* * There's no EDID data without a connected monitor. Set BMC- * compatible modes in this case. The XGA default resolution diff --git a/drivers/gpu/drm/ast/ast_vga.c b/drivers/gpu/drm/ast/ast_vga.c index 3e815da43fbd..dd389a0a8f4a 100644 --- a/drivers/gpu/drm/ast/ast_vga.c +++ b/drivers/gpu/drm/ast/ast_vga.c @@ -29,6 +29,8 @@ static int ast_vga_connector_helper_get_modes(struct drm_connector *connector) if (ast_connector->physical_status == connector_status_connected) { count = drm_connector_helper_get_modes(connector); } else { + drm_edid_connector_update(connector, NULL); + /* * There's no EDID data without a connected monitor. Set BMC- * compatible modes in this case. The XGA default resolution diff --git a/drivers/gpu/drm/i915/display/intel_dp_mst.c b/drivers/gpu/drm/i915/display/intel_dp_mst.c index 15541932b809..eeaedd979354 100644 --- a/drivers/gpu/drm/i915/display/intel_dp_mst.c +++ b/drivers/gpu/drm/i915/display/intel_dp_mst.c @@ -89,25 +89,19 @@ static int intel_dp_mst_max_dpt_bpp(const struct intel_crtc_state *crtc_state, static int intel_dp_mst_bw_overhead(const struct intel_crtc_state *crtc_state, const struct intel_connector *connector, - bool ssc, bool dsc, int bpp_x16) + bool ssc, int dsc_slice_count, int bpp_x16) { const struct drm_display_mode *adjusted_mode = &crtc_state->hw.adjusted_mode; unsigned long flags = DRM_DP_BW_OVERHEAD_MST; - int dsc_slice_count = 0; int overhead; flags |= intel_dp_is_uhbr(crtc_state) ? DRM_DP_BW_OVERHEAD_UHBR : 0; flags |= ssc ? DRM_DP_BW_OVERHEAD_SSC_REF_CLK : 0; flags |= crtc_state->fec_enable ? DRM_DP_BW_OVERHEAD_FEC : 0; - if (dsc) { + if (dsc_slice_count) flags |= DRM_DP_BW_OVERHEAD_DSC; - dsc_slice_count = intel_dp_dsc_get_slice_count(connector, - adjusted_mode->clock, - adjusted_mode->hdisplay, - crtc_state->joiner_pipes); - } overhead = drm_dp_bw_overhead(crtc_state->lane_count, adjusted_mode->hdisplay, @@ -153,6 +147,19 @@ static int intel_dp_mst_calc_pbn(int pixel_clock, int bpp_x16, int bw_overhead) return DIV_ROUND_UP(effective_data_rate * 64, 54 * 1000); } +static int intel_dp_mst_dsc_get_slice_count(const struct intel_connector *connector, + const struct intel_crtc_state *crtc_state) +{ + const struct drm_display_mode *adjusted_mode = + &crtc_state->hw.adjusted_mode; + int num_joined_pipes = crtc_state->joiner_pipes; + + return intel_dp_dsc_get_slice_count(connector, + adjusted_mode->clock, + adjusted_mode->hdisplay, + num_joined_pipes); +} + static int intel_dp_mst_find_vcpi_slots_for_bpp(struct intel_encoder *encoder, struct intel_crtc_state *crtc_state, int max_bpp, @@ -172,6 +179,7 @@ static int intel_dp_mst_find_vcpi_slots_for_bpp(struct intel_encoder *encoder, const struct drm_display_mode *adjusted_mode = &crtc_state->hw.adjusted_mode; int bpp, slots = -EINVAL; + int dsc_slice_count = 0; int max_dpt_bpp; int ret = 0; @@ -203,6 +211,15 @@ static int intel_dp_mst_find_vcpi_slots_for_bpp(struct intel_encoder *encoder, drm_dbg_kms(&i915->drm, "Looking for slots in range min bpp %d max bpp %d\n", min_bpp, max_bpp); + if (dsc) { + dsc_slice_count = intel_dp_mst_dsc_get_slice_count(connector, crtc_state); + if (!dsc_slice_count) { + drm_dbg_kms(&i915->drm, "Can't get valid DSC slice count\n"); + + return -ENOSPC; + } + } + for (bpp = max_bpp; bpp >= min_bpp; bpp -= step) { int local_bw_overhead; int remote_bw_overhead; @@ -216,9 +233,9 @@ static int intel_dp_mst_find_vcpi_slots_for_bpp(struct intel_encoder *encoder, intel_dp_output_bpp(crtc_state->output_format, bpp)); local_bw_overhead = intel_dp_mst_bw_overhead(crtc_state, connector, - false, dsc, link_bpp_x16); + false, dsc_slice_count, link_bpp_x16); remote_bw_overhead = intel_dp_mst_bw_overhead(crtc_state, connector, - true, dsc, link_bpp_x16); + true, dsc_slice_count, link_bpp_x16); intel_dp_mst_compute_m_n(crtc_state, connector, local_bw_overhead, @@ -449,6 +466,9 @@ hblank_expansion_quirk_needs_dsc(const struct intel_connector *connector, if (mode_hblank_period_ns(adjusted_mode) > hblank_limit) return false; + if (!intel_dp_mst_dsc_get_slice_count(connector, crtc_state)) + return false; + return true; } diff --git a/drivers/gpu/drm/i915/display/intel_fb.c b/drivers/gpu/drm/i915/display/intel_fb.c index 5be7bb43e2e0..35557d98d7a7 100644 --- a/drivers/gpu/drm/i915/display/intel_fb.c +++ b/drivers/gpu/drm/i915/display/intel_fb.c @@ -438,6 +438,19 @@ bool intel_fb_needs_64k_phys(u64 modifier) INTEL_PLANE_CAP_NEED64K_PHYS); } +/** + * intel_fb_is_tile4_modifier: Check if a modifier is a tile4 modifier type + * @modifier: Modifier to check + * + * Returns: + * Returns %true if @modifier is a tile4 modifier. + */ +bool intel_fb_is_tile4_modifier(u64 modifier) +{ + return plane_caps_contain_any(lookup_modifier(modifier)->plane_caps, + INTEL_PLANE_CAP_TILING_4); +} + static bool check_modifier_display_ver_range(const struct intel_modifier_desc *md, u8 display_ver_from, u8 display_ver_until) { diff --git a/drivers/gpu/drm/i915/display/intel_fb.h b/drivers/gpu/drm/i915/display/intel_fb.h index 10de437e8ef8..827be3f7934c 100644 --- a/drivers/gpu/drm/i915/display/intel_fb.h +++ b/drivers/gpu/drm/i915/display/intel_fb.h @@ -35,6 +35,7 @@ bool intel_fb_is_ccs_modifier(u64 modifier); bool intel_fb_is_rc_ccs_cc_modifier(u64 modifier); bool intel_fb_is_mc_ccs_modifier(u64 modifier); bool intel_fb_needs_64k_phys(u64 modifier); +bool intel_fb_is_tile4_modifier(u64 modifier); bool intel_fb_is_ccs_aux_plane(const struct drm_framebuffer *fb, int color_plane); int intel_fb_rc_ccs_cc_plane(const struct drm_framebuffer *fb); diff --git a/drivers/gpu/drm/i915/display/skl_universal_plane.c b/drivers/gpu/drm/i915/display/skl_universal_plane.c index 17d4c880ecc4..c8720d31d101 100644 --- a/drivers/gpu/drm/i915/display/skl_universal_plane.c +++ b/drivers/gpu/drm/i915/display/skl_universal_plane.c @@ -1591,6 +1591,17 @@ static int skl_plane_check_fb(const struct intel_crtc_state *crtc_state, return -EINVAL; } + /* + * Display20 onward tile4 hflip is not supported + */ + if (rotation & DRM_MODE_REFLECT_X && + intel_fb_is_tile4_modifier(fb->modifier) && + DISPLAY_VER(dev_priv) >= 20) { + drm_dbg_kms(&dev_priv->drm, + "horizontal flip is not supported with tile4 surface formats\n"); + return -EINVAL; + } + if (drm_rotation_90_or_270(rotation)) { if (!intel_fb_supports_90_270_rotation(to_intel_framebuffer(fb))) { drm_dbg_kms(&dev_priv->drm, diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.c b/drivers/gpu/drm/mgag200/mgag200_drv.c index 6623ee4e3277..9f5925693686 100644 --- a/drivers/gpu/drm/mgag200/mgag200_drv.c +++ b/drivers/gpu/drm/mgag200/mgag200_drv.c @@ -18,7 +18,6 @@ #include <drm/drm_managed.h> #include <drm/drm_module.h> #include <drm/drm_pciids.h> -#include <drm/drm_vblank.h> #include "mgag200_drv.h" @@ -85,34 +84,6 @@ resource_size_t mgag200_probe_vram(void __iomem *mem, resource_size_t size) return offset - 65536; } -static irqreturn_t mgag200_irq_handler(int irq, void *arg) -{ - struct drm_device *dev = arg; - struct mga_device *mdev = to_mga_device(dev); - struct drm_crtc *crtc; - u32 status, ien; - - status = RREG32(MGAREG_STATUS); - - if (status & MGAREG_STATUS_VLINEPEN) { - ien = RREG32(MGAREG_IEN); - if (!(ien & MGAREG_IEN_VLINEIEN)) - goto out; - - crtc = drm_crtc_from_index(dev, 0); - if (WARN_ON_ONCE(!crtc)) - goto out; - drm_crtc_handle_vblank(crtc); - - WREG32(MGAREG_ICLEAR, MGAREG_ICLEAR_VLINEICLR); - - return IRQ_HANDLED; - } - -out: - return IRQ_NONE; -} - /* * DRM driver */ @@ -196,7 +167,6 @@ int mgag200_device_init(struct mga_device *mdev, const struct mgag200_device_funcs *funcs) { struct drm_device *dev = &mdev->base; - struct pci_dev *pdev = to_pci_dev(dev->dev); u8 crtcext3, misc; int ret; @@ -223,14 +193,6 @@ int mgag200_device_init(struct mga_device *mdev, mutex_unlock(&mdev->rmmio_lock); WREG32(MGAREG_IEN, 0); - WREG32(MGAREG_ICLEAR, MGAREG_ICLEAR_VLINEICLR); - - ret = devm_request_irq(&pdev->dev, pdev->irq, mgag200_irq_handler, IRQF_SHARED, - dev->driver->name, dev); - if (ret) { - drm_err(dev, "Failed to acquire interrupt, error %d\n", ret); - return ret; - } return 0; } diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.h b/drivers/gpu/drm/mgag200/mgag200_drv.h index 4760ba92871b..988967eafbf2 100644 --- a/drivers/gpu/drm/mgag200/mgag200_drv.h +++ b/drivers/gpu/drm/mgag200/mgag200_drv.h @@ -391,24 +391,17 @@ int mgag200_crtc_helper_atomic_check(struct drm_crtc *crtc, struct drm_atomic_st void mgag200_crtc_helper_atomic_flush(struct drm_crtc *crtc, struct drm_atomic_state *old_state); void mgag200_crtc_helper_atomic_enable(struct drm_crtc *crtc, struct drm_atomic_state *old_state); void mgag200_crtc_helper_atomic_disable(struct drm_crtc *crtc, struct drm_atomic_state *old_state); -bool mgag200_crtc_helper_get_scanout_position(struct drm_crtc *crtc, bool in_vblank_irq, - int *vpos, int *hpos, - ktime_t *stime, ktime_t *etime, - const struct drm_display_mode *mode); #define MGAG200_CRTC_HELPER_FUNCS \ .mode_valid = mgag200_crtc_helper_mode_valid, \ .atomic_check = mgag200_crtc_helper_atomic_check, \ .atomic_flush = mgag200_crtc_helper_atomic_flush, \ .atomic_enable = mgag200_crtc_helper_atomic_enable, \ - .atomic_disable = mgag200_crtc_helper_atomic_disable, \ - .get_scanout_position = mgag200_crtc_helper_get_scanout_position + .atomic_disable = mgag200_crtc_helper_atomic_disable void mgag200_crtc_reset(struct drm_crtc *crtc); struct drm_crtc_state *mgag200_crtc_atomic_duplicate_state(struct drm_crtc *crtc); void mgag200_crtc_atomic_destroy_state(struct drm_crtc *crtc, struct drm_crtc_state *crtc_state); -int mgag200_crtc_enable_vblank(struct drm_crtc *crtc); -void mgag200_crtc_disable_vblank(struct drm_crtc *crtc); #define MGAG200_CRTC_FUNCS \ .reset = mgag200_crtc_reset, \ @@ -416,10 +409,7 @@ void mgag200_crtc_disable_vblank(struct drm_crtc *crtc); .set_config = drm_atomic_helper_set_config, \ .page_flip = drm_atomic_helper_page_flip, \ .atomic_duplicate_state = mgag200_crtc_atomic_duplicate_state, \ - .atomic_destroy_state = mgag200_crtc_atomic_destroy_state, \ - .enable_vblank = mgag200_crtc_enable_vblank, \ - .disable_vblank = mgag200_crtc_disable_vblank, \ - .get_vblank_timestamp = drm_crtc_vblank_helper_get_vblank_timestamp + .atomic_destroy_state = mgag200_crtc_atomic_destroy_state void mgag200_set_mode_regs(struct mga_device *mdev, const struct drm_display_mode *mode, bool set_vidrst); diff --git a/drivers/gpu/drm/mgag200/mgag200_g200.c b/drivers/gpu/drm/mgag200/mgag200_g200.c index 77ce8d36cef0..f874e2949840 100644 --- a/drivers/gpu/drm/mgag200/mgag200_g200.c +++ b/drivers/gpu/drm/mgag200/mgag200_g200.c @@ -8,7 +8,6 @@ #include <drm/drm_drv.h> #include <drm/drm_gem_atomic_helper.h> #include <drm/drm_probe_helper.h> -#include <drm/drm_vblank.h> #include "mgag200_drv.h" @@ -404,9 +403,5 @@ struct mga_device *mgag200_g200_device_create(struct pci_dev *pdev, const struct drm_mode_config_reset(dev); drm_kms_helper_poll_init(dev); - ret = drm_vblank_init(dev, 1); - if (ret) - return ERR_PTR(ret); - return mdev; } diff --git a/drivers/gpu/drm/mgag200/mgag200_g200eh.c b/drivers/gpu/drm/mgag200/mgag200_g200eh.c index 09ced65c1d2f..e2305f8e00f8 100644 --- a/drivers/gpu/drm/mgag200/mgag200_g200eh.c +++ b/drivers/gpu/drm/mgag200/mgag200_g200eh.c @@ -8,7 +8,6 @@ #include <drm/drm_drv.h> #include <drm/drm_gem_atomic_helper.h> #include <drm/drm_probe_helper.h> -#include <drm/drm_vblank.h> #include "mgag200_drv.h" @@ -276,9 +275,5 @@ struct mga_device *mgag200_g200eh_device_create(struct pci_dev *pdev, const stru drm_mode_config_reset(dev); drm_kms_helper_poll_init(dev); - ret = drm_vblank_init(dev, 1); - if (ret) - return ERR_PTR(ret); - return mdev; } diff --git a/drivers/gpu/drm/mgag200/mgag200_g200eh3.c b/drivers/gpu/drm/mgag200/mgag200_g200eh3.c index 5daa469137bd..11ae76eb081d 100644 --- a/drivers/gpu/drm/mgag200/mgag200_g200eh3.c +++ b/drivers/gpu/drm/mgag200/mgag200_g200eh3.c @@ -7,7 +7,6 @@ #include <drm/drm_drv.h> #include <drm/drm_gem_atomic_helper.h> #include <drm/drm_probe_helper.h> -#include <drm/drm_vblank.h> #include "mgag200_drv.h" @@ -181,9 +180,5 @@ struct mga_device *mgag200_g200eh3_device_create(struct pci_dev *pdev, drm_mode_config_reset(dev); drm_kms_helper_poll_init(dev); - ret = drm_vblank_init(dev, 1); - if (ret) - return ERR_PTR(ret); - return mdev; } diff --git a/drivers/gpu/drm/mgag200/mgag200_g200er.c b/drivers/gpu/drm/mgag200/mgag200_g200er.c index 09cfffafe130..c20ed0ab50ec 100644 --- a/drivers/gpu/drm/mgag200/mgag200_g200er.c +++ b/drivers/gpu/drm/mgag200/mgag200_g200er.c @@ -8,7 +8,6 @@ #include <drm/drm_drv.h> #include <drm/drm_gem_atomic_helper.h> #include <drm/drm_probe_helper.h> -#include <drm/drm_vblank.h> #include "mgag200_drv.h" @@ -206,8 +205,6 @@ static void mgag200_g200er_crtc_helper_atomic_enable(struct drm_crtc *crtc, mgag200_crtc_set_gamma_linear(mdev, format); mgag200_enable_display(mdev); - - drm_crtc_vblank_on(crtc); } static const struct drm_crtc_helper_funcs mgag200_g200er_crtc_helper_funcs = { @@ -215,8 +212,7 @@ static const struct drm_crtc_helper_funcs mgag200_g200er_crtc_helper_funcs = { .atomic_check = mgag200_crtc_helper_atomic_check, .atomic_flush = mgag200_crtc_helper_atomic_flush, .atomic_enable = mgag200_g200er_crtc_helper_atomic_enable, - .atomic_disable = mgag200_crtc_helper_atomic_disable, - .get_scanout_position = mgag200_crtc_helper_get_scanout_position, + .atomic_disable = mgag200_crtc_helper_atomic_disable }; static const struct drm_crtc_funcs mgag200_g200er_crtc_funcs = { @@ -312,9 +308,5 @@ struct mga_device *mgag200_g200er_device_create(struct pci_dev *pdev, const stru drm_mode_config_reset(dev); drm_kms_helper_poll_init(dev); - ret = drm_vblank_init(dev, 1); - if (ret) - return ERR_PTR(ret); - return mdev; } diff --git a/drivers/gpu/drm/mgag200/mgag200_g200ev.c b/drivers/gpu/drm/mgag200/mgag200_g200ev.c index 3d48baa91d8b..78be964eb97c 100644 --- a/drivers/gpu/drm/mgag200/mgag200_g200ev.c +++ b/drivers/gpu/drm/mgag200/mgag200_g200ev.c @@ -8,7 +8,6 @@ #include <drm/drm_drv.h> #include <drm/drm_gem_atomic_helper.h> #include <drm/drm_probe_helper.h> -#include <drm/drm_vblank.h> #include "mgag200_drv.h" @@ -207,8 +206,6 @@ static void mgag200_g200ev_crtc_helper_atomic_enable(struct drm_crtc *crtc, mgag200_crtc_set_gamma_linear(mdev, format); mgag200_enable_display(mdev); - - drm_crtc_vblank_on(crtc); } static const struct drm_crtc_helper_funcs mgag200_g200ev_crtc_helper_funcs = { @@ -216,8 +213,7 @@ static const struct drm_crtc_helper_funcs mgag200_g200ev_crtc_helper_funcs = { .atomic_check = mgag200_crtc_helper_atomic_check, .atomic_flush = mgag200_crtc_helper_atomic_flush, .atomic_enable = mgag200_g200ev_crtc_helper_atomic_enable, - .atomic_disable = mgag200_crtc_helper_atomic_disable, - .get_scanout_position = mgag200_crtc_helper_get_scanout_position, + .atomic_disable = mgag200_crtc_helper_atomic_disable }; static const struct drm_crtc_funcs mgag200_g200ev_crtc_funcs = { @@ -317,9 +313,5 @@ struct mga_device *mgag200_g200ev_device_create(struct pci_dev *pdev, const stru drm_mode_config_reset(dev); drm_kms_helper_poll_init(dev); - ret = drm_vblank_init(dev, 1); - if (ret) - return ERR_PTR(ret); - return mdev; } diff --git a/drivers/gpu/drm/mgag200/mgag200_g200ew3.c b/drivers/gpu/drm/mgag200/mgag200_g200ew3.c index dabc778e64e8..31624c9ab7b7 100644 --- a/drivers/gpu/drm/mgag200/mgag200_g200ew3.c +++ b/drivers/gpu/drm/mgag200/mgag200_g200ew3.c @@ -7,7 +7,6 @@ #include <drm/drm_drv.h> #include <drm/drm_gem_atomic_helper.h> #include <drm/drm_probe_helper.h> -#include <drm/drm_vblank.h> #include "mgag200_drv.h" @@ -199,9 +198,5 @@ struct mga_device *mgag200_g200ew3_device_create(struct pci_dev *pdev, drm_mode_config_reset(dev); drm_kms_helper_poll_init(dev); - ret = drm_vblank_init(dev, 1); - if (ret) - return ERR_PTR(ret); - return mdev; } diff --git a/drivers/gpu/drm/mgag200/mgag200_g200se.c b/drivers/gpu/drm/mgag200/mgag200_g200se.c index 9dcbe8304271..7a32d3b1d226 100644 --- a/drivers/gpu/drm/mgag200/mgag200_g200se.c +++ b/drivers/gpu/drm/mgag200/mgag200_g200se.c @@ -8,7 +8,6 @@ #include <drm/drm_drv.h> #include <drm/drm_gem_atomic_helper.h> #include <drm/drm_probe_helper.h> -#include <drm/drm_vblank.h> #include "mgag200_drv.h" @@ -338,8 +337,6 @@ static void mgag200_g200se_crtc_helper_atomic_enable(struct drm_crtc *crtc, mgag200_crtc_set_gamma_linear(mdev, format); mgag200_enable_display(mdev); - - drm_crtc_vblank_on(crtc); } static const struct drm_crtc_helper_funcs mgag200_g200se_crtc_helper_funcs = { @@ -347,8 +344,7 @@ static const struct drm_crtc_helper_funcs mgag200_g200se_crtc_helper_funcs = { .atomic_check = mgag200_crtc_helper_atomic_check, .atomic_flush = mgag200_crtc_helper_atomic_flush, .atomic_enable = mgag200_g200se_crtc_helper_atomic_enable, - .atomic_disable = mgag200_crtc_helper_atomic_disable, - .get_scanout_position = mgag200_crtc_helper_get_scanout_position, + .atomic_disable = mgag200_crtc_helper_atomic_disable }; static const struct drm_crtc_funcs mgag200_g200se_crtc_funcs = { @@ -517,9 +513,5 @@ struct mga_device *mgag200_g200se_device_create(struct pci_dev *pdev, const stru drm_mode_config_reset(dev); drm_kms_helper_poll_init(dev); - ret = drm_vblank_init(dev, 1); - if (ret) - return ERR_PTR(ret); - return mdev; } diff --git a/drivers/gpu/drm/mgag200/mgag200_g200wb.c b/drivers/gpu/drm/mgag200/mgag200_g200wb.c index 83a24aedbf2f..a0e7b9ad46cd 100644 --- a/drivers/gpu/drm/mgag200/mgag200_g200wb.c +++ b/drivers/gpu/drm/mgag200/mgag200_g200wb.c @@ -8,7 +8,6 @@ #include <drm/drm_drv.h> #include <drm/drm_gem_atomic_helper.h> #include <drm/drm_probe_helper.h> -#include <drm/drm_vblank.h> #include "mgag200_drv.h" @@ -323,9 +322,5 @@ struct mga_device *mgag200_g200wb_device_create(struct pci_dev *pdev, const stru drm_mode_config_reset(dev); drm_kms_helper_poll_init(dev); - ret = drm_vblank_init(dev, 1); - if (ret) - return ERR_PTR(ret); - return mdev; } diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c b/drivers/gpu/drm/mgag200/mgag200_mode.c index 7159909aca1e..fb71658c3117 100644 --- a/drivers/gpu/drm/mgag200/mgag200_mode.c +++ b/drivers/gpu/drm/mgag200/mgag200_mode.c @@ -22,7 +22,6 @@ #include <drm/drm_gem_framebuffer_helper.h> #include <drm/drm_panic.h> #include <drm/drm_print.h> -#include <drm/drm_vblank.h> #include "mgag200_ddc.h" #include "mgag200_drv.h" @@ -227,14 +226,7 @@ void mgag200_set_mode_regs(struct mga_device *mdev, const struct drm_display_mod vblkstr = mode->crtc_vblank_start; vblkend = vtotal + 1; - /* - * There's no VBLANK interrupt on Matrox chipsets, so we use - * the VLINE interrupt instead. It triggers when the current - * <linecomp> has been reached. For VBLANK, this is the first - * non-visible line at the bottom of the screen. Therefore, - * keep <linecomp> in sync with <vblkstr>. - */ - linecomp = vblkstr; + linecomp = vdispend; misc = RREG8(MGA_MISC_IN); @@ -645,8 +637,6 @@ void mgag200_crtc_helper_atomic_flush(struct drm_crtc *crtc, struct drm_atomic_s struct mgag200_crtc_state *mgag200_crtc_state = to_mgag200_crtc_state(crtc_state); struct drm_device *dev = crtc->dev; struct mga_device *mdev = to_mga_device(dev); - struct drm_pending_vblank_event *event; - unsigned long flags; if (crtc_state->enable && crtc_state->color_mgmt_changed) { const struct drm_format_info *format = mgag200_crtc_state->format; @@ -656,18 +646,6 @@ void mgag200_crtc_helper_atomic_flush(struct drm_crtc *crtc, struct drm_atomic_s else mgag200_crtc_set_gamma_linear(mdev, format); } - - event = crtc->state->event; - if (event) { - crtc->state->event = NULL; - - spin_lock_irqsave(&dev->event_lock, flags); - if (drm_crtc_vblank_get(crtc) != 0) - drm_crtc_send_vblank_event(crtc, event); - else - drm_crtc_arm_vblank_event(crtc, event); - spin_unlock_irqrestore(&dev->event_lock, flags); - } } void mgag200_crtc_helper_atomic_enable(struct drm_crtc *crtc, struct drm_atomic_state *old_state) @@ -692,44 +670,15 @@ void mgag200_crtc_helper_atomic_enable(struct drm_crtc *crtc, struct drm_atomic_ mgag200_crtc_set_gamma_linear(mdev, format); mgag200_enable_display(mdev); - - drm_crtc_vblank_on(crtc); } void mgag200_crtc_helper_atomic_disable(struct drm_crtc *crtc, struct drm_atomic_state *old_state) { struct mga_device *mdev = to_mga_device(crtc->dev); - drm_crtc_vblank_off(crtc); - mgag200_disable_display(mdev); } -bool mgag200_crtc_helper_get_scanout_position(struct drm_crtc *crtc, bool in_vblank_irq, - int *vpos, int *hpos, - ktime_t *stime, ktime_t *etime, - const struct drm_display_mode *mode) -{ - struct mga_device *mdev = to_mga_device(crtc->dev); - u32 vcount; - - if (stime) - *stime = ktime_get(); - - if (vpos) { - vcount = RREG32(MGAREG_VCOUNT); - *vpos = vcount & GENMASK(11, 0); - } - - if (hpos) - *hpos = mode->htotal >> 1; // near middle of scanline on average - - if (etime) - *etime = ktime_get(); - - return true; -} - void mgag200_crtc_reset(struct drm_crtc *crtc) { struct mgag200_crtc_state *mgag200_crtc_state; @@ -774,30 +723,6 @@ void mgag200_crtc_atomic_destroy_state(struct drm_crtc *crtc, struct drm_crtc_st kfree(mgag200_crtc_state); } -int mgag200_crtc_enable_vblank(struct drm_crtc *crtc) -{ - struct mga_device *mdev = to_mga_device(crtc->dev); - u32 ien; - - WREG32(MGAREG_ICLEAR, MGAREG_ICLEAR_VLINEICLR); - - ien = RREG32(MGAREG_IEN); - ien |= MGAREG_IEN_VLINEIEN; - WREG32(MGAREG_IEN, ien); - - return 0; -} - -void mgag200_crtc_disable_vblank(struct drm_crtc *crtc) -{ - struct mga_device *mdev = to_mga_device(crtc->dev); - u32 ien; - - ien = RREG32(MGAREG_IEN); - ien &= ~(MGAREG_IEN_VLINEIEN); - WREG32(MGAREG_IEN, ien); -} - /* * Mode config */ diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 06cab2c6fd66..702b8d4b3497 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -101,9 +101,10 @@ static void get_stats_counter(struct msm_ringbuffer *ring, u32 counter, } static void a6xx_set_pagetable(struct a6xx_gpu *a6xx_gpu, - struct msm_ringbuffer *ring, struct msm_file_private *ctx) + struct msm_ringbuffer *ring, struct msm_gem_submit *submit) { bool sysprof = refcount_read(&a6xx_gpu->base.base.sysprof_active) > 1; + struct msm_file_private *ctx = submit->queue->ctx; struct adreno_gpu *adreno_gpu = &a6xx_gpu->base; phys_addr_t ttbr; u32 asid; @@ -115,6 +116,15 @@ static void a6xx_set_pagetable(struct a6xx_gpu *a6xx_gpu, if (msm_iommu_pagetable_params(ctx->aspace->mmu, &ttbr, &asid)) return; + if (adreno_gpu->info->family >= ADRENO_7XX_GEN1) { + /* Wait for previous submit to complete before continuing: */ + OUT_PKT7(ring, CP_WAIT_TIMESTAMP, 4); + OUT_RING(ring, 0); + OUT_RING(ring, lower_32_bits(rbmemptr(ring, fence))); + OUT_RING(ring, upper_32_bits(rbmemptr(ring, fence))); + OUT_RING(ring, submit->seqno - 1); + } + if (!sysprof) { if (!adreno_is_a7xx(adreno_gpu)) { /* Turn off protected mode to write to special registers */ @@ -193,7 +203,7 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) struct msm_ringbuffer *ring = submit->ring; unsigned int i, ibs = 0; - a6xx_set_pagetable(a6xx_gpu, ring, submit->queue->ctx); + a6xx_set_pagetable(a6xx_gpu, ring, submit); get_stats_counter(ring, REG_A6XX_RBBM_PERFCTR_CP(0), rbmemptr_stats(ring, index, cpcycles_start)); @@ -283,7 +293,7 @@ static void a7xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) OUT_PKT7(ring, CP_THREAD_CONTROL, 1); OUT_RING(ring, CP_THREAD_CONTROL_0_SYNC_THREADS | CP_SET_THREAD_BR); - a6xx_set_pagetable(a6xx_gpu, ring, submit->queue->ctx); + a6xx_set_pagetable(a6xx_gpu, ring, submit); get_stats_counter(ring, REG_A7XX_RBBM_PERFCTR_CP(0), rbmemptr_stats(ring, index, cpcycles_start)); diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c index 4c1be2f0555f..db6c57900781 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c @@ -711,12 +711,13 @@ void dpu_crtc_complete_commit(struct drm_crtc *crtc) _dpu_crtc_complete_flip(crtc); } -static void _dpu_crtc_setup_lm_bounds(struct drm_crtc *crtc, +static int _dpu_crtc_check_and_setup_lm_bounds(struct drm_crtc *crtc, struct drm_crtc_state *state) { struct dpu_crtc_state *cstate = to_dpu_crtc_state(state); struct drm_display_mode *adj_mode = &state->adjusted_mode; u32 crtc_split_width = adj_mode->hdisplay / cstate->num_mixers; + struct dpu_kms *dpu_kms = _dpu_crtc_get_kms(crtc); int i; for (i = 0; i < cstate->num_mixers; i++) { @@ -727,7 +728,12 @@ static void _dpu_crtc_setup_lm_bounds(struct drm_crtc *crtc, r->y2 = adj_mode->vdisplay; trace_dpu_crtc_setup_lm_bounds(DRMID(crtc), i, r); + + if (drm_rect_width(r) > dpu_kms->catalog->caps->max_mixer_width) + return -E2BIG; } + + return 0; } static void _dpu_crtc_get_pcc_coeff(struct drm_crtc_state *state, @@ -803,7 +809,7 @@ static void dpu_crtc_atomic_begin(struct drm_crtc *crtc, DRM_DEBUG_ATOMIC("crtc%d\n", crtc->base.id); - _dpu_crtc_setup_lm_bounds(crtc, crtc->state); + _dpu_crtc_check_and_setup_lm_bounds(crtc, crtc->state); /* encoder will trigger pending mask now */ drm_for_each_encoder_mask(encoder, crtc->dev, crtc->state->encoder_mask) @@ -1091,9 +1097,6 @@ static void dpu_crtc_disable(struct drm_crtc *crtc, dpu_core_perf_crtc_update(crtc, 0); - memset(cstate->mixers, 0, sizeof(cstate->mixers)); - cstate->num_mixers = 0; - /* disable clk & bw control until clk & bw properties are set */ cstate->bw_control = false; cstate->bw_split_vote = false; @@ -1192,8 +1195,11 @@ static int dpu_crtc_atomic_check(struct drm_crtc *crtc, if (crtc_state->active_changed) crtc_state->mode_changed = true; - if (cstate->num_mixers) - _dpu_crtc_setup_lm_bounds(crtc, crtc_state); + if (cstate->num_mixers) { + rc = _dpu_crtc_check_and_setup_lm_bounds(crtc, crtc_state); + if (rc) + return rc; + } /* FIXME: move this to dpu_plane_atomic_check? */ drm_atomic_crtc_state_for_each_plane_state(plane, pstate, crtc_state) { diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c index 3b171bf227d1..bd3698bf0cf7 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c @@ -624,6 +624,40 @@ static struct msm_display_topology dpu_encoder_get_topology( return topology; } +static void dpu_encoder_assign_crtc_resources(struct dpu_kms *dpu_kms, + struct drm_encoder *drm_enc, + struct dpu_global_state *global_state, + struct drm_crtc_state *crtc_state) +{ + struct dpu_crtc_state *cstate; + struct dpu_hw_blk *hw_ctl[MAX_CHANNELS_PER_ENC]; + struct dpu_hw_blk *hw_lm[MAX_CHANNELS_PER_ENC]; + struct dpu_hw_blk *hw_dspp[MAX_CHANNELS_PER_ENC]; + int num_lm, num_ctl, num_dspp, i; + + cstate = to_dpu_crtc_state(crtc_state); + + memset(cstate->mixers, 0, sizeof(cstate->mixers)); + + num_ctl = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state, + drm_enc->base.id, DPU_HW_BLK_CTL, hw_ctl, ARRAY_SIZE(hw_ctl)); + num_lm = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state, + drm_enc->base.id, DPU_HW_BLK_LM, hw_lm, ARRAY_SIZE(hw_lm)); + num_dspp = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state, + drm_enc->base.id, DPU_HW_BLK_DSPP, hw_dspp, + ARRAY_SIZE(hw_dspp)); + + for (i = 0; i < num_lm; i++) { + int ctl_idx = (i < num_ctl) ? i : (num_ctl-1); + + cstate->mixers[i].hw_lm = to_dpu_hw_mixer(hw_lm[i]); + cstate->mixers[i].lm_ctl = to_dpu_hw_ctl(hw_ctl[ctl_idx]); + cstate->mixers[i].hw_dspp = i < num_dspp ? to_dpu_hw_dspp(hw_dspp[i]) : NULL; + } + + cstate->num_mixers = num_lm; +} + static int dpu_encoder_virt_atomic_check( struct drm_encoder *drm_enc, struct drm_crtc_state *crtc_state, @@ -692,6 +726,9 @@ static int dpu_encoder_virt_atomic_check( if (!crtc_state->active_changed || crtc_state->enable) ret = dpu_rm_reserve(&dpu_kms->rm, global_state, drm_enc, crtc_state, topology); + if (!ret) + dpu_encoder_assign_crtc_resources(dpu_kms, drm_enc, + global_state, crtc_state); } trace_dpu_enc_atomic_check_flags(DRMID(drm_enc), adj_mode->flags); @@ -1093,14 +1130,11 @@ static void dpu_encoder_virt_atomic_mode_set(struct drm_encoder *drm_enc, struct dpu_encoder_virt *dpu_enc; struct msm_drm_private *priv; struct dpu_kms *dpu_kms; - struct dpu_crtc_state *cstate; struct dpu_global_state *global_state; struct dpu_hw_blk *hw_pp[MAX_CHANNELS_PER_ENC]; struct dpu_hw_blk *hw_ctl[MAX_CHANNELS_PER_ENC]; - struct dpu_hw_blk *hw_lm[MAX_CHANNELS_PER_ENC]; - struct dpu_hw_blk *hw_dspp[MAX_CHANNELS_PER_ENC] = { NULL }; struct dpu_hw_blk *hw_dsc[MAX_CHANNELS_PER_ENC]; - int num_lm, num_ctl, num_pp, num_dsc; + int num_ctl, num_pp, num_dsc; unsigned int dsc_mask = 0; int i; @@ -1129,11 +1163,6 @@ static void dpu_encoder_virt_atomic_mode_set(struct drm_encoder *drm_enc, ARRAY_SIZE(hw_pp)); num_ctl = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state, drm_enc->base.id, DPU_HW_BLK_CTL, hw_ctl, ARRAY_SIZE(hw_ctl)); - num_lm = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state, - drm_enc->base.id, DPU_HW_BLK_LM, hw_lm, ARRAY_SIZE(hw_lm)); - dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state, - drm_enc->base.id, DPU_HW_BLK_DSPP, hw_dspp, - ARRAY_SIZE(hw_dspp)); for (i = 0; i < MAX_CHANNELS_PER_ENC; i++) dpu_enc->hw_pp[i] = i < num_pp ? to_dpu_hw_pingpong(hw_pp[i]) @@ -1159,36 +1188,23 @@ static void dpu_encoder_virt_atomic_mode_set(struct drm_encoder *drm_enc, dpu_enc->cur_master->hw_cdm = hw_cdm ? to_dpu_hw_cdm(hw_cdm) : NULL; } - cstate = to_dpu_crtc_state(crtc_state); - - for (i = 0; i < num_lm; i++) { - int ctl_idx = (i < num_ctl) ? i : (num_ctl-1); - - cstate->mixers[i].hw_lm = to_dpu_hw_mixer(hw_lm[i]); - cstate->mixers[i].lm_ctl = to_dpu_hw_ctl(hw_ctl[ctl_idx]); - cstate->mixers[i].hw_dspp = to_dpu_hw_dspp(hw_dspp[i]); - } - - cstate->num_mixers = num_lm; - for (i = 0; i < dpu_enc->num_phys_encs; i++) { struct dpu_encoder_phys *phys = dpu_enc->phys_encs[i]; - if (!dpu_enc->hw_pp[i]) { + phys->hw_pp = dpu_enc->hw_pp[i]; + if (!phys->hw_pp) { DPU_ERROR_ENC(dpu_enc, "no pp block assigned at idx: %d\n", i); return; } - if (!hw_ctl[i]) { + phys->hw_ctl = i < num_ctl ? to_dpu_hw_ctl(hw_ctl[i]) : NULL; + if (!phys->hw_ctl) { DPU_ERROR_ENC(dpu_enc, "no ctl block assigned at idx: %d\n", i); return; } - phys->hw_pp = dpu_enc->hw_pp[i]; - phys->hw_ctl = to_dpu_hw_ctl(hw_ctl[i]); - phys->cached_mode = crtc_state->adjusted_mode; if (phys->ops.atomic_mode_set) phys->ops.atomic_mode_set(phys, crtc_state, conn_state); diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c index ba8878d21cf0..d8a2edebfe8c 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c @@ -302,7 +302,7 @@ static void dpu_encoder_phys_vid_setup_timing_engine( intf_cfg.stream_sel = 0; /* Don't care value for video mode */ intf_cfg.mode_3d = dpu_encoder_helper_get_3d_blend_mode(phys_enc); intf_cfg.dsc = dpu_encoder_helper_get_dsc(phys_enc); - if (phys_enc->hw_pp->merge_3d) + if (intf_cfg.mode_3d && phys_enc->hw_pp->merge_3d) intf_cfg.merge_3d = phys_enc->hw_pp->merge_3d->idx; spin_lock_irqsave(phys_enc->enc_spinlock, lock_flags); @@ -440,10 +440,12 @@ static void dpu_encoder_phys_vid_enable(struct dpu_encoder_phys *phys_enc) struct dpu_hw_ctl *ctl; const struct msm_format *fmt; u32 fmt_fourcc; + u32 mode_3d; ctl = phys_enc->hw_ctl; fmt_fourcc = dpu_encoder_get_drm_fmt(phys_enc); fmt = mdp_get_format(&phys_enc->dpu_kms->base, fmt_fourcc, 0); + mode_3d = dpu_encoder_helper_get_3d_blend_mode(phys_enc); DPU_DEBUG_VIDENC(phys_enc, "\n"); @@ -466,7 +468,8 @@ static void dpu_encoder_phys_vid_enable(struct dpu_encoder_phys *phys_enc) goto skip_flush; ctl->ops.update_pending_flush_intf(ctl, phys_enc->hw_intf->idx); - if (ctl->ops.update_pending_flush_merge_3d && phys_enc->hw_pp->merge_3d) + if (mode_3d && ctl->ops.update_pending_flush_merge_3d && + phys_enc->hw_pp->merge_3d) ctl->ops.update_pending_flush_merge_3d(ctl, phys_enc->hw_pp->merge_3d->idx); if (ctl->ops.update_pending_flush_cdm && phys_enc->hw_cdm) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c index 882c717859ce..07035ab77b79 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c @@ -275,6 +275,7 @@ static void _dpu_encoder_phys_wb_update_flush(struct dpu_encoder_phys *phys_enc) struct dpu_hw_pingpong *hw_pp; struct dpu_hw_cdm *hw_cdm; u32 pending_flush = 0; + u32 mode_3d; if (!phys_enc) return; @@ -283,6 +284,7 @@ static void _dpu_encoder_phys_wb_update_flush(struct dpu_encoder_phys *phys_enc) hw_pp = phys_enc->hw_pp; hw_ctl = phys_enc->hw_ctl; hw_cdm = phys_enc->hw_cdm; + mode_3d = dpu_encoder_helper_get_3d_blend_mode(phys_enc); DPU_DEBUG("[wb:%d]\n", hw_wb->idx - WB_0); @@ -294,7 +296,8 @@ static void _dpu_encoder_phys_wb_update_flush(struct dpu_encoder_phys *phys_enc) if (hw_ctl->ops.update_pending_flush_wb) hw_ctl->ops.update_pending_flush_wb(hw_ctl, hw_wb->idx); - if (hw_ctl->ops.update_pending_flush_merge_3d && hw_pp && hw_pp->merge_3d) + if (mode_3d && hw_ctl->ops.update_pending_flush_merge_3d && + hw_pp && hw_pp->merge_3d) hw_ctl->ops.update_pending_flush_merge_3d(hw_ctl, hw_pp->merge_3d->idx); diff --git a/drivers/gpu/drm/msm/disp/msm_disp_snapshot_util.c b/drivers/gpu/drm/msm/disp/msm_disp_snapshot_util.c index add72bbc28b1..4d55e3cf570f 100644 --- a/drivers/gpu/drm/msm/disp/msm_disp_snapshot_util.c +++ b/drivers/gpu/drm/msm/disp/msm_disp_snapshot_util.c @@ -26,7 +26,7 @@ static void msm_disp_state_dump_regs(u32 **reg, u32 aligned_len, void __iomem *b end_addr = base_addr + aligned_len; if (!(*reg)) - *reg = kzalloc(len_padded, GFP_KERNEL); + *reg = kvzalloc(len_padded, GFP_KERNEL); if (*reg) dump_addr = *reg; @@ -48,20 +48,21 @@ static void msm_disp_state_dump_regs(u32 **reg, u32 aligned_len, void __iomem *b } } -static void msm_disp_state_print_regs(u32 **reg, u32 len, void __iomem *base_addr, - struct drm_printer *p) +static void msm_disp_state_print_regs(const u32 *dump_addr, u32 len, + void __iomem *base_addr, struct drm_printer *p) { int i; - u32 *dump_addr = NULL; void __iomem *addr; u32 num_rows; + if (!dump_addr) { + drm_printf(p, "Registers not stored\n"); + return; + } + addr = base_addr; num_rows = len / REG_DUMP_ALIGN; - if (*reg) - dump_addr = *reg; - for (i = 0; i < num_rows; i++) { drm_printf(p, "0x%lx : %08x %08x %08x %08x\n", (unsigned long)(addr - base_addr), @@ -89,7 +90,7 @@ void msm_disp_state_print(struct msm_disp_state *state, struct drm_printer *p) list_for_each_entry_safe(block, tmp, &state->blocks, node) { drm_printf(p, "====================%s================\n", block->name); - msm_disp_state_print_regs(&block->state, block->size, block->base_addr, p); + msm_disp_state_print_regs(block->state, block->size, block->base_addr, p); } drm_printf(p, "===================dpu drm state================\n"); @@ -161,7 +162,7 @@ void msm_disp_state_free(void *data) list_for_each_entry_safe(block, tmp, &disp_state->blocks, node) { list_del(&block->node); - kfree(block->state); + kvfree(block->state); kfree(block); } diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c b/drivers/gpu/drm/msm/dsi/dsi_host.c index 185d7de0bf37..a98d24b7cb00 100644 --- a/drivers/gpu/drm/msm/dsi/dsi_host.c +++ b/drivers/gpu/drm/msm/dsi/dsi_host.c @@ -542,7 +542,7 @@ static unsigned long dsi_adjust_pclk_for_compression(const struct drm_display_mo int new_htotal = mode->htotal - mode->hdisplay + new_hdisplay; - return new_htotal * mode->vtotal * drm_mode_vrefresh(mode); + return mult_frac(mode->clock * 1000u, new_htotal, mode->htotal); } static unsigned long dsi_get_pclk_rate(const struct drm_display_mode *mode, @@ -550,7 +550,7 @@ static unsigned long dsi_get_pclk_rate(const struct drm_display_mode *mode, { unsigned long pclk_rate; - pclk_rate = mode->clock * 1000; + pclk_rate = mode->clock * 1000u; if (dsc) pclk_rate = dsi_adjust_pclk_for_compression(mode, dsc); diff --git a/drivers/gpu/drm/msm/hdmi/hdmi_phy_8998.c b/drivers/gpu/drm/msm/hdmi/hdmi_phy_8998.c index 0e3a2b16a2ce..e6ffaf92d26d 100644 --- a/drivers/gpu/drm/msm/hdmi/hdmi_phy_8998.c +++ b/drivers/gpu/drm/msm/hdmi/hdmi_phy_8998.c @@ -153,15 +153,6 @@ static inline u32 pll_get_pll_cmp(u64 fdata, unsigned long ref_clk) return dividend - 1; } -static inline u64 pll_cmp_to_fdata(u32 pll_cmp, unsigned long ref_clk) -{ - u64 fdata = ((u64)pll_cmp) * ref_clk * 10; - - do_div(fdata, HDMI_PLL_CMP_CNT); - - return fdata; -} - #define HDMI_REF_CLOCK_HZ ((u64)19200000) #define HDMI_MHZ_TO_HZ ((u64)1000000) static int pll_get_post_div(struct hdmi_8998_post_divider *pd, u64 bclk) diff --git a/drivers/gpu/drm/panel/panel-himax-hx83102.c b/drivers/gpu/drm/panel/panel-himax-hx83102.c index 6e4b7e4644ce..8b48bba18131 100644 --- a/drivers/gpu/drm/panel/panel-himax-hx83102.c +++ b/drivers/gpu/drm/panel/panel-himax-hx83102.c @@ -298,7 +298,7 @@ static int ivo_t109nw41_init(struct hx83102 *ctx) msleep(60); hx83102_enable_extended_cmds(&dsi_ctx, true); - mipi_dsi_dcs_write_seq_multi(&dsi_ctx, HX83102_SETPOWER, 0x2c, 0xed, 0xed, 0x0f, 0xcf, 0x42, + mipi_dsi_dcs_write_seq_multi(&dsi_ctx, HX83102_SETPOWER, 0x2c, 0xed, 0xed, 0x27, 0xe7, 0x52, 0xf5, 0x39, 0x36, 0x36, 0x36, 0x36, 0x32, 0x8b, 0x11, 0x65, 0x00, 0x88, 0xfa, 0xff, 0xff, 0x8f, 0xff, 0x08, 0xd6, 0x33); mipi_dsi_dcs_write_seq_multi(&dsi_ctx, HX83102_SETDISP, 0x00, 0x47, 0xb0, 0x80, 0x00, 0x12, @@ -343,11 +343,11 @@ static int ivo_t109nw41_init(struct hx83102 *ctx) 0xaa, 0xaa, 0xaa, 0xaa, 0xaa, 0xa0, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00); - mipi_dsi_dcs_write_seq_multi(&dsi_ctx, HX83102_SETGMA, 0x04, 0x04, 0x06, 0x0a, 0x0a, 0x05, - 0x12, 0x14, 0x17, 0x13, 0x2c, 0x33, 0x39, 0x4b, 0x4c, 0x56, 0x61, 0x78, - 0x7a, 0x41, 0x50, 0x68, 0x73, 0x04, 0x04, 0x06, 0x0a, 0x0a, 0x05, 0x12, - 0x14, 0x17, 0x13, 0x2c, 0x33, 0x39, 0x4b, 0x4c, 0x56, 0x61, 0x78, 0x7a, - 0x41, 0x50, 0x68, 0x73); + mipi_dsi_dcs_write_seq_multi(&dsi_ctx, HX83102_SETGMA, 0x00, 0x07, 0x10, 0x17, 0x1c, 0x33, + 0x48, 0x50, 0x57, 0x50, 0x68, 0x6e, 0x71, 0x7f, 0x81, 0x8a, 0x8e, 0x9b, + 0x9c, 0x4d, 0x56, 0x5d, 0x73, 0x00, 0x07, 0x10, 0x17, 0x1c, 0x33, 0x48, + 0x50, 0x57, 0x50, 0x68, 0x6e, 0x71, 0x7f, 0x81, 0x8a, 0x8e, 0x9b, 0x9c, + 0x4d, 0x56, 0x5d, 0x73); mipi_dsi_dcs_write_seq_multi(&dsi_ctx, HX83102_SETTP1, 0x07, 0x10, 0x10, 0x1a, 0x26, 0x9e, 0x00, 0x4f, 0xa0, 0x14, 0x14, 0x00, 0x00, 0x00, 0x00, 0x12, 0x0a, 0x02, 0x02, 0x00, 0x33, 0x02, 0x04, 0x18, 0x01); diff --git a/drivers/gpu/drm/radeon/radeon_encoders.c b/drivers/gpu/drm/radeon/radeon_encoders.c index 0f723292409e..fafed331e0a0 100644 --- a/drivers/gpu/drm/radeon/radeon_encoders.c +++ b/drivers/gpu/drm/radeon/radeon_encoders.c @@ -43,7 +43,7 @@ static uint32_t radeon_encoder_clones(struct drm_encoder *encoder) struct radeon_device *rdev = dev->dev_private; struct radeon_encoder *radeon_encoder = to_radeon_encoder(encoder); struct drm_encoder *clone_encoder; - uint32_t index_mask = 0; + uint32_t index_mask = drm_encoder_mask(encoder); int count; /* DIG routing gets problematic */ diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c index 890a66a2361f..64bd7d74854e 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c @@ -635,10 +635,8 @@ out: kunmap_atomic(d.src_addr); if (d.dst_addr) kunmap_atomic(d.dst_addr); - if (src_pages) - kvfree(src_pages); - if (dst_pages) - kvfree(dst_pages); + kvfree(src_pages); + kvfree(dst_pages); return ret; } diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h index 3f4719b3c268..4e2807f5f94c 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h @@ -62,7 +62,7 @@ #define VMWGFX_DRIVER_MINOR 20 #define VMWGFX_DRIVER_PATCHLEVEL 0 #define VMWGFX_FIFO_STATIC_SIZE (1024*1024) -#define VMWGFX_MAX_DISPLAYS 16 +#define VMWGFX_NUM_DISPLAY_UNITS 8 #define VMWGFX_CMD_BOUNCE_INIT_SIZE 32768 #define VMWGFX_MIN_INITIAL_WIDTH 1280 @@ -82,7 +82,7 @@ #define VMWGFX_NUM_GB_CONTEXT 256 #define VMWGFX_NUM_GB_SHADER 20000 #define VMWGFX_NUM_GB_SURFACE 32768 -#define VMWGFX_NUM_GB_SCREEN_TARGET VMWGFX_MAX_DISPLAYS +#define VMWGFX_NUM_GB_SCREEN_TARGET VMWGFX_NUM_DISPLAY_UNITS #define VMWGFX_NUM_DXCONTEXT 256 #define VMWGFX_NUM_DXQUERY 512 #define VMWGFX_NUM_MOB (VMWGFX_NUM_GB_CONTEXT +\ diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c index 288ed0bb75cb..63b8d7591253 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c @@ -1283,7 +1283,6 @@ static int vmw_kms_new_framebuffer_surface(struct vmw_private *dev_priv, { struct drm_device *dev = &dev_priv->drm; struct vmw_framebuffer_surface *vfbs; - enum SVGA3dSurfaceFormat format; struct vmw_surface *surface; int ret; @@ -1320,34 +1319,6 @@ static int vmw_kms_new_framebuffer_surface(struct vmw_private *dev_priv, return -EINVAL; } - switch (mode_cmd->pixel_format) { - case DRM_FORMAT_ARGB8888: - format = SVGA3D_A8R8G8B8; - break; - case DRM_FORMAT_XRGB8888: - format = SVGA3D_X8R8G8B8; - break; - case DRM_FORMAT_RGB565: - format = SVGA3D_R5G6B5; - break; - case DRM_FORMAT_XRGB1555: - format = SVGA3D_A1R5G5B5; - break; - default: - DRM_ERROR("Invalid pixel format: %p4cc\n", - &mode_cmd->pixel_format); - return -EINVAL; - } - - /* - * For DX, surface format validation is done when surface->scanout - * is set. - */ - if (!has_sm4_context(dev_priv) && format != surface->metadata.format) { - DRM_ERROR("Invalid surface format for requested mode.\n"); - return -EINVAL; - } - vfbs = kzalloc(sizeof(*vfbs), GFP_KERNEL); if (!vfbs) { ret = -ENOMEM; @@ -1539,6 +1510,7 @@ static struct drm_framebuffer *vmw_kms_fb_create(struct drm_device *dev, DRM_ERROR("Surface size cannot exceed %dx%d\n", dev_priv->texture_max_width, dev_priv->texture_max_height); + ret = -EINVAL; goto err_out; } @@ -2225,7 +2197,7 @@ int vmw_kms_update_layout_ioctl(struct drm_device *dev, void *data, struct drm_mode_config *mode_config = &dev->mode_config; struct drm_vmw_update_layout_arg *arg = (struct drm_vmw_update_layout_arg *)data; - void __user *user_rects; + const void __user *user_rects; struct drm_vmw_rect *rects; struct drm_rect *drm_rects; unsigned rects_size; @@ -2237,6 +2209,8 @@ int vmw_kms_update_layout_ioctl(struct drm_device *dev, void *data, VMWGFX_MIN_INITIAL_HEIGHT}; vmw_du_update_layout(dev_priv, 1, &def_rect); return 0; + } else if (arg->num_outputs > VMWGFX_NUM_DISPLAY_UNITS) { + return -E2BIG; } rects_size = arg->num_outputs * sizeof(struct drm_vmw_rect); diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h index 6141fadf81ef..2a6c6d6581e0 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h @@ -199,9 +199,6 @@ struct vmw_kms_dirty { s32 unit_y2; }; -#define VMWGFX_NUM_DISPLAY_UNITS 8 - - #define vmw_framebuffer_to_vfb(x) \ container_of(x, struct vmw_framebuffer, base) #define vmw_framebuffer_to_vfbs(x) \ diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c index fab155a68054..82d18b88f4a7 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c @@ -886,6 +886,10 @@ static int vmw_stdu_connector_atomic_check(struct drm_connector *conn, struct drm_crtc_state *new_crtc_state; conn_state = drm_atomic_get_connector_state(state, conn); + + if (IS_ERR(conn_state)) + return PTR_ERR(conn_state); + du = vmw_connector_to_stdu(conn); if (!conn_state->crtc) diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c b/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c index 1625b30d9970..5721c74da3e0 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c @@ -2276,9 +2276,12 @@ int vmw_dumb_create(struct drm_file *file_priv, const struct SVGA3dSurfaceDesc *desc = vmw_surface_get_desc(format); SVGA3dSurfaceAllFlags flags = SVGA3D_SURFACE_HINT_TEXTURE | SVGA3D_SURFACE_HINT_RENDERTARGET | - SVGA3D_SURFACE_SCREENTARGET | - SVGA3D_SURFACE_BIND_SHADER_RESOURCE | - SVGA3D_SURFACE_BIND_RENDER_TARGET; + SVGA3D_SURFACE_SCREENTARGET; + + if (vmw_surface_is_dx_screen_target_format(format)) { + flags |= SVGA3D_SURFACE_BIND_SHADER_RESOURCE | + SVGA3D_SURFACE_BIND_RENDER_TARGET; + } /* * Without mob support we're just going to use raw memory buffer diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h index ac9c437e103d..00ad34ed73a5 100644 --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h @@ -393,9 +393,6 @@ #define XE2_GLOBAL_INVAL XE_REG(0xb404) -#define SCRATCH1LPFC XE_REG(0xb474) -#define EN_L3_RW_CCS_CACHE_FLUSH REG_BIT(0) - #define XE2LPM_L3SQCREG2 XE_REG_MCR(0xb604) #define XE2LPM_L3SQCREG3 XE_REG_MCR(0xb608) diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c index 5a63d135ba96..0a9ffc19e92f 100644 --- a/drivers/gpu/drm/xe/xe_device.c +++ b/drivers/gpu/drm/xe/xe_device.c @@ -980,13 +980,13 @@ void xe_device_declare_wedged(struct xe_device *xe) return; } + xe_pm_runtime_get_noresume(xe); + if (drmm_add_action_or_reset(&xe->drm, xe_device_wedged_fini, xe)) { drm_err(&xe->drm, "Failed to register xe_device_wedged_fini clean-up. Although device is wedged.\n"); return; } - xe_pm_runtime_get_noresume(xe); - if (!atomic_xchg(&xe->wedged.flag, 1)) { xe->needs_flr_on_fini = true; drm_err(&xe->drm, diff --git a/drivers/gpu/drm/xe/xe_exec.c b/drivers/gpu/drm/xe/xe_exec.c index 7b38485817dc..f23ac1e2ed88 100644 --- a/drivers/gpu/drm/xe/xe_exec.c +++ b/drivers/gpu/drm/xe/xe_exec.c @@ -41,11 +41,6 @@ * user knows an exec writes to a BO and reads from the BO in the next exec, it * is the user's responsibility to pass in / out fence between the two execs). * - * Implicit dependencies for external BOs are handled by using the dma-buf - * implicit dependency uAPI (TODO: add link). To make this works each exec must - * install the job's fence into the DMA_RESV_USAGE_WRITE slot of every external - * BO mapped in the VM. - * * We do not allow a user to trigger a bind at exec time rather we have a VM * bind IOCTL which uses the same in / out fence interface as exec. In that * sense, a VM bind is basically the same operation as an exec from the user @@ -59,8 +54,8 @@ * behind any pending kernel operations on any external BOs in VM or any BOs * private to the VM. This is accomplished by the rebinds waiting on BOs * DMA_RESV_USAGE_KERNEL slot (kernel ops) and kernel ops waiting on all BOs - * slots (inflight execs are in the DMA_RESV_USAGE_BOOKING for private BOs and - * in DMA_RESV_USAGE_WRITE for external BOs). + * slots (inflight execs are in the DMA_RESV_USAGE_BOOKKEEP for private BOs and + * for external BOs). * * Rebinds / dma-resv usage applies to non-compute mode VMs only as for compute * mode VMs we use preempt fences and a rebind worker (TODO: add link). @@ -304,7 +299,8 @@ retry: xe_sched_job_arm(job); if (!xe_vm_in_lr_mode(vm)) drm_gpuvm_resv_add_fence(&vm->gpuvm, exec, &job->drm.s_fence->finished, - DMA_RESV_USAGE_BOOKKEEP, DMA_RESV_USAGE_WRITE); + DMA_RESV_USAGE_BOOKKEEP, + DMA_RESV_USAGE_BOOKKEEP); for (i = 0; i < num_syncs; i++) { xe_sync_entry_signal(&syncs[i], &job->drm.s_fence->finished); diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.h b/drivers/gpu/drm/xe/xe_gpu_scheduler.h index 5ad5629a6c60..64b2ae6839db 100644 --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.h +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.h @@ -63,7 +63,9 @@ xe_sched_invalidate_job(struct xe_sched_job *job, int threshold) static inline void xe_sched_add_pending_job(struct xe_gpu_scheduler *sched, struct xe_sched_job *job) { + spin_lock(&sched->base.job_list_lock); list_add(&job->drm.list, &sched->base.pending_list); + spin_unlock(&sched->base.job_list_lock); } static inline diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c index ea65cf59372c..d5fd6a089b7c 100644 --- a/drivers/gpu/drm/xe/xe_gt.c +++ b/drivers/gpu/drm/xe/xe_gt.c @@ -108,7 +108,6 @@ static void xe_gt_enable_host_l2_vram(struct xe_gt *gt) return; if (!xe_gt_is_media_type(gt)) { - xe_mmio_write32(gt, SCRATCH1LPFC, EN_L3_RW_CCS_CACHE_FLUSH); reg = xe_gt_mcr_unicast_read_any(gt, XE2_GAMREQSTRM_CTRL); reg |= CG_DIS_CNTLBUS; xe_gt_mcr_multicast_write(gt, XE2_GAMREQSTRM_CTRL, reg); diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c index cca9cf536f76..bbb9e411d21f 100644 --- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c +++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c @@ -37,6 +37,15 @@ static long tlb_timeout_jiffies(struct xe_gt *gt) return hw_tlb_timeout + 2 * delay; } +static void xe_gt_tlb_invalidation_fence_fini(struct xe_gt_tlb_invalidation_fence *fence) +{ + if (WARN_ON_ONCE(!fence->gt)) + return; + + xe_pm_runtime_put(gt_to_xe(fence->gt)); + fence->gt = NULL; /* fini() should be called once */ +} + static void __invalidation_fence_signal(struct xe_device *xe, struct xe_gt_tlb_invalidation_fence *fence) { @@ -204,7 +213,7 @@ static int send_tlb_invalidation(struct xe_guc *guc, tlb_timeout_jiffies(gt)); } spin_unlock_irq(>->tlb_invalidation.pending_lock); - } else if (ret < 0) { + } else { __invalidation_fence_signal(xe, fence); } if (!ret) { @@ -267,10 +276,8 @@ int xe_gt_tlb_invalidation_ggtt(struct xe_gt *gt) xe_gt_tlb_invalidation_fence_init(gt, &fence, true); ret = xe_gt_tlb_invalidation_guc(gt, &fence); - if (ret < 0) { - xe_gt_tlb_invalidation_fence_fini(&fence); + if (ret) return ret; - } xe_gt_tlb_invalidation_fence_wait(&fence); } else if (xe_device_uc_enabled(xe) && !xe_device_wedged(xe)) { @@ -496,7 +503,8 @@ static const struct dma_fence_ops invalidation_fence_ops = { * @stack: fence is stack variable * * Initialize TLB invalidation fence for use. xe_gt_tlb_invalidation_fence_fini - * must be called if fence is not signaled. + * will be automatically called when fence is signalled (all fences must signal), + * even on error. */ void xe_gt_tlb_invalidation_fence_init(struct xe_gt *gt, struct xe_gt_tlb_invalidation_fence *fence, @@ -516,14 +524,3 @@ void xe_gt_tlb_invalidation_fence_init(struct xe_gt *gt, dma_fence_get(&fence->base); fence->gt = gt; } - -/** - * xe_gt_tlb_invalidation_fence_fini - Finalize TLB invalidation fence - * @fence: TLB invalidation fence to finalize - * - * Drop PM ref which fence took durinig init. - */ -void xe_gt_tlb_invalidation_fence_fini(struct xe_gt_tlb_invalidation_fence *fence) -{ - xe_pm_runtime_put(gt_to_xe(fence->gt)); -} diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.h b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.h index a84065fa324c..f430d5797af7 100644 --- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.h +++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.h @@ -28,7 +28,6 @@ int xe_guc_tlb_invalidation_done_handler(struct xe_guc *guc, u32 *msg, u32 len); void xe_gt_tlb_invalidation_fence_init(struct xe_gt *gt, struct xe_gt_tlb_invalidation_fence *fence, bool stack); -void xe_gt_tlb_invalidation_fence_fini(struct xe_gt_tlb_invalidation_fence *fence); static inline void xe_gt_tlb_invalidation_fence_wait(struct xe_gt_tlb_invalidation_fence *fence) diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c index 63495007f336..8a9254e5af6e 100644 --- a/drivers/gpu/drm/xe/xe_guc_submit.c +++ b/drivers/gpu/drm/xe/xe_guc_submit.c @@ -1030,10 +1030,13 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job) /* * TDR has fired before free job worker. Common if exec queue - * immediately closed after last fence signaled. + * immediately closed after last fence signaled. Add back to pending + * list so job can be freed and kick scheduler ensuring free job is not + * lost. */ if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &job->fence->flags)) { - guc_exec_queue_free_job(drm_job); + xe_sched_add_pending_job(sched, job); + xe_sched_submission_start(sched); return DRM_GPU_SCHED_STAT_NOMINAL; } diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c index 28d9bb3b825d..848da8e68c7a 100644 --- a/drivers/gpu/drm/xe/xe_query.c +++ b/drivers/gpu/drm/xe/xe_query.c @@ -161,7 +161,11 @@ query_engine_cycles(struct xe_device *xe, cpu_clock); xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL); - resp.width = 36; + + if (GRAPHICS_VER(xe) >= 20) + resp.width = 64; + else + resp.width = 36; /* Only write to the output fields of user query */ if (put_user(resp.cpu_timestamp, &query_ptr->cpu_timestamp)) diff --git a/drivers/gpu/drm/xe/xe_sync.c b/drivers/gpu/drm/xe/xe_sync.c index bb3c2a830362..c6cf227ead40 100644 --- a/drivers/gpu/drm/xe/xe_sync.c +++ b/drivers/gpu/drm/xe/xe_sync.c @@ -58,7 +58,7 @@ static struct xe_user_fence *user_fence_create(struct xe_device *xe, u64 addr, if (!access_ok(ptr, sizeof(*ptr))) return ERR_PTR(-EFAULT); - ufence = kmalloc(sizeof(*ufence), GFP_KERNEL); + ufence = kzalloc(sizeof(*ufence), GFP_KERNEL); if (!ufence) return ERR_PTR(-ENOMEM); diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index ce9dca4d4e87..c99380271de6 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -3199,10 +3199,8 @@ int xe_vm_invalidate_vma(struct xe_vma *vma) ret = xe_gt_tlb_invalidation_vma(tile->primary_gt, &fence[fence_id], vma); - if (ret < 0) { - xe_gt_tlb_invalidation_fence_fini(&fence[fence_id]); + if (ret) goto wait; - } ++fence_id; if (!tile->media_gt) @@ -3214,10 +3212,8 @@ int xe_vm_invalidate_vma(struct xe_vma *vma) ret = xe_gt_tlb_invalidation_vma(tile->media_gt, &fence[fence_id], vma); - if (ret < 0) { - xe_gt_tlb_invalidation_fence_fini(&fence[fence_id]); + if (ret) goto wait; - } ++fence_id; } } diff --git a/drivers/gpu/drm/xe/xe_wa.c b/drivers/gpu/drm/xe/xe_wa.c index d424992514a4..353936a0f877 100644 --- a/drivers/gpu/drm/xe/xe_wa.c +++ b/drivers/gpu/drm/xe/xe_wa.c @@ -710,6 +710,10 @@ static const struct xe_rtp_entry_sr lrc_was[] = { DIS_PARTIAL_AUTOSTRIP | DIS_AUTOSTRIP)) }, + { XE_RTP_NAME("15016589081"), + XE_RTP_RULES(GRAPHICS_VERSION(2004), ENGINE_CLASS(RENDER)), + XE_RTP_ACTIONS(SET(CHICKEN_RASTER_1, DIS_CLIP_NEGATIVE_BOUNDING_BOX)) + }, /* Xe2_HPG */ { XE_RTP_NAME("15010599737"), diff --git a/drivers/gpu/drm/xe/xe_wait_user_fence.c b/drivers/gpu/drm/xe/xe_wait_user_fence.c index d46fa8374980..f5deb81eba01 100644 --- a/drivers/gpu/drm/xe/xe_wait_user_fence.c +++ b/drivers/gpu/drm/xe/xe_wait_user_fence.c @@ -169,9 +169,6 @@ int xe_wait_user_fence_ioctl(struct drm_device *dev, void *data, args->timeout = 0; } - if (!timeout && !(err < 0)) - err = -ETIME; - if (q) xe_exec_queue_put(q); diff --git a/drivers/gpu/host1x/context.c b/drivers/gpu/host1x/context.c index 955c971c528d..a6f6779662a3 100644 --- a/drivers/gpu/host1x/context.c +++ b/drivers/gpu/host1x/context.c @@ -58,6 +58,7 @@ int host1x_memory_context_list_init(struct host1x *host1x) ctx->dev.parent = host1x->dev; ctx->dev.release = host1x_memory_context_release; + ctx->dev.dma_parms = &ctx->dma_parms; dma_set_max_seg_size(&ctx->dev, UINT_MAX); err = device_add(&ctx->dev); diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c index b62e4f0e8130..e98528777faa 100644 --- a/drivers/gpu/host1x/dev.c +++ b/drivers/gpu/host1x/dev.c @@ -625,12 +625,6 @@ static int host1x_probe(struct platform_device *pdev) goto free_contexts; } - err = host1x_intr_init(host); - if (err) { - dev_err(&pdev->dev, "failed to initialize interrupts\n"); - goto deinit_syncpt; - } - pm_runtime_enable(&pdev->dev); err = devm_tegra_core_dev_init_opp_table_common(&pdev->dev); @@ -642,6 +636,12 @@ static int host1x_probe(struct platform_device *pdev) if (err) goto pm_disable; + err = host1x_intr_init(host); + if (err) { + dev_err(&pdev->dev, "failed to initialize interrupts\n"); + goto pm_put; + } + host1x_debug_init(host); err = host1x_register(host); @@ -658,13 +658,11 @@ unregister: host1x_unregister(host); deinit_debugfs: host1x_debug_deinit(host); - + host1x_intr_deinit(host); +pm_put: pm_runtime_put_sync_suspend(&pdev->dev); pm_disable: pm_runtime_disable(&pdev->dev); - - host1x_intr_deinit(host); -deinit_syncpt: host1x_syncpt_deinit(host); free_contexts: host1x_memory_context_list_free(&host->context_list); diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h index 8a991b30e3c6..92cff3f2658c 100644 --- a/drivers/hid/hid-ids.h +++ b/drivers/hid/hid-ids.h @@ -509,6 +509,7 @@ #define I2C_DEVICE_ID_GOODIX_01E8 0x01e8 #define I2C_DEVICE_ID_GOODIX_01E9 0x01e9 #define I2C_DEVICE_ID_GOODIX_01F0 0x01f0 +#define I2C_DEVICE_ID_GOODIX_0D42 0x0d42 #define USB_VENDOR_ID_GOODTOUCH 0x1aad #define USB_DEVICE_ID_GOODTOUCH_000f 0x000f @@ -868,6 +869,7 @@ #define USB_DEVICE_ID_LOGITECH_NANO_RECEIVER_LIGHTSPEED_1 0xc539 #define USB_DEVICE_ID_LOGITECH_NANO_RECEIVER_LIGHTSPEED_1_1 0xc53f #define USB_DEVICE_ID_LOGITECH_NANO_RECEIVER_POWERPLAY 0xc53a +#define USB_DEVICE_ID_LOGITECH_BOLT_RECEIVER 0xc548 #define USB_DEVICE_ID_SPACETRAVELLER 0xc623 #define USB_DEVICE_ID_SPACENAVIGATOR 0xc626 #define USB_DEVICE_ID_DINOVO_DESKTOP 0xc704 diff --git a/drivers/hid/hid-lenovo.c b/drivers/hid/hid-lenovo.c index 3b0c779ce8f7..f66194fde891 100644 --- a/drivers/hid/hid-lenovo.c +++ b/drivers/hid/hid-lenovo.c @@ -473,6 +473,7 @@ static int lenovo_input_mapping(struct hid_device *hdev, return lenovo_input_mapping_tp10_ultrabook_kbd(hdev, hi, field, usage, bit, max); case USB_DEVICE_ID_LENOVO_X1_TAB: + case USB_DEVICE_ID_LENOVO_X1_TAB3: return lenovo_input_mapping_x1_tab_kbd(hdev, hi, field, usage, bit, max); default: return 0; @@ -583,6 +584,7 @@ static ssize_t attr_fn_lock_store(struct device *dev, break; case USB_DEVICE_ID_LENOVO_TP10UBKBD: case USB_DEVICE_ID_LENOVO_X1_TAB: + case USB_DEVICE_ID_LENOVO_X1_TAB3: ret = lenovo_led_set_tp10ubkbd(hdev, TP10UBKBD_FN_LOCK_LED, value); if (ret) return ret; @@ -776,6 +778,7 @@ static int lenovo_event(struct hid_device *hdev, struct hid_field *field, return lenovo_event_cptkbd(hdev, field, usage, value); case USB_DEVICE_ID_LENOVO_TP10UBKBD: case USB_DEVICE_ID_LENOVO_X1_TAB: + case USB_DEVICE_ID_LENOVO_X1_TAB3: return lenovo_event_tp10ubkbd(hdev, field, usage, value); default: return 0; @@ -1056,6 +1059,7 @@ static int lenovo_led_brightness_set(struct led_classdev *led_cdev, break; case USB_DEVICE_ID_LENOVO_TP10UBKBD: case USB_DEVICE_ID_LENOVO_X1_TAB: + case USB_DEVICE_ID_LENOVO_X1_TAB3: ret = lenovo_led_set_tp10ubkbd(hdev, tp10ubkbd_led[led_nr], value); break; } @@ -1286,6 +1290,7 @@ static int lenovo_probe(struct hid_device *hdev, break; case USB_DEVICE_ID_LENOVO_TP10UBKBD: case USB_DEVICE_ID_LENOVO_X1_TAB: + case USB_DEVICE_ID_LENOVO_X1_TAB3: ret = lenovo_probe_tp10ubkbd(hdev); break; default: @@ -1372,6 +1377,7 @@ static void lenovo_remove(struct hid_device *hdev) break; case USB_DEVICE_ID_LENOVO_TP10UBKBD: case USB_DEVICE_ID_LENOVO_X1_TAB: + case USB_DEVICE_ID_LENOVO_X1_TAB3: lenovo_remove_tp10ubkbd(hdev); break; } @@ -1421,6 +1427,8 @@ static const struct hid_device_id lenovo_devices[] = { */ { HID_DEVICE(BUS_USB, HID_GROUP_GENERIC, USB_VENDOR_ID_LENOVO, USB_DEVICE_ID_LENOVO_X1_TAB) }, + { HID_DEVICE(BUS_USB, HID_GROUP_GENERIC, + USB_VENDOR_ID_LENOVO, USB_DEVICE_ID_LENOVO_X1_TAB3) }, { } }; diff --git a/drivers/hid/hid-multitouch.c b/drivers/hid/hid-multitouch.c index 52004ae76de9..e936019d21fe 100644 --- a/drivers/hid/hid-multitouch.c +++ b/drivers/hid/hid-multitouch.c @@ -2146,6 +2146,10 @@ static const struct hid_device_id mt_devices[] = { HID_DEVICE(BUS_BLUETOOTH, HID_GROUP_MULTITOUCH_WIN_8, USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_CASA_TOUCHPAD) }, + { .driver_data = MT_CLS_WIN_8_FORCE_MULTI_INPUT_NSMU, + HID_DEVICE(BUS_USB, HID_GROUP_MULTITOUCH_WIN_8, + USB_VENDOR_ID_LOGITECH, + USB_DEVICE_ID_LOGITECH_BOLT_RECEIVER) }, /* MosArt panels */ { .driver_data = MT_CLS_CONFIDENCE_MINUS_ONE, diff --git a/drivers/hid/i2c-hid/i2c-hid-core.c b/drivers/hid/i2c-hid/i2c-hid-core.c index be5d342d5d13..43664a24176f 100644 --- a/drivers/hid/i2c-hid/i2c-hid-core.c +++ b/drivers/hid/i2c-hid/i2c-hid-core.c @@ -50,6 +50,7 @@ #define I2C_HID_QUIRK_BAD_INPUT_SIZE BIT(3) #define I2C_HID_QUIRK_NO_WAKEUP_AFTER_RESET BIT(4) #define I2C_HID_QUIRK_NO_SLEEP_ON_SUSPEND BIT(5) +#define I2C_HID_QUIRK_DELAY_WAKEUP_AFTER_RESUME BIT(6) /* Command opcodes */ #define I2C_HID_OPCODE_RESET 0x01 @@ -140,6 +141,8 @@ static const struct i2c_hid_quirks { { USB_VENDOR_ID_ELAN, HID_ANY_ID, I2C_HID_QUIRK_NO_WAKEUP_AFTER_RESET | I2C_HID_QUIRK_BOGUS_IRQ }, + { I2C_VENDOR_ID_GOODIX, I2C_DEVICE_ID_GOODIX_0D42, + I2C_HID_QUIRK_DELAY_WAKEUP_AFTER_RESUME }, { 0, 0 } }; @@ -981,6 +984,13 @@ static int i2c_hid_core_resume(struct i2c_hid *ihid) return -ENXIO; } + /* On Goodix 27c6:0d42 wait extra time before device wakeup. + * It's not clear why but if we send wakeup too early, the device will + * never trigger input interrupts. + */ + if (ihid->quirks & I2C_HID_QUIRK_DELAY_WAKEUP_AFTER_RESUME) + msleep(1500); + /* Instead of resetting device, simply powers the device on. This * solves "incomplete reports" on Raydium devices 2386:3118 and * 2386:4B33 and fixes various SIS touchscreens no longer sending diff --git a/drivers/hwmon/jc42.c b/drivers/hwmon/jc42.c index a260cff750a5..c459dce496a6 100644 --- a/drivers/hwmon/jc42.c +++ b/drivers/hwmon/jc42.c @@ -417,7 +417,7 @@ static int jc42_detect(struct i2c_client *client, struct i2c_board_info *info) return -ENODEV; if ((devid & TSE2004_DEVID_MASK) == TSE2004_DEVID && - (cap & 0x00e7) != 0x00e7) + (cap & 0x0062) != 0x0062) return -ENODEV; for (i = 0; i < ARRAY_SIZE(jc42_chips); i++) { diff --git a/drivers/iio/accel/Kconfig b/drivers/iio/accel/Kconfig index 516c1a8e4d56..8c3f7cf55d5f 100644 --- a/drivers/iio/accel/Kconfig +++ b/drivers/iio/accel/Kconfig @@ -447,6 +447,8 @@ config IIO_ST_ACCEL_SPI_3AXIS config IIO_KX022A tristate + select IIO_BUFFER + select IIO_TRIGGERED_BUFFER config IIO_KX022A_SPI tristate "Kionix KX022A tri-axis digital accelerometer SPI interface" diff --git a/drivers/iio/accel/bma400_core.c b/drivers/iio/accel/bma400_core.c index f4e43c3bbf1a..e4fe36768216 100644 --- a/drivers/iio/accel/bma400_core.c +++ b/drivers/iio/accel/bma400_core.c @@ -1218,7 +1218,8 @@ static int bma400_activity_event_en(struct bma400_data *data, static int bma400_tap_event_en(struct bma400_data *data, enum iio_event_direction dir, int state) { - unsigned int mask, field_value; + unsigned int mask; + unsigned int field_value = 0; int ret; /* diff --git a/drivers/iio/adc/Kconfig b/drivers/iio/adc/Kconfig index 97ece1a4b7e3..6c4e74420fd2 100644 --- a/drivers/iio/adc/Kconfig +++ b/drivers/iio/adc/Kconfig @@ -52,6 +52,8 @@ config AD4695 tristate "Analog Device AD4695 ADC Driver" depends on SPI select REGMAP_SPI + select IIO_BUFFER + select IIO_TRIGGERED_BUFFER help Say yes here to build support for Analog Devices AD4695 and similar analog to digital converters (ADC). @@ -328,6 +330,8 @@ config AD7923 config AD7944 tristate "Analog Devices AD7944 and similar ADCs driver" depends on SPI + select IIO_BUFFER + select IIO_TRIGGERED_BUFFER help Say yes here to build support for Analog Devices AD7944, AD7985, AD7986 ADCs. @@ -1481,6 +1485,8 @@ config TI_ADS8344 config TI_ADS8688 tristate "Texas Instruments ADS8688" depends on SPI + select IIO_BUFFER + select IIO_TRIGGERED_BUFFER help If you say yes here you get support for Texas Instruments ADS8684 and and ADS8688 ADC chips @@ -1491,6 +1497,8 @@ config TI_ADS8688 config TI_ADS124S08 tristate "Texas Instruments ADS124S08" depends on SPI + select IIO_BUFFER + select IIO_TRIGGERED_BUFFER help If you say yes here you get support for Texas Instruments ADS124S08 and ADS124S06 ADC chips @@ -1525,6 +1533,9 @@ config TI_AM335X_ADC config TI_LMP92064 tristate "Texas Instruments LMP92064 ADC driver" depends on SPI + select REGMAP_SPI + select IIO_BUFFER + select IIO_TRIGGERED_BUFFER help Say yes here to build support for the LMP92064 Precision Current and Voltage sensor. diff --git a/drivers/iio/amplifiers/Kconfig b/drivers/iio/amplifiers/Kconfig index b54fe01734b0..55eb16b32f6c 100644 --- a/drivers/iio/amplifiers/Kconfig +++ b/drivers/iio/amplifiers/Kconfig @@ -27,6 +27,7 @@ config AD8366 config ADA4250 tristate "Analog Devices ADA4250 Instrumentation Amplifier" depends on SPI + select REGMAP_SPI help Say yes here to build support for Analog Devices ADA4250 SPI Amplifier's support. The driver provides direct access via diff --git a/drivers/iio/chemical/Kconfig b/drivers/iio/chemical/Kconfig index 678a6adb9a75..6c87223f58d9 100644 --- a/drivers/iio/chemical/Kconfig +++ b/drivers/iio/chemical/Kconfig @@ -80,6 +80,8 @@ config ENS160 tristate "ScioSense ENS160 sensor driver" depends on (I2C || SPI) select REGMAP + select IIO_BUFFER + select IIO_TRIGGERED_BUFFER select ENS160_I2C if I2C select ENS160_SPI if SPI help diff --git a/drivers/iio/common/hid-sensors/hid-sensor-trigger.c b/drivers/iio/common/hid-sensors/hid-sensor-trigger.c index ad8910e6ad59..abb09fefc792 100644 --- a/drivers/iio/common/hid-sensors/hid-sensor-trigger.c +++ b/drivers/iio/common/hid-sensors/hid-sensor-trigger.c @@ -32,7 +32,7 @@ static ssize_t _hid_sensor_set_report_latency(struct device *dev, latency = integer * 1000 + fract / 1000; ret = hid_sensor_set_report_latency(attrb, latency); if (ret < 0) - return len; + return ret; attrb->latency_ms = hid_sensor_get_report_latency(attrb); diff --git a/drivers/iio/dac/Kconfig b/drivers/iio/dac/Kconfig index 1cfd7e2a622f..45e337c6d256 100644 --- a/drivers/iio/dac/Kconfig +++ b/drivers/iio/dac/Kconfig @@ -9,6 +9,8 @@ menu "Digital to analog converters" config AD3552R tristate "Analog Devices AD3552R DAC driver" depends on SPI_MASTER + select IIO_BUFFER + select IIO_TRIGGERED_BUFFER help Say yes here to build support for Analog Devices AD3552R Digital to Analog Converter. @@ -252,6 +254,8 @@ config AD5764 config AD5766 tristate "Analog Devices AD5766/AD5767 DAC driver" depends on SPI_MASTER + select IIO_BUFFER + select IIO_TRIGGERED_BUFFER help Say yes here to build support for Analog Devices AD5766, AD5767 Digital to Analog Converter. @@ -262,6 +266,7 @@ config AD5766 config AD5770R tristate "Analog Devices AD5770R IDAC driver" depends on SPI_MASTER + select REGMAP_SPI help Say yes here to build support for Analog Devices AD5770R Digital to Analog Converter. @@ -353,6 +358,7 @@ config LPC18XX_DAC config LTC1660 tristate "Linear Technology LTC1660/LTC1665 DAC SPI driver" depends on SPI + select REGMAP_SPI help Say yes here to build support for Linear Technology LTC1660 and LTC1665 Digital to Analog Converters. @@ -483,6 +489,7 @@ config STM32_DAC config STM32_DAC_CORE tristate + select REGMAP_MMIO config TI_DAC082S085 tristate "Texas Instruments 8/10/12-bit 2/4-channel DAC driver" diff --git a/drivers/iio/dac/ltc2664.c b/drivers/iio/dac/ltc2664.c index 5be5345ac5c8..67f14046cf77 100644 --- a/drivers/iio/dac/ltc2664.c +++ b/drivers/iio/dac/ltc2664.c @@ -516,7 +516,7 @@ static int ltc2664_channel_config(struct ltc2664_state *st) const struct ltc2664_chip_info *chip_info = st->chip_info; struct device *dev = &st->spi->dev; u32 reg, tmp[2], mspan; - int ret, span = 0; + int ret; mspan = LTC2664_MSPAN_SOFTSPAN; ret = device_property_read_u32(dev, "adi,manual-span-operation-config", @@ -579,20 +579,21 @@ static int ltc2664_channel_config(struct ltc2664_state *st) ret = fwnode_property_read_u32_array(child, "output-range-microvolt", tmp, ARRAY_SIZE(tmp)); if (!ret && mspan == LTC2664_MSPAN_SOFTSPAN) { - chan->span = ltc2664_set_span(st, tmp[0] / 1000, - tmp[1] / 1000, reg); - if (span < 0) - return dev_err_probe(dev, span, + ret = ltc2664_set_span(st, tmp[0] / 1000, tmp[1] / 1000, reg); + if (ret < 0) + return dev_err_probe(dev, ret, "Failed to set span\n"); + chan->span = ret; } ret = fwnode_property_read_u32_array(child, "output-range-microamp", tmp, ARRAY_SIZE(tmp)); if (!ret) { - chan->span = ltc2664_set_span(st, 0, tmp[1] / 1000, reg); - if (span < 0) - return dev_err_probe(dev, span, + ret = ltc2664_set_span(st, 0, tmp[1] / 1000, reg); + if (ret < 0) + return dev_err_probe(dev, ret, "Failed to set span\n"); + chan->span = ret; } } diff --git a/drivers/iio/frequency/Kconfig b/drivers/iio/frequency/Kconfig index c455be7d4a1c..583cbdf4e8cd 100644 --- a/drivers/iio/frequency/Kconfig +++ b/drivers/iio/frequency/Kconfig @@ -53,6 +53,7 @@ config ADF4371 config ADF4377 tristate "Analog Devices ADF4377 Microwave Wideband Synthesizer" depends on SPI && COMMON_CLK + select REGMAP_SPI help Say yes here to build support for Analog Devices ADF4377 Microwave Wideband Synthesizer. @@ -91,25 +92,26 @@ config ADMV1014 module will be called admv1014. config ADMV4420 - tristate "Analog Devices ADMV4420 K Band Downconverter" - depends on SPI - help - Say yes here to build support for Analog Devices K Band - Downconverter with integrated Fractional-N PLL and VCO. + tristate "Analog Devices ADMV4420 K Band Downconverter" + depends on SPI + select REGMAP_SPI + help + Say yes here to build support for Analog Devices K Band + Downconverter with integrated Fractional-N PLL and VCO. - To compile this driver as a module, choose M here: the - module will be called admv4420. + To compile this driver as a module, choose M here: the + module will be called admv4420. config ADRF6780 - tristate "Analog Devices ADRF6780 Microwave Upconverter" - depends on SPI - depends on COMMON_CLK - help - Say yes here to build support for Analog Devices ADRF6780 - 5.9 GHz to 23.6 GHz, Wideband, Microwave Upconverter. - - To compile this driver as a module, choose M here: the - module will be called adrf6780. + tristate "Analog Devices ADRF6780 Microwave Upconverter" + depends on SPI + depends on COMMON_CLK + help + Say yes here to build support for Analog Devices ADRF6780 + 5.9 GHz to 23.6 GHz, Wideband, Microwave Upconverter. + + To compile this driver as a module, choose M here: the + module will be called adrf6780. endmenu endmenu diff --git a/drivers/iio/imu/bmi323/bmi323_core.c b/drivers/iio/imu/bmi323/bmi323_core.c index beda8d2de53f..e1f3b0d778be 100644 --- a/drivers/iio/imu/bmi323/bmi323_core.c +++ b/drivers/iio/imu/bmi323/bmi323_core.c @@ -2172,7 +2172,6 @@ int bmi323_core_probe(struct device *dev) } EXPORT_SYMBOL_NS_GPL(bmi323_core_probe, IIO_BMI323); -#if defined(CONFIG_PM) static int bmi323_core_runtime_suspend(struct device *dev) { struct iio_dev *indio_dev = dev_get_drvdata(dev); @@ -2199,12 +2198,12 @@ static int bmi323_core_runtime_suspend(struct device *dev) } for (unsigned int i = 0; i < ARRAY_SIZE(bmi323_ext_reg_savestate); i++) { - ret = bmi323_read_ext_reg(data, bmi323_reg_savestate[i], - &savestate->reg_settings[i]); + ret = bmi323_read_ext_reg(data, bmi323_ext_reg_savestate[i], + &savestate->ext_reg_settings[i]); if (ret) { dev_err(data->dev, "Error reading bmi323 external reg 0x%x: %d\n", - bmi323_reg_savestate[i], ret); + bmi323_ext_reg_savestate[i], ret); return ret; } } @@ -2232,8 +2231,10 @@ static int bmi323_core_runtime_resume(struct device *dev) * after being reset in the lower power state by runtime-pm. */ ret = bmi323_init(data); - if (!ret) + if (ret) { + dev_err(data->dev, "Device power-on and init failed: %d", ret); return ret; + } /* Register must be cleared before changing an active config */ ret = regmap_write(data->regmap, BMI323_FEAT_IO0_REG, 0); @@ -2243,12 +2244,12 @@ static int bmi323_core_runtime_resume(struct device *dev) } for (unsigned int i = 0; i < ARRAY_SIZE(bmi323_ext_reg_savestate); i++) { - ret = bmi323_write_ext_reg(data, bmi323_reg_savestate[i], - savestate->reg_settings[i]); + ret = bmi323_write_ext_reg(data, bmi323_ext_reg_savestate[i], + savestate->ext_reg_settings[i]); if (ret) { dev_err(data->dev, "Error writing bmi323 external reg 0x%x: %d\n", - bmi323_reg_savestate[i], ret); + bmi323_ext_reg_savestate[i], ret); return ret; } } @@ -2293,11 +2294,9 @@ static int bmi323_core_runtime_resume(struct device *dev) return iio_device_resume_triggering(indio_dev); } -#endif - const struct dev_pm_ops bmi323_core_pm_ops = { - SET_RUNTIME_PM_OPS(bmi323_core_runtime_suspend, - bmi323_core_runtime_resume, NULL) + RUNTIME_PM_OPS(bmi323_core_runtime_suspend, + bmi323_core_runtime_resume, NULL) }; EXPORT_SYMBOL_NS_GPL(bmi323_core_pm_ops, IIO_BMI323); diff --git a/drivers/iio/light/Kconfig b/drivers/iio/light/Kconfig index 515ff46b5b82..f2f3e414849a 100644 --- a/drivers/iio/light/Kconfig +++ b/drivers/iio/light/Kconfig @@ -335,6 +335,8 @@ config ROHM_BU27008 depends on I2C select REGMAP_I2C select IIO_GTS_HELPER + select IIO_BUFFER + select IIO_TRIGGERED_BUFFER help Enable support for the ROHM BU27008 color sensor. The ROHM BU27008 is a sensor with 5 photodiodes (red, green, diff --git a/drivers/iio/light/opt3001.c b/drivers/iio/light/opt3001.c index 887c4b776a86..176e54bb48c3 100644 --- a/drivers/iio/light/opt3001.c +++ b/drivers/iio/light/opt3001.c @@ -139,6 +139,10 @@ static const struct opt3001_scale opt3001_scales[] = { .val2 = 400000, }, { + .val = 41932, + .val2 = 800000, + }, + { .val = 83865, .val2 = 600000, }, diff --git a/drivers/iio/light/veml6030.c b/drivers/iio/light/veml6030.c index 2e86d310952e..9630de1c578e 100644 --- a/drivers/iio/light/veml6030.c +++ b/drivers/iio/light/veml6030.c @@ -99,9 +99,8 @@ static const char * const period_values[] = { static ssize_t in_illuminance_period_available_show(struct device *dev, struct device_attribute *attr, char *buf) { + struct veml6030_data *data = iio_priv(dev_to_iio_dev(dev)); int ret, reg, x; - struct iio_dev *indio_dev = i2c_get_clientdata(to_i2c_client(dev)); - struct veml6030_data *data = iio_priv(indio_dev); ret = regmap_read(data->regmap, VEML6030_REG_ALS_CONF, ®); if (ret) { @@ -780,7 +779,7 @@ static int veml6030_hw_init(struct iio_dev *indio_dev) /* Cache currently active measurement parameters */ data->cur_gain = 3; - data->cur_resolution = 4608; + data->cur_resolution = 5376; data->cur_integration_time = 3; return ret; diff --git a/drivers/iio/magnetometer/Kconfig b/drivers/iio/magnetometer/Kconfig index 8eb718f5e50f..f69ac75500f9 100644 --- a/drivers/iio/magnetometer/Kconfig +++ b/drivers/iio/magnetometer/Kconfig @@ -11,6 +11,8 @@ config AF8133J depends on I2C depends on OF select REGMAP_I2C + select IIO_BUFFER + select IIO_TRIGGERED_BUFFER help Say yes here to build support for Voltafield AF8133J I2C-based 3-axis magnetometer chip. diff --git a/drivers/iio/pressure/Kconfig b/drivers/iio/pressure/Kconfig index ce369dbb17fc..d2cb8c871f6a 100644 --- a/drivers/iio/pressure/Kconfig +++ b/drivers/iio/pressure/Kconfig @@ -19,6 +19,9 @@ config ABP060MG config ROHM_BM1390 tristate "ROHM BM1390GLV-Z pressure sensor driver" depends on I2C + select REGMAP_I2C + select IIO_BUFFER + select IIO_TRIGGERED_BUFFER help Support for the ROHM BM1390 pressure sensor. The BM1390GLV-Z can measure pressures ranging from 300 hPa to 1300 hPa with @@ -253,6 +256,7 @@ config MS5637 config SDP500 tristate "Sensirion SDP500 differential pressure sensor I2C driver" depends on I2C + select CRC8 help Say Y here to build support for Sensirion SDP500 differential pressure sensor I2C driver. diff --git a/drivers/iio/proximity/Kconfig b/drivers/iio/proximity/Kconfig index 31c679074b25..a562a78b7d0d 100644 --- a/drivers/iio/proximity/Kconfig +++ b/drivers/iio/proximity/Kconfig @@ -86,6 +86,8 @@ config LIDAR_LITE_V2 config MB1232 tristate "MaxSonar I2CXL family ultrasonic sensors" depends on I2C + select IIO_BUFFER + select IIO_TRIGGERED_BUFFER help Say Y to build a driver for the ultrasonic sensors I2CXL of MaxBotix which have an i2c interface. It can be used to measure diff --git a/drivers/iio/resolver/Kconfig b/drivers/iio/resolver/Kconfig index 424529d36080..de2dee3832a1 100644 --- a/drivers/iio/resolver/Kconfig +++ b/drivers/iio/resolver/Kconfig @@ -31,6 +31,9 @@ config AD2S1210 depends on SPI depends on COMMON_CLK depends on GPIOLIB || COMPILE_TEST + select REGMAP + select IIO_BUFFER + select IIO_TRIGGERED_BUFFER help Say yes here to build support for Analog Devices spi resolver to digital converters, ad2s1210, provides direct access via sysfs. diff --git a/drivers/input/joystick/xpad.c b/drivers/input/joystick/xpad.c index 4eda18f4f46e..22ea58bf76cb 100644 --- a/drivers/input/joystick/xpad.c +++ b/drivers/input/joystick/xpad.c @@ -218,6 +218,7 @@ static const struct xpad_device { { 0x0c12, 0x8810, "Zeroplus Xbox Controller", 0, XTYPE_XBOX }, { 0x0c12, 0x9902, "HAMA VibraX - *FAULTY HARDWARE*", 0, XTYPE_XBOX }, { 0x0d2f, 0x0002, "Andamiro Pump It Up pad", MAP_DPAD_TO_BUTTONS, XTYPE_XBOX }, + { 0x0db0, 0x1901, "Micro Star International Xbox360 Controller for Windows", 0, XTYPE_XBOX360 }, { 0x0e4c, 0x1097, "Radica Gamester Controller", 0, XTYPE_XBOX }, { 0x0e4c, 0x1103, "Radica Gamester Reflex", MAP_TRIGGERS_TO_BUTTONS, XTYPE_XBOX }, { 0x0e4c, 0x2390, "Radica Games Jtech Controller", 0, XTYPE_XBOX }, @@ -373,6 +374,7 @@ static const struct xpad_device { { 0x294b, 0x3404, "Snakebyte GAMEPAD RGB X", 0, XTYPE_XBOXONE }, { 0x2dc8, 0x2000, "8BitDo Pro 2 Wired Controller fox Xbox", 0, XTYPE_XBOXONE }, { 0x2dc8, 0x3106, "8BitDo Pro 2 Wired Controller", 0, XTYPE_XBOX360 }, + { 0x2dc8, 0x310a, "8BitDo Ultimate 2C Wireless Controller", 0, XTYPE_XBOX360 }, { 0x2e24, 0x0652, "Hyperkin Duke X-Box One pad", 0, XTYPE_XBOXONE }, { 0x31e3, 0x1100, "Wooting One", 0, XTYPE_XBOX360 }, { 0x31e3, 0x1200, "Wooting Two", 0, XTYPE_XBOX360 }, @@ -492,6 +494,7 @@ static const struct usb_device_id xpad_table[] = { XPAD_XBOX360_VENDOR(0x07ff), /* Mad Catz Gamepad */ XPAD_XBOXONE_VENDOR(0x0b05), /* ASUS controllers */ XPAD_XBOX360_VENDOR(0x0c12), /* Zeroplus X-Box 360 controllers */ + XPAD_XBOX360_VENDOR(0x0db0), /* Micro Star International X-Box 360 controllers */ XPAD_XBOX360_VENDOR(0x0e6f), /* 0x0e6f Xbox 360 controllers */ XPAD_XBOXONE_VENDOR(0x0e6f), /* 0x0e6f Xbox One controllers */ XPAD_XBOX360_VENDOR(0x0f0d), /* Hori controllers */ diff --git a/drivers/input/touchscreen/zinitix.c b/drivers/input/touchscreen/zinitix.c index 52b3950460e2..716d6fa60f86 100644 --- a/drivers/input/touchscreen/zinitix.c +++ b/drivers/input/touchscreen/zinitix.c @@ -645,19 +645,29 @@ static int zinitix_ts_probe(struct i2c_client *client) return error; } - bt541->num_keycodes = device_property_count_u32(&client->dev, "linux,keycodes"); - if (bt541->num_keycodes > ARRAY_SIZE(bt541->keycodes)) { - dev_err(&client->dev, "too many keys defined (%d)\n", bt541->num_keycodes); - return -EINVAL; - } + if (device_property_present(&client->dev, "linux,keycodes")) { + bt541->num_keycodes = device_property_count_u32(&client->dev, + "linux,keycodes"); + if (bt541->num_keycodes < 0) { + dev_err(&client->dev, "Failed to count keys (%d)\n", + bt541->num_keycodes); + return bt541->num_keycodes; + } else if (bt541->num_keycodes > ARRAY_SIZE(bt541->keycodes)) { + dev_err(&client->dev, "Too many keys defined (%d)\n", + bt541->num_keycodes); + return -EINVAL; + } - error = device_property_read_u32_array(&client->dev, "linux,keycodes", - bt541->keycodes, - bt541->num_keycodes); - if (error) { - dev_err(&client->dev, - "Unable to parse \"linux,keycodes\" property: %d\n", error); - return error; + error = device_property_read_u32_array(&client->dev, + "linux,keycodes", + bt541->keycodes, + bt541->num_keycodes); + if (error) { + dev_err(&client->dev, + "Unable to parse \"linux,keycodes\" property: %d\n", + error); + return error; + } } error = zinitix_init_input_dev(bt541); diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 737c5b882355..353fea58cd31 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1420,7 +1420,7 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master) cd_table->s1fmt = STRTAB_STE_0_S1FMT_LINEAR; cd_table->linear.num_ents = max_contexts; - l1size = max_contexts * sizeof(struct arm_smmu_cd), + l1size = max_contexts * sizeof(struct arm_smmu_cd); cd_table->linear.table = dma_alloc_coherent(smmu->dev, l1size, &cd_table->cdtab_dma, GFP_KERNEL); @@ -3625,7 +3625,7 @@ static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu) u32 l1size; struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg; unsigned int last_sid_idx = - arm_smmu_strtab_l1_idx((1 << smmu->sid_bits) - 1); + arm_smmu_strtab_l1_idx((1ULL << smmu->sid_bits) - 1); /* Calculate the L1 size, capped to the SIDSIZE. */ cfg->l2.num_l1_ents = min(last_sid_idx + 1, STRTAB_MAX_L1_ENTRIES); diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c index 9dc772f2cbb2..99030e6b16e7 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c @@ -130,7 +130,7 @@ int arm_mmu500_reset(struct arm_smmu_device *smmu) /* * Disable MMU-500's not-particularly-beneficial next-page - * prefetcher for the sake of errata #841119 and #826419. + * prefetcher for the sake of at least 5 known errata. */ for (i = 0; i < smmu->num_context_banks; ++i) { reg = arm_smmu_cb_read(smmu, i, ARM_SMMU_CB_ACTLR); @@ -138,7 +138,7 @@ int arm_mmu500_reset(struct arm_smmu_device *smmu) arm_smmu_cb_write(smmu, i, ARM_SMMU_CB_ACTLR, reg); reg = arm_smmu_cb_read(smmu, i, ARM_SMMU_CB_ACTLR); if (reg & ARM_MMU500_ACTLR_CPRE) - dev_warn_once(smmu->dev, "Failed to disable prefetcher [errata #841119 and #826419], check ACR.CACHE_LOCK\n"); + dev_warn_once(smmu->dev, "Failed to disable prefetcher for errata workarounds, check SACR.CACHE_LOCK\n"); } return 0; diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 9f6b0780f2ef..e860bc9439a2 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -3340,8 +3340,10 @@ static int domain_context_clear_one_cb(struct pci_dev *pdev, u16 alias, void *op */ static void domain_context_clear(struct device_domain_info *info) { - if (!dev_is_pci(info->dev)) + if (!dev_is_pci(info->dev)) { domain_context_clear_one(info, info->bus, info->devfn); + return; + } pci_for_each_dma_alias(to_pci_dev(info->dev), &domain_context_clear_one_cb, info); diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig index 341cd9ca5a05..d82bcab233a1 100644 --- a/drivers/irqchip/Kconfig +++ b/drivers/irqchip/Kconfig @@ -45,13 +45,6 @@ config ARM_GIC_V3_ITS select IRQ_MSI_LIB default ARM_GIC_V3 -config ARM_GIC_V3_ITS_PCI - bool - depends on ARM_GIC_V3_ITS - depends on PCI - depends on PCI_MSI - default ARM_GIC_V3_ITS - config ARM_GIC_V3_ITS_FSL_MC bool depends on ARM_GIC_V3_ITS diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index fdec478ba5e7..ab597e74ba08 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -797,8 +797,8 @@ static struct its_vpe *its_build_vmapp_cmd(struct its_node *its, its_encode_valid(cmd, desc->its_vmapp_cmd.valid); if (!desc->its_vmapp_cmd.valid) { + alloc = !atomic_dec_return(&desc->its_vmapp_cmd.vpe->vmapp_count); if (is_v4_1(its)) { - alloc = !atomic_dec_return(&desc->its_vmapp_cmd.vpe->vmapp_count); its_encode_alloc(cmd, alloc); /* * Unmapping a VPE is self-synchronizing on GICv4.1, @@ -817,13 +817,13 @@ static struct its_vpe *its_build_vmapp_cmd(struct its_node *its, its_encode_vpt_addr(cmd, vpt_addr); its_encode_vpt_size(cmd, LPI_NRBITS - 1); + alloc = !atomic_fetch_inc(&desc->its_vmapp_cmd.vpe->vmapp_count); + if (!is_v4_1(its)) goto out; vconf_addr = virt_to_phys(page_address(desc->its_vmapp_cmd.vpe->its_vm->vprop_page)); - alloc = !atomic_fetch_inc(&desc->its_vmapp_cmd.vpe->vmapp_count); - its_encode_alloc(cmd, alloc); /* @@ -3807,6 +3807,13 @@ static int its_vpe_set_affinity(struct irq_data *d, unsigned long flags; /* + * Check if we're racing against a VPE being destroyed, for + * which we don't want to allow a VMOVP. + */ + if (!atomic_read(&vpe->vmapp_count)) + return -EINVAL; + + /* * Changing affinity is mega expensive, so let's be as lazy as * we can and only do it if we really have to. Also, if mapped * into the proxy device, we need to move the doorbell @@ -4463,9 +4470,8 @@ static int its_vpe_init(struct its_vpe *vpe) raw_spin_lock_init(&vpe->vpe_lock); vpe->vpe_id = vpe_id; vpe->vpt_page = vpt_page; - if (gic_rdists->has_rvpeid) - atomic_set(&vpe->vmapp_count, 0); - else + atomic_set(&vpe->vmapp_count, 0); + if (!gic_rdists->has_rvpeid) vpe->vpe_proxy_event = -1; return 0; diff --git a/drivers/irqchip/irq-mscc-ocelot.c b/drivers/irqchip/irq-mscc-ocelot.c index 4d0c3532dbe7..3dc745b14caf 100644 --- a/drivers/irqchip/irq-mscc-ocelot.c +++ b/drivers/irqchip/irq-mscc-ocelot.c @@ -37,7 +37,7 @@ static struct chip_props ocelot_props = { .reg_off_ena_clr = 0x1c, .reg_off_ena_set = 0x20, .reg_off_ident = 0x38, - .reg_off_trigger = 0x5c, + .reg_off_trigger = 0x4, .n_irq = 24, }; @@ -70,7 +70,7 @@ static struct chip_props jaguar2_props = { .reg_off_ena_clr = 0x1c, .reg_off_ena_set = 0x20, .reg_off_ident = 0x38, - .reg_off_trigger = 0x5c, + .reg_off_trigger = 0x4, .n_irq = 29, }; @@ -84,6 +84,12 @@ static void ocelot_irq_unmask(struct irq_data *data) u32 val; irq_gc_lock(gc); + /* + * Clear sticky bits for edge mode interrupts. + * Serval has only one trigger register replication, but the adjacent + * register is always read as zero, so there's no need to handle this + * case separately. + */ val = irq_reg_readl(gc, ICPU_CFG_INTR_INTR_TRIGGER(p, 0)) | irq_reg_readl(gc, ICPU_CFG_INTR_INTR_TRIGGER(p, 1)); if (!(val & mask)) diff --git a/drivers/irqchip/irq-renesas-rzg2l.c b/drivers/irqchip/irq-renesas-rzg2l.c index 693ff285ca2c..99e27e01b0b1 100644 --- a/drivers/irqchip/irq-renesas-rzg2l.c +++ b/drivers/irqchip/irq-renesas-rzg2l.c @@ -8,6 +8,7 @@ */ #include <linux/bitfield.h> +#include <linux/cleanup.h> #include <linux/clk.h> #include <linux/err.h> #include <linux/io.h> @@ -530,12 +531,12 @@ static int rzg2l_irqc_parse_interrupts(struct rzg2l_irqc_priv *priv, static int rzg2l_irqc_common_init(struct device_node *node, struct device_node *parent, const struct irq_chip *irq_chip) { + struct platform_device *pdev = of_find_device_by_node(node); + struct device *dev __free(put_device) = pdev ? &pdev->dev : NULL; struct irq_domain *irq_domain, *parent_domain; - struct platform_device *pdev; struct reset_control *resetn; int ret; - pdev = of_find_device_by_node(node); if (!pdev) return -ENODEV; @@ -591,6 +592,17 @@ static int rzg2l_irqc_common_init(struct device_node *node, struct device_node * register_syscore_ops(&rzg2l_irqc_syscore_ops); + /* + * Prevent the cleanup function from invoking put_device by assigning + * NULL to dev. + * + * make coccicheck will complain about missing put_device calls, but + * those are false positives, as dev will be automatically "put" via + * __free_put_device on the failing path. + * On the successful path we don't actually want to "put" dev. + */ + dev = NULL; + return 0; pm_put: diff --git a/drivers/irqchip/irq-riscv-imsic-platform.c b/drivers/irqchip/irq-riscv-imsic-platform.c index 64905e6f52d7..c708780e8760 100644 --- a/drivers/irqchip/irq-riscv-imsic-platform.c +++ b/drivers/irqchip/irq-riscv-imsic-platform.c @@ -341,7 +341,7 @@ int imsic_irqdomain_init(void) imsic->fwnode, global->hart_index_bits, global->guest_index_bits); pr_info("%pfwP: group-index-bits: %d, group-index-shift: %d\n", imsic->fwnode, global->group_index_bits, global->group_index_shift); - pr_info("%pfwP: per-CPU IDs %d at base PPN %pa\n", + pr_info("%pfwP: per-CPU IDs %d at base address %pa\n", imsic->fwnode, global->nr_ids, &global->base_addr); pr_info("%pfwP: total %d interrupts available\n", imsic->fwnode, num_possible_cpus() * (global->nr_ids - 1)); diff --git a/drivers/irqchip/irq-riscv-intc.c b/drivers/irqchip/irq-riscv-intc.c index 8c5411386220..f653c13de62b 100644 --- a/drivers/irqchip/irq-riscv-intc.c +++ b/drivers/irqchip/irq-riscv-intc.c @@ -265,7 +265,7 @@ struct rintc_data { }; static u32 nr_rintc; -static struct rintc_data *rintc_acpi_data[NR_CPUS]; +static struct rintc_data **rintc_acpi_data; #define for_each_matching_plic(_plic_id) \ unsigned int _plic; \ @@ -329,13 +329,30 @@ int acpi_rintc_get_imsic_mmio_info(u32 index, struct resource *res) return 0; } +static int __init riscv_intc_acpi_match(union acpi_subtable_headers *header, + const unsigned long end) +{ + return 0; +} + static int __init riscv_intc_acpi_init(union acpi_subtable_headers *header, const unsigned long end) { struct acpi_madt_rintc *rintc; struct fwnode_handle *fn; + int count; int rc; + if (!rintc_acpi_data) { + count = acpi_table_parse_madt(ACPI_MADT_TYPE_RINTC, riscv_intc_acpi_match, 0); + if (count <= 0) + return -EINVAL; + + rintc_acpi_data = kcalloc(count, sizeof(*rintc_acpi_data), GFP_KERNEL); + if (!rintc_acpi_data) + return -ENOMEM; + } + rintc = (struct acpi_madt_rintc *)header; rintc_acpi_data[nr_rintc] = kzalloc(sizeof(*rintc_acpi_data[0]), GFP_KERNEL); if (!rintc_acpi_data[nr_rintc]) diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c index 2f6ef5c495bd..36dbcf2d728a 100644 --- a/drivers/irqchip/irq-sifive-plic.c +++ b/drivers/irqchip/irq-sifive-plic.c @@ -126,16 +126,6 @@ static inline void plic_irq_toggle(const struct cpumask *mask, } } -static void plic_irq_enable(struct irq_data *d) -{ - plic_irq_toggle(irq_data_get_effective_affinity_mask(d), d, 1); -} - -static void plic_irq_disable(struct irq_data *d) -{ - plic_irq_toggle(irq_data_get_effective_affinity_mask(d), d, 0); -} - static void plic_irq_unmask(struct irq_data *d) { struct plic_priv *priv = irq_data_get_irq_chip_data(d); @@ -150,6 +140,17 @@ static void plic_irq_mask(struct irq_data *d) writel(0, priv->regs + PRIORITY_BASE + d->hwirq * PRIORITY_PER_ID); } +static void plic_irq_enable(struct irq_data *d) +{ + plic_irq_toggle(irq_data_get_effective_affinity_mask(d), d, 1); + plic_irq_unmask(d); +} + +static void plic_irq_disable(struct irq_data *d) +{ + plic_irq_toggle(irq_data_get_effective_affinity_mask(d), d, 0); +} + static void plic_irq_eoi(struct irq_data *d) { struct plic_handler *handler = this_cpu_ptr(&plic_handlers); @@ -626,8 +627,10 @@ static int plic_probe(struct fwnode_handle *fwnode) handler->enable_save = kcalloc(DIV_ROUND_UP(nr_irqs, 32), sizeof(*handler->enable_save), GFP_KERNEL); - if (!handler->enable_save) + if (!handler->enable_save) { + error = -ENOMEM; goto fail_cleanup_contexts; + } done: for (hwirq = 1; hwirq <= nr_irqs; hwirq++) { plic_toggle(handler, hwirq, 0); @@ -639,8 +642,10 @@ done: priv->irqdomain = irq_domain_create_linear(fwnode, nr_irqs + 1, &plic_irqdomain_ops, priv); - if (WARN_ON(!priv->irqdomain)) + if (WARN_ON(!priv->irqdomain)) { + error = -ENOMEM; goto fail_cleanup_contexts; + } /* * We can have multiple PLIC instances so setup global state diff --git a/drivers/misc/cardreader/Kconfig b/drivers/misc/cardreader/Kconfig index 022322dfb36e..a70700f0e592 100644 --- a/drivers/misc/cardreader/Kconfig +++ b/drivers/misc/cardreader/Kconfig @@ -16,7 +16,8 @@ config MISC_RTSX_PCI select MFD_CORE help This supports for Realtek PCI-Express card reader including rts5209, - rts5227, rts522A, rts5229, rts5249, rts524A, rts525A, rtl8411, rts5260. + rts5227, rts5228, rts522A, rts5229, rts5249, rts524A, rts525A, rtl8411, + rts5260, rts5261, rts5264. Realtek card readers support access to many types of memory cards, such as Memory Stick, Memory Stick Pro, Secure Digital and MultiMediaCard. diff --git a/drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_otpe2p.c b/drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_otpe2p.c index 7c3d8bedf90b..a2ed477e0370 100644 --- a/drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_otpe2p.c +++ b/drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_otpe2p.c @@ -364,6 +364,7 @@ static int pci1xxxx_otp_eeprom_probe(struct auxiliary_device *aux_dev, if (is_eeprom_responsive(priv)) { priv->nvmem_config_eeprom.type = NVMEM_TYPE_EEPROM; priv->nvmem_config_eeprom.name = EEPROM_NAME; + priv->nvmem_config_eeprom.id = NVMEM_DEVID_AUTO; priv->nvmem_config_eeprom.dev = &aux_dev->dev; priv->nvmem_config_eeprom.owner = THIS_MODULE; priv->nvmem_config_eeprom.reg_read = pci1xxxx_eeprom_read; @@ -383,6 +384,7 @@ static int pci1xxxx_otp_eeprom_probe(struct auxiliary_device *aux_dev, priv->nvmem_config_otp.type = NVMEM_TYPE_OTP; priv->nvmem_config_otp.name = OTP_NAME; + priv->nvmem_config_otp.id = NVMEM_DEVID_AUTO; priv->nvmem_config_otp.dev = &aux_dev->dev; priv->nvmem_config_otp.owner = THIS_MODULE; priv->nvmem_config_otp.reg_read = pci1xxxx_otp_read; diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c index 4e8710c7cb7b..5290f5ad98f3 100644 --- a/drivers/net/dsa/microchip/ksz_common.c +++ b/drivers/net/dsa/microchip/ksz_common.c @@ -2733,26 +2733,27 @@ static u32 ksz_get_phy_flags(struct dsa_switch *ds, int port) return MICREL_KSZ8_P1_ERRATA; break; case KSZ8567_CHIP_ID: + /* KSZ8567R Errata DS80000752C Module 4 */ + case KSZ8765_CHIP_ID: + case KSZ8794_CHIP_ID: + case KSZ8795_CHIP_ID: + /* KSZ879x/KSZ877x/KSZ876x Errata DS80000687C Module 2 */ case KSZ9477_CHIP_ID: + /* KSZ9477S Errata DS80000754A Module 4 */ case KSZ9567_CHIP_ID: + /* KSZ9567S Errata DS80000756A Module 4 */ case KSZ9896_CHIP_ID: + /* KSZ9896C Errata DS80000757A Module 3 */ case KSZ9897_CHIP_ID: - /* KSZ9477 Errata DS80000754C - * - * Module 4: Energy Efficient Ethernet (EEE) feature select must - * be manually disabled + /* KSZ9897R Errata DS80000758C Module 4 */ + /* Energy Efficient Ethernet (EEE) feature select must be manually disabled * The EEE feature is enabled by default, but it is not fully * operational. It must be manually disabled through register * controls. If not disabled, the PHY ports can auto-negotiate * to enable EEE, and this feature can cause link drops when * linked to another device supporting EEE. * - * The same item appears in the errata for the KSZ9567, KSZ9896, - * and KSZ9897. - * - * A similar item appears in the errata for the KSZ8567, but - * provides an alternative workaround. For now, use the simple - * workaround of disabling the EEE feature for this device too. + * The same item appears in the errata for all switches above. */ return MICREL_NO_EEE; } diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h index 00aa59857b64..48399ab5355a 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.h +++ b/drivers/net/dsa/mv88e6xxx/chip.h @@ -208,6 +208,7 @@ struct mv88e6xxx_gpio_ops; struct mv88e6xxx_avb_ops; struct mv88e6xxx_ptp_ops; struct mv88e6xxx_pcs_ops; +struct mv88e6xxx_cc_coeffs; struct mv88e6xxx_irq { u16 masked; @@ -416,6 +417,7 @@ struct mv88e6xxx_chip { struct cyclecounter tstamp_cc; struct timecounter tstamp_tc; struct delayed_work overflow_work; + const struct mv88e6xxx_cc_coeffs *cc_coeffs; struct ptp_clock *ptp_clock; struct ptp_clock_info ptp_clock_info; @@ -745,10 +747,6 @@ struct mv88e6xxx_ptp_ops { int arr1_sts_reg; int dep_sts_reg; u32 rx_filters; - u32 cc_shift; - u32 cc_mult; - u32 cc_mult_num; - u32 cc_mult_dem; }; struct mv88e6xxx_pcs_ops { diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c index d72bba1969f7..dc777ddce1f3 100644 --- a/drivers/net/dsa/mv88e6xxx/port.c +++ b/drivers/net/dsa/mv88e6xxx/port.c @@ -1714,6 +1714,7 @@ int mv88e6393x_port_set_policy(struct mv88e6xxx_chip *chip, int port, ptr = shift / 8; shift %= 8; mask >>= ptr * 8; + ptr <<= 8; err = mv88e6393x_port_policy_read(chip, port, ptr, ®); if (err) diff --git a/drivers/net/dsa/mv88e6xxx/ptp.c b/drivers/net/dsa/mv88e6xxx/ptp.c index 56391e09b325..aed4a4b07f34 100644 --- a/drivers/net/dsa/mv88e6xxx/ptp.c +++ b/drivers/net/dsa/mv88e6xxx/ptp.c @@ -18,6 +18,13 @@ #define MV88E6XXX_MAX_ADJ_PPB 1000000 +struct mv88e6xxx_cc_coeffs { + u32 cc_shift; + u32 cc_mult; + u32 cc_mult_num; + u32 cc_mult_dem; +}; + /* Family MV88E6250: * Raw timestamps are in units of 10-ns clock periods. * @@ -25,22 +32,43 @@ * simplifies to * clkadj = scaled_ppm * 2^7 / 5^5 */ -#define MV88E6250_CC_SHIFT 28 -#define MV88E6250_CC_MULT (10 << MV88E6250_CC_SHIFT) -#define MV88E6250_CC_MULT_NUM (1 << 7) -#define MV88E6250_CC_MULT_DEM 3125ULL +#define MV88E6XXX_CC_10NS_SHIFT 28 +static const struct mv88e6xxx_cc_coeffs mv88e6xxx_cc_10ns_coeffs = { + .cc_shift = MV88E6XXX_CC_10NS_SHIFT, + .cc_mult = 10 << MV88E6XXX_CC_10NS_SHIFT, + .cc_mult_num = 1 << 7, + .cc_mult_dem = 3125ULL, +}; -/* Other families: +/* Other families except MV88E6393X in internal clock mode: * Raw timestamps are in units of 8-ns clock periods. * * clkadj = scaled_ppm * 8*2^28 / (10^6 * 2^16) * simplifies to * clkadj = scaled_ppm * 2^9 / 5^6 */ -#define MV88E6XXX_CC_SHIFT 28 -#define MV88E6XXX_CC_MULT (8 << MV88E6XXX_CC_SHIFT) -#define MV88E6XXX_CC_MULT_NUM (1 << 9) -#define MV88E6XXX_CC_MULT_DEM 15625ULL +#define MV88E6XXX_CC_8NS_SHIFT 28 +static const struct mv88e6xxx_cc_coeffs mv88e6xxx_cc_8ns_coeffs = { + .cc_shift = MV88E6XXX_CC_8NS_SHIFT, + .cc_mult = 8 << MV88E6XXX_CC_8NS_SHIFT, + .cc_mult_num = 1 << 9, + .cc_mult_dem = 15625ULL +}; + +/* Family MV88E6393X using internal clock: + * Raw timestamps are in units of 4-ns clock periods. + * + * clkadj = scaled_ppm * 4*2^28 / (10^6 * 2^16) + * simplifies to + * clkadj = scaled_ppm * 2^8 / 5^6 + */ +#define MV88E6XXX_CC_4NS_SHIFT 28 +static const struct mv88e6xxx_cc_coeffs mv88e6xxx_cc_4ns_coeffs = { + .cc_shift = MV88E6XXX_CC_4NS_SHIFT, + .cc_mult = 4 << MV88E6XXX_CC_4NS_SHIFT, + .cc_mult_num = 1 << 8, + .cc_mult_dem = 15625ULL +}; #define TAI_EVENT_WORK_INTERVAL msecs_to_jiffies(100) @@ -83,6 +111,33 @@ static int mv88e6352_set_gpio_func(struct mv88e6xxx_chip *chip, int pin, return chip->info->ops->gpio_ops->set_pctl(chip, pin, func); } +static const struct mv88e6xxx_cc_coeffs * +mv88e6xxx_cc_coeff_get(struct mv88e6xxx_chip *chip) +{ + u16 period_ps; + int err; + + err = mv88e6xxx_tai_read(chip, MV88E6XXX_TAI_CLOCK_PERIOD, &period_ps, 1); + if (err) { + dev_err(chip->dev, "failed to read cycle counter period: %d\n", + err); + return ERR_PTR(err); + } + + switch (period_ps) { + case 4000: + return &mv88e6xxx_cc_4ns_coeffs; + case 8000: + return &mv88e6xxx_cc_8ns_coeffs; + case 10000: + return &mv88e6xxx_cc_10ns_coeffs; + default: + dev_err(chip->dev, "unexpected cycle counter period of %u ps\n", + period_ps); + return ERR_PTR(-ENODEV); + } +} + static u64 mv88e6352_ptp_clock_read(const struct cyclecounter *cc) { struct mv88e6xxx_chip *chip = cc_to_chip(cc); @@ -204,7 +259,6 @@ out: static int mv88e6xxx_ptp_adjfine(struct ptp_clock_info *ptp, long scaled_ppm) { struct mv88e6xxx_chip *chip = ptp_to_chip(ptp); - const struct mv88e6xxx_ptp_ops *ptp_ops = chip->info->ops->ptp_ops; int neg_adj = 0; u32 diff, mult; u64 adj; @@ -214,10 +268,10 @@ static int mv88e6xxx_ptp_adjfine(struct ptp_clock_info *ptp, long scaled_ppm) scaled_ppm = -scaled_ppm; } - mult = ptp_ops->cc_mult; - adj = ptp_ops->cc_mult_num; + mult = chip->cc_coeffs->cc_mult; + adj = chip->cc_coeffs->cc_mult_num; adj *= scaled_ppm; - diff = div_u64(adj, ptp_ops->cc_mult_dem); + diff = div_u64(adj, chip->cc_coeffs->cc_mult_dem); mv88e6xxx_reg_lock(chip); @@ -364,10 +418,6 @@ const struct mv88e6xxx_ptp_ops mv88e6165_ptp_ops = { (1 << HWTSTAMP_FILTER_PTP_V2_EVENT) | (1 << HWTSTAMP_FILTER_PTP_V2_SYNC) | (1 << HWTSTAMP_FILTER_PTP_V2_DELAY_REQ), - .cc_shift = MV88E6XXX_CC_SHIFT, - .cc_mult = MV88E6XXX_CC_MULT, - .cc_mult_num = MV88E6XXX_CC_MULT_NUM, - .cc_mult_dem = MV88E6XXX_CC_MULT_DEM, }; const struct mv88e6xxx_ptp_ops mv88e6250_ptp_ops = { @@ -391,10 +441,6 @@ const struct mv88e6xxx_ptp_ops mv88e6250_ptp_ops = { (1 << HWTSTAMP_FILTER_PTP_V2_EVENT) | (1 << HWTSTAMP_FILTER_PTP_V2_SYNC) | (1 << HWTSTAMP_FILTER_PTP_V2_DELAY_REQ), - .cc_shift = MV88E6250_CC_SHIFT, - .cc_mult = MV88E6250_CC_MULT, - .cc_mult_num = MV88E6250_CC_MULT_NUM, - .cc_mult_dem = MV88E6250_CC_MULT_DEM, }; const struct mv88e6xxx_ptp_ops mv88e6352_ptp_ops = { @@ -418,10 +464,6 @@ const struct mv88e6xxx_ptp_ops mv88e6352_ptp_ops = { (1 << HWTSTAMP_FILTER_PTP_V2_EVENT) | (1 << HWTSTAMP_FILTER_PTP_V2_SYNC) | (1 << HWTSTAMP_FILTER_PTP_V2_DELAY_REQ), - .cc_shift = MV88E6XXX_CC_SHIFT, - .cc_mult = MV88E6XXX_CC_MULT, - .cc_mult_num = MV88E6XXX_CC_MULT_NUM, - .cc_mult_dem = MV88E6XXX_CC_MULT_DEM, }; const struct mv88e6xxx_ptp_ops mv88e6390_ptp_ops = { @@ -446,10 +488,6 @@ const struct mv88e6xxx_ptp_ops mv88e6390_ptp_ops = { (1 << HWTSTAMP_FILTER_PTP_V2_EVENT) | (1 << HWTSTAMP_FILTER_PTP_V2_SYNC) | (1 << HWTSTAMP_FILTER_PTP_V2_DELAY_REQ), - .cc_shift = MV88E6XXX_CC_SHIFT, - .cc_mult = MV88E6XXX_CC_MULT, - .cc_mult_num = MV88E6XXX_CC_MULT_NUM, - .cc_mult_dem = MV88E6XXX_CC_MULT_DEM, }; static u64 mv88e6xxx_ptp_clock_read(const struct cyclecounter *cc) @@ -462,10 +500,10 @@ static u64 mv88e6xxx_ptp_clock_read(const struct cyclecounter *cc) return 0; } -/* With a 125MHz input clock, the 32-bit timestamp counter overflows in ~34.3 +/* With a 250MHz input clock, the 32-bit timestamp counter overflows in ~17.2 * seconds; this task forces periodic reads so that we don't miss any. */ -#define MV88E6XXX_TAI_OVERFLOW_PERIOD (HZ * 16) +#define MV88E6XXX_TAI_OVERFLOW_PERIOD (HZ * 8) static void mv88e6xxx_ptp_overflow_check(struct work_struct *work) { struct delayed_work *dw = to_delayed_work(work); @@ -484,11 +522,15 @@ int mv88e6xxx_ptp_setup(struct mv88e6xxx_chip *chip) int i; /* Set up the cycle counter */ + chip->cc_coeffs = mv88e6xxx_cc_coeff_get(chip); + if (IS_ERR(chip->cc_coeffs)) + return PTR_ERR(chip->cc_coeffs); + memset(&chip->tstamp_cc, 0, sizeof(chip->tstamp_cc)); chip->tstamp_cc.read = mv88e6xxx_ptp_clock_read; chip->tstamp_cc.mask = CYCLECOUNTER_MASK(32); - chip->tstamp_cc.mult = ptp_ops->cc_mult; - chip->tstamp_cc.shift = ptp_ops->cc_shift; + chip->tstamp_cc.mult = chip->cc_coeffs->cc_mult; + chip->tstamp_cc.shift = chip->cc_coeffs->cc_shift; timecounter_init(&chip->tstamp_tc, &chip->tstamp_cc, ktime_to_ns(ktime_get_real())); diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index bda3742d4e32..6dd6541d8619 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -2254,10 +2254,11 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_cp_ring_info *cpr, if (!bnxt_get_rx_ts_p5(bp, &ts, cmpl_ts)) { struct bnxt_ptp_cfg *ptp = bp->ptp_cfg; + unsigned long flags; - spin_lock_bh(&ptp->ptp_lock); + spin_lock_irqsave(&ptp->ptp_lock, flags); ns = timecounter_cyc2time(&ptp->tc, ts); - spin_unlock_bh(&ptp->ptp_lock); + spin_unlock_irqrestore(&ptp->ptp_lock, flags); memset(skb_hwtstamps(skb), 0, sizeof(*skb_hwtstamps(skb))); skb_hwtstamps(skb)->hwtstamp = ns_to_ktime(ns); @@ -2757,17 +2758,18 @@ static int bnxt_async_event_process(struct bnxt *bp, case ASYNC_EVENT_CMPL_PHC_UPDATE_EVENT_DATA1_FLAGS_PHC_RTC_UPDATE: if (BNXT_PTP_USE_RTC(bp)) { struct bnxt_ptp_cfg *ptp = bp->ptp_cfg; + unsigned long flags; u64 ns; if (!ptp) goto async_event_process_exit; - spin_lock_bh(&ptp->ptp_lock); + spin_lock_irqsave(&ptp->ptp_lock, flags); bnxt_ptp_update_current_time(bp); ns = (((u64)BNXT_EVENT_PHC_RTC_UPDATE(data1) << BNXT_PHC_BITS) | ptp->current_time); bnxt_ptp_rtc_timecounter_init(ptp, ns); - spin_unlock_bh(&ptp->ptp_lock); + spin_unlock_irqrestore(&ptp->ptp_lock, flags); } break; } @@ -13495,9 +13497,11 @@ static void bnxt_force_fw_reset(struct bnxt *bp) return; if (ptp) { - spin_lock_bh(&ptp->ptp_lock); + unsigned long flags; + + spin_lock_irqsave(&ptp->ptp_lock, flags); set_bit(BNXT_STATE_IN_FW_RESET, &bp->state); - spin_unlock_bh(&ptp->ptp_lock); + spin_unlock_irqrestore(&ptp->ptp_lock, flags); } else { set_bit(BNXT_STATE_IN_FW_RESET, &bp->state); } @@ -13562,9 +13566,11 @@ void bnxt_fw_reset(struct bnxt *bp) int n = 0, tmo; if (ptp) { - spin_lock_bh(&ptp->ptp_lock); + unsigned long flags; + + spin_lock_irqsave(&ptp->ptp_lock, flags); set_bit(BNXT_STATE_IN_FW_RESET, &bp->state); - spin_unlock_bh(&ptp->ptp_lock); + spin_unlock_irqrestore(&ptp->ptp_lock, flags); } else { set_bit(BNXT_STATE_IN_FW_RESET, &bp->state); } diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c index 37d42423459c..fa514be87650 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c @@ -62,13 +62,14 @@ static int bnxt_ptp_settime(struct ptp_clock_info *ptp_info, struct bnxt_ptp_cfg *ptp = container_of(ptp_info, struct bnxt_ptp_cfg, ptp_info); u64 ns = timespec64_to_ns(ts); + unsigned long flags; if (BNXT_PTP_USE_RTC(ptp->bp)) return bnxt_ptp_cfg_settime(ptp->bp, ns); - spin_lock_bh(&ptp->ptp_lock); + spin_lock_irqsave(&ptp->ptp_lock, flags); timecounter_init(&ptp->tc, &ptp->cc, ns); - spin_unlock_bh(&ptp->ptp_lock); + spin_unlock_irqrestore(&ptp->ptp_lock, flags); return 0; } @@ -100,13 +101,14 @@ static int bnxt_refclk_read(struct bnxt *bp, struct ptp_system_timestamp *sts, static void bnxt_ptp_get_current_time(struct bnxt *bp) { struct bnxt_ptp_cfg *ptp = bp->ptp_cfg; + unsigned long flags; if (!ptp) return; - spin_lock_bh(&ptp->ptp_lock); + spin_lock_irqsave(&ptp->ptp_lock, flags); WRITE_ONCE(ptp->old_time, ptp->current_time); bnxt_refclk_read(bp, NULL, &ptp->current_time); - spin_unlock_bh(&ptp->ptp_lock); + spin_unlock_irqrestore(&ptp->ptp_lock, flags); } static int bnxt_hwrm_port_ts_query(struct bnxt *bp, u32 flags, u64 *ts, @@ -149,17 +151,18 @@ static int bnxt_ptp_gettimex(struct ptp_clock_info *ptp_info, { struct bnxt_ptp_cfg *ptp = container_of(ptp_info, struct bnxt_ptp_cfg, ptp_info); + unsigned long flags; u64 ns, cycles; int rc; - spin_lock_bh(&ptp->ptp_lock); + spin_lock_irqsave(&ptp->ptp_lock, flags); rc = bnxt_refclk_read(ptp->bp, sts, &cycles); if (rc) { - spin_unlock_bh(&ptp->ptp_lock); + spin_unlock_irqrestore(&ptp->ptp_lock, flags); return rc; } ns = timecounter_cyc2time(&ptp->tc, cycles); - spin_unlock_bh(&ptp->ptp_lock); + spin_unlock_irqrestore(&ptp->ptp_lock, flags); *ts = ns_to_timespec64(ns); return 0; @@ -177,6 +180,7 @@ void bnxt_ptp_update_current_time(struct bnxt *bp) static int bnxt_ptp_adjphc(struct bnxt_ptp_cfg *ptp, s64 delta) { struct hwrm_port_mac_cfg_input *req; + unsigned long flags; int rc; rc = hwrm_req_init(ptp->bp, req, HWRM_PORT_MAC_CFG); @@ -190,9 +194,9 @@ static int bnxt_ptp_adjphc(struct bnxt_ptp_cfg *ptp, s64 delta) if (rc) { netdev_err(ptp->bp->dev, "ptp adjphc failed. rc = %x\n", rc); } else { - spin_lock_bh(&ptp->ptp_lock); + spin_lock_irqsave(&ptp->ptp_lock, flags); bnxt_ptp_update_current_time(ptp->bp); - spin_unlock_bh(&ptp->ptp_lock); + spin_unlock_irqrestore(&ptp->ptp_lock, flags); } return rc; @@ -202,13 +206,14 @@ static int bnxt_ptp_adjtime(struct ptp_clock_info *ptp_info, s64 delta) { struct bnxt_ptp_cfg *ptp = container_of(ptp_info, struct bnxt_ptp_cfg, ptp_info); + unsigned long flags; if (BNXT_PTP_USE_RTC(ptp->bp)) return bnxt_ptp_adjphc(ptp, delta); - spin_lock_bh(&ptp->ptp_lock); + spin_lock_irqsave(&ptp->ptp_lock, flags); timecounter_adjtime(&ptp->tc, delta); - spin_unlock_bh(&ptp->ptp_lock); + spin_unlock_irqrestore(&ptp->ptp_lock, flags); return 0; } @@ -236,14 +241,15 @@ static int bnxt_ptp_adjfine(struct ptp_clock_info *ptp_info, long scaled_ppm) struct bnxt_ptp_cfg *ptp = container_of(ptp_info, struct bnxt_ptp_cfg, ptp_info); struct bnxt *bp = ptp->bp; + unsigned long flags; if (!BNXT_MH(bp)) return bnxt_ptp_adjfine_rtc(bp, scaled_ppm); - spin_lock_bh(&ptp->ptp_lock); + spin_lock_irqsave(&ptp->ptp_lock, flags); timecounter_read(&ptp->tc); ptp->cc.mult = adjust_by_scaled_ppm(ptp->cmult, scaled_ppm); - spin_unlock_bh(&ptp->ptp_lock); + spin_unlock_irqrestore(&ptp->ptp_lock, flags); return 0; } @@ -251,12 +257,13 @@ void bnxt_ptp_pps_event(struct bnxt *bp, u32 data1, u32 data2) { struct bnxt_ptp_cfg *ptp = bp->ptp_cfg; struct ptp_clock_event event; + unsigned long flags; u64 ns, pps_ts; pps_ts = EVENT_PPS_TS(data2, data1); - spin_lock_bh(&ptp->ptp_lock); + spin_lock_irqsave(&ptp->ptp_lock, flags); ns = timecounter_cyc2time(&ptp->tc, pps_ts); - spin_unlock_bh(&ptp->ptp_lock); + spin_unlock_irqrestore(&ptp->ptp_lock, flags); switch (EVENT_DATA2_PPS_EVENT_TYPE(data2)) { case ASYNC_EVENT_CMPL_PPS_TIMESTAMP_EVENT_DATA2_EVENT_TYPE_INTERNAL: @@ -393,16 +400,17 @@ static int bnxt_get_target_cycles(struct bnxt_ptp_cfg *ptp, u64 target_ns, { u64 cycles_now; u64 nsec_now, nsec_delta; + unsigned long flags; int rc; - spin_lock_bh(&ptp->ptp_lock); + spin_lock_irqsave(&ptp->ptp_lock, flags); rc = bnxt_refclk_read(ptp->bp, NULL, &cycles_now); if (rc) { - spin_unlock_bh(&ptp->ptp_lock); + spin_unlock_irqrestore(&ptp->ptp_lock, flags); return rc; } nsec_now = timecounter_cyc2time(&ptp->tc, cycles_now); - spin_unlock_bh(&ptp->ptp_lock); + spin_unlock_irqrestore(&ptp->ptp_lock, flags); nsec_delta = target_ns - nsec_now; *cycles_delta = div64_u64(nsec_delta << ptp->cc.shift, ptp->cc.mult); @@ -689,6 +697,7 @@ static int bnxt_stamp_tx_skb(struct bnxt *bp, int slot) struct skb_shared_hwtstamps timestamp; struct bnxt_ptp_tx_req *txts_req; unsigned long now = jiffies; + unsigned long flags; u64 ts = 0, ns = 0; u32 tmo = 0; int rc; @@ -702,9 +711,9 @@ static int bnxt_stamp_tx_skb(struct bnxt *bp, int slot) tmo, slot); if (!rc) { memset(×tamp, 0, sizeof(timestamp)); - spin_lock_bh(&ptp->ptp_lock); + spin_lock_irqsave(&ptp->ptp_lock, flags); ns = timecounter_cyc2time(&ptp->tc, ts); - spin_unlock_bh(&ptp->ptp_lock); + spin_unlock_irqrestore(&ptp->ptp_lock, flags); timestamp.hwtstamp = ns_to_ktime(ns); skb_tstamp_tx(txts_req->tx_skb, ×tamp); ptp->stats.ts_pkts++; @@ -730,6 +739,7 @@ static long bnxt_ptp_ts_aux_work(struct ptp_clock_info *ptp_info) unsigned long now = jiffies; struct bnxt *bp = ptp->bp; u16 cons = ptp->txts_cons; + unsigned long flags; u32 num_requests; int rc = 0; @@ -757,9 +767,9 @@ next_slot: bnxt_ptp_get_current_time(bp); ptp->next_period = now + HZ; if (time_after_eq(now, ptp->next_overflow_check)) { - spin_lock_bh(&ptp->ptp_lock); + spin_lock_irqsave(&ptp->ptp_lock, flags); timecounter_read(&ptp->tc); - spin_unlock_bh(&ptp->ptp_lock); + spin_unlock_irqrestore(&ptp->ptp_lock, flags); ptp->next_overflow_check = now + BNXT_PHC_OVERFLOW_PERIOD; } if (rc == -EAGAIN) @@ -819,6 +829,7 @@ void bnxt_tx_ts_cmp(struct bnxt *bp, struct bnxt_napi *bnapi, u32 opaque = tscmp->tx_ts_cmp_opaque; struct bnxt_tx_ring_info *txr; struct bnxt_sw_tx_bd *tx_buf; + unsigned long flags; u64 ts, ns; u16 cons; @@ -833,9 +844,9 @@ void bnxt_tx_ts_cmp(struct bnxt *bp, struct bnxt_napi *bnapi, le32_to_cpu(tscmp->tx_ts_cmp_flags_type), le32_to_cpu(tscmp->tx_ts_cmp_errors_v)); } else { - spin_lock_bh(&ptp->ptp_lock); + spin_lock_irqsave(&ptp->ptp_lock, flags); ns = timecounter_cyc2time(&ptp->tc, ts); - spin_unlock_bh(&ptp->ptp_lock); + spin_unlock_irqrestore(&ptp->ptp_lock, flags); timestamp.hwtstamp = ns_to_ktime(ns); skb_tstamp_tx(tx_buf->skb, ×tamp); } @@ -975,6 +986,7 @@ void bnxt_ptp_rtc_timecounter_init(struct bnxt_ptp_cfg *ptp, u64 ns) int bnxt_ptp_init_rtc(struct bnxt *bp, bool phc_cfg) { struct timespec64 tsp; + unsigned long flags; u64 ns; int rc; @@ -993,9 +1005,9 @@ int bnxt_ptp_init_rtc(struct bnxt *bp, bool phc_cfg) if (rc) return rc; } - spin_lock_bh(&bp->ptp_cfg->ptp_lock); + spin_lock_irqsave(&bp->ptp_cfg->ptp_lock, flags); bnxt_ptp_rtc_timecounter_init(bp->ptp_cfg, ns); - spin_unlock_bh(&bp->ptp_cfg->ptp_lock); + spin_unlock_irqrestore(&bp->ptp_cfg->ptp_lock, flags); return 0; } @@ -1063,10 +1075,12 @@ int bnxt_ptp_init(struct bnxt *bp, bool phc_cfg) atomic64_set(&ptp->stats.ts_err, 0); if (bp->flags & BNXT_FLAG_CHIP_P5_PLUS) { - spin_lock_bh(&ptp->ptp_lock); + unsigned long flags; + + spin_lock_irqsave(&ptp->ptp_lock, flags); bnxt_refclk_read(bp, NULL, &ptp->current_time); WRITE_ONCE(ptp->old_time, ptp->current_time); - spin_unlock_bh(&ptp->ptp_lock); + spin_unlock_irqrestore(&ptp->ptp_lock, flags); ptp_schedule_worker(ptp->ptp_clock, 0); } ptp->txts_tmo = BNXT_PTP_DFLT_TX_TMO; diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h index a9a2f9a18c9c..f322466ecad3 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h @@ -146,11 +146,13 @@ struct bnxt_ptp_cfg { }; #if BITS_PER_LONG == 32 -#define BNXT_READ_TIME64(ptp, dst, src) \ -do { \ - spin_lock_bh(&(ptp)->ptp_lock); \ - (dst) = (src); \ - spin_unlock_bh(&(ptp)->ptp_lock); \ +#define BNXT_READ_TIME64(ptp, dst, src) \ +do { \ + unsigned long flags; \ + \ + spin_lock_irqsave(&(ptp)->ptp_lock, flags); \ + (dst) = (src); \ + spin_unlock_irqrestore(&(ptp)->ptp_lock, flags); \ } while (0) #else #define BNXT_READ_TIME64(ptp, dst, src) \ diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c index a8596ebcdfd6..875fe379eea2 100644 --- a/drivers/net/ethernet/emulex/benet/be_main.c +++ b/drivers/net/ethernet/emulex/benet/be_main.c @@ -1381,10 +1381,8 @@ static netdev_tx_t be_xmit(struct sk_buff *skb, struct net_device *netdev) be_get_wrb_params_from_skb(adapter, skb, &wrb_params); wrb_cnt = be_xmit_enqueue(adapter, txo, skb, &wrb_params); - if (unlikely(!wrb_cnt)) { - dev_kfree_skb_any(skb); - goto drop; - } + if (unlikely(!wrb_cnt)) + goto drop_skb; /* if os2bmc is enabled and if the pkt is destined to bmc, * enqueue the pkt a 2nd time with mgmt bit set. @@ -1393,7 +1391,7 @@ static netdev_tx_t be_xmit(struct sk_buff *skb, struct net_device *netdev) BE_WRB_F_SET(wrb_params.features, OS2BMC, 1); wrb_cnt = be_xmit_enqueue(adapter, txo, skb, &wrb_params); if (unlikely(!wrb_cnt)) - goto drop; + goto drop_skb; else skb_get(skb); } @@ -1407,6 +1405,8 @@ static netdev_tx_t be_xmit(struct sk_buff *skb, struct net_device *netdev) be_xmit_flush(adapter, txo); return NETDEV_TX_OK; +drop_skb: + dev_kfree_skb_any(skb); drop: tx_stats(txo)->tx_drv_drops++; /* Flush the already enqueued tx requests */ diff --git a/drivers/net/ethernet/freescale/fman/mac.c b/drivers/net/ethernet/freescale/fman/mac.c index 974d2e7e131c..c7bff9490ce0 100644 --- a/drivers/net/ethernet/freescale/fman/mac.c +++ b/drivers/net/ethernet/freescale/fman/mac.c @@ -155,55 +155,67 @@ static int mac_probe(struct platform_device *_of_dev) err = -EINVAL; goto _return_of_node_put; } + mac_dev->fman_dev = &of_dev->dev; /* Get the FMan cell-index */ err = of_property_read_u32(dev_node, "cell-index", &val); if (err) { dev_err(dev, "failed to read cell-index for %pOF\n", dev_node); err = -EINVAL; - goto _return_of_node_put; + goto _return_dev_put; } /* cell-index 0 => FMan id 1 */ fman_id = (u8)(val + 1); - priv->fman = fman_bind(&of_dev->dev); + priv->fman = fman_bind(mac_dev->fman_dev); if (!priv->fman) { dev_err(dev, "fman_bind(%pOF) failed\n", dev_node); err = -ENODEV; - goto _return_of_node_put; + goto _return_dev_put; } + /* Two references have been taken in of_find_device_by_node() + * and fman_bind(). Release one of them here. The second one + * will be released in mac_remove(). + */ + put_device(mac_dev->fman_dev); of_node_put(dev_node); + dev_node = NULL; /* Get the address of the memory mapped registers */ mac_dev->res = platform_get_mem_or_io(_of_dev, 0); if (!mac_dev->res) { dev_err(dev, "could not get registers\n"); - return -EINVAL; + err = -EINVAL; + goto _return_dev_put; } err = devm_request_resource(dev, fman_get_mem_region(priv->fman), mac_dev->res); if (err) { dev_err_probe(dev, err, "could not request resource\n"); - return err; + goto _return_dev_put; } mac_dev->vaddr = devm_ioremap(dev, mac_dev->res->start, resource_size(mac_dev->res)); if (!mac_dev->vaddr) { dev_err(dev, "devm_ioremap() failed\n"); - return -EIO; + err = -EIO; + goto _return_dev_put; } - if (!of_device_is_available(mac_node)) - return -ENODEV; + if (!of_device_is_available(mac_node)) { + err = -ENODEV; + goto _return_dev_put; + } /* Get the cell-index */ err = of_property_read_u32(mac_node, "cell-index", &val); if (err) { dev_err(dev, "failed to read cell-index for %pOF\n", mac_node); - return -EINVAL; + err = -EINVAL; + goto _return_dev_put; } priv->cell_index = (u8)val; @@ -217,22 +229,26 @@ static int mac_probe(struct platform_device *_of_dev) if (unlikely(nph < 0)) { dev_err(dev, "of_count_phandle_with_args(%pOF, fsl,fman-ports) failed\n", mac_node); - return nph; + err = nph; + goto _return_dev_put; } if (nph != ARRAY_SIZE(mac_dev->port)) { dev_err(dev, "Not supported number of fman-ports handles of mac node %pOF from device tree\n", mac_node); - return -EINVAL; + err = -EINVAL; + goto _return_dev_put; } - for (i = 0; i < ARRAY_SIZE(mac_dev->port); i++) { + /* PORT_NUM determines the size of the port array */ + for (i = 0; i < PORT_NUM; i++) { /* Find the port node */ dev_node = of_parse_phandle(mac_node, "fsl,fman-ports", i); if (!dev_node) { dev_err(dev, "of_parse_phandle(%pOF, fsl,fman-ports) failed\n", mac_node); - return -EINVAL; + err = -EINVAL; + goto _return_dev_arr_put; } of_dev = of_find_device_by_node(dev_node); @@ -240,17 +256,24 @@ static int mac_probe(struct platform_device *_of_dev) dev_err(dev, "of_find_device_by_node(%pOF) failed\n", dev_node); err = -EINVAL; - goto _return_of_node_put; + goto _return_dev_arr_put; } + mac_dev->fman_port_devs[i] = &of_dev->dev; - mac_dev->port[i] = fman_port_bind(&of_dev->dev); + mac_dev->port[i] = fman_port_bind(mac_dev->fman_port_devs[i]); if (!mac_dev->port[i]) { dev_err(dev, "dev_get_drvdata(%pOF) failed\n", dev_node); err = -EINVAL; - goto _return_of_node_put; + goto _return_dev_arr_put; } + /* Two references have been taken in of_find_device_by_node() + * and fman_port_bind(). Release one of them here. The second + * one will be released in mac_remove(). + */ + put_device(mac_dev->fman_port_devs[i]); of_node_put(dev_node); + dev_node = NULL; } /* Get the PHY connection type */ @@ -270,7 +293,7 @@ static int mac_probe(struct platform_device *_of_dev) err = init(mac_dev, mac_node, ¶ms); if (err < 0) - return err; + goto _return_dev_arr_put; if (!is_zero_ether_addr(mac_dev->addr)) dev_info(dev, "FMan MAC address: %pM\n", mac_dev->addr); @@ -285,6 +308,12 @@ static int mac_probe(struct platform_device *_of_dev) return err; +_return_dev_arr_put: + /* mac_dev is kzalloc'ed */ + for (i = 0; i < PORT_NUM; i++) + put_device(mac_dev->fman_port_devs[i]); +_return_dev_put: + put_device(mac_dev->fman_dev); _return_of_node_put: of_node_put(dev_node); return err; @@ -293,6 +322,11 @@ _return_of_node_put: static void mac_remove(struct platform_device *pdev) { struct mac_device *mac_dev = platform_get_drvdata(pdev); + int i; + + for (i = 0; i < PORT_NUM; i++) + put_device(mac_dev->fman_port_devs[i]); + put_device(mac_dev->fman_dev); platform_device_unregister(mac_dev->priv->eth_dev); } diff --git a/drivers/net/ethernet/freescale/fman/mac.h b/drivers/net/ethernet/freescale/fman/mac.h index be9d48aad5ef..955ace338965 100644 --- a/drivers/net/ethernet/freescale/fman/mac.h +++ b/drivers/net/ethernet/freescale/fman/mac.h @@ -19,12 +19,13 @@ struct fman_mac; struct mac_priv_s; +#define PORT_NUM 2 struct mac_device { void __iomem *vaddr; struct device *dev; struct resource *res; u8 addr[ETH_ALEN]; - struct fman_port *port[2]; + struct fman_port *port[PORT_NUM]; struct phylink *phylink; struct phylink_config phylink_config; phy_interface_t phy_if; @@ -50,6 +51,9 @@ struct mac_device { struct fman_mac *fman_mac; struct mac_priv_s *priv; + + struct device *fman_dev; + struct device *fman_port_devs[PORT_NUM]; }; static inline struct mac_device diff --git a/drivers/net/ethernet/i825xx/sun3_82586.c b/drivers/net/ethernet/i825xx/sun3_82586.c index f2d4669c81cf..58a3d28d938c 100644 --- a/drivers/net/ethernet/i825xx/sun3_82586.c +++ b/drivers/net/ethernet/i825xx/sun3_82586.c @@ -1012,6 +1012,7 @@ sun3_82586_send_packet(struct sk_buff *skb, struct net_device *dev) if(skb->len > XMIT_BUFF_SIZE) { printk("%s: Sorry, max. framelength is %d bytes. The length of your frame is %d bytes.\n",dev->name,XMIT_BUFF_SIZE,skb->len); + dev_kfree_skb(skb); return NETDEV_TX_OK; } diff --git a/drivers/net/ethernet/marvell/octeon_ep/octep_rx.c b/drivers/net/ethernet/marvell/octeon_ep/octep_rx.c index 4746a6b258f0..8af75cb37c3e 100644 --- a/drivers/net/ethernet/marvell/octeon_ep/octep_rx.c +++ b/drivers/net/ethernet/marvell/octeon_ep/octep_rx.c @@ -337,6 +337,51 @@ static int octep_oq_check_hw_for_pkts(struct octep_device *oct, } /** + * octep_oq_next_pkt() - Move to the next packet in Rx queue. + * + * @oq: Octeon Rx queue data structure. + * @buff_info: Current packet buffer info. + * @read_idx: Current packet index in the ring. + * @desc_used: Current packet descriptor number. + * + * Free the resources associated with a packet. + * Increment packet index in the ring and packet descriptor number. + */ +static void octep_oq_next_pkt(struct octep_oq *oq, + struct octep_rx_buffer *buff_info, + u32 *read_idx, u32 *desc_used) +{ + dma_unmap_page(oq->dev, oq->desc_ring[*read_idx].buffer_ptr, + PAGE_SIZE, DMA_FROM_DEVICE); + buff_info->page = NULL; + (*read_idx)++; + (*desc_used)++; + if (*read_idx == oq->max_count) + *read_idx = 0; +} + +/** + * octep_oq_drop_rx() - Free the resources associated with a packet. + * + * @oq: Octeon Rx queue data structure. + * @buff_info: Current packet buffer info. + * @read_idx: Current packet index in the ring. + * @desc_used: Current packet descriptor number. + * + */ +static void octep_oq_drop_rx(struct octep_oq *oq, + struct octep_rx_buffer *buff_info, + u32 *read_idx, u32 *desc_used) +{ + int data_len = buff_info->len - oq->max_single_buffer_size; + + while (data_len > 0) { + octep_oq_next_pkt(oq, buff_info, read_idx, desc_used); + data_len -= oq->buffer_size; + }; +} + +/** * __octep_oq_process_rx() - Process hardware Rx queue and push to stack. * * @oct: Octeon device private data structure. @@ -367,10 +412,7 @@ static int __octep_oq_process_rx(struct octep_device *oct, desc_used = 0; for (pkt = 0; pkt < pkts_to_process; pkt++) { buff_info = (struct octep_rx_buffer *)&oq->buff_info[read_idx]; - dma_unmap_page(oq->dev, oq->desc_ring[read_idx].buffer_ptr, - PAGE_SIZE, DMA_FROM_DEVICE); resp_hw = page_address(buff_info->page); - buff_info->page = NULL; /* Swap the length field that is in Big-Endian to CPU */ buff_info->len = be64_to_cpu(resp_hw->length); @@ -394,36 +436,33 @@ static int __octep_oq_process_rx(struct octep_device *oct, data_offset = OCTEP_OQ_RESP_HW_SIZE; rx_ol_flags = 0; } + + octep_oq_next_pkt(oq, buff_info, &read_idx, &desc_used); + + skb = build_skb((void *)resp_hw, PAGE_SIZE); + if (!skb) { + octep_oq_drop_rx(oq, buff_info, + &read_idx, &desc_used); + oq->stats.alloc_failures++; + continue; + } + skb_reserve(skb, data_offset); + rx_bytes += buff_info->len; if (buff_info->len <= oq->max_single_buffer_size) { - skb = build_skb((void *)resp_hw, PAGE_SIZE); - skb_reserve(skb, data_offset); skb_put(skb, buff_info->len); - read_idx++; - desc_used++; - if (read_idx == oq->max_count) - read_idx = 0; } else { struct skb_shared_info *shinfo; u16 data_len; - skb = build_skb((void *)resp_hw, PAGE_SIZE); - skb_reserve(skb, data_offset); /* Head fragment includes response header(s); * subsequent fragments contains only data. */ skb_put(skb, oq->max_single_buffer_size); - read_idx++; - desc_used++; - if (read_idx == oq->max_count) - read_idx = 0; - shinfo = skb_shinfo(skb); data_len = buff_info->len - oq->max_single_buffer_size; while (data_len) { - dma_unmap_page(oq->dev, oq->desc_ring[read_idx].buffer_ptr, - PAGE_SIZE, DMA_FROM_DEVICE); buff_info = (struct octep_rx_buffer *) &oq->buff_info[read_idx]; if (data_len < oq->buffer_size) { @@ -438,11 +477,8 @@ static int __octep_oq_process_rx(struct octep_device *oct, buff_info->page, 0, buff_info->len, buff_info->len); - buff_info->page = NULL; - read_idx++; - desc_used++; - if (read_idx == oq->max_count) - read_idx = 0; + + octep_oq_next_pkt(oq, buff_info, &read_idx, &desc_used); } } diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c index 800dfb64ec83..7d6d859cef3f 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c @@ -3197,7 +3197,6 @@ mlxsw_sp_nexthop_sh_counter_get(struct mlxsw_sp *mlxsw_sp, { struct mlxsw_sp_nexthop_group *nh_grp = nh->nhgi->nh_grp; struct mlxsw_sp_nexthop_counter *nhct; - void *ptr; int err; nhct = xa_load(&nh_grp->nhgi->nexthop_counters, nh->id); @@ -3210,12 +3209,10 @@ mlxsw_sp_nexthop_sh_counter_get(struct mlxsw_sp *mlxsw_sp, if (IS_ERR(nhct)) return nhct; - ptr = xa_store(&nh_grp->nhgi->nexthop_counters, nh->id, nhct, - GFP_KERNEL); - if (IS_ERR(ptr)) { - err = PTR_ERR(ptr); + err = xa_err(xa_store(&nh_grp->nhgi->nexthop_counters, nh->id, nhct, + GFP_KERNEL)); + if (err) goto err_store; - } return nhct; diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index 4166f1ab854b..79e7b223bd5b 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -4749,7 +4749,9 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance) if ((status & 0xffff) == 0xffff || !(status & tp->irq_mask)) return IRQ_NONE; - if (unlikely(status & SYSErr)) { + /* At least RTL8168fp may unexpectedly set the SYSErr bit */ + if (unlikely(status & SYSErr && + tp->mac_version <= RTL_GIGA_MAC_VER_06)) { rtl8169_pcierr_interrupt(tp->dev); goto out; } diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c index f8e2dd6d271d..d6c4abfc3a28 100644 --- a/drivers/net/hyperv/netvsc_drv.c +++ b/drivers/net/hyperv/netvsc_drv.c @@ -2798,6 +2798,31 @@ static struct hv_driver netvsc_drv = { }, }; +/* Set VF's namespace same as the synthetic NIC */ +static void netvsc_event_set_vf_ns(struct net_device *ndev) +{ + struct net_device_context *ndev_ctx = netdev_priv(ndev); + struct net_device *vf_netdev; + int ret; + + vf_netdev = rtnl_dereference(ndev_ctx->vf_netdev); + if (!vf_netdev) + return; + + if (!net_eq(dev_net(ndev), dev_net(vf_netdev))) { + ret = dev_change_net_namespace(vf_netdev, dev_net(ndev), + "eth%d"); + if (ret) + netdev_err(vf_netdev, + "Cannot move to same namespace as %s: %d\n", + ndev->name, ret); + else + netdev_info(vf_netdev, + "Moved VF to namespace with: %s\n", + ndev->name); + } +} + /* * On Hyper-V, every VF interface is matched with a corresponding * synthetic interface. The synthetic interface is presented first @@ -2810,6 +2835,11 @@ static int netvsc_netdev_event(struct notifier_block *this, struct net_device *event_dev = netdev_notifier_info_to_dev(ptr); int ret = 0; + if (event_dev->netdev_ops == &device_ops && event == NETDEV_REGISTER) { + netvsc_event_set_vf_ns(event_dev); + return NOTIFY_DONE; + } + ret = check_dev_is_matching_vf(event_dev); if (ret != 0) return NOTIFY_DONE; diff --git a/drivers/net/phy/dp83822.c b/drivers/net/phy/dp83822.c index fc247f479257..3ab64e04a01c 100644 --- a/drivers/net/phy/dp83822.c +++ b/drivers/net/phy/dp83822.c @@ -45,8 +45,8 @@ /* Control Register 2 bits */ #define DP83822_FX_ENABLE BIT(14) -#define DP83822_HW_RESET BIT(15) -#define DP83822_SW_RESET BIT(14) +#define DP83822_SW_RESET BIT(15) +#define DP83822_DIG_RESTART BIT(14) /* PHY STS bits */ #define DP83822_PHYSTS_DUPLEX BIT(2) diff --git a/drivers/net/plip/plip.c b/drivers/net/plip/plip.c index e39bfaefe8c5..d81163bc910a 100644 --- a/drivers/net/plip/plip.c +++ b/drivers/net/plip/plip.c @@ -815,7 +815,7 @@ plip_send_packet(struct net_device *dev, struct net_local *nl, return HS_TIMEOUT; } } - break; + fallthrough; case PLIP_PK_LENGTH_LSB: if (plip_send(nibble_timeout, dev, diff --git a/drivers/net/pse-pd/pse_core.c b/drivers/net/pse-pd/pse_core.c index f8e6854781e6..2906ce173f66 100644 --- a/drivers/net/pse-pd/pse_core.c +++ b/drivers/net/pse-pd/pse_core.c @@ -113,7 +113,7 @@ static void pse_release_pis(struct pse_controller_dev *pcdev) { int i; - for (i = 0; i <= pcdev->nr_lines; i++) { + for (i = 0; i < pcdev->nr_lines; i++) { of_node_put(pcdev->pi[i].pairset[0].np); of_node_put(pcdev->pi[i].pairset[1].np); of_node_put(pcdev->pi[i].np); @@ -647,7 +647,7 @@ static int of_pse_match_pi(struct pse_controller_dev *pcdev, { int i; - for (i = 0; i <= pcdev->nr_lines; i++) { + for (i = 0; i < pcdev->nr_lines; i++) { if (pcdev->pi[i].np == np) return i; } diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c index 4823dbdf5465..f137c82f1c0f 100644 --- a/drivers/net/usb/qmi_wwan.c +++ b/drivers/net/usb/qmi_wwan.c @@ -1426,6 +1426,7 @@ static const struct usb_device_id products[] = { {QMI_FIXED_INTF(0x2c7c, 0x0296, 4)}, /* Quectel BG96 */ {QMI_QUIRK_SET_DTR(0x2c7c, 0x030e, 4)}, /* Quectel EM05GV2 */ {QMI_QUIRK_SET_DTR(0x2cb7, 0x0104, 4)}, /* Fibocom NL678 series */ + {QMI_QUIRK_SET_DTR(0x2cb7, 0x0112, 0)}, /* Fibocom FG132 */ {QMI_FIXED_INTF(0x0489, 0xe0b4, 0)}, /* Foxconn T77W968 LTE */ {QMI_FIXED_INTF(0x0489, 0xe0b5, 0)}, /* Foxconn T77W968 LTE with eSIM support*/ {QMI_FIXED_INTF(0x2692, 0x9025, 4)}, /* Cellient MPL200 (rebranded Qualcomm 05c6:9025) */ diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c index ee1b5fd7b491..44179f4e807f 100644 --- a/drivers/net/usb/usbnet.c +++ b/drivers/net/usb/usbnet.c @@ -1767,7 +1767,8 @@ usbnet_probe (struct usb_interface *udev, const struct usb_device_id *prod) // can rename the link if it knows better. if ((dev->driver_info->flags & FLAG_ETHER) != 0 && ((dev->driver_info->flags & FLAG_POINTTOPOINT) == 0 || - (net->dev_addr [0] & 0x02) == 0)) + /* somebody touched it*/ + !is_zero_ether_addr(net->dev_addr))) strscpy(net->name, "eth%d", sizeof(net->name)); /* WLAN devices should always be named "wlan%d" */ if ((dev->driver_info->flags & FLAG_WLAN) != 0) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index f8131f92a392..792e9eadbfc3 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -4155,7 +4155,7 @@ struct virtnet_stats_ctx { u32 desc_num[3]; /* The actual supported stat types. */ - u32 bitmap[3]; + u64 bitmap[3]; /* Used to calculate the reply buffer size. */ u32 size[3]; diff --git a/drivers/net/wwan/wwan_core.c b/drivers/net/wwan/wwan_core.c index 17431f1b1a0c..65a7ed4d6766 100644 --- a/drivers/net/wwan/wwan_core.c +++ b/drivers/net/wwan/wwan_core.c @@ -1038,7 +1038,7 @@ static const struct nla_policy wwan_rtnl_policy[IFLA_WWAN_MAX + 1] = { static struct rtnl_link_ops wwan_rtnl_link_ops __read_mostly = { .kind = "wwan", - .maxtype = __IFLA_WWAN_MAX, + .maxtype = IFLA_WWAN_MAX, .alloc = wwan_rtnl_alloc, .validate = wwan_rtnl_validate, .newlink = wwan_rtnl_newlink, diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 43d73d31c66f..84cb859a911d 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1292,14 +1292,12 @@ static void nvme_queue_keep_alive_work(struct nvme_ctrl *ctrl) queue_delayed_work(nvme_wq, &ctrl->ka_work, delay); } -static enum rq_end_io_ret nvme_keep_alive_end_io(struct request *rq, - blk_status_t status) +static void nvme_keep_alive_finish(struct request *rq, + blk_status_t status, struct nvme_ctrl *ctrl) { - struct nvme_ctrl *ctrl = rq->end_io_data; - unsigned long flags; - bool startka = false; unsigned long rtt = jiffies - (rq->deadline - rq->timeout); unsigned long delay = nvme_keep_alive_work_period(ctrl); + enum nvme_ctrl_state state = nvme_ctrl_state(ctrl); /* * Subtract off the keepalive RTT so nvme_keep_alive_work runs @@ -1313,25 +1311,17 @@ static enum rq_end_io_ret nvme_keep_alive_end_io(struct request *rq, delay = 0; } - blk_mq_free_request(rq); - if (status) { dev_err(ctrl->device, "failed nvme_keep_alive_end_io error=%d\n", status); - return RQ_END_IO_NONE; + return; } ctrl->ka_last_check_time = jiffies; ctrl->comp_seen = false; - spin_lock_irqsave(&ctrl->lock, flags); - if (ctrl->state == NVME_CTRL_LIVE || - ctrl->state == NVME_CTRL_CONNECTING) - startka = true; - spin_unlock_irqrestore(&ctrl->lock, flags); - if (startka) + if (state == NVME_CTRL_LIVE || state == NVME_CTRL_CONNECTING) queue_delayed_work(nvme_wq, &ctrl->ka_work, delay); - return RQ_END_IO_NONE; } static void nvme_keep_alive_work(struct work_struct *work) @@ -1340,6 +1330,7 @@ static void nvme_keep_alive_work(struct work_struct *work) struct nvme_ctrl, ka_work); bool comp_seen = ctrl->comp_seen; struct request *rq; + blk_status_t status; ctrl->ka_last_check_time = jiffies; @@ -1362,9 +1353,9 @@ static void nvme_keep_alive_work(struct work_struct *work) nvme_init_request(rq, &ctrl->ka_cmd); rq->timeout = ctrl->kato * HZ; - rq->end_io = nvme_keep_alive_end_io; - rq->end_io_data = ctrl; - blk_execute_rq_nowait(rq, false); + status = blk_execute_rq(rq, false); + nvme_keep_alive_finish(rq, status, ctrl); + blk_mq_free_request(rq); } static void nvme_start_keep_alive(struct nvme_ctrl *ctrl) @@ -2458,8 +2449,13 @@ int nvme_enable_ctrl(struct nvme_ctrl *ctrl) else ctrl->ctrl_config = NVME_CC_CSS_NVM; - if (ctrl->cap & NVME_CAP_CRMS_CRWMS && ctrl->cap & NVME_CAP_CRMS_CRIMS) - ctrl->ctrl_config |= NVME_CC_CRIME; + /* + * Setting CRIME results in CSTS.RDY before the media is ready. This + * makes it possible for media related commands to return the error + * NVME_SC_ADMIN_COMMAND_MEDIA_NOT_READY. Until the driver is + * restructured to handle retries, disable CC.CRIME. + */ + ctrl->ctrl_config &= ~NVME_CC_CRIME; ctrl->ctrl_config |= (NVME_CTRL_PAGE_SHIFT - 12) << NVME_CC_MPS_SHIFT; ctrl->ctrl_config |= NVME_CC_AMS_RR | NVME_CC_SHN_NONE; @@ -2489,10 +2485,7 @@ int nvme_enable_ctrl(struct nvme_ctrl *ctrl) * devices are known to get this wrong. Use the larger of the * two values. */ - if (ctrl->ctrl_config & NVME_CC_CRIME) - ready_timeout = NVME_CRTO_CRIMT(crto); - else - ready_timeout = NVME_CRTO_CRWMT(crto); + ready_timeout = NVME_CRTO_CRWMT(crto); if (ready_timeout < timeout) dev_warn_once(ctrl->device, "bad crto:%x cap:%llx\n", diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c index 48e7a8906d01..6a15873055b9 100644 --- a/drivers/nvme/host/multipath.c +++ b/drivers/nvme/host/multipath.c @@ -431,7 +431,6 @@ static bool nvme_available_path(struct nvme_ns_head *head) case NVME_CTRL_LIVE: case NVME_CTRL_RESETTING: case NVME_CTRL_CONNECTING: - /* fallthru */ return true; default: break; @@ -580,6 +579,20 @@ static int nvme_add_ns_head_cdev(struct nvme_ns_head *head) return ret; } +static void nvme_partition_scan_work(struct work_struct *work) +{ + struct nvme_ns_head *head = + container_of(work, struct nvme_ns_head, partition_scan_work); + + if (WARN_ON_ONCE(!test_and_clear_bit(GD_SUPPRESS_PART_SCAN, + &head->disk->state))) + return; + + mutex_lock(&head->disk->open_mutex); + bdev_disk_changed(head->disk, false); + mutex_unlock(&head->disk->open_mutex); +} + static void nvme_requeue_work(struct work_struct *work) { struct nvme_ns_head *head = @@ -606,6 +619,7 @@ int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl, struct nvme_ns_head *head) bio_list_init(&head->requeue_list); spin_lock_init(&head->requeue_lock); INIT_WORK(&head->requeue_work, nvme_requeue_work); + INIT_WORK(&head->partition_scan_work, nvme_partition_scan_work); /* * Add a multipath node if the subsystems supports multiple controllers. @@ -629,6 +643,16 @@ int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl, struct nvme_ns_head *head) return PTR_ERR(head->disk); head->disk->fops = &nvme_ns_head_ops; head->disk->private_data = head; + + /* + * We need to suppress the partition scan from occuring within the + * controller's scan_work context. If a path error occurs here, the IO + * will wait until a path becomes available or all paths are torn down, + * but that action also occurs within scan_work, so it would deadlock. + * Defer the partion scan to a different context that does not block + * scan_work. + */ + set_bit(GD_SUPPRESS_PART_SCAN, &head->disk->state); sprintf(head->disk->disk_name, "nvme%dn%d", ctrl->subsys->instance, head->instance); return 0; @@ -655,6 +679,7 @@ static void nvme_mpath_set_live(struct nvme_ns *ns) return; } nvme_add_ns_head_cdev(head); + kblockd_schedule_work(&head->partition_scan_work); } mutex_lock(&head->lock); @@ -974,14 +999,14 @@ void nvme_mpath_shutdown_disk(struct nvme_ns_head *head) return; if (test_and_clear_bit(NVME_NSHEAD_DISK_LIVE, &head->flags)) { nvme_cdev_del(&head->cdev, &head->cdev_device); + /* + * requeue I/O after NVME_NSHEAD_DISK_LIVE has been cleared + * to allow multipath to fail all I/O. + */ + synchronize_srcu(&head->srcu); + kblockd_schedule_work(&head->requeue_work); del_gendisk(head->disk); } - /* - * requeue I/O after NVME_NSHEAD_DISK_LIVE has been cleared - * to allow multipath to fail all I/O. - */ - synchronize_srcu(&head->srcu); - kblockd_schedule_work(&head->requeue_work); } void nvme_mpath_remove_disk(struct nvme_ns_head *head) @@ -991,6 +1016,7 @@ void nvme_mpath_remove_disk(struct nvme_ns_head *head) /* make sure all pending bios are cleaned up */ kblockd_schedule_work(&head->requeue_work); flush_work(&head->requeue_work); + flush_work(&head->partition_scan_work); put_disk(head->disk); } diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index 313a4f978a2c..093cb423f536 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -494,6 +494,7 @@ struct nvme_ns_head { struct bio_list requeue_list; spinlock_t requeue_lock; struct work_struct requeue_work; + struct work_struct partition_scan_work; struct mutex lock; unsigned long flags; #define NVME_NSHEAD_DISK_LIVE 0 diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 7990c3f22ecf..4b9fda0b1d9a 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2506,17 +2506,29 @@ static unsigned int nvme_pci_nr_maps(struct nvme_dev *dev) return 1; } -static void nvme_pci_update_nr_queues(struct nvme_dev *dev) +static bool nvme_pci_update_nr_queues(struct nvme_dev *dev) { if (!dev->ctrl.tagset) { nvme_alloc_io_tag_set(&dev->ctrl, &dev->tagset, &nvme_mq_ops, nvme_pci_nr_maps(dev), sizeof(struct nvme_iod)); - return; + return true; + } + + /* Give up if we are racing with nvme_dev_disable() */ + if (!mutex_trylock(&dev->shutdown_lock)) + return false; + + /* Check if nvme_dev_disable() has been executed already */ + if (!dev->online_queues) { + mutex_unlock(&dev->shutdown_lock); + return false; } blk_mq_update_nr_hw_queues(&dev->tagset, dev->online_queues - 1); /* free previously allocated queues that are no longer usable */ nvme_free_queues(dev, dev->online_queues); + mutex_unlock(&dev->shutdown_lock); + return true; } static int nvme_pci_enable(struct nvme_dev *dev) @@ -2797,7 +2809,8 @@ static void nvme_reset_work(struct work_struct *work) nvme_dbbuf_set(dev); nvme_unquiesce_io_queues(&dev->ctrl); nvme_wait_freeze(&dev->ctrl); - nvme_pci_update_nr_queues(dev); + if (!nvme_pci_update_nr_queues(dev)) + goto out; nvme_unfreeze(&dev->ctrl); } else { dev_warn(dev->ctrl.device, "IO queues lost\n"); diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 89c44413c593..3e416af2659f 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -2644,10 +2644,11 @@ static int nvme_tcp_get_address(struct nvme_ctrl *ctrl, char *buf, int size) len = nvmf_get_address(ctrl, buf, size); + if (!test_bit(NVME_TCP_Q_LIVE, &queue->flags)) + return len; + mutex_lock(&queue->queue_lock); - if (!test_bit(NVME_TCP_Q_LIVE, &queue->flags)) - goto done; ret = kernel_getsockname(queue->sock, (struct sockaddr *)&src_addr); if (ret > 0) { if (len > 0) @@ -2655,7 +2656,7 @@ static int nvme_tcp_get_address(struct nvme_ctrl *ctrl, char *buf, int size) len += scnprintf(buf + len, size - len, "%ssrc_addr=%pISc\n", (len) ? "," : "", &src_addr); } -done: + mutex_unlock(&queue->queue_lock); return len; diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c index e32790d8fc26..a9d112d34d4f 100644 --- a/drivers/nvme/target/loop.c +++ b/drivers/nvme/target/loop.c @@ -265,6 +265,13 @@ static void nvme_loop_destroy_admin_queue(struct nvme_loop_ctrl *ctrl) { if (!test_and_clear_bit(NVME_LOOP_Q_LIVE, &ctrl->queues[0].flags)) return; + /* + * It's possible that some requests might have been added + * after admin queue is stopped/quiesced. So now start the + * queue to flush these requests to the completion. + */ + nvme_unquiesce_admin_queue(&ctrl->ctrl); + nvmet_sq_destroy(&ctrl->queues[0].nvme_sq); nvme_remove_admin_tag_set(&ctrl->ctrl); } @@ -297,6 +304,12 @@ static void nvme_loop_destroy_io_queues(struct nvme_loop_ctrl *ctrl) nvmet_sq_destroy(&ctrl->queues[i].nvme_sq); } ctrl->ctrl.queue_count = 1; + /* + * It's possible that some requests might have been added + * after io queue is stopped/quiesced. So now start the + * queue to flush these requests to the completion. + */ + nvme_unquiesce_io_queues(&ctrl->ctrl); } static int nvme_loop_init_io_queues(struct nvme_loop_ctrl *ctrl) diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c index 24d0e2418d2e..0f9b280c438d 100644 --- a/drivers/nvme/target/passthru.c +++ b/drivers/nvme/target/passthru.c @@ -535,10 +535,6 @@ u16 nvmet_parse_passthru_admin_cmd(struct nvmet_req *req) break; case nvme_admin_identify: switch (req->cmd->identify.cns) { - case NVME_ID_CNS_CTRL: - req->execute = nvmet_passthru_execute_cmd; - req->p.use_workqueue = true; - return NVME_SC_SUCCESS; case NVME_ID_CNS_CS_CTRL: switch (req->cmd->identify.csi) { case NVME_CSI_ZNS: @@ -547,7 +543,9 @@ u16 nvmet_parse_passthru_admin_cmd(struct nvmet_req *req) return NVME_SC_SUCCESS; } return NVME_SC_INVALID_OPCODE | NVME_STATUS_DNR; + case NVME_ID_CNS_CTRL: case NVME_ID_CNS_NS: + case NVME_ID_CNS_NS_DESC_LIST: req->execute = nvmet_passthru_execute_cmd; req->p.use_workqueue = true; return NVME_SC_SUCCESS; diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c index ade285308450..1afd93026f9b 100644 --- a/drivers/nvme/target/rdma.c +++ b/drivers/nvme/target/rdma.c @@ -39,6 +39,8 @@ #define NVMET_RDMA_BACKLOG 128 +#define NVMET_RDMA_DISCRETE_RSP_TAG -1 + struct nvmet_rdma_srq; struct nvmet_rdma_cmd { @@ -75,7 +77,7 @@ struct nvmet_rdma_rsp { u32 invalidate_rkey; struct list_head wait_list; - struct list_head free_list; + int tag; }; enum nvmet_rdma_queue_state { @@ -98,8 +100,7 @@ struct nvmet_rdma_queue { struct nvmet_sq nvme_sq; struct nvmet_rdma_rsp *rsps; - struct list_head free_rsps; - spinlock_t rsps_lock; + struct sbitmap rsp_tags; struct nvmet_rdma_cmd *cmds; struct work_struct release_work; @@ -172,7 +173,8 @@ static void nvmet_rdma_queue_disconnect(struct nvmet_rdma_queue *queue); static void nvmet_rdma_free_rsp(struct nvmet_rdma_device *ndev, struct nvmet_rdma_rsp *r); static int nvmet_rdma_alloc_rsp(struct nvmet_rdma_device *ndev, - struct nvmet_rdma_rsp *r); + struct nvmet_rdma_rsp *r, + int tag); static const struct nvmet_fabrics_ops nvmet_rdma_ops; @@ -210,15 +212,12 @@ static inline bool nvmet_rdma_need_data_out(struct nvmet_rdma_rsp *rsp) static inline struct nvmet_rdma_rsp * nvmet_rdma_get_rsp(struct nvmet_rdma_queue *queue) { - struct nvmet_rdma_rsp *rsp; - unsigned long flags; + struct nvmet_rdma_rsp *rsp = NULL; + int tag; - spin_lock_irqsave(&queue->rsps_lock, flags); - rsp = list_first_entry_or_null(&queue->free_rsps, - struct nvmet_rdma_rsp, free_list); - if (likely(rsp)) - list_del(&rsp->free_list); - spin_unlock_irqrestore(&queue->rsps_lock, flags); + tag = sbitmap_get(&queue->rsp_tags); + if (tag >= 0) + rsp = &queue->rsps[tag]; if (unlikely(!rsp)) { int ret; @@ -226,13 +225,12 @@ nvmet_rdma_get_rsp(struct nvmet_rdma_queue *queue) rsp = kzalloc(sizeof(*rsp), GFP_KERNEL); if (unlikely(!rsp)) return NULL; - ret = nvmet_rdma_alloc_rsp(queue->dev, rsp); + ret = nvmet_rdma_alloc_rsp(queue->dev, rsp, + NVMET_RDMA_DISCRETE_RSP_TAG); if (unlikely(ret)) { kfree(rsp); return NULL; } - - rsp->allocated = true; } return rsp; @@ -241,17 +239,13 @@ nvmet_rdma_get_rsp(struct nvmet_rdma_queue *queue) static inline void nvmet_rdma_put_rsp(struct nvmet_rdma_rsp *rsp) { - unsigned long flags; - - if (unlikely(rsp->allocated)) { + if (unlikely(rsp->tag == NVMET_RDMA_DISCRETE_RSP_TAG)) { nvmet_rdma_free_rsp(rsp->queue->dev, rsp); kfree(rsp); return; } - spin_lock_irqsave(&rsp->queue->rsps_lock, flags); - list_add_tail(&rsp->free_list, &rsp->queue->free_rsps); - spin_unlock_irqrestore(&rsp->queue->rsps_lock, flags); + sbitmap_clear_bit(&rsp->queue->rsp_tags, rsp->tag); } static void nvmet_rdma_free_inline_pages(struct nvmet_rdma_device *ndev, @@ -404,7 +398,7 @@ static void nvmet_rdma_free_cmds(struct nvmet_rdma_device *ndev, } static int nvmet_rdma_alloc_rsp(struct nvmet_rdma_device *ndev, - struct nvmet_rdma_rsp *r) + struct nvmet_rdma_rsp *r, int tag) { /* NVMe CQE / RDMA SEND */ r->req.cqe = kmalloc(sizeof(*r->req.cqe), GFP_KERNEL); @@ -432,6 +426,7 @@ static int nvmet_rdma_alloc_rsp(struct nvmet_rdma_device *ndev, r->read_cqe.done = nvmet_rdma_read_data_done; /* Data Out / RDMA WRITE */ r->write_cqe.done = nvmet_rdma_write_data_done; + r->tag = tag; return 0; @@ -454,21 +449,23 @@ nvmet_rdma_alloc_rsps(struct nvmet_rdma_queue *queue) { struct nvmet_rdma_device *ndev = queue->dev; int nr_rsps = queue->recv_queue_size * 2; - int ret = -EINVAL, i; + int ret = -ENOMEM, i; + + if (sbitmap_init_node(&queue->rsp_tags, nr_rsps, -1, GFP_KERNEL, + NUMA_NO_NODE, false, true)) + goto out; queue->rsps = kcalloc(nr_rsps, sizeof(struct nvmet_rdma_rsp), GFP_KERNEL); if (!queue->rsps) - goto out; + goto out_free_sbitmap; for (i = 0; i < nr_rsps; i++) { struct nvmet_rdma_rsp *rsp = &queue->rsps[i]; - ret = nvmet_rdma_alloc_rsp(ndev, rsp); + ret = nvmet_rdma_alloc_rsp(ndev, rsp, i); if (ret) goto out_free; - - list_add_tail(&rsp->free_list, &queue->free_rsps); } return 0; @@ -477,6 +474,8 @@ out_free: while (--i >= 0) nvmet_rdma_free_rsp(ndev, &queue->rsps[i]); kfree(queue->rsps); +out_free_sbitmap: + sbitmap_free(&queue->rsp_tags); out: return ret; } @@ -489,6 +488,7 @@ static void nvmet_rdma_free_rsps(struct nvmet_rdma_queue *queue) for (i = 0; i < nr_rsps; i++) nvmet_rdma_free_rsp(ndev, &queue->rsps[i]); kfree(queue->rsps); + sbitmap_free(&queue->rsp_tags); } static int nvmet_rdma_post_recv(struct nvmet_rdma_device *ndev, @@ -1447,8 +1447,6 @@ nvmet_rdma_alloc_queue(struct nvmet_rdma_device *ndev, INIT_LIST_HEAD(&queue->rsp_wait_list); INIT_LIST_HEAD(&queue->rsp_wr_wait_list); spin_lock_init(&queue->rsp_wr_wait_lock); - INIT_LIST_HEAD(&queue->free_rsps); - spin_lock_init(&queue->rsps_lock); INIT_LIST_HEAD(&queue->queue_list); queue->idx = ida_alloc(&nvmet_rdma_queue_ida, GFP_KERNEL); diff --git a/drivers/parport/procfs.c b/drivers/parport/procfs.c index 3ef486cd3d6d..3880460e67f2 100644 --- a/drivers/parport/procfs.c +++ b/drivers/parport/procfs.c @@ -51,12 +51,12 @@ static int do_active_device(const struct ctl_table *table, int write, for (dev = port->devices; dev ; dev = dev->next) { if(dev == port->cad) { - len += snprintf(buffer, sizeof(buffer), "%s\n", dev->name); + len += scnprintf(buffer, sizeof(buffer), "%s\n", dev->name); } } if(!len) { - len += snprintf(buffer, sizeof(buffer), "%s\n", "none"); + len += scnprintf(buffer, sizeof(buffer), "%s\n", "none"); } if (len > *lenp) @@ -87,19 +87,19 @@ static int do_autoprobe(const struct ctl_table *table, int write, } if ((str = info->class_name) != NULL) - len += snprintf (buffer + len, sizeof(buffer) - len, "CLASS:%s;\n", str); + len += scnprintf (buffer + len, sizeof(buffer) - len, "CLASS:%s;\n", str); if ((str = info->model) != NULL) - len += snprintf (buffer + len, sizeof(buffer) - len, "MODEL:%s;\n", str); + len += scnprintf (buffer + len, sizeof(buffer) - len, "MODEL:%s;\n", str); if ((str = info->mfr) != NULL) - len += snprintf (buffer + len, sizeof(buffer) - len, "MANUFACTURER:%s;\n", str); + len += scnprintf (buffer + len, sizeof(buffer) - len, "MANUFACTURER:%s;\n", str); if ((str = info->description) != NULL) - len += snprintf (buffer + len, sizeof(buffer) - len, "DESCRIPTION:%s;\n", str); + len += scnprintf (buffer + len, sizeof(buffer) - len, "DESCRIPTION:%s;\n", str); if ((str = info->cmdset) != NULL) - len += snprintf (buffer + len, sizeof(buffer) - len, "COMMAND SET:%s;\n", str); + len += scnprintf (buffer + len, sizeof(buffer) - len, "COMMAND SET:%s;\n", str); if (len > *lenp) len = *lenp; @@ -128,7 +128,7 @@ static int do_hardware_base_addr(const struct ctl_table *table, int write, if (write) /* permissions prevent this anyway */ return -EACCES; - len += snprintf (buffer, sizeof(buffer), "%lu\t%lu\n", port->base, port->base_hi); + len += scnprintf (buffer, sizeof(buffer), "%lu\t%lu\n", port->base, port->base_hi); if (len > *lenp) len = *lenp; @@ -155,7 +155,7 @@ static int do_hardware_irq(const struct ctl_table *table, int write, if (write) /* permissions prevent this anyway */ return -EACCES; - len += snprintf (buffer, sizeof(buffer), "%d\n", port->irq); + len += scnprintf (buffer, sizeof(buffer), "%d\n", port->irq); if (len > *lenp) len = *lenp; @@ -182,7 +182,7 @@ static int do_hardware_dma(const struct ctl_table *table, int write, if (write) /* permissions prevent this anyway */ return -EACCES; - len += snprintf (buffer, sizeof(buffer), "%d\n", port->dma); + len += scnprintf (buffer, sizeof(buffer), "%d\n", port->dma); if (len > *lenp) len = *lenp; @@ -213,7 +213,7 @@ static int do_hardware_modes(const struct ctl_table *table, int write, #define printmode(x) \ do { \ if (port->modes & PARPORT_MODE_##x) \ - len += snprintf(buffer + len, sizeof(buffer) - len, "%s%s", f++ ? "," : "", #x); \ + len += scnprintf(buffer + len, sizeof(buffer) - len, "%s%s", f++ ? "," : "", #x); \ } while (0) int f = 0; printmode(PCSPP); diff --git a/drivers/pinctrl/intel/Kconfig b/drivers/pinctrl/intel/Kconfig index 2101d30bd66c..14c26c023590 100644 --- a/drivers/pinctrl/intel/Kconfig +++ b/drivers/pinctrl/intel/Kconfig @@ -46,6 +46,7 @@ config PINCTRL_INTEL_PLATFORM of Intel PCH pins and using them as GPIOs. Currently the following Intel SoCs / platforms require this to be functional: - Lunar Lake + - Panther Lake config PINCTRL_ALDERLAKE tristate "Intel Alder Lake pinctrl and GPIO driver" diff --git a/drivers/pinctrl/intel/pinctrl-intel-platform.c b/drivers/pinctrl/intel/pinctrl-intel-platform.c index 4a19ab3b4ba7..016a9f62eecc 100644 --- a/drivers/pinctrl/intel/pinctrl-intel-platform.c +++ b/drivers/pinctrl/intel/pinctrl-intel-platform.c @@ -90,7 +90,6 @@ static int intel_platform_pinctrl_prepare_community(struct device *dev, struct intel_community *community, struct intel_platform_pins *pins) { - struct fwnode_handle *child; struct intel_padgroup *gpps; unsigned int group; size_t ngpps; @@ -131,7 +130,7 @@ static int intel_platform_pinctrl_prepare_community(struct device *dev, return -ENOMEM; group = 0; - device_for_each_child_node(dev, child) { + device_for_each_child_node_scoped(dev, child) { struct intel_padgroup *gpp = &gpps[group]; gpp->reg_num = group; @@ -159,7 +158,7 @@ static int intel_platform_pinctrl_prepare_soc_data(struct device *dev, int ret; /* Version 1.0 of the specification assumes only a single community per device node */ - ncommunities = 1, + ncommunities = 1; communities = devm_kcalloc(dev, ncommunities, sizeof(*communities), GFP_KERNEL); if (!communities) return -ENOMEM; diff --git a/drivers/pinctrl/nuvoton/pinctrl-ma35.c b/drivers/pinctrl/nuvoton/pinctrl-ma35.c index 1fa00a23534a..59c4e7c6cdde 100644 --- a/drivers/pinctrl/nuvoton/pinctrl-ma35.c +++ b/drivers/pinctrl/nuvoton/pinctrl-ma35.c @@ -218,7 +218,7 @@ static int ma35_pinctrl_dt_node_to_map_func(struct pinctrl_dev *pctldev, } map_num += grp->npins; - new_map = devm_kcalloc(pctldev->dev, map_num, sizeof(*new_map), GFP_KERNEL); + new_map = kcalloc(map_num, sizeof(*new_map), GFP_KERNEL); if (!new_map) return -ENOMEM; diff --git a/drivers/pinctrl/pinctrl-apple-gpio.c b/drivers/pinctrl/pinctrl-apple-gpio.c index 3751c7de37aa..f861e63f4115 100644 --- a/drivers/pinctrl/pinctrl-apple-gpio.c +++ b/drivers/pinctrl/pinctrl-apple-gpio.c @@ -474,6 +474,9 @@ static int apple_gpio_pinctrl_probe(struct platform_device *pdev) for (i = 0; i < npins; i++) { pins[i].number = i; pins[i].name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "PIN%u", i); + if (!pins[i].name) + return -ENOMEM; + pins[i].drv_data = pctl; pin_names[i] = pins[i].name; pin_nums[i] = i; diff --git a/drivers/pinctrl/pinctrl-aw9523.c b/drivers/pinctrl/pinctrl-aw9523.c index b5e1c467625b..1374f30166bc 100644 --- a/drivers/pinctrl/pinctrl-aw9523.c +++ b/drivers/pinctrl/pinctrl-aw9523.c @@ -987,8 +987,10 @@ static int aw9523_probe(struct i2c_client *client) lockdep_set_subclass(&awi->i2c_lock, i2c_adapter_depth(client->adapter)); pdesc = devm_kzalloc(dev, sizeof(*pdesc), GFP_KERNEL); - if (!pdesc) - return -ENOMEM; + if (!pdesc) { + ret = -ENOMEM; + goto err_disable_vregs; + } ret = aw9523_hw_init(awi); if (ret) diff --git a/drivers/pinctrl/pinctrl-ocelot.c b/drivers/pinctrl/pinctrl-ocelot.c index be9b8c010167..d1ab8450ea93 100644 --- a/drivers/pinctrl/pinctrl-ocelot.c +++ b/drivers/pinctrl/pinctrl-ocelot.c @@ -1955,21 +1955,21 @@ static void ocelot_irq_handler(struct irq_desc *desc) unsigned int reg = 0, irq, i; unsigned long irqs; + chained_irq_enter(parent_chip, desc); + for (i = 0; i < info->stride; i++) { regmap_read(info->map, id_reg + 4 * i, ®); if (!reg) continue; - chained_irq_enter(parent_chip, desc); - irqs = reg; for_each_set_bit(irq, &irqs, min(32U, info->desc->npins - 32 * i)) generic_handle_domain_irq(chip->irq.domain, irq + 32 * i); - - chained_irq_exit(parent_chip, desc); } + + chained_irq_exit(parent_chip, desc); } static int ocelot_gpiochip_register(struct platform_device *pdev, diff --git a/drivers/pinctrl/sophgo/pinctrl-cv18xx.c b/drivers/pinctrl/sophgo/pinctrl-cv18xx.c index d18fc5aa84f7..57f2674e75d6 100644 --- a/drivers/pinctrl/sophgo/pinctrl-cv18xx.c +++ b/drivers/pinctrl/sophgo/pinctrl-cv18xx.c @@ -221,7 +221,7 @@ static int cv1800_pctrl_dt_node_to_map(struct pinctrl_dev *pctldev, if (!grpnames) return -ENOMEM; - map = devm_kcalloc(dev, ngroups * 2, sizeof(*map), GFP_KERNEL); + map = kcalloc(ngroups * 2, sizeof(*map), GFP_KERNEL); if (!map) return -ENOMEM; diff --git a/drivers/pinctrl/stm32/pinctrl-stm32.c b/drivers/pinctrl/stm32/pinctrl-stm32.c index a8673739871d..5b7fa77c1184 100644 --- a/drivers/pinctrl/stm32/pinctrl-stm32.c +++ b/drivers/pinctrl/stm32/pinctrl-stm32.c @@ -1374,10 +1374,15 @@ static int stm32_gpiolib_register_bank(struct stm32_pinctrl *pctl, struct fwnode for (i = 0; i < npins; i++) { stm32_pin = stm32_pctrl_get_desc_pin_from_gpio(pctl, bank, i); - if (stm32_pin && stm32_pin->pin.name) + if (stm32_pin && stm32_pin->pin.name) { names[i] = devm_kasprintf(dev, GFP_KERNEL, "%s", stm32_pin->pin.name); - else + if (!names[i]) { + err = -ENOMEM; + goto err_clk; + } + } else { names[i] = NULL; + } } bank->gpio_chip.names = (const char * const *)names; diff --git a/drivers/powercap/intel_rapl_msr.c b/drivers/powercap/intel_rapl_msr.c index 1f4c5389676a..cbe07450de93 100644 --- a/drivers/powercap/intel_rapl_msr.c +++ b/drivers/powercap/intel_rapl_msr.c @@ -148,6 +148,7 @@ static const struct x86_cpu_id pl4_support_ids[] = { X86_MATCH_VFM(INTEL_METEORLAKE, NULL), X86_MATCH_VFM(INTEL_METEORLAKE_L, NULL), X86_MATCH_VFM(INTEL_ARROWLAKE_U, NULL), + X86_MATCH_VFM(INTEL_ARROWLAKE_H, NULL), {} }; diff --git a/drivers/reset/reset-npcm.c b/drivers/reset/reset-npcm.c index 8935ef95a2d1..a200cc8c7955 100644 --- a/drivers/reset/reset-npcm.c +++ b/drivers/reset/reset-npcm.c @@ -405,8 +405,8 @@ static int npcm_rc_probe(struct platform_device *pdev) if (!of_property_read_u32(pdev->dev.of_node, "nuvoton,sw-reset-number", &rc->sw_reset_number)) { if (rc->sw_reset_number && rc->sw_reset_number < 5) { - rc->restart_nb.priority = 192, - rc->restart_nb.notifier_call = npcm_rc_restart, + rc->restart_nb.priority = 192; + rc->restart_nb.notifier_call = npcm_rc_restart; ret = register_restart_handler(&rc->restart_nb); if (ret) dev_warn(&pdev->dev, "failed to register restart handler\n"); diff --git a/drivers/reset/starfive/reset-starfive-jh71x0.c b/drivers/reset/starfive/reset-starfive-jh71x0.c index 55bbbd2de52c..29ce3486752f 100644 --- a/drivers/reset/starfive/reset-starfive-jh71x0.c +++ b/drivers/reset/starfive/reset-starfive-jh71x0.c @@ -94,6 +94,9 @@ static int jh71x0_reset_status(struct reset_controller_dev *rcdev, void __iomem *reg_status = data->status + offset * sizeof(u32); u32 value = readl(reg_status); + if (!data->asserted) + return !(value & mask); + return !((value ^ data->asserted[offset]) & mask); } diff --git a/drivers/s390/char/sclp.c b/drivers/s390/char/sclp.c index f3621adbd5de..fbffd451031f 100644 --- a/drivers/s390/char/sclp.c +++ b/drivers/s390/char/sclp.c @@ -1195,7 +1195,8 @@ sclp_reboot_event(struct notifier_block *this, unsigned long event, void *ptr) } static struct notifier_block sclp_reboot_notifier = { - .notifier_call = sclp_reboot_event + .notifier_call = sclp_reboot_event, + .priority = INT_MIN, }; static ssize_t con_pages_show(struct device_driver *dev, char *buf) diff --git a/drivers/s390/char/sclp_vt220.c b/drivers/s390/char/sclp_vt220.c index 218ae604f737..33b9c968dbcb 100644 --- a/drivers/s390/char/sclp_vt220.c +++ b/drivers/s390/char/sclp_vt220.c @@ -319,7 +319,7 @@ sclp_vt220_add_msg(struct sclp_vt220_request *request, buffer = (void *) ((addr_t) sccb + sccb->header.length); if (convertlf) { - /* Perform Linefeed conversion (0x0a -> 0x0a 0x0d)*/ + /* Perform Linefeed conversion (0x0a -> 0x0d 0x0a)*/ for (from=0, to=0; (from < count) && (to < sclp_vt220_space_left(request)); from++) { @@ -328,8 +328,8 @@ sclp_vt220_add_msg(struct sclp_vt220_request *request, /* Perform conversion */ if (c == 0x0a) { if (to + 1 < sclp_vt220_space_left(request)) { - ((unsigned char *) buffer)[to++] = c; ((unsigned char *) buffer)[to++] = 0x0d; + ((unsigned char *) buffer)[to++] = c; } else break; diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c index 60cea6c24349..e14638936de6 100644 --- a/drivers/s390/crypto/ap_bus.c +++ b/drivers/s390/crypto/ap_bus.c @@ -1864,13 +1864,12 @@ static inline void ap_scan_domains(struct ap_card *ac) } /* if no queue device exists, create a new one */ if (!aq) { - aq = ap_queue_create(qid, ac->ap_dev.device_type); + aq = ap_queue_create(qid, ac); if (!aq) { AP_DBF_WARN("%s(%d,%d) ap_queue_create() failed\n", __func__, ac->id, dom); continue; } - aq->card = ac; aq->config = !decfg; aq->chkstop = chkstop; aq->se_bstate = hwinfo.bs; diff --git a/drivers/s390/crypto/ap_bus.h b/drivers/s390/crypto/ap_bus.h index 0b275c719319..f4622ee4d894 100644 --- a/drivers/s390/crypto/ap_bus.h +++ b/drivers/s390/crypto/ap_bus.h @@ -272,7 +272,7 @@ int ap_test_config_usage_domain(unsigned int domain); int ap_test_config_ctrl_domain(unsigned int domain); void ap_queue_init_reply(struct ap_queue *aq, struct ap_message *ap_msg); -struct ap_queue *ap_queue_create(ap_qid_t qid, int device_type); +struct ap_queue *ap_queue_create(ap_qid_t qid, struct ap_card *ac); void ap_queue_prepare_remove(struct ap_queue *aq); void ap_queue_remove(struct ap_queue *aq); void ap_queue_init_state(struct ap_queue *aq); diff --git a/drivers/s390/crypto/ap_queue.c b/drivers/s390/crypto/ap_queue.c index 8c878c5aa31f..9a0e6e4d8a5e 100644 --- a/drivers/s390/crypto/ap_queue.c +++ b/drivers/s390/crypto/ap_queue.c @@ -22,6 +22,11 @@ static void __ap_flush_queue(struct ap_queue *aq); * some AP queue helper functions */ +static inline bool ap_q_supported_in_se(struct ap_queue *aq) +{ + return aq->card->hwinfo.ep11 || aq->card->hwinfo.accel; +} + static inline bool ap_q_supports_bind(struct ap_queue *aq) { return aq->card->hwinfo.ep11 || aq->card->hwinfo.accel; @@ -1104,18 +1109,19 @@ static void ap_queue_device_release(struct device *dev) kfree(aq); } -struct ap_queue *ap_queue_create(ap_qid_t qid, int device_type) +struct ap_queue *ap_queue_create(ap_qid_t qid, struct ap_card *ac) { struct ap_queue *aq; aq = kzalloc(sizeof(*aq), GFP_KERNEL); if (!aq) return NULL; + aq->card = ac; aq->ap_dev.device.release = ap_queue_device_release; aq->ap_dev.device.type = &ap_queue_type; - aq->ap_dev.device_type = device_type; - // add optional SE secure binding attributes group - if (ap_sb_available() && is_prot_virt_guest()) + aq->ap_dev.device_type = ac->ap_dev.device_type; + /* in SE environment add bind/associate attributes group */ + if (ap_is_se_guest() && ap_q_supported_in_se(aq)) aq->ap_dev.device.groups = ap_queue_dev_sb_attr_groups; aq->qid = qid; spin_lock_init(&aq->lock); @@ -1196,10 +1202,16 @@ bool ap_queue_usable(struct ap_queue *aq) } /* SE guest's queues additionally need to be bound */ - if (ap_q_needs_bind(aq) && - !(aq->se_bstate == AP_BS_Q_USABLE || - aq->se_bstate == AP_BS_Q_USABLE_NO_SECURE_KEY)) - rc = false; + if (ap_is_se_guest()) { + if (!ap_q_supported_in_se(aq)) { + rc = false; + goto unlock_and_out; + } + if (ap_q_needs_bind(aq) && + !(aq->se_bstate == AP_BS_Q_USABLE || + aq->se_bstate == AP_BS_Q_USABLE_NO_SECURE_KEY)) + rc = false; + } unlock_and_out: spin_unlock_bh(&aq->lock); diff --git a/drivers/s390/crypto/pkey_pckmo.c b/drivers/s390/crypto/pkey_pckmo.c index 98079b1ed6db..beeca8827c46 100644 --- a/drivers/s390/crypto/pkey_pckmo.c +++ b/drivers/s390/crypto/pkey_pckmo.c @@ -324,6 +324,7 @@ static int pckmo_key2protkey(const u8 *key, u32 keylen, memcpy(protkey, t->protkey, t->len); *protkeylen = t->len; *protkeytype = t->keytype; + rc = 0; break; } case TOKVER_CLEAR_KEY: { diff --git a/drivers/scsi/mpi3mr/mpi3mr.h b/drivers/scsi/mpi3mr/mpi3mr.h index 14eeac08660f..81bb408ce56d 100644 --- a/drivers/scsi/mpi3mr/mpi3mr.h +++ b/drivers/scsi/mpi3mr/mpi3mr.h @@ -542,8 +542,8 @@ struct mpi3mr_hba_port { * @port_list: List of ports belonging to a SAS node * @num_phys: Number of phys associated with port * @marked_responding: used while refresing the sas ports - * @lowest_phy: lowest phy ID of current sas port - * @phy_mask: phy_mask of current sas port + * @lowest_phy: lowest phy ID of current sas port, valid for controller port + * @phy_mask: phy_mask of current sas port, valid for controller port * @hba_port: HBA port entry * @remote_identify: Attached device identification * @rphy: SAS transport layer rphy object diff --git a/drivers/scsi/mpi3mr/mpi3mr_transport.c b/drivers/scsi/mpi3mr/mpi3mr_transport.c index ccd23def2e0c..0ba9e6a6a13c 100644 --- a/drivers/scsi/mpi3mr/mpi3mr_transport.c +++ b/drivers/scsi/mpi3mr/mpi3mr_transport.c @@ -590,12 +590,13 @@ static enum sas_linkrate mpi3mr_convert_phy_link_rate(u8 link_rate) * @mrioc: Adapter instance reference * @mr_sas_port: Internal Port object * @mr_sas_phy: Internal Phy object + * @host_node: Flag to indicate this is a host_node * * Return: None. */ static void mpi3mr_delete_sas_phy(struct mpi3mr_ioc *mrioc, struct mpi3mr_sas_port *mr_sas_port, - struct mpi3mr_sas_phy *mr_sas_phy) + struct mpi3mr_sas_phy *mr_sas_phy, u8 host_node) { u64 sas_address = mr_sas_port->remote_identify.sas_address; @@ -605,9 +606,13 @@ static void mpi3mr_delete_sas_phy(struct mpi3mr_ioc *mrioc, list_del(&mr_sas_phy->port_siblings); mr_sas_port->num_phys--; - mr_sas_port->phy_mask &= ~(1 << mr_sas_phy->phy_id); - if (mr_sas_port->lowest_phy == mr_sas_phy->phy_id) - mr_sas_port->lowest_phy = ffs(mr_sas_port->phy_mask) - 1; + + if (host_node) { + mr_sas_port->phy_mask &= ~(1 << mr_sas_phy->phy_id); + + if (mr_sas_port->lowest_phy == mr_sas_phy->phy_id) + mr_sas_port->lowest_phy = ffs(mr_sas_port->phy_mask) - 1; + } sas_port_delete_phy(mr_sas_port->port, mr_sas_phy->phy); mr_sas_phy->phy_belongs_to_port = 0; } @@ -617,12 +622,13 @@ static void mpi3mr_delete_sas_phy(struct mpi3mr_ioc *mrioc, * @mrioc: Adapter instance reference * @mr_sas_port: Internal Port object * @mr_sas_phy: Internal Phy object + * @host_node: Flag to indicate this is a host_node * * Return: None. */ static void mpi3mr_add_sas_phy(struct mpi3mr_ioc *mrioc, struct mpi3mr_sas_port *mr_sas_port, - struct mpi3mr_sas_phy *mr_sas_phy) + struct mpi3mr_sas_phy *mr_sas_phy, u8 host_node) { u64 sas_address = mr_sas_port->remote_identify.sas_address; @@ -632,9 +638,12 @@ static void mpi3mr_add_sas_phy(struct mpi3mr_ioc *mrioc, list_add_tail(&mr_sas_phy->port_siblings, &mr_sas_port->phy_list); mr_sas_port->num_phys++; - mr_sas_port->phy_mask |= (1 << mr_sas_phy->phy_id); - if (mr_sas_phy->phy_id < mr_sas_port->lowest_phy) - mr_sas_port->lowest_phy = ffs(mr_sas_port->phy_mask) - 1; + if (host_node) { + mr_sas_port->phy_mask |= (1 << mr_sas_phy->phy_id); + + if (mr_sas_phy->phy_id < mr_sas_port->lowest_phy) + mr_sas_port->lowest_phy = ffs(mr_sas_port->phy_mask) - 1; + } sas_port_add_phy(mr_sas_port->port, mr_sas_phy->phy); mr_sas_phy->phy_belongs_to_port = 1; } @@ -675,7 +684,7 @@ static void mpi3mr_add_phy_to_an_existing_port(struct mpi3mr_ioc *mrioc, if (srch_phy == mr_sas_phy) return; } - mpi3mr_add_sas_phy(mrioc, mr_sas_port, mr_sas_phy); + mpi3mr_add_sas_phy(mrioc, mr_sas_port, mr_sas_phy, mr_sas_node->host_node); return; } } @@ -736,7 +745,7 @@ static void mpi3mr_del_phy_from_an_existing_port(struct mpi3mr_ioc *mrioc, mpi3mr_delete_sas_port(mrioc, mr_sas_port); else mpi3mr_delete_sas_phy(mrioc, mr_sas_port, - mr_sas_phy); + mr_sas_phy, mr_sas_node->host_node); return; } } @@ -1028,7 +1037,7 @@ mpi3mr_alloc_hba_port(struct mpi3mr_ioc *mrioc, u16 port_id) /** * mpi3mr_get_hba_port_by_id - find hba port by id * @mrioc: Adapter instance reference - * @port_id - Port ID to search + * @port_id: Port ID to search * * Return: mpi3mr_hba_port reference for the matched port */ @@ -1367,7 +1376,8 @@ static struct mpi3mr_sas_port *mpi3mr_sas_port_add(struct mpi3mr_ioc *mrioc, mpi3mr_sas_port_sanity_check(mrioc, mr_sas_node, mr_sas_port->remote_identify.sas_address, hba_port); - if (mr_sas_node->num_phys >= sizeof(mr_sas_port->phy_mask) * 8) + if (mr_sas_node->host_node && mr_sas_node->num_phys >= + sizeof(mr_sas_port->phy_mask) * 8) ioc_info(mrioc, "max port count %u could be too high\n", mr_sas_node->num_phys); @@ -1377,7 +1387,7 @@ static struct mpi3mr_sas_port *mpi3mr_sas_port_add(struct mpi3mr_ioc *mrioc, (mr_sas_node->phy[i].hba_port != hba_port)) continue; - if (i >= sizeof(mr_sas_port->phy_mask) * 8) { + if (mr_sas_node->host_node && (i >= sizeof(mr_sas_port->phy_mask) * 8)) { ioc_warn(mrioc, "skipping port %u, max allowed value is %zu\n", i, sizeof(mr_sas_port->phy_mask) * 8); goto out_fail; @@ -1385,7 +1395,8 @@ static struct mpi3mr_sas_port *mpi3mr_sas_port_add(struct mpi3mr_ioc *mrioc, list_add_tail(&mr_sas_node->phy[i].port_siblings, &mr_sas_port->phy_list); mr_sas_port->num_phys++; - mr_sas_port->phy_mask |= (1 << i); + if (mr_sas_node->host_node) + mr_sas_port->phy_mask |= (1 << i); } if (!mr_sas_port->num_phys) { @@ -1394,7 +1405,8 @@ static struct mpi3mr_sas_port *mpi3mr_sas_port_add(struct mpi3mr_ioc *mrioc, goto out_fail; } - mr_sas_port->lowest_phy = ffs(mr_sas_port->phy_mask) - 1; + if (mr_sas_node->host_node) + mr_sas_port->lowest_phy = ffs(mr_sas_port->phy_mask) - 1; if (mr_sas_port->remote_identify.device_type == SAS_END_DEVICE) { tgtdev = mpi3mr_get_tgtdev_by_addr(mrioc, diff --git a/drivers/soc/fsl/qe/qmc.c b/drivers/soc/fsl/qe/qmc.c index 3dffebb48b0d..19cc581b06d0 100644 --- a/drivers/soc/fsl/qe/qmc.c +++ b/drivers/soc/fsl/qe/qmc.c @@ -1761,10 +1761,9 @@ static int qmc_qe_init_resources(struct qmc *qmc, struct platform_device *pdev) */ info = devm_qe_muram_alloc(qmc->dev, UCC_SLOW_PRAM_SIZE + 2 * 64, ALIGNMENT_OF_UCC_SLOW_PRAM); - if (IS_ERR_VALUE(info)) { - dev_err(qmc->dev, "cannot allocate MURAM for PRAM"); - return -ENOMEM; - } + if (info < 0) + return info; + if (!qe_issue_cmd(QE_ASSIGN_PAGE_TO_DEVICE, qmc->qe_subblock, QE_CR_PROTOCOL_UNSPECIFIED, info)) { dev_err(qmc->dev, "QE_ASSIGN_PAGE_TO_DEVICE cmd failed"); @@ -2056,7 +2055,7 @@ static void qmc_remove(struct platform_device *pdev) qmc_exit_xcc(qmc); } -static const struct qmc_data qmc_data_cpm1 = { +static const struct qmc_data qmc_data_cpm1 __maybe_unused = { .version = QMC_CPM1, .tstate = 0x30000000, .rstate = 0x31000000, @@ -2066,7 +2065,7 @@ static const struct qmc_data qmc_data_cpm1 = { .rpack = 0x00000000, }; -static const struct qmc_data qmc_data_qe = { +static const struct qmc_data qmc_data_qe __maybe_unused = { .version = QMC_QE, .tstate = 0x30000000, .rstate = 0x30000000, diff --git a/drivers/target/target_core_device.c b/drivers/target/target_core_device.c index 7d43d92c44d4..d1ae3df069a4 100644 --- a/drivers/target/target_core_device.c +++ b/drivers/target/target_core_device.c @@ -691,7 +691,7 @@ struct se_device *target_alloc_device(struct se_hba *hba, const char *name) dev->queues = kcalloc(nr_cpu_ids, sizeof(*dev->queues), GFP_KERNEL); if (!dev->queues) { - dev->transport->free_device(dev); + hba->backend->ops->free_device(dev); return NULL; } diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c index 5d37a0984916..252849910588 100644 --- a/drivers/tty/n_gsm.c +++ b/drivers/tty/n_gsm.c @@ -3157,6 +3157,8 @@ static void gsm_cleanup_mux(struct gsm_mux *gsm, bool disc) mutex_unlock(&gsm->mutex); /* Now wipe the queues */ tty_ldisc_flush(gsm->tty); + + guard(spinlock_irqsave)(&gsm->tx_lock); list_for_each_entry_safe(txq, ntxq, &gsm->tx_ctrl_list, list) kfree(txq); INIT_LIST_HEAD(&gsm->tx_ctrl_list); diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c index 67d4a72eda77..90974d338f3c 100644 --- a/drivers/tty/serial/imx.c +++ b/drivers/tty/serial/imx.c @@ -762,6 +762,21 @@ static irqreturn_t __imx_uart_rtsint(int irq, void *dev_id) imx_uart_writel(sport, USR1_RTSD, USR1); usr1 = imx_uart_readl(sport, USR1) & USR1_RTSS; + /* + * Update sport->old_status here, so any follow-up calls to + * imx_uart_mctrl_check() will be able to recognize that RTS + * state changed since last imx_uart_mctrl_check() call. + * + * In case RTS has been detected as asserted here and later on + * deasserted by the time imx_uart_mctrl_check() was called, + * imx_uart_mctrl_check() can detect the RTS state change and + * trigger uart_handle_cts_change() to unblock the port for + * further TX transfers. + */ + if (usr1 & USR1_RTSS) + sport->old_status |= TIOCM_CTS; + else + sport->old_status &= ~TIOCM_CTS; uart_handle_cts_change(&sport->port, usr1); wake_up_interruptible(&sport->port.state->port.delta_msr_wait); diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c index 6f0db310cf69..5dfe4e599ad6 100644 --- a/drivers/tty/serial/qcom_geni_serial.c +++ b/drivers/tty/serial/qcom_geni_serial.c @@ -147,6 +147,7 @@ static struct uart_driver qcom_geni_uart_driver; static void __qcom_geni_serial_cancel_tx_cmd(struct uart_port *uport); static void qcom_geni_serial_cancel_tx_cmd(struct uart_port *uport); +static int qcom_geni_serial_port_setup(struct uart_port *uport); static inline struct qcom_geni_serial_port *to_dev_port(struct uart_port *uport) { @@ -395,6 +396,23 @@ static void qcom_geni_serial_poll_put_char(struct uart_port *uport, writel(c, uport->membase + SE_GENI_TX_FIFOn); qcom_geni_serial_poll_tx_done(uport); } + +static int qcom_geni_serial_poll_init(struct uart_port *uport) +{ + struct qcom_geni_serial_port *port = to_dev_port(uport); + int ret; + + if (!port->setup) { + ret = qcom_geni_serial_port_setup(uport); + if (ret) + return ret; + } + + if (!qcom_geni_serial_secondary_active(uport)) + geni_se_setup_s_cmd(&port->se, UART_START_READ, 0); + + return 0; +} #endif #ifdef CONFIG_SERIAL_QCOM_GENI_CONSOLE @@ -562,7 +580,7 @@ static void handle_rx_console(struct uart_port *uport, u32 bytes, bool drop) } #endif /* CONFIG_SERIAL_QCOM_GENI_CONSOLE */ -static void handle_rx_uart(struct uart_port *uport, u32 bytes, bool drop) +static void handle_rx_uart(struct uart_port *uport, u32 bytes) { struct qcom_geni_serial_port *port = to_dev_port(uport); struct tty_port *tport = &uport->state->port; @@ -570,9 +588,8 @@ static void handle_rx_uart(struct uart_port *uport, u32 bytes, bool drop) ret = tty_insert_flip_string(tport, port->rx_buf, bytes); if (ret != bytes) { - dev_err(uport->dev, "%s:Unable to push data ret %d_bytes %d\n", - __func__, ret, bytes); - WARN_ON_ONCE(1); + dev_err_ratelimited(uport->dev, "failed to push data (%d < %u)\n", + ret, bytes); } uport->icount.rx += ret; tty_flip_buffer_push(tport); @@ -787,17 +804,27 @@ static void qcom_geni_serial_start_rx_fifo(struct uart_port *uport) static void qcom_geni_serial_stop_rx_dma(struct uart_port *uport) { struct qcom_geni_serial_port *port = to_dev_port(uport); + bool done; if (!qcom_geni_serial_secondary_active(uport)) return; geni_se_cancel_s_cmd(&port->se); - qcom_geni_serial_poll_bit(uport, SE_GENI_S_IRQ_STATUS, - S_CMD_CANCEL_EN, true); - - if (qcom_geni_serial_secondary_active(uport)) + done = qcom_geni_serial_poll_bit(uport, SE_DMA_RX_IRQ_STAT, + RX_EOT, true); + if (done) { + writel(RX_EOT | RX_DMA_DONE, + uport->membase + SE_DMA_RX_IRQ_CLR); + } else { qcom_geni_serial_abort_rx(uport); + writel(1, uport->membase + SE_DMA_RX_FSM_RST); + qcom_geni_serial_poll_bit(uport, SE_DMA_RX_IRQ_STAT, + RX_RESET_DONE, true); + writel(RX_RESET_DONE | RX_DMA_DONE, + uport->membase + SE_DMA_RX_IRQ_CLR); + } + if (port->rx_dma_addr) { geni_se_rx_dma_unprep(&port->se, port->rx_dma_addr, DMA_RX_BUF_SIZE); @@ -846,7 +873,7 @@ static void qcom_geni_serial_handle_rx_dma(struct uart_port *uport, bool drop) } if (!drop) - handle_rx_uart(uport, rx_in, drop); + handle_rx_uart(uport, rx_in); ret = geni_se_rx_dma_prep(&port->se, port->rx_buf, DMA_RX_BUF_SIZE, @@ -1096,10 +1123,12 @@ static void qcom_geni_serial_shutdown(struct uart_port *uport) { disable_irq(uport->irq); + uart_port_lock_irq(uport); qcom_geni_serial_stop_tx(uport); qcom_geni_serial_stop_rx(uport); qcom_geni_serial_cancel_tx_cmd(uport); + uart_port_unlock_irq(uport); } static void qcom_geni_serial_flush_buffer(struct uart_port *uport) @@ -1152,7 +1181,6 @@ static int qcom_geni_serial_port_setup(struct uart_port *uport) false, true, true); geni_se_init(&port->se, UART_RX_WM, port->rx_fifo_depth - 2); geni_se_select_mode(&port->se, port->dev_data->mode); - qcom_geni_serial_start_rx(uport); port->setup = true; return 0; @@ -1168,6 +1196,11 @@ static int qcom_geni_serial_startup(struct uart_port *uport) if (ret) return ret; } + + uart_port_lock_irq(uport); + qcom_geni_serial_start_rx(uport); + uart_port_unlock_irq(uport); + enable_irq(uport->irq); return 0; @@ -1253,7 +1286,6 @@ static void qcom_geni_serial_set_termios(struct uart_port *uport, unsigned int avg_bw_core; unsigned long timeout; - qcom_geni_serial_stop_rx(uport); /* baud rate */ baud = uart_get_baud_rate(uport, termios, old, 300, 4000000); @@ -1269,7 +1301,7 @@ static void qcom_geni_serial_set_termios(struct uart_port *uport, dev_err(port->se.dev, "Couldn't find suitable clock rate for %u\n", baud * sampling_rate); - goto out_restart_rx; + return; } dev_dbg(port->se.dev, "desired_rate = %u, clk_rate = %lu, clk_div = %u\n", @@ -1360,8 +1392,6 @@ static void qcom_geni_serial_set_termios(struct uart_port *uport, writel(stop_bit_len, uport->membase + SE_UART_TX_STOP_BIT_LEN); writel(ser_clk_cfg, uport->membase + GENI_SER_M_CLK_CFG); writel(ser_clk_cfg, uport->membase + GENI_SER_S_CLK_CFG); -out_restart_rx: - qcom_geni_serial_start_rx(uport); } #ifdef CONFIG_SERIAL_QCOM_GENI_CONSOLE @@ -1582,7 +1612,7 @@ static const struct uart_ops qcom_geni_console_pops = { #ifdef CONFIG_CONSOLE_POLL .poll_get_char = qcom_geni_serial_get_char, .poll_put_char = qcom_geni_serial_poll_put_char, - .poll_init = qcom_geni_serial_port_setup, + .poll_init = qcom_geni_serial_poll_init, #endif .pm = qcom_geni_serial_pm, }; @@ -1749,7 +1779,7 @@ static void qcom_geni_serial_remove(struct platform_device *pdev) uart_remove_one_port(drv, &port->uport); } -static int qcom_geni_serial_sys_suspend(struct device *dev) +static int qcom_geni_serial_suspend(struct device *dev) { struct qcom_geni_serial_port *port = dev_get_drvdata(dev); struct uart_port *uport = &port->uport; @@ -1766,7 +1796,7 @@ static int qcom_geni_serial_sys_suspend(struct device *dev) return uart_suspend_port(private_data->drv, uport); } -static int qcom_geni_serial_sys_resume(struct device *dev) +static int qcom_geni_serial_resume(struct device *dev) { int ret; struct qcom_geni_serial_port *port = dev_get_drvdata(dev); @@ -1781,38 +1811,6 @@ static int qcom_geni_serial_sys_resume(struct device *dev) return ret; } -static int qcom_geni_serial_sys_hib_resume(struct device *dev) -{ - int ret = 0; - struct uart_port *uport; - struct qcom_geni_private_data *private_data; - struct qcom_geni_serial_port *port = dev_get_drvdata(dev); - - uport = &port->uport; - private_data = uport->private_data; - - if (uart_console(uport)) { - geni_icc_set_tag(&port->se, QCOM_ICC_TAG_ALWAYS); - geni_icc_set_bw(&port->se); - ret = uart_resume_port(private_data->drv, uport); - /* - * For hibernation usecase clients for - * console UART won't call port setup during restore, - * hence call port setup for console uart. - */ - qcom_geni_serial_port_setup(uport); - } else { - /* - * Peripheral register settings are lost during hibernation. - * Update setup flag such that port setup happens again - * during next session. Clients of HS-UART will close and - * open the port during hibernation. - */ - port->setup = false; - } - return ret; -} - static const struct qcom_geni_device_data qcom_geni_console_data = { .console = true, .mode = GENI_SE_FIFO, @@ -1824,12 +1822,7 @@ static const struct qcom_geni_device_data qcom_geni_uart_data = { }; static const struct dev_pm_ops qcom_geni_serial_pm_ops = { - .suspend = pm_sleep_ptr(qcom_geni_serial_sys_suspend), - .resume = pm_sleep_ptr(qcom_geni_serial_sys_resume), - .freeze = pm_sleep_ptr(qcom_geni_serial_sys_suspend), - .poweroff = pm_sleep_ptr(qcom_geni_serial_sys_suspend), - .restore = pm_sleep_ptr(qcom_geni_serial_sys_hib_resume), - .thaw = pm_sleep_ptr(qcom_geni_serial_sys_hib_resume), + SYSTEM_SLEEP_PM_OPS(qcom_geni_serial_suspend, qcom_geni_serial_resume) }; static const struct of_device_id qcom_geni_serial_match_table[] = { diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c index cd87e3d1291e..96842ce817af 100644 --- a/drivers/tty/vt/vt.c +++ b/drivers/tty/vt/vt.c @@ -4726,7 +4726,7 @@ static int con_font_get(struct vc_data *vc, struct console_font_op *op) return -EINVAL; if (op->data) { - font.data = kvmalloc(max_font_size, GFP_KERNEL); + font.data = kvzalloc(max_font_size, GFP_KERNEL); if (!font.data) return -ENOMEM; } else diff --git a/drivers/ufs/core/ufs-mcq.c b/drivers/ufs/core/ufs-mcq.c index 8402151330fe..dba935c712d6 100644 --- a/drivers/ufs/core/ufs-mcq.c +++ b/drivers/ufs/core/ufs-mcq.c @@ -539,7 +539,7 @@ int ufshcd_mcq_sq_cleanup(struct ufs_hba *hba, int task_tag) struct scsi_cmnd *cmd = lrbp->cmd; struct ufs_hw_queue *hwq; void __iomem *reg, *opr_sqd_base; - u32 nexus, id, val; + u32 nexus, id, val, rtc; int err; if (hba->quirks & UFSHCD_QUIRK_MCQ_BROKEN_RTC) @@ -569,17 +569,18 @@ int ufshcd_mcq_sq_cleanup(struct ufs_hba *hba, int task_tag) opr_sqd_base = mcq_opr_base(hba, OPR_SQD, id); writel(nexus, opr_sqd_base + REG_SQCTI); - /* SQRTCy.ICU = 1 */ - writel(SQ_ICU, opr_sqd_base + REG_SQRTC); + /* Initiate Cleanup */ + writel(readl(opr_sqd_base + REG_SQRTC) | SQ_ICU, + opr_sqd_base + REG_SQRTC); /* Poll SQRTSy.CUS = 1. Return result from SQRTSy.RTC */ reg = opr_sqd_base + REG_SQRTS; err = read_poll_timeout(readl, val, val & SQ_CUS, 20, MCQ_POLL_US, false, reg); - if (err) - dev_err(hba->dev, "%s: failed. hwq=%d, tag=%d err=%ld\n", - __func__, id, task_tag, - FIELD_GET(SQ_ICU_ERR_CODE_MASK, readl(reg))); + rtc = FIELD_GET(SQ_ICU_ERR_CODE_MASK, readl(reg)); + if (err || rtc) + dev_err(hba->dev, "%s: failed. hwq=%d, tag=%d err=%d RTC=%d\n", + __func__, id, task_tag, err, rtc); if (ufshcd_mcq_sq_start(hba, hwq)) err = -ETIMEDOUT; diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c index f16e3e638372..a63dcf48e59d 100644 --- a/drivers/ufs/core/ufshcd.c +++ b/drivers/ufs/core/ufshcd.c @@ -5416,10 +5416,12 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp, } break; case OCS_ABORTED: - result |= DID_ABORT << 16; - break; case OCS_INVALID_COMMAND_STATUS: result |= DID_REQUEUE << 16; + dev_warn(hba->dev, + "OCS %s from controller for tag %d\n", + (ocs == OCS_ABORTED ? "aborted" : "invalid"), + lrbp->task_tag); break; case OCS_INVALID_CMD_TABLE_ATTR: case OCS_INVALID_PRDT_ATTR: @@ -6465,26 +6467,12 @@ static bool ufshcd_abort_one(struct request *rq, void *priv) struct scsi_device *sdev = cmd->device; struct Scsi_Host *shost = sdev->host; struct ufs_hba *hba = shost_priv(shost); - struct ufshcd_lrb *lrbp = &hba->lrb[tag]; - struct ufs_hw_queue *hwq; - unsigned long flags; *ret = ufshcd_try_to_abort_task(hba, tag); dev_err(hba->dev, "Aborting tag %d / CDB %#02x %s\n", tag, hba->lrb[tag].cmd ? hba->lrb[tag].cmd->cmnd[0] : -1, *ret ? "failed" : "succeeded"); - /* Release cmd in MCQ mode if abort succeeds */ - if (hba->mcq_enabled && (*ret == 0)) { - hwq = ufshcd_mcq_req_to_hwq(hba, scsi_cmd_to_rq(lrbp->cmd)); - if (!hwq) - return 0; - spin_lock_irqsave(&hwq->cq_lock, flags); - if (ufshcd_cmd_inflight(lrbp->cmd)) - ufshcd_release_scsi_cmd(hba, lrbp); - spin_unlock_irqrestore(&hwq->cq_lock, flags); - } - return *ret == 0; } @@ -10209,7 +10197,9 @@ static void ufshcd_wl_shutdown(struct device *dev) shost_for_each_device(sdev, hba->host) { if (sdev == hba->ufs_device_wlun) continue; - scsi_device_quiesce(sdev); + mutex_lock(&sdev->state_mutex); + scsi_device_set_state(sdev, SDEV_OFFLINE); + mutex_unlock(&sdev->state_mutex); } __ufshcd_wl_suspend(hba, UFS_SHUTDOWN_PM); diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c index 21740e2b8f07..427e5660f87c 100644 --- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -2342,6 +2342,11 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg) u32 reg; int i; + dwc->susphy_state = (dwc3_readl(dwc->regs, DWC3_GUSB2PHYCFG(0)) & + DWC3_GUSB2PHYCFG_SUSPHY) || + (dwc3_readl(dwc->regs, DWC3_GUSB3PIPECTL(0)) & + DWC3_GUSB3PIPECTL_SUSPHY); + switch (dwc->current_dr_role) { case DWC3_GCTL_PRTCAP_DEVICE: if (pm_runtime_suspended(dwc->dev)) @@ -2393,6 +2398,15 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg) break; } + if (!PMSG_IS_AUTO(msg)) { + /* + * TI AM62 platform requires SUSPHY to be + * enabled for system suspend to work. + */ + if (!dwc->susphy_state) + dwc3_enable_susphy(dwc, true); + } + return 0; } @@ -2460,6 +2474,11 @@ static int dwc3_resume_common(struct dwc3 *dwc, pm_message_t msg) break; } + if (!PMSG_IS_AUTO(msg)) { + /* restore SUSPHY state to that before system suspend. */ + dwc3_enable_susphy(dwc, dwc->susphy_state); + } + return 0; } diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h index 9c508e0c5cdf..eab81dfdcc35 100644 --- a/drivers/usb/dwc3/core.h +++ b/drivers/usb/dwc3/core.h @@ -1150,6 +1150,8 @@ struct dwc3_scratchpad_array { * @sys_wakeup: set if the device may do system wakeup. * @wakeup_configured: set if the device is configured for remote wakeup. * @suspended: set to track suspend event due to U3/L2. + * @susphy_state: state of DWC3_GUSB2PHYCFG_SUSPHY + DWC3_GUSB3PIPECTL_SUSPHY + * before PM suspend. * @imod_interval: set the interrupt moderation interval in 250ns * increments or 0 to disable. * @max_cfg_eps: current max number of IN eps used across all USB configs. @@ -1382,6 +1384,7 @@ struct dwc3 { unsigned sys_wakeup:1; unsigned wakeup_configured:1; unsigned suspended:1; + unsigned susphy_state:1; u16 imod_interval; diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 10178e5eda5a..4959c26d3b71 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -438,6 +438,10 @@ skip_status: dwc3_gadget_ep_get_transfer_index(dep); } + if (DWC3_DEPCMD_CMD(cmd) == DWC3_DEPCMD_ENDTRANSFER && + !(cmd & DWC3_DEPCMD_CMDIOC)) + mdelay(1); + if (saved_config) { reg = dwc3_readl(dwc->regs, DWC3_GUSB2PHYCFG(0)); reg |= saved_config; @@ -1715,12 +1719,10 @@ static int __dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force, bool int WARN_ON_ONCE(ret); dep->resource_index = 0; - if (!interrupt) { - mdelay(1); + if (!interrupt) dep->flags &= ~DWC3_EP_TRANSFER_STARTED; - } else if (!ret) { + else if (!ret) dep->flags |= DWC3_EP_END_TRANSFER_PENDING; - } dep->flags &= ~DWC3_EP_DELAY_STOP; return ret; diff --git a/drivers/usb/gadget/function/f_uac2.c b/drivers/usb/gadget/function/f_uac2.c index 1cdda44455b3..ce5b77f89190 100644 --- a/drivers/usb/gadget/function/f_uac2.c +++ b/drivers/usb/gadget/function/f_uac2.c @@ -2061,7 +2061,7 @@ static ssize_t f_uac2_opts_##name##_store(struct config_item *item, \ const char *page, size_t len) \ { \ struct f_uac2_opts *opts = to_f_uac2_opts(item); \ - int ret = 0; \ + int ret = len; \ \ mutex_lock(&opts->lock); \ if (opts->refcnt) { \ @@ -2072,8 +2072,8 @@ static ssize_t f_uac2_opts_##name##_store(struct config_item *item, \ if (len && page[len - 1] == '\n') \ len--; \ \ - ret = scnprintf(opts->name, min(sizeof(opts->name), len + 1), \ - "%s", page); \ + scnprintf(opts->name, min(sizeof(opts->name), len + 1), \ + "%s", page); \ \ end: \ mutex_unlock(&opts->lock); \ diff --git a/drivers/usb/gadget/udc/dummy_hcd.c b/drivers/usb/gadget/udc/dummy_hcd.c index 8820d9924448..081ac7683c0b 100644 --- a/drivers/usb/gadget/udc/dummy_hcd.c +++ b/drivers/usb/gadget/udc/dummy_hcd.c @@ -254,6 +254,7 @@ struct dummy_hcd { u32 stream_en_ep; u8 num_stream[30 / 2]; + unsigned timer_pending:1; unsigned active:1; unsigned old_active:1; unsigned resuming:1; @@ -1303,9 +1304,11 @@ static int dummy_urb_enqueue( urb->error_count = 1; /* mark as a new urb */ /* kick the scheduler, it'll do the rest */ - if (!hrtimer_active(&dum_hcd->timer)) + if (!dum_hcd->timer_pending) { + dum_hcd->timer_pending = 1; hrtimer_start(&dum_hcd->timer, ns_to_ktime(DUMMY_TIMER_INT_NSECS), HRTIMER_MODE_REL_SOFT); + } done: spin_unlock_irqrestore(&dum_hcd->dum->lock, flags); @@ -1324,9 +1327,10 @@ static int dummy_urb_dequeue(struct usb_hcd *hcd, struct urb *urb, int status) spin_lock_irqsave(&dum_hcd->dum->lock, flags); rc = usb_hcd_check_unlink_urb(hcd, urb, status); - if (!rc && dum_hcd->rh_state != DUMMY_RH_RUNNING && - !list_empty(&dum_hcd->urbp_list)) + if (rc == 0 && !dum_hcd->timer_pending) { + dum_hcd->timer_pending = 1; hrtimer_start(&dum_hcd->timer, ns_to_ktime(0), HRTIMER_MODE_REL_SOFT); + } spin_unlock_irqrestore(&dum_hcd->dum->lock, flags); return rc; @@ -1813,6 +1817,7 @@ static enum hrtimer_restart dummy_timer(struct hrtimer *t) /* look at each urb queued by the host side driver */ spin_lock_irqsave(&dum->lock, flags); + dum_hcd->timer_pending = 0; if (!dum_hcd->udev) { dev_err(dummy_dev(dum_hcd), @@ -1994,8 +1999,10 @@ return_urb: if (list_empty(&dum_hcd->urbp_list)) { usb_put_dev(dum_hcd->udev); dum_hcd->udev = NULL; - } else if (dum_hcd->rh_state == DUMMY_RH_RUNNING) { + } else if (!dum_hcd->timer_pending && + dum_hcd->rh_state == DUMMY_RH_RUNNING) { /* want a 1 msec delay here */ + dum_hcd->timer_pending = 1; hrtimer_start(&dum_hcd->timer, ns_to_ktime(DUMMY_TIMER_INT_NSECS), HRTIMER_MODE_REL_SOFT); } @@ -2390,8 +2397,10 @@ static int dummy_bus_resume(struct usb_hcd *hcd) } else { dum_hcd->rh_state = DUMMY_RH_RUNNING; set_link_state(dum_hcd); - if (!list_empty(&dum_hcd->urbp_list)) + if (!list_empty(&dum_hcd->urbp_list)) { + dum_hcd->timer_pending = 1; hrtimer_start(&dum_hcd->timer, ns_to_ktime(0), HRTIMER_MODE_REL_SOFT); + } hcd->state = HC_STATE_RUNNING; } spin_unlock_irq(&dum_hcd->dum->lock); @@ -2522,6 +2531,7 @@ static void dummy_stop(struct usb_hcd *hcd) struct dummy_hcd *dum_hcd = hcd_to_dummy_hcd(hcd); hrtimer_cancel(&dum_hcd->timer); + dum_hcd->timer_pending = 0; device_remove_file(dummy_dev(dum_hcd), &dev_attr_urbs); dev_info(dummy_dev(dum_hcd), "stopped\n"); } diff --git a/drivers/usb/host/xhci-dbgcap.h b/drivers/usb/host/xhci-dbgcap.h index 8ec813b6e9fd..9dc8f4d8077c 100644 --- a/drivers/usb/host/xhci-dbgcap.h +++ b/drivers/usb/host/xhci-dbgcap.h @@ -110,6 +110,7 @@ struct dbc_port { struct tasklet_struct push; struct list_head write_pool; + unsigned int tx_boundary; bool registered; }; diff --git a/drivers/usb/host/xhci-dbgtty.c b/drivers/usb/host/xhci-dbgtty.c index b8e78867e25a..d719c16ea30b 100644 --- a/drivers/usb/host/xhci-dbgtty.c +++ b/drivers/usb/host/xhci-dbgtty.c @@ -24,6 +24,29 @@ static inline struct dbc_port *dbc_to_port(struct xhci_dbc *dbc) return dbc->priv; } +static unsigned int +dbc_kfifo_to_req(struct dbc_port *port, char *packet) +{ + unsigned int len; + + len = kfifo_len(&port->port.xmit_fifo); + + if (len == 0) + return 0; + + len = min(len, DBC_MAX_PACKET); + + if (port->tx_boundary) + len = min(port->tx_boundary, len); + + len = kfifo_out(&port->port.xmit_fifo, packet, len); + + if (port->tx_boundary) + port->tx_boundary -= len; + + return len; +} + static int dbc_start_tx(struct dbc_port *port) __releases(&port->port_lock) __acquires(&port->port_lock) @@ -36,7 +59,7 @@ static int dbc_start_tx(struct dbc_port *port) while (!list_empty(pool)) { req = list_entry(pool->next, struct dbc_request, list_pool); - len = kfifo_out(&port->port.xmit_fifo, req->buf, DBC_MAX_PACKET); + len = dbc_kfifo_to_req(port, req->buf); if (len == 0) break; do_tty_wake = true; @@ -200,14 +223,32 @@ static ssize_t dbc_tty_write(struct tty_struct *tty, const u8 *buf, { struct dbc_port *port = tty->driver_data; unsigned long flags; + unsigned int written = 0; spin_lock_irqsave(&port->port_lock, flags); - if (count) - count = kfifo_in(&port->port.xmit_fifo, buf, count); - dbc_start_tx(port); + + /* + * Treat tty write as one usb transfer. Make sure the writes are turned + * into TRB request having the same size boundaries as the tty writes. + * Don't add data to kfifo before previous write is turned into TRBs + */ + if (port->tx_boundary) { + spin_unlock_irqrestore(&port->port_lock, flags); + return 0; + } + + if (count) { + written = kfifo_in(&port->port.xmit_fifo, buf, count); + + if (written == count) + port->tx_boundary = kfifo_len(&port->port.xmit_fifo); + + dbc_start_tx(port); + } + spin_unlock_irqrestore(&port->port_lock, flags); - return count; + return written; } static int dbc_tty_put_char(struct tty_struct *tty, u8 ch) @@ -241,6 +282,10 @@ static unsigned int dbc_tty_write_room(struct tty_struct *tty) spin_lock_irqsave(&port->port_lock, flags); room = kfifo_avail(&port->port.xmit_fifo); + + if (port->tx_boundary) + room = 0; + spin_unlock_irqrestore(&port->port_lock, flags); return room; diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 4d664ba53fe9..b6eb928e260f 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -1023,7 +1023,7 @@ static int xhci_invalidate_cancelled_tds(struct xhci_virt_ep *ep) td_to_noop(xhci, ring, cached_td, false); cached_td->cancel_status = TD_CLEARED; } - + td_to_noop(xhci, ring, td, false); td->cancel_status = TD_CLEARING_CACHE; cached_td = td; break; @@ -2775,6 +2775,29 @@ static int handle_tx_event(struct xhci_hcd *xhci, return 0; } + /* + * xhci 4.10.2 states isoc endpoints should continue + * processing the next TD if there was an error mid TD. + * So host like NEC don't generate an event for the last + * isoc TRB even if the IOC flag is set. + * xhci 4.9.1 states that if there are errors in mult-TRB + * TDs xHC should generate an error for that TRB, and if xHC + * proceeds to the next TD it should genete an event for + * any TRB with IOC flag on the way. Other host follow this. + * + * We wait for the final IOC event, but if we get an event + * anywhere outside this TD, just give it back already. + */ + td = list_first_entry_or_null(&ep_ring->td_list, struct xhci_td, td_list); + + if (td && td->error_mid_td && !trb_in_td(xhci, td, ep_trb_dma, false)) { + xhci_dbg(xhci, "Missing TD completion event after mid TD error\n"); + ep_ring->dequeue = td->last_trb; + ep_ring->deq_seg = td->last_trb_seg; + inc_deq(xhci, ep_ring); + xhci_td_cleanup(xhci, td, ep_ring, td->status); + } + if (list_empty(&ep_ring->td_list)) { /* * Don't print wanings if ring is empty due to a stopped endpoint generating an @@ -2836,44 +2859,13 @@ static int handle_tx_event(struct xhci_hcd *xhci, return 0; } - /* - * xhci 4.10.2 states isoc endpoints should continue - * processing the next TD if there was an error mid TD. - * So host like NEC don't generate an event for the last - * isoc TRB even if the IOC flag is set. - * xhci 4.9.1 states that if there are errors in mult-TRB - * TDs xHC should generate an error for that TRB, and if xHC - * proceeds to the next TD it should genete an event for - * any TRB with IOC flag on the way. Other host follow this. - * So this event might be for the next TD. - */ - if (td->error_mid_td && - !list_is_last(&td->td_list, &ep_ring->td_list)) { - struct xhci_td *td_next = list_next_entry(td, td_list); - - ep_seg = trb_in_td(xhci, td_next, ep_trb_dma, false); - if (ep_seg) { - /* give back previous TD, start handling new */ - xhci_dbg(xhci, "Missing TD completion event after mid TD error\n"); - ep_ring->dequeue = td->last_trb; - ep_ring->deq_seg = td->last_trb_seg; - inc_deq(xhci, ep_ring); - xhci_td_cleanup(xhci, td, ep_ring, td->status); - td = td_next; - } - } - - if (!ep_seg) { - /* HC is busted, give up! */ - xhci_err(xhci, - "ERROR Transfer event TRB DMA ptr not " - "part of current TD ep_index %d " - "comp_code %u\n", ep_index, - trb_comp_code); - trb_in_td(xhci, td, ep_trb_dma, true); + /* HC is busted, give up! */ + xhci_err(xhci, + "ERROR Transfer event TRB DMA ptr not part of current TD ep_index %d comp_code %u\n", + ep_index, trb_comp_code); + trb_in_td(xhci, td, ep_trb_dma, true); - return -ESHUTDOWN; - } + return -ESHUTDOWN; } if (ep->skip) { diff --git a/drivers/usb/host/xhci-tegra.c b/drivers/usb/host/xhci-tegra.c index 6246d5ad1468..76f228e7443c 100644 --- a/drivers/usb/host/xhci-tegra.c +++ b/drivers/usb/host/xhci-tegra.c @@ -2183,7 +2183,7 @@ static int tegra_xusb_enter_elpg(struct tegra_xusb *tegra, bool runtime) goto out; } - for (i = 0; i < tegra->num_usb_phys; i++) { + for (i = 0; i < xhci->usb2_rhub.num_ports; i++) { if (!xhci->usb2_rhub.ports[i]) continue; portsc = readl(xhci->usb2_rhub.ports[i]->addr); diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h index 620502de971a..f0fb696d5619 100644 --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -1001,7 +1001,7 @@ enum xhci_setup_dev { /* Set TR Dequeue Pointer command TRB fields, 6.4.3.9 */ #define TRB_TO_STREAM_ID(p) ((((p) & (0xffff << 16)) >> 16)) #define STREAM_ID_FOR_TRB(p) ((((p)) & 0xffff) << 16) -#define SCT_FOR_TRB(p) (((p) << 1) & 0x7) +#define SCT_FOR_TRB(p) (((p) & 0x7) << 1) /* Link TRB specific fields */ #define TRB_TC (1<<1) diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c index eb0731992ca9..4f18f189f309 100644 --- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -279,6 +279,7 @@ static void option_instat_callback(struct urb *urb); #define QUECTEL_PRODUCT_EG912Y 0x6001 #define QUECTEL_PRODUCT_EC200S_CN 0x6002 #define QUECTEL_PRODUCT_EC200A 0x6005 +#define QUECTEL_PRODUCT_EG916Q 0x6007 #define QUECTEL_PRODUCT_EM061K_LWW 0x6008 #define QUECTEL_PRODUCT_EM061K_LCN 0x6009 #define QUECTEL_PRODUCT_EC200T 0x6026 @@ -1270,6 +1271,7 @@ static const struct usb_device_id option_ids[] = { { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EC200S_CN, 0xff, 0, 0) }, { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EC200T, 0xff, 0, 0) }, { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EG912Y, 0xff, 0, 0) }, + { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EG916Q, 0xff, 0x00, 0x00) }, { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_RM500K, 0xff, 0x00, 0x00) }, { USB_DEVICE(CMOTECH_VENDOR_ID, CMOTECH_PRODUCT_6001) }, @@ -1380,10 +1382,16 @@ static const struct usb_device_id option_ids[] = { .driver_info = NCTRL(0) | RSVD(1) }, { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10a0, 0xff), /* Telit FN20C04 (rmnet) */ .driver_info = RSVD(0) | NCTRL(3) }, + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10a2, 0xff), /* Telit FN920C04 (MBIM) */ + .driver_info = NCTRL(4) }, { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10a4, 0xff), /* Telit FN20C04 (rmnet) */ .driver_info = RSVD(0) | NCTRL(3) }, + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10a7, 0xff), /* Telit FN920C04 (MBIM) */ + .driver_info = NCTRL(4) }, { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10a9, 0xff), /* Telit FN20C04 (rmnet) */ .driver_info = RSVD(0) | NCTRL(2) | RSVD(3) | RSVD(4) }, + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10aa, 0xff), /* Telit FN920C04 (MBIM) */ + .driver_info = NCTRL(3) | RSVD(4) | RSVD(5) }, { USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_ME910), .driver_info = NCTRL(0) | RSVD(1) | RSVD(3) }, { USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_ME910_DUAL_MODEM), diff --git a/drivers/usb/typec/class.c b/drivers/usb/typec/class.c index 9262fcd4144f..d61b4c74648d 100644 --- a/drivers/usb/typec/class.c +++ b/drivers/usb/typec/class.c @@ -519,6 +519,7 @@ static void typec_altmode_release(struct device *dev) typec_altmode_put_partner(alt); altmode_id_remove(alt->adev.dev.parent, alt->id); + put_device(alt->adev.dev.parent); kfree(alt); } @@ -568,6 +569,8 @@ typec_register_altmode(struct device *parent, alt->adev.dev.type = &typec_altmode_dev_type; dev_set_name(&alt->adev.dev, "%s.%u", dev_name(parent), id); + get_device(alt->adev.dev.parent); + /* Link partners and plugs with the ports */ if (!is_port) typec_altmode_set_partner(alt); diff --git a/drivers/usb/typec/tcpm/qcom/qcom_pmic_typec_port.c b/drivers/usb/typec/tcpm/qcom/qcom_pmic_typec_port.c index a747baa29784..c37dede62e12 100644 --- a/drivers/usb/typec/tcpm/qcom/qcom_pmic_typec_port.c +++ b/drivers/usb/typec/tcpm/qcom/qcom_pmic_typec_port.c @@ -432,7 +432,6 @@ static int qcom_pmic_typec_port_get_cc(struct tcpc_dev *tcpc, val = TYPEC_CC_RP_DEF; break; } - val = TYPEC_CC_RP_DEF; } if (misc & CC_ORIENTATION) diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig index 72ddee4c1544..f7d6f47971fd 100644 --- a/drivers/xen/Kconfig +++ b/drivers/xen/Kconfig @@ -261,7 +261,6 @@ config XEN_SCSI_BACKEND config XEN_PRIVCMD tristate "Xen hypercall passthrough driver" depends on XEN - imply XEN_PCIDEV_BACKEND default m help The hypercall passthrough driver allows privileged user programs to diff --git a/drivers/xen/acpi.c b/drivers/xen/acpi.c index 9e2096524fbc..d2ee605c5ca1 100644 --- a/drivers/xen/acpi.c +++ b/drivers/xen/acpi.c @@ -125,3 +125,27 @@ int xen_acpi_get_gsi_info(struct pci_dev *dev, return 0; } EXPORT_SYMBOL_GPL(xen_acpi_get_gsi_info); + +static get_gsi_from_sbdf_t get_gsi_from_sbdf; +static DEFINE_RWLOCK(get_gsi_from_sbdf_lock); + +void xen_acpi_register_get_gsi_func(get_gsi_from_sbdf_t func) +{ + write_lock(&get_gsi_from_sbdf_lock); + get_gsi_from_sbdf = func; + write_unlock(&get_gsi_from_sbdf_lock); +} +EXPORT_SYMBOL_GPL(xen_acpi_register_get_gsi_func); + +int xen_acpi_get_gsi_from_sbdf(u32 sbdf) +{ + int ret = -EOPNOTSUPP; + + read_lock(&get_gsi_from_sbdf_lock); + if (get_gsi_from_sbdf) + ret = get_gsi_from_sbdf(sbdf); + read_unlock(&get_gsi_from_sbdf_lock); + + return ret; +} +EXPORT_SYMBOL_GPL(xen_acpi_get_gsi_from_sbdf); diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c index 3273cb8c2a66..4f75bc876454 100644 --- a/drivers/xen/privcmd.c +++ b/drivers/xen/privcmd.c @@ -850,15 +850,13 @@ out: static long privcmd_ioctl_pcidev_get_gsi(struct file *file, void __user *udata) { #if defined(CONFIG_XEN_ACPI) - int rc = -EINVAL; + int rc; struct privcmd_pcidev_get_gsi kdata; if (copy_from_user(&kdata, udata, sizeof(kdata))) return -EFAULT; - if (IS_REACHABLE(CONFIG_XEN_PCIDEV_BACKEND)) - rc = pcistub_get_gsi_from_sbdf(kdata.sbdf); - + rc = xen_acpi_get_gsi_from_sbdf(kdata.sbdf); if (rc < 0) return rc; diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c index 2f3da5ac62cd..b616b7768c3b 100644 --- a/drivers/xen/xen-pciback/pci_stub.c +++ b/drivers/xen/xen-pciback/pci_stub.c @@ -227,7 +227,7 @@ static struct pci_dev *pcistub_device_get_pci_dev(struct xen_pcibk_device *pdev, } #ifdef CONFIG_XEN_ACPI -int pcistub_get_gsi_from_sbdf(unsigned int sbdf) +static int pcistub_get_gsi_from_sbdf(unsigned int sbdf) { struct pcistub_device *psdev; int domain = (sbdf >> 16) & 0xffff; @@ -242,7 +242,6 @@ int pcistub_get_gsi_from_sbdf(unsigned int sbdf) return psdev->gsi; } -EXPORT_SYMBOL_GPL(pcistub_get_gsi_from_sbdf); #endif struct pci_dev *pcistub_get_pci_dev_by_slot(struct xen_pcibk_device *pdev, @@ -1757,11 +1756,19 @@ static int __init xen_pcibk_init(void) bus_register_notifier(&pci_bus_type, &pci_stub_nb); #endif +#ifdef CONFIG_XEN_ACPI + xen_acpi_register_get_gsi_func(pcistub_get_gsi_from_sbdf); +#endif + return err; } static void __exit xen_pcibk_cleanup(void) { +#ifdef CONFIG_XEN_ACPI + xen_acpi_register_get_gsi_func(NULL); +#endif + #ifdef CONFIG_PCI_IOV bus_unregister_notifier(&pci_bus_type, &pci_stub_nb); #endif diff --git a/fs/9p/fid.c b/fs/9p/fid.c index de009a33e0e2..f84412290a30 100644 --- a/fs/9p/fid.c +++ b/fs/9p/fid.c @@ -131,10 +131,9 @@ static struct p9_fid *v9fs_fid_find(struct dentry *dentry, kuid_t uid, int any) } } spin_unlock(&dentry->d_lock); - } else { - if (dentry->d_inode) - ret = v9fs_fid_find_inode(dentry->d_inode, false, uid, any); } + if (!ret && dentry->d_inode) + ret = v9fs_fid_find_inode(dentry->d_inode, false, uid, any); return ret; } diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 6e1d3c4daf72..52aab09a32a9 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -130,6 +130,7 @@ struct afs_call { wait_queue_head_t waitq; /* processes awaiting completion */ struct work_struct async_work; /* async I/O processor */ struct work_struct work; /* actual work processor */ + struct work_struct free_work; /* Deferred free processor */ struct rxrpc_call *rxcall; /* RxRPC call handle */ struct rxrpc_peer *peer; /* Remote endpoint */ struct key *key; /* security for this call */ @@ -1331,6 +1332,7 @@ extern int __net_init afs_open_socket(struct afs_net *); extern void __net_exit afs_close_socket(struct afs_net *); extern void afs_charge_preallocation(struct work_struct *); extern void afs_put_call(struct afs_call *); +void afs_deferred_put_call(struct afs_call *call); void afs_make_call(struct afs_call *call, gfp_t gfp); void afs_wait_for_call_to_complete(struct afs_call *call); extern struct afs_call *afs_alloc_flat_call(struct afs_net *, diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c index c453428f3c8b..9f2a3bb56ec6 100644 --- a/fs/afs/rxrpc.c +++ b/fs/afs/rxrpc.c @@ -18,6 +18,7 @@ struct workqueue_struct *afs_async_calls; +static void afs_deferred_free_worker(struct work_struct *work); static void afs_wake_up_call_waiter(struct sock *, struct rxrpc_call *, unsigned long); static void afs_wake_up_async_call(struct sock *, struct rxrpc_call *, unsigned long); static void afs_process_async_call(struct work_struct *); @@ -149,6 +150,7 @@ static struct afs_call *afs_alloc_call(struct afs_net *net, call->debug_id = atomic_inc_return(&rxrpc_debug_id); refcount_set(&call->ref, 1); INIT_WORK(&call->async_work, afs_process_async_call); + INIT_WORK(&call->free_work, afs_deferred_free_worker); init_waitqueue_head(&call->waitq); spin_lock_init(&call->state_lock); call->iter = &call->def_iter; @@ -159,6 +161,36 @@ static struct afs_call *afs_alloc_call(struct afs_net *net, return call; } +static void afs_free_call(struct afs_call *call) +{ + struct afs_net *net = call->net; + int o; + + ASSERT(!work_pending(&call->async_work)); + + rxrpc_kernel_put_peer(call->peer); + + if (call->rxcall) { + rxrpc_kernel_shutdown_call(net->socket, call->rxcall); + rxrpc_kernel_put_call(net->socket, call->rxcall); + call->rxcall = NULL; + } + if (call->type->destructor) + call->type->destructor(call); + + afs_unuse_server_notime(call->net, call->server, afs_server_trace_put_call); + kfree(call->request); + + o = atomic_read(&net->nr_outstanding_calls); + trace_afs_call(call->debug_id, afs_call_trace_free, 0, o, + __builtin_return_address(0)); + kfree(call); + + o = atomic_dec_return(&net->nr_outstanding_calls); + if (o == 0) + wake_up_var(&net->nr_outstanding_calls); +} + /* * Dispose of a reference on a call. */ @@ -173,32 +205,34 @@ void afs_put_call(struct afs_call *call) o = atomic_read(&net->nr_outstanding_calls); trace_afs_call(debug_id, afs_call_trace_put, r - 1, o, __builtin_return_address(0)); + if (zero) + afs_free_call(call); +} - if (zero) { - ASSERT(!work_pending(&call->async_work)); - ASSERT(call->type->name != NULL); - - rxrpc_kernel_put_peer(call->peer); - - if (call->rxcall) { - rxrpc_kernel_shutdown_call(net->socket, call->rxcall); - rxrpc_kernel_put_call(net->socket, call->rxcall); - call->rxcall = NULL; - } - if (call->type->destructor) - call->type->destructor(call); +static void afs_deferred_free_worker(struct work_struct *work) +{ + struct afs_call *call = container_of(work, struct afs_call, free_work); - afs_unuse_server_notime(call->net, call->server, afs_server_trace_put_call); - kfree(call->request); + afs_free_call(call); +} - trace_afs_call(call->debug_id, afs_call_trace_free, 0, o, - __builtin_return_address(0)); - kfree(call); +/* + * Dispose of a reference on a call, deferring the cleanup to a workqueue + * to avoid lock recursion. + */ +void afs_deferred_put_call(struct afs_call *call) +{ + struct afs_net *net = call->net; + unsigned int debug_id = call->debug_id; + bool zero; + int r, o; - o = atomic_dec_return(&net->nr_outstanding_calls); - if (o == 0) - wake_up_var(&net->nr_outstanding_calls); - } + zero = __refcount_dec_and_test(&call->ref, &r); + o = atomic_read(&net->nr_outstanding_calls); + trace_afs_call(debug_id, afs_call_trace_put, r - 1, o, + __builtin_return_address(0)); + if (zero) + schedule_work(&call->free_work); } static struct afs_call *afs_get_call(struct afs_call *call, @@ -640,7 +674,8 @@ static void afs_wake_up_call_waiter(struct sock *sk, struct rxrpc_call *rxcall, } /* - * wake up an asynchronous call + * Wake up an asynchronous call. The caller is holding the call notify + * spinlock around this, so we can't call afs_put_call(). */ static void afs_wake_up_async_call(struct sock *sk, struct rxrpc_call *rxcall, unsigned long call_user_ID) @@ -657,7 +692,7 @@ static void afs_wake_up_async_call(struct sock *sk, struct rxrpc_call *rxcall, __builtin_return_address(0)); if (!queue_work(afs_async_calls, &call->async_work)) - afs_put_call(call); + afs_deferred_put_call(call); } } diff --git a/fs/bcachefs/alloc_background.c b/fs/bcachefs/alloc_background.c index 6e161f8ffe8d..c84a91572a1d 100644 --- a/fs/bcachefs/alloc_background.c +++ b/fs/bcachefs/alloc_background.c @@ -1977,7 +1977,7 @@ static void bch2_do_discards_fast_work(struct work_struct *work) ca->mi.bucket_size, GFP_KERNEL); - int ret = bch2_trans_do(c, NULL, NULL, + int ret = bch2_trans_commit_do(c, NULL, NULL, BCH_WATERMARK_btree| BCH_TRANS_COMMIT_no_enospc, bch2_clear_bucket_needs_discard(trans, POS(ca->dev_idx, bucket))); @@ -2137,14 +2137,15 @@ static void bch2_do_invalidates_work(struct work_struct *work) struct bkey_s_c k = next_lru_key(trans, &iter, ca, &wrapped); ret = bkey_err(k); - if (bch2_err_matches(ret, BCH_ERR_transaction_restart)) - continue; if (ret) - break; + goto restart_err; if (!k.k) break; ret = invalidate_one_bucket(trans, &iter, k, &nr_to_invalidate); +restart_err: + if (bch2_err_matches(ret, BCH_ERR_transaction_restart)) + continue; if (ret) break; @@ -2350,24 +2351,19 @@ int bch2_dev_remove_alloc(struct bch_fs *c, struct bch_dev *ca) /* Bucket IO clocks: */ -int bch2_bucket_io_time_reset(struct btree_trans *trans, unsigned dev, - size_t bucket_nr, int rw) +static int __bch2_bucket_io_time_reset(struct btree_trans *trans, unsigned dev, + size_t bucket_nr, int rw) { struct bch_fs *c = trans->c; - struct btree_iter iter; - struct bkey_i_alloc_v4 *a; - u64 now; - int ret = 0; - - if (bch2_trans_relock(trans)) - bch2_trans_begin(trans); - a = bch2_trans_start_alloc_update_noupdate(trans, &iter, POS(dev, bucket_nr)); - ret = PTR_ERR_OR_ZERO(a); + struct btree_iter iter; + struct bkey_i_alloc_v4 *a = + bch2_trans_start_alloc_update_noupdate(trans, &iter, POS(dev, bucket_nr)); + int ret = PTR_ERR_OR_ZERO(a); if (ret) return ret; - now = bch2_current_io_time(c, rw); + u64 now = bch2_current_io_time(c, rw); if (a->v.io_time[rw] == now) goto out; @@ -2380,6 +2376,15 @@ out: return ret; } +int bch2_bucket_io_time_reset(struct btree_trans *trans, unsigned dev, + size_t bucket_nr, int rw) +{ + if (bch2_trans_relock(trans)) + bch2_trans_begin(trans); + + return nested_lockrestart_do(trans, __bch2_bucket_io_time_reset(trans, dev, bucket_nr, rw)); +} + /* Startup/shutdown (ro/rw): */ void bch2_recalc_capacity(struct bch_fs *c) diff --git a/fs/bcachefs/alloc_foreground.c b/fs/bcachefs/alloc_foreground.c index d0e0b56892e3..5836870ab882 100644 --- a/fs/bcachefs/alloc_foreground.c +++ b/fs/bcachefs/alloc_foreground.c @@ -684,7 +684,7 @@ struct open_bucket *bch2_bucket_alloc(struct bch_fs *c, struct bch_dev *ca, struct bch_dev_usage usage; struct open_bucket *ob; - bch2_trans_do(c, NULL, NULL, 0, + bch2_trans_do(c, PTR_ERR_OR_ZERO(ob = bch2_bucket_alloc_trans(trans, ca, watermark, data_type, cl, false, &usage))); return ob; diff --git a/fs/bcachefs/btree_gc.c b/fs/bcachefs/btree_gc.c index 94bbd8505582..0ca3feeb42c8 100644 --- a/fs/bcachefs/btree_gc.c +++ b/fs/bcachefs/btree_gc.c @@ -820,12 +820,22 @@ static int bch2_alloc_write_key(struct btree_trans *trans, * fix that here: */ alloc_data_type_set(&gc, gc.data_type); - if (gc.data_type != old_gc.data_type || gc.dirty_sectors != old_gc.dirty_sectors) { ret = bch2_alloc_key_to_dev_counters(trans, ca, &old_gc, &gc, BTREE_TRIGGER_gc); if (ret) return ret; + + /* + * Ugly: alloc_key_to_dev_counters(..., BTREE_TRIGGER_gc) is not + * safe w.r.t. transaction restarts, so fixup the gc_bucket so + * we don't run it twice: + */ + percpu_down_read(&c->mark_lock); + struct bucket *gc_m = gc_bucket(ca, iter->pos.offset); + gc_m->data_type = gc.data_type; + gc_m->dirty_sectors = gc.dirty_sectors; + percpu_up_read(&c->mark_lock); } if (fsck_err_on(new.data_type != gc.data_type, diff --git a/fs/bcachefs/btree_io.c b/fs/bcachefs/btree_io.c index cf933409d385..6296a11ccb09 100644 --- a/fs/bcachefs/btree_io.c +++ b/fs/bcachefs/btree_io.c @@ -1871,7 +1871,7 @@ static void btree_node_write_work(struct work_struct *work) } } else { - ret = bch2_trans_do(c, NULL, NULL, 0, + ret = bch2_trans_do(c, bch2_btree_node_update_key_get_iter(trans, b, &wbio->key, BCH_WATERMARK_interior_updates| BCH_TRANS_COMMIT_journal_reclaim| diff --git a/fs/bcachefs/btree_iter.h b/fs/bcachefs/btree_iter.h index 31a58bf46fdb..0bda054f80d7 100644 --- a/fs/bcachefs/btree_iter.h +++ b/fs/bcachefs/btree_iter.h @@ -912,6 +912,8 @@ struct bkey_s_c bch2_btree_iter_peek_and_restart_outlined(struct btree_iter *); _ret; \ }) +#define bch2_trans_do(_c, _do) bch2_trans_run(_c, lockrestart_do(trans, _do)) + struct btree_trans *__bch2_trans_get(struct bch_fs *, unsigned); void bch2_trans_put(struct btree_trans *); diff --git a/fs/bcachefs/btree_update.c b/fs/bcachefs/btree_update.c index 514df618548e..5d809e8bd170 100644 --- a/fs/bcachefs/btree_update.c +++ b/fs/bcachefs/btree_update.c @@ -668,7 +668,7 @@ int bch2_btree_insert(struct bch_fs *c, enum btree_id id, struct bkey_i *k, struct disk_reservation *disk_res, int flags, enum btree_iter_update_trigger_flags iter_flags) { - return bch2_trans_do(c, disk_res, NULL, flags, + return bch2_trans_commit_do(c, disk_res, NULL, flags, bch2_btree_insert_trans(trans, id, k, iter_flags)); } @@ -865,7 +865,7 @@ __bch2_fs_log_msg(struct bch_fs *c, unsigned commit_flags, const char *fmt, memcpy(l->d, buf.buf, buf.pos); c->journal.early_journal_entries.nr += jset_u64s(u64s); } else { - ret = bch2_trans_do(c, NULL, NULL, + ret = bch2_trans_commit_do(c, NULL, NULL, BCH_TRANS_COMMIT_lazy_rw|commit_flags, __bch2_trans_log_msg(trans, &buf, u64s)); } diff --git a/fs/bcachefs/btree_update.h b/fs/bcachefs/btree_update.h index 6a454f2fa005..70b3c989fac2 100644 --- a/fs/bcachefs/btree_update.h +++ b/fs/bcachefs/btree_update.h @@ -192,7 +192,7 @@ static inline int bch2_trans_commit(struct btree_trans *trans, nested_lockrestart_do(_trans, _do ?: bch2_trans_commit(_trans, (_disk_res),\ (_journal_seq), (_flags))) -#define bch2_trans_do(_c, _disk_res, _journal_seq, _flags, _do) \ +#define bch2_trans_commit_do(_c, _disk_res, _journal_seq, _flags, _do) \ bch2_trans_run(_c, commit_do(trans, _disk_res, _journal_seq, _flags, _do)) #define trans_for_each_update(_trans, _i) \ diff --git a/fs/bcachefs/btree_update_interior.c b/fs/bcachefs/btree_update_interior.c index 190bc1e81756..64f0928e1137 100644 --- a/fs/bcachefs/btree_update_interior.c +++ b/fs/bcachefs/btree_update_interior.c @@ -2239,10 +2239,8 @@ static void async_btree_node_rewrite_work(struct work_struct *work) struct async_btree_rewrite *a = container_of(work, struct async_btree_rewrite, work); struct bch_fs *c = a->c; - int ret; - ret = bch2_trans_do(c, NULL, NULL, 0, - async_btree_node_rewrite_trans(trans, a)); + int ret = bch2_trans_do(c, async_btree_node_rewrite_trans(trans, a)); bch_err_fn_ratelimited(c, ret); bch2_write_ref_put(c, BCH_WRITE_REF_node_rewrite); kfree(a); diff --git a/fs/bcachefs/buckets.c b/fs/bcachefs/buckets.c index 546cd01a72e3..ec7d9a59bea9 100644 --- a/fs/bcachefs/buckets.c +++ b/fs/bcachefs/buckets.c @@ -1160,11 +1160,11 @@ int bch2_trans_mark_dev_sbs(struct bch_fs *c) #define SECTORS_CACHE 1024 int __bch2_disk_reservation_add(struct bch_fs *c, struct disk_reservation *res, - u64 sectors, int flags) + u64 sectors, enum bch_reservation_flags flags) { struct bch_fs_pcpu *pcpu; u64 old, get; - s64 sectors_available; + u64 sectors_available; int ret; percpu_down_read(&c->mark_lock); @@ -1202,6 +1202,9 @@ recalculate: percpu_u64_set(&c->pcpu->sectors_available, 0); sectors_available = avail_factor(__bch2_fs_usage_read_short(c).free); + if (sectors_available && (flags & BCH_DISK_RESERVATION_PARTIAL)) + sectors = min(sectors, sectors_available); + if (sectors <= sectors_available || (flags & BCH_DISK_RESERVATION_NOFAIL)) { atomic64_set(&c->sectors_available, diff --git a/fs/bcachefs/buckets.h b/fs/bcachefs/buckets.h index e2cb7b24b220..fd5e6ccad45e 100644 --- a/fs/bcachefs/buckets.h +++ b/fs/bcachefs/buckets.h @@ -344,14 +344,16 @@ static inline void bch2_disk_reservation_put(struct bch_fs *c, } } -#define BCH_DISK_RESERVATION_NOFAIL (1 << 0) +enum bch_reservation_flags { + BCH_DISK_RESERVATION_NOFAIL = 1 << 0, + BCH_DISK_RESERVATION_PARTIAL = 1 << 1, +}; -int __bch2_disk_reservation_add(struct bch_fs *, - struct disk_reservation *, - u64, int); +int __bch2_disk_reservation_add(struct bch_fs *, struct disk_reservation *, + u64, enum bch_reservation_flags); static inline int bch2_disk_reservation_add(struct bch_fs *c, struct disk_reservation *res, - u64 sectors, int flags) + u64 sectors, enum bch_reservation_flags flags) { #ifdef __KERNEL__ u64 old, new; diff --git a/fs/bcachefs/chardev.c b/fs/bcachefs/chardev.c index cbfd88f98472..2182b555c112 100644 --- a/fs/bcachefs/chardev.c +++ b/fs/bcachefs/chardev.c @@ -225,6 +225,7 @@ static long bch2_ioctl_fsck_offline(struct bch_ioctl_fsck_offline __user *user_a opt_set(thr->opts, stdio, (u64)(unsigned long)&thr->thr.stdio); opt_set(thr->opts, read_only, 1); + opt_set(thr->opts, ratelimit_errors, 0); /* We need request_key() to be called before we punt to kthread: */ opt_set(thr->opts, nostart, true); diff --git a/fs/bcachefs/darray.c b/fs/bcachefs/darray.c index 4f06cd8bbbe1..e86d36d23e9e 100644 --- a/fs/bcachefs/darray.c +++ b/fs/bcachefs/darray.c @@ -2,6 +2,7 @@ #include <linux/log2.h> #include <linux/slab.h> +#include <linux/vmalloc.h> #include "darray.h" int __bch2_darray_resize_noprof(darray_char *d, size_t element_size, size_t new_size, gfp_t gfp) @@ -9,7 +10,19 @@ int __bch2_darray_resize_noprof(darray_char *d, size_t element_size, size_t new_ if (new_size > d->size) { new_size = roundup_pow_of_two(new_size); - void *data = kvmalloc_array_noprof(new_size, element_size, gfp); + /* + * This is a workaround: kvmalloc() doesn't support > INT_MAX + * allocations, but vmalloc() does. + * The limit needs to be lifted from kvmalloc, and when it does + * we'll go back to just using that. + */ + size_t bytes; + if (unlikely(check_mul_overflow(new_size, element_size, &bytes))) + return -ENOMEM; + + void *data = likely(bytes < INT_MAX) + ? kvmalloc_noprof(bytes, gfp) + : vmalloc_noprof(bytes); if (!data) return -ENOMEM; diff --git a/fs/bcachefs/dirent.c b/fs/bcachefs/dirent.c index 84dd4a879d98..faffc98d5605 100644 --- a/fs/bcachefs/dirent.c +++ b/fs/bcachefs/dirent.c @@ -250,13 +250,6 @@ int bch2_dirent_create(struct btree_trans *trans, subvol_inum dir, return ret; } -static void dirent_copy_target(struct bkey_i_dirent *dst, - struct bkey_s_c_dirent src) -{ - dst->v.d_inum = src.v->d_inum; - dst->v.d_type = src.v->d_type; -} - int bch2_dirent_read_target(struct btree_trans *trans, subvol_inum dir, struct bkey_s_c_dirent d, subvol_inum *target) { diff --git a/fs/bcachefs/dirent.h b/fs/bcachefs/dirent.h index 8945145865c5..53ad99666022 100644 --- a/fs/bcachefs/dirent.h +++ b/fs/bcachefs/dirent.h @@ -34,6 +34,13 @@ static inline unsigned dirent_val_u64s(unsigned len) int bch2_dirent_read_target(struct btree_trans *, subvol_inum, struct bkey_s_c_dirent, subvol_inum *); +static inline void dirent_copy_target(struct bkey_i_dirent *dst, + struct bkey_s_c_dirent src) +{ + dst->v.d_inum = src.v->d_inum; + dst->v.d_type = src.v->d_type; +} + int bch2_dirent_create_snapshot(struct btree_trans *, u32, u64, u32, const struct bch_hash_info *, u8, const struct qstr *, u64, u64 *, diff --git a/fs/bcachefs/disk_accounting.c b/fs/bcachefs/disk_accounting.c index e309fb78529b..07eb8fa1b026 100644 --- a/fs/bcachefs/disk_accounting.c +++ b/fs/bcachefs/disk_accounting.c @@ -856,8 +856,10 @@ int bch2_dev_usage_init(struct bch_dev *ca, bool gc) }; u64 v[3] = { ca->mi.nbuckets - ca->mi.first_bucket, 0, 0 }; - int ret = bch2_trans_do(c, NULL, NULL, 0, - bch2_disk_accounting_mod(trans, &acc, v, ARRAY_SIZE(v), gc)); + int ret = bch2_trans_do(c, ({ + bch2_disk_accounting_mod(trans, &acc, v, ARRAY_SIZE(v), gc) ?: + (!gc ? bch2_trans_commit(trans, NULL, NULL, 0) : 0); + })); bch_err_fn(c, ret); return ret; } diff --git a/fs/bcachefs/ec.c b/fs/bcachefs/ec.c index e410cfe37b1a..a0aa5bb467d9 100644 --- a/fs/bcachefs/ec.c +++ b/fs/bcachefs/ec.c @@ -266,12 +266,12 @@ static int __mark_stripe_bucket(struct btree_trans *trans, if (!deleting) { a->stripe = s.k->p.offset; a->stripe_redundancy = s.v->nr_redundant; + alloc_data_type_set(a, data_type); } else { a->stripe = 0; a->stripe_redundancy = 0; + alloc_data_type_set(a, BCH_DATA_user); } - - alloc_data_type_set(a, data_type); err: printbuf_exit(&buf); return ret; @@ -1186,7 +1186,7 @@ static void ec_stripe_delete_work(struct work_struct *work) if (!idx) break; - int ret = bch2_trans_do(c, NULL, NULL, BCH_TRANS_COMMIT_no_enospc, + int ret = bch2_trans_commit_do(c, NULL, NULL, BCH_TRANS_COMMIT_no_enospc, ec_stripe_delete(trans, idx)); bch_err_fn(c, ret); if (ret) @@ -1519,14 +1519,14 @@ static void ec_stripe_create(struct ec_stripe_new *s) goto err; } - ret = bch2_trans_do(c, &s->res, NULL, - BCH_TRANS_COMMIT_no_check_rw| - BCH_TRANS_COMMIT_no_enospc, - ec_stripe_key_update(trans, - s->have_existing_stripe - ? bkey_i_to_stripe(&s->existing_stripe.key) - : NULL, - bkey_i_to_stripe(&s->new_stripe.key))); + ret = bch2_trans_commit_do(c, &s->res, NULL, + BCH_TRANS_COMMIT_no_check_rw| + BCH_TRANS_COMMIT_no_enospc, + ec_stripe_key_update(trans, + s->have_existing_stripe + ? bkey_i_to_stripe(&s->existing_stripe.key) + : NULL, + bkey_i_to_stripe(&s->new_stripe.key))); bch_err_msg(c, ret, "creating stripe key"); if (ret) { goto err; diff --git a/fs/bcachefs/error.c b/fs/bcachefs/error.c index 7a79f695ba2e..b679def8fb98 100644 --- a/fs/bcachefs/error.c +++ b/fs/bcachefs/error.c @@ -251,7 +251,10 @@ int __bch2_fsck_err(struct bch_fs *c, * delete the key) * - and we don't need to warn if we're not prompting */ - WARN_ON(!(flags & FSCK_AUTOFIX) && !trans && bch2_current_has_btree_trans(c)); + WARN_ON((flags & FSCK_CAN_FIX) && + !(flags & FSCK_AUTOFIX) && + !trans && + bch2_current_has_btree_trans(c)); if ((flags & FSCK_CAN_FIX) && test_bit(err, c->sb.errors_silent)) diff --git a/fs/bcachefs/fs-io-buffered.c b/fs/bcachefs/fs-io-buffered.c index 48a1ab9a649b..95972809e76d 100644 --- a/fs/bcachefs/fs-io-buffered.c +++ b/fs/bcachefs/fs-io-buffered.c @@ -856,6 +856,12 @@ static int __bch2_buffered_write(struct bch_inode_info *inode, folios_trunc(&fs, fi); end = min(end, folio_end_pos(darray_last(fs))); } else { + if (!folio_test_uptodate(f)) { + ret = bch2_read_single_folio(f, mapping); + if (ret) + goto out; + } + folios_trunc(&fs, fi + 1); end = f_pos + f_reserved; } diff --git a/fs/bcachefs/fs-io-pagecache.c b/fs/bcachefs/fs-io-pagecache.c index af3a24546aa3..1d4910ea0f1d 100644 --- a/fs/bcachefs/fs-io-pagecache.c +++ b/fs/bcachefs/fs-io-pagecache.c @@ -399,14 +399,17 @@ void bch2_folio_reservation_put(struct bch_fs *c, bch2_quota_reservation_put(c, inode, &res->quota); } -int bch2_folio_reservation_get(struct bch_fs *c, +static int __bch2_folio_reservation_get(struct bch_fs *c, struct bch_inode_info *inode, struct folio *folio, struct bch2_folio_reservation *res, - size_t offset, size_t len) + size_t offset, size_t len, + bool partial) { struct bch_folio *s = bch2_folio_create(folio, 0); unsigned i, disk_sectors = 0, quota_sectors = 0; + struct disk_reservation disk_res = {}; + size_t reserved = len; int ret; if (!s) @@ -422,48 +425,65 @@ int bch2_folio_reservation_get(struct bch_fs *c, } if (disk_sectors) { - ret = bch2_disk_reservation_add(c, &res->disk, disk_sectors, 0); + ret = bch2_disk_reservation_add(c, &disk_res, disk_sectors, + partial ? BCH_DISK_RESERVATION_PARTIAL : 0); if (unlikely(ret)) return ret; + + if (unlikely(disk_res.sectors != disk_sectors)) { + disk_sectors = quota_sectors = 0; + + for (i = round_down(offset, block_bytes(c)) >> 9; + i < round_up(offset + len, block_bytes(c)) >> 9; + i++) { + disk_sectors += sectors_to_reserve(&s->s[i], res->disk.nr_replicas); + if (disk_sectors > disk_res.sectors) { + /* + * Make sure to get a reservation that's + * aligned to the filesystem blocksize: + */ + unsigned reserved_offset = round_down(i << 9, block_bytes(c)); + reserved = clamp(reserved_offset, offset, offset + len) - offset; + + if (!reserved) { + bch2_disk_reservation_put(c, &disk_res); + return -BCH_ERR_ENOSPC_disk_reservation; + } + break; + } + quota_sectors += s->s[i].state == SECTOR_unallocated; + } + } } if (quota_sectors) { ret = bch2_quota_reservation_add(c, inode, &res->quota, quota_sectors, true); if (unlikely(ret)) { - struct disk_reservation tmp = { .sectors = disk_sectors }; - - bch2_disk_reservation_put(c, &tmp); - res->disk.sectors -= disk_sectors; + bch2_disk_reservation_put(c, &disk_res); return ret; } } - return 0; + res->disk.sectors += disk_res.sectors; + return partial ? reserved : 0; } -ssize_t bch2_folio_reservation_get_partial(struct bch_fs *c, +int bch2_folio_reservation_get(struct bch_fs *c, struct bch_inode_info *inode, struct folio *folio, struct bch2_folio_reservation *res, size_t offset, size_t len) { - size_t l, reserved = 0; - int ret; - - while ((l = len - reserved)) { - while ((ret = bch2_folio_reservation_get(c, inode, folio, res, offset, l))) { - if ((offset & (block_bytes(c) - 1)) + l <= block_bytes(c)) - return reserved ?: ret; - - len = reserved + l; - l /= 2; - } - - offset += l; - reserved += l; - } + return __bch2_folio_reservation_get(c, inode, folio, res, offset, len, false); +} - return reserved; +ssize_t bch2_folio_reservation_get_partial(struct bch_fs *c, + struct bch_inode_info *inode, + struct folio *folio, + struct bch2_folio_reservation *res, + size_t offset, size_t len) +{ + return __bch2_folio_reservation_get(c, inode, folio, res, offset, len, true); } static void bch2_clear_folio_bits(struct folio *folio) diff --git a/fs/bcachefs/fs-io.c b/fs/bcachefs/fs-io.c index 71d0fa387509..15d3f073b824 100644 --- a/fs/bcachefs/fs-io.c +++ b/fs/bcachefs/fs-io.c @@ -182,7 +182,7 @@ static int bch2_flush_inode(struct bch_fs *c, struct bch_inode_unpacked u; int ret = bch2_inode_find_by_inum(c, inode_inum(inode), &u) ?: - bch2_journal_flush_seq(&c->journal, u.bi_journal_seq) ?: + bch2_journal_flush_seq(&c->journal, u.bi_journal_seq, TASK_INTERRUPTIBLE) ?: bch2_inode_flush_nocow_writes(c, inode); bch2_write_ref_put(c, BCH_WRITE_REF_fsync); return ret; diff --git a/fs/bcachefs/fs.c b/fs/bcachefs/fs.c index d7bd28bdaffc..a41d0d8a2f7b 100644 --- a/fs/bcachefs/fs.c +++ b/fs/bcachefs/fs.c @@ -656,7 +656,7 @@ static struct dentry *bch2_lookup(struct inode *vdir, struct dentry *dentry, struct bch_hash_info hash = bch2_hash_info_init(c, &dir->ei_inode); struct bch_inode_info *inode; - bch2_trans_do(c, NULL, NULL, 0, + bch2_trans_do(c, PTR_ERR_OR_ZERO(inode = bch2_lookup_trans(trans, inode_inum(dir), &hash, &dentry->d_name))); if (IS_ERR(inode)) @@ -869,7 +869,7 @@ static int bch2_rename2(struct mnt_idmap *idmap, ret = bch2_subvol_is_ro_trans(trans, src_dir->ei_inum.subvol) ?: bch2_subvol_is_ro_trans(trans, dst_dir->ei_inum.subvol); if (ret) - goto err; + goto err_tx_restart; if (inode_attr_changing(dst_dir, src_inode, Inode_opt_project)) { ret = bch2_fs_quota_transfer(c, src_inode, @@ -1266,7 +1266,7 @@ static int bch2_fiemap(struct inode *vinode, struct fiemap_extent_info *info, bch2_trans_iter_init(trans, &iter, BTREE_ID_extents, POS(ei->v.i_ino, start), 0); - while (true) { + while (!ret || bch2_err_matches(ret, BCH_ERR_transaction_restart)) { enum btree_id data_btree = BTREE_ID_extents; bch2_trans_begin(trans); @@ -1274,14 +1274,14 @@ static int bch2_fiemap(struct inode *vinode, struct fiemap_extent_info *info, u32 snapshot; ret = bch2_subvolume_get_snapshot(trans, ei->ei_inum.subvol, &snapshot); if (ret) - goto err; + continue; bch2_btree_iter_set_snapshot(&iter, snapshot); k = bch2_btree_iter_peek_upto(&iter, end); ret = bkey_err(k); if (ret) - goto err; + continue; if (!k.k) break; @@ -1301,7 +1301,7 @@ static int bch2_fiemap(struct inode *vinode, struct fiemap_extent_info *info, ret = bch2_read_indirect_extent(trans, &data_btree, &offset_into_extent, &cur); if (ret) - break; + continue; k = bkey_i_to_s_c(cur.k); bch2_bkey_buf_realloc(&prev, c, k.k->u64s); @@ -1329,10 +1329,6 @@ static int bch2_fiemap(struct inode *vinode, struct fiemap_extent_info *info, bch2_btree_iter_set_pos(&iter, POS(iter.pos.inode, iter.pos.offset + sectors)); -err: - if (ret && - !bch2_err_matches(ret, BCH_ERR_transaction_restart)) - break; } bch2_trans_iter_exit(trans, &iter); @@ -2040,7 +2036,7 @@ static int bch2_show_options(struct seq_file *seq, struct dentry *root) bch2_opts_to_text(&buf, c->opts, c, c->disk_sb.sb, OPT_MOUNT, OPT_HIDDEN, OPT_SHOW_MOUNT_STYLE); printbuf_nul_terminate(&buf); - seq_puts(seq, buf.buf); + seq_printf(seq, ",%s", buf.buf); int ret = buf.allocation_failure ? -ENOMEM : 0; printbuf_exit(&buf); diff --git a/fs/bcachefs/fsck.c b/fs/bcachefs/fsck.c index a1087fd292e4..75c8a97a6954 100644 --- a/fs/bcachefs/fsck.c +++ b/fs/bcachefs/fsck.c @@ -929,35 +929,138 @@ static int get_visible_inodes(struct btree_trans *trans, return ret; } -static int hash_redo_key(struct btree_trans *trans, - const struct bch_hash_desc desc, - struct bch_hash_info *hash_info, - struct btree_iter *k_iter, struct bkey_s_c k) -{ - struct bkey_i *delete; - struct bkey_i *tmp; - - delete = bch2_trans_kmalloc(trans, sizeof(*delete)); - if (IS_ERR(delete)) - return PTR_ERR(delete); - - tmp = bch2_bkey_make_mut_noupdate(trans, k); - if (IS_ERR(tmp)) - return PTR_ERR(tmp); - - bkey_init(&delete->k); - delete->k.p = k_iter->pos; - return bch2_btree_iter_traverse(k_iter) ?: - bch2_trans_update(trans, k_iter, delete, 0) ?: - bch2_hash_set_in_snapshot(trans, desc, hash_info, - (subvol_inum) { 0, k.k->p.inode }, - k.k->p.snapshot, tmp, - STR_HASH_must_create| - BTREE_UPDATE_internal_snapshot_node) ?: - bch2_trans_commit(trans, NULL, NULL, BCH_TRANS_COMMIT_no_enospc); +static int dirent_has_target(struct btree_trans *trans, struct bkey_s_c_dirent d) +{ + if (d.v->d_type == DT_SUBVOL) { + u32 snap; + u64 inum; + int ret = subvol_lookup(trans, le32_to_cpu(d.v->d_child_subvol), &snap, &inum); + if (ret && !bch2_err_matches(ret, ENOENT)) + return ret; + return !ret; + } else { + struct btree_iter iter; + struct bkey_s_c k = bch2_bkey_get_iter(trans, &iter, BTREE_ID_inodes, + SPOS(0, le64_to_cpu(d.v->d_inum), d.k->p.snapshot), 0); + int ret = bkey_err(k); + if (ret) + return ret; + + ret = bkey_is_inode(k.k); + bch2_trans_iter_exit(trans, &iter); + return ret; + } +} + +/* + * Prefer to delete the first one, since that will be the one at the wrong + * offset: + * return value: 0 -> delete k1, 1 -> delete k2 + */ +static int hash_pick_winner(struct btree_trans *trans, + const struct bch_hash_desc desc, + struct bch_hash_info *hash_info, + struct bkey_s_c k1, + struct bkey_s_c k2) +{ + if (bkey_val_bytes(k1.k) == bkey_val_bytes(k2.k) && + !memcmp(k1.v, k2.v, bkey_val_bytes(k1.k))) + return 0; + + switch (desc.btree_id) { + case BTREE_ID_dirents: { + int ret = dirent_has_target(trans, bkey_s_c_to_dirent(k1)); + if (ret < 0) + return ret; + if (!ret) + return 0; + + ret = dirent_has_target(trans, bkey_s_c_to_dirent(k2)); + if (ret < 0) + return ret; + if (!ret) + return 1; + return 2; + } + default: + return 0; + } +} + +static int fsck_update_backpointers(struct btree_trans *trans, + struct snapshots_seen *s, + const struct bch_hash_desc desc, + struct bch_hash_info *hash_info, + struct bkey_i *new) +{ + if (new->k.type != KEY_TYPE_dirent) + return 0; + + struct bkey_i_dirent *d = bkey_i_to_dirent(new); + struct inode_walker target = inode_walker_init(); + int ret = 0; + + if (d->v.d_type == DT_SUBVOL) { + BUG(); + } else { + ret = get_visible_inodes(trans, &target, s, le64_to_cpu(d->v.d_inum)); + if (ret) + goto err; + + darray_for_each(target.inodes, i) { + i->inode.bi_dir_offset = d->k.p.offset; + ret = __bch2_fsck_write_inode(trans, &i->inode); + if (ret) + goto err; + } + } +err: + inode_walker_exit(&target); + return ret; +} + +static int fsck_rename_dirent(struct btree_trans *trans, + struct snapshots_seen *s, + const struct bch_hash_desc desc, + struct bch_hash_info *hash_info, + struct bkey_s_c_dirent old) +{ + struct qstr old_name = bch2_dirent_get_name(old); + struct bkey_i_dirent *new = bch2_trans_kmalloc(trans, bkey_bytes(old.k) + 32); + int ret = PTR_ERR_OR_ZERO(new); + if (ret) + return ret; + + bkey_dirent_init(&new->k_i); + dirent_copy_target(new, old); + new->k.p = old.k->p; + + for (unsigned i = 0; i < 1000; i++) { + unsigned len = sprintf(new->v.d_name, "%.*s.fsck_renamed-%u", + old_name.len, old_name.name, i); + unsigned u64s = BKEY_U64s + dirent_val_u64s(len); + + if (u64s > U8_MAX) + return -EINVAL; + + new->k.u64s = u64s; + + ret = bch2_hash_set_in_snapshot(trans, bch2_dirent_hash_desc, hash_info, + (subvol_inum) { 0, old.k->p.inode }, + old.k->p.snapshot, &new->k_i, + BTREE_UPDATE_internal_snapshot_node); + if (!bch2_err_matches(ret, EEXIST)) + break; + } + + if (ret) + return ret; + + return fsck_update_backpointers(trans, s, desc, hash_info, &new->k_i); } static int hash_check_key(struct btree_trans *trans, + struct snapshots_seen *s, const struct bch_hash_desc desc, struct bch_hash_info *hash_info, struct btree_iter *k_iter, struct bkey_s_c hash_k) @@ -986,16 +1089,9 @@ static int hash_check_key(struct btree_trans *trans, if (bkey_eq(k.k->p, hash_k.k->p)) break; - if (fsck_err_on(k.k->type == desc.key_type && - !desc.cmp_bkey(k, hash_k), - trans, hash_table_key_duplicate, - "duplicate hash table keys:\n%s", - (printbuf_reset(&buf), - bch2_bkey_val_to_text(&buf, c, hash_k), - buf.buf))) { - ret = bch2_hash_delete_at(trans, desc, hash_info, k_iter, 0) ?: 1; - break; - } + if (k.k->type == desc.key_type && + !desc.cmp_bkey(k, hash_k)) + goto duplicate_entries; if (bkey_deleted(k.k)) { bch2_trans_iter_exit(trans, &iter); @@ -1008,18 +1104,66 @@ out: return ret; bad_hash: if (fsck_err(trans, hash_table_key_wrong_offset, - "hash table key at wrong offset: btree %s inode %llu offset %llu, hashed to %llu\n%s", + "hash table key at wrong offset: btree %s inode %llu offset %llu, hashed to %llu\n %s", bch2_btree_id_str(desc.btree_id), hash_k.k->p.inode, hash_k.k->p.offset, hash, (printbuf_reset(&buf), bch2_bkey_val_to_text(&buf, c, hash_k), buf.buf))) { - ret = hash_redo_key(trans, desc, hash_info, k_iter, hash_k); - bch_err_fn(c, ret); + struct bkey_i *new = bch2_bkey_make_mut_noupdate(trans, hash_k); + if (IS_ERR(new)) + return PTR_ERR(new); + + k = bch2_hash_set_or_get_in_snapshot(trans, &iter, desc, hash_info, + (subvol_inum) { 0, hash_k.k->p.inode }, + hash_k.k->p.snapshot, new, + STR_HASH_must_create| + BTREE_ITER_with_updates| + BTREE_UPDATE_internal_snapshot_node); + ret = bkey_err(k); if (ret) - return ret; - ret = -BCH_ERR_transaction_restart_nested; + goto out; + if (k.k) + goto duplicate_entries; + + ret = bch2_hash_delete_at(trans, desc, hash_info, k_iter, + BTREE_UPDATE_internal_snapshot_node) ?: + fsck_update_backpointers(trans, s, desc, hash_info, new) ?: + bch2_trans_commit(trans, NULL, NULL, BCH_TRANS_COMMIT_no_enospc) ?: + -BCH_ERR_transaction_restart_nested; + goto out; } fsck_err: goto out; +duplicate_entries: + ret = hash_pick_winner(trans, desc, hash_info, hash_k, k); + if (ret < 0) + goto out; + + if (!fsck_err(trans, hash_table_key_duplicate, + "duplicate hash table keys%s:\n%s", + ret != 2 ? "" : ", both point to valid inodes", + (printbuf_reset(&buf), + bch2_bkey_val_to_text(&buf, c, hash_k), + prt_newline(&buf), + bch2_bkey_val_to_text(&buf, c, k), + buf.buf))) + goto out; + + switch (ret) { + case 0: + ret = bch2_hash_delete_at(trans, desc, hash_info, k_iter, 0); + break; + case 1: + ret = bch2_hash_delete_at(trans, desc, hash_info, &iter, 0); + break; + case 2: + ret = fsck_rename_dirent(trans, s, desc, hash_info, bkey_s_c_to_dirent(hash_k)) ?: + bch2_hash_delete_at(trans, desc, hash_info, k_iter, 0); + goto out; + } + + ret = bch2_trans_commit(trans, NULL, NULL, 0) ?: + -BCH_ERR_transaction_restart_nested; + goto out; } static struct bkey_s_c_dirent dirent_get_by_pos(struct btree_trans *trans, @@ -1096,10 +1240,36 @@ fsck_err: return ret; } +static int get_snapshot_root_inode(struct btree_trans *trans, + struct bch_inode_unpacked *root, + u64 inum) +{ + struct btree_iter iter; + struct bkey_s_c k; + int ret = 0; + + for_each_btree_key_reverse_norestart(trans, iter, BTREE_ID_inodes, + SPOS(0, inum, U32_MAX), + BTREE_ITER_all_snapshots, k, ret) { + if (k.k->p.offset != inum) + break; + if (bkey_is_inode(k.k)) + goto found_root; + } + if (ret) + goto err; + BUG(); +found_root: + BUG_ON(bch2_inode_unpack(k, root)); +err: + bch2_trans_iter_exit(trans, &iter); + return ret; +} + static int check_inode(struct btree_trans *trans, struct btree_iter *iter, struct bkey_s_c k, - struct bch_inode_unpacked *prev, + struct bch_inode_unpacked *snapshot_root, struct snapshots_seen *s) { struct bch_fs *c = trans->c; @@ -1123,16 +1293,19 @@ static int check_inode(struct btree_trans *trans, BUG_ON(bch2_inode_unpack(k, &u)); - if (prev->bi_inum != u.bi_inum) - *prev = u; + if (snapshot_root->bi_inum != u.bi_inum) { + ret = get_snapshot_root_inode(trans, snapshot_root, u.bi_inum); + if (ret) + goto err; + } - if (fsck_err_on(prev->bi_hash_seed != u.bi_hash_seed || - inode_d_type(prev) != inode_d_type(&u), + if (fsck_err_on(u.bi_hash_seed != snapshot_root->bi_hash_seed || + INODE_STR_HASH(&u) != INODE_STR_HASH(snapshot_root), trans, inode_snapshot_mismatch, "inodes in different snapshots don't match")) { - bch_err(c, "repair not implemented yet"); - ret = -BCH_ERR_fsck_repair_unimplemented; - goto err_noprint; + u.bi_hash_seed = snapshot_root->bi_hash_seed; + SET_INODE_STR_HASH(&u, INODE_STR_HASH(snapshot_root)); + do_update = true; } if (u.bi_dir || u.bi_dir_offset) { @@ -1285,7 +1458,7 @@ err_noprint: int bch2_check_inodes(struct bch_fs *c) { - struct bch_inode_unpacked prev = { 0 }; + struct bch_inode_unpacked snapshot_root = {}; struct snapshots_seen s; snapshots_seen_init(&s); @@ -1295,7 +1468,7 @@ int bch2_check_inodes(struct bch_fs *c) POS_MIN, BTREE_ITER_prefetch|BTREE_ITER_all_snapshots, k, NULL, NULL, BCH_TRANS_COMMIT_no_enospc, - check_inode(trans, &iter, k, &prev, &s))); + check_inode(trans, &iter, k, &snapshot_root, &s))); snapshots_seen_exit(&s); bch_err_fn(c, ret); @@ -2307,7 +2480,7 @@ static int check_dirent(struct btree_trans *trans, struct btree_iter *iter, *hash_info = bch2_hash_info_init(c, &i->inode); dir->first_this_inode = false; - ret = hash_check_key(trans, bch2_dirent_hash_desc, hash_info, iter, k); + ret = hash_check_key(trans, s, bch2_dirent_hash_desc, hash_info, iter, k); if (ret < 0) goto err; if (ret) { @@ -2421,7 +2594,7 @@ static int check_xattr(struct btree_trans *trans, struct btree_iter *iter, *hash_info = bch2_hash_info_init(c, &i->inode); inode->first_this_inode = false; - ret = hash_check_key(trans, bch2_xattr_hash_desc, hash_info, iter, k); + ret = hash_check_key(trans, NULL, bch2_xattr_hash_desc, hash_info, iter, k); bch_err_fn(c, ret); return ret; } @@ -2509,7 +2682,7 @@ fsck_err: /* Get root directory, create if it doesn't exist: */ int bch2_check_root(struct bch_fs *c) { - int ret = bch2_trans_do(c, NULL, NULL, BCH_TRANS_COMMIT_no_enospc, + int ret = bch2_trans_commit_do(c, NULL, NULL, BCH_TRANS_COMMIT_no_enospc, check_root_trans(trans)); bch_err_fn(c, ret); return ret; diff --git a/fs/bcachefs/inode.c b/fs/bcachefs/inode.c index 344ccb7a824c..039cb7a22244 100644 --- a/fs/bcachefs/inode.c +++ b/fs/bcachefs/inode.c @@ -163,8 +163,8 @@ static noinline int bch2_inode_unpack_v1(struct bkey_s_c_inode inode, unsigned fieldnr = 0, field_bits; int ret; -#define x(_name, _bits) \ - if (fieldnr++ == INODE_NR_FIELDS(inode.v)) { \ +#define x(_name, _bits) \ + if (fieldnr++ == INODEv1_NR_FIELDS(inode.v)) { \ unsigned offset = offsetof(struct bch_inode_unpacked, _name);\ memset((void *) unpacked + offset, 0, \ sizeof(*unpacked) - offset); \ @@ -283,6 +283,8 @@ static noinline int bch2_inode_unpack_slowpath(struct bkey_s_c k, { memset(unpacked, 0, sizeof(*unpacked)); + unpacked->bi_snapshot = k.k->p.snapshot; + switch (k.k->type) { case KEY_TYPE_inode: { struct bkey_s_c_inode inode = bkey_s_c_to_inode(k); @@ -293,10 +295,10 @@ static noinline int bch2_inode_unpack_slowpath(struct bkey_s_c k, unpacked->bi_flags = le32_to_cpu(inode.v->bi_flags); unpacked->bi_mode = le16_to_cpu(inode.v->bi_mode); - if (INODE_NEW_VARINT(inode.v)) { + if (INODEv1_NEW_VARINT(inode.v)) { return bch2_inode_unpack_v2(unpacked, inode.v->fields, bkey_val_end(inode), - INODE_NR_FIELDS(inode.v)); + INODEv1_NR_FIELDS(inode.v)); } else { return bch2_inode_unpack_v1(inode, unpacked); } @@ -471,10 +473,10 @@ int bch2_inode_validate(struct bch_fs *c, struct bkey_s_c k, struct bkey_s_c_inode inode = bkey_s_c_to_inode(k); int ret = 0; - bkey_fsck_err_on(INODE_STR_HASH(inode.v) >= BCH_STR_HASH_NR, + bkey_fsck_err_on(INODEv1_STR_HASH(inode.v) >= BCH_STR_HASH_NR, c, inode_str_hash_invalid, "invalid str hash type (%llu >= %u)", - INODE_STR_HASH(inode.v), BCH_STR_HASH_NR); + INODEv1_STR_HASH(inode.v), BCH_STR_HASH_NR); ret = __bch2_inode_validate(c, k, flags); fsck_err: @@ -533,6 +535,10 @@ static void __bch2_inode_unpacked_to_text(struct printbuf *out, prt_printf(out, "(%x)\n", inode->bi_flags); prt_printf(out, "journal_seq=%llu\n", inode->bi_journal_seq); + prt_printf(out, "hash_seed=%llx\n", inode->bi_hash_seed); + prt_printf(out, "hash_type="); + bch2_prt_str_hash_type(out, INODE_STR_HASH(inode)); + prt_newline(out); prt_printf(out, "bi_size=%llu\n", inode->bi_size); prt_printf(out, "bi_sectors=%llu\n", inode->bi_sectors); prt_printf(out, "bi_version=%llu\n", inode->bi_version); @@ -800,10 +806,8 @@ void bch2_inode_init_early(struct bch_fs *c, memset(inode_u, 0, sizeof(*inode_u)); - /* ick */ - inode_u->bi_flags |= str_hash << INODE_STR_HASH_OFFSET; - get_random_bytes(&inode_u->bi_hash_seed, - sizeof(inode_u->bi_hash_seed)); + SET_INODE_STR_HASH(inode_u, str_hash); + get_random_bytes(&inode_u->bi_hash_seed, sizeof(inode_u->bi_hash_seed)); } void bch2_inode_init_late(struct bch_inode_unpacked *inode_u, u64 now, @@ -1087,8 +1091,7 @@ int bch2_inode_find_by_inum_trans(struct btree_trans *trans, int bch2_inode_find_by_inum(struct bch_fs *c, subvol_inum inum, struct bch_inode_unpacked *inode) { - return bch2_trans_do(c, NULL, NULL, 0, - bch2_inode_find_by_inum_trans(trans, inum, inode)); + return bch2_trans_do(c, bch2_inode_find_by_inum_trans(trans, inum, inode)); } int bch2_inode_nlink_inc(struct bch_inode_unpacked *bi) diff --git a/fs/bcachefs/inode.h b/fs/bcachefs/inode.h index c8e98443e2d4..eab82b5eb897 100644 --- a/fs/bcachefs/inode.h +++ b/fs/bcachefs/inode.h @@ -92,6 +92,7 @@ struct bch_inode_unpacked { BCH_INODE_FIELDS_v3() #undef x }; +BITMASK(INODE_STR_HASH, struct bch_inode_unpacked, bi_flags, 20, 24); struct bkey_inode_buf { struct bkey_i_inode_v3 inode; diff --git a/fs/bcachefs/inode_format.h b/fs/bcachefs/inode_format.h index a204e46b6b47..7928d0c6954f 100644 --- a/fs/bcachefs/inode_format.h +++ b/fs/bcachefs/inode_format.h @@ -150,9 +150,9 @@ enum __bch_inode_flags { #undef x }; -LE32_BITMASK(INODE_STR_HASH, struct bch_inode, bi_flags, 20, 24); -LE32_BITMASK(INODE_NR_FIELDS, struct bch_inode, bi_flags, 24, 31); -LE32_BITMASK(INODE_NEW_VARINT, struct bch_inode, bi_flags, 31, 32); +LE32_BITMASK(INODEv1_STR_HASH, struct bch_inode, bi_flags, 20, 24); +LE32_BITMASK(INODEv1_NR_FIELDS, struct bch_inode, bi_flags, 24, 31); +LE32_BITMASK(INODEv1_NEW_VARINT,struct bch_inode, bi_flags, 31, 32); LE64_BITMASK(INODEv2_STR_HASH, struct bch_inode_v2, bi_flags, 20, 24); LE64_BITMASK(INODEv2_NR_FIELDS, struct bch_inode_v2, bi_flags, 24, 31); diff --git a/fs/bcachefs/io_misc.c b/fs/bcachefs/io_misc.c index 307ed0a45184..f283051758d6 100644 --- a/fs/bcachefs/io_misc.c +++ b/fs/bcachefs/io_misc.c @@ -377,7 +377,7 @@ static int __bch2_resume_logged_op_finsert(struct btree_trans *trans, * check for missing subvolume before fpunch, as in resume we don't want * it to be a fatal error */ - ret = __bch2_subvolume_get_snapshot(trans, inum.subvol, &snapshot, warn_errors); + ret = lockrestart_do(trans, __bch2_subvolume_get_snapshot(trans, inum.subvol, &snapshot, warn_errors)); if (ret) return ret; diff --git a/fs/bcachefs/io_read.c b/fs/bcachefs/io_read.c index e4fc17c548fd..fc246f342820 100644 --- a/fs/bcachefs/io_read.c +++ b/fs/bcachefs/io_read.c @@ -409,8 +409,8 @@ retry: bch2_trans_begin(trans); rbio->bio.bi_status = 0; - k = bch2_btree_iter_peek_slot(&iter); - if (bkey_err(k)) + ret = lockrestart_do(trans, bkey_err(k = bch2_btree_iter_peek_slot(&iter))); + if (ret) goto err; bch2_bkey_buf_reassemble(&sk, c, k); @@ -557,8 +557,8 @@ out: static noinline void bch2_rbio_narrow_crcs(struct bch_read_bio *rbio) { - bch2_trans_do(rbio->c, NULL, NULL, BCH_TRANS_COMMIT_no_enospc, - __bch2_rbio_narrow_crcs(trans, rbio)); + bch2_trans_commit_do(rbio->c, NULL, NULL, BCH_TRANS_COMMIT_no_enospc, + __bch2_rbio_narrow_crcs(trans, rbio)); } /* Inner part that may run in process context */ diff --git a/fs/bcachefs/io_write.c b/fs/bcachefs/io_write.c index b5fe9e0dc155..8609e25e450f 100644 --- a/fs/bcachefs/io_write.c +++ b/fs/bcachefs/io_write.c @@ -1437,7 +1437,7 @@ again: * freeing up space on specific disks, which means that * allocations for specific disks may hang arbitrarily long: */ - ret = bch2_trans_do(c, NULL, NULL, 0, + ret = bch2_trans_run(c, lockrestart_do(trans, bch2_alloc_sectors_start_trans(trans, op->target, op->opts.erasure_code && !(op->flags & BCH_WRITE_CACHED), @@ -1447,7 +1447,7 @@ again: op->nr_replicas_required, op->watermark, op->flags, - &op->cl, &wp)); + &op->cl, &wp))); if (unlikely(ret)) { if (bch2_err_matches(ret, BCH_ERR_operation_blocked)) break; diff --git a/fs/bcachefs/journal.c b/fs/bcachefs/journal.c index dc099f06341f..2dc0d60c1745 100644 --- a/fs/bcachefs/journal.c +++ b/fs/bcachefs/journal.c @@ -758,7 +758,7 @@ out: return ret; } -int bch2_journal_flush_seq(struct journal *j, u64 seq) +int bch2_journal_flush_seq(struct journal *j, u64 seq, unsigned task_state) { u64 start_time = local_clock(); int ret, ret2; @@ -769,7 +769,9 @@ int bch2_journal_flush_seq(struct journal *j, u64 seq) if (seq <= j->flushed_seq_ondisk) return 0; - ret = wait_event_interruptible(j->wait, (ret2 = bch2_journal_flush_seq_async(j, seq, NULL))); + ret = wait_event_state(j->wait, + (ret2 = bch2_journal_flush_seq_async(j, seq, NULL)), + task_state); if (!ret) bch2_time_stats_update(j->flush_seq_time, start_time); @@ -788,7 +790,7 @@ void bch2_journal_flush_async(struct journal *j, struct closure *parent) int bch2_journal_flush(struct journal *j) { - return bch2_journal_flush_seq(j, atomic64_read(&j->seq)); + return bch2_journal_flush_seq(j, atomic64_read(&j->seq), TASK_UNINTERRUPTIBLE); } /* @@ -851,7 +853,7 @@ int bch2_journal_meta(struct journal *j) bch2_journal_res_put(j, &res); - return bch2_journal_flush_seq(j, res.seq); + return bch2_journal_flush_seq(j, res.seq, TASK_UNINTERRUPTIBLE); } /* block/unlock the journal: */ diff --git a/fs/bcachefs/journal.h b/fs/bcachefs/journal.h index 377a3750406e..2762be6f9814 100644 --- a/fs/bcachefs/journal.h +++ b/fs/bcachefs/journal.h @@ -401,7 +401,7 @@ void bch2_journal_entry_res_resize(struct journal *, int bch2_journal_flush_seq_async(struct journal *, u64, struct closure *); void bch2_journal_flush_async(struct journal *, struct closure *); -int bch2_journal_flush_seq(struct journal *, u64); +int bch2_journal_flush_seq(struct journal *, u64, unsigned); int bch2_journal_flush(struct journal *); bool bch2_journal_noflush_seq(struct journal *, u64); int bch2_journal_meta(struct journal *); diff --git a/fs/bcachefs/opts.c b/fs/bcachefs/opts.c index 84097235eea9..6673cbd8bdb9 100644 --- a/fs/bcachefs/opts.c +++ b/fs/bcachefs/opts.c @@ -63,7 +63,7 @@ const char * const bch2_compression_opts[] = { NULL }; -const char * const bch2_str_hash_types[] = { +const char * const __bch2_str_hash_types[] = { BCH_STR_HASH_TYPES() NULL }; @@ -115,6 +115,7 @@ PRT_STR_OPT_BOUNDSCHECKED(fs_usage_type, enum bch_fs_usage_type); PRT_STR_OPT_BOUNDSCHECKED(data_type, enum bch_data_type); PRT_STR_OPT_BOUNDSCHECKED(csum_type, enum bch_csum_type); PRT_STR_OPT_BOUNDSCHECKED(compression_type, enum bch_compression_type); +PRT_STR_OPT_BOUNDSCHECKED(str_hash_type, enum bch_str_hash_type); static int bch2_opt_fix_errors_parse(struct bch_fs *c, const char *val, u64 *res, struct printbuf *err) @@ -596,6 +597,9 @@ int bch2_parse_mount_opts(struct bch_fs *c, struct bch_opts *opts, copied_opts_start = copied_opts; while ((opt = strsep(&copied_opts, ",")) != NULL) { + if (!*opt) + continue; + name = strsep(&opt, "="); val = opt; diff --git a/fs/bcachefs/opts.h b/fs/bcachefs/opts.h index cb2e244a2429..23dda014e331 100644 --- a/fs/bcachefs/opts.h +++ b/fs/bcachefs/opts.h @@ -18,7 +18,7 @@ extern const char * const bch2_sb_compat[]; extern const char * const __bch2_btree_ids[]; extern const char * const bch2_csum_opts[]; extern const char * const bch2_compression_opts[]; -extern const char * const bch2_str_hash_types[]; +extern const char * const __bch2_str_hash_types[]; extern const char * const bch2_str_hash_opts[]; extern const char * const __bch2_data_types[]; extern const char * const bch2_member_states[]; @@ -29,6 +29,7 @@ void bch2_prt_fs_usage_type(struct printbuf *, enum bch_fs_usage_type); void bch2_prt_data_type(struct printbuf *, enum bch_data_type); void bch2_prt_csum_type(struct printbuf *, enum bch_csum_type); void bch2_prt_compression_type(struct printbuf *, enum bch_compression_type); +void bch2_prt_str_hash_type(struct printbuf *, enum bch_str_hash_type); static inline const char *bch2_d_type_str(unsigned d_type) { diff --git a/fs/bcachefs/quota.c b/fs/bcachefs/quota.c index c32a05e252e2..74f45a8162ad 100644 --- a/fs/bcachefs/quota.c +++ b/fs/bcachefs/quota.c @@ -869,7 +869,7 @@ static int bch2_set_quota(struct super_block *sb, struct kqid qid, bkey_quota_init(&new_quota.k_i); new_quota.k.p = POS(qid.type, from_kqid(&init_user_ns, qid)); - ret = bch2_trans_do(c, NULL, NULL, 0, + ret = bch2_trans_commit_do(c, NULL, NULL, 0, bch2_set_quota_trans(trans, &new_quota, qdq)) ?: __bch2_quota_set(c, bkey_i_to_s_c(&new_quota.k_i), qdq); diff --git a/fs/bcachefs/rebalance.c b/fs/bcachefs/rebalance.c index 2d299a37cf07..cd6647374353 100644 --- a/fs/bcachefs/rebalance.c +++ b/fs/bcachefs/rebalance.c @@ -70,7 +70,9 @@ err: int bch2_set_rebalance_needs_scan(struct bch_fs *c, u64 inum) { - int ret = bch2_trans_do(c, NULL, NULL, BCH_TRANS_COMMIT_no_enospc|BCH_TRANS_COMMIT_lazy_rw, + int ret = bch2_trans_commit_do(c, NULL, NULL, + BCH_TRANS_COMMIT_no_enospc| + BCH_TRANS_COMMIT_lazy_rw, __bch2_set_rebalance_needs_scan(trans, inum)); rebalance_wakeup(c); return ret; diff --git a/fs/bcachefs/recovery.c b/fs/bcachefs/recovery.c index 55e1504a8130..454b5a32dd7f 100644 --- a/fs/bcachefs/recovery.c +++ b/fs/bcachefs/recovery.c @@ -1091,7 +1091,7 @@ int bch2_fs_initialize(struct bch_fs *c) bch2_inode_init_early(c, &lostfound_inode); - ret = bch2_trans_do(c, NULL, NULL, 0, + ret = bch2_trans_commit_do(c, NULL, NULL, 0, bch2_create_trans(trans, BCACHEFS_ROOT_SUBVOL_INUM, &root_inode, &lostfound_inode, diff --git a/fs/bcachefs/sb-errors_format.h b/fs/bcachefs/sb-errors_format.h index aab328ac6dfa..937275d061fe 100644 --- a/fs/bcachefs/sb-errors_format.h +++ b/fs/bcachefs/sb-errors_format.h @@ -267,8 +267,8 @@ enum bch_fsck_flags { x(journal_entry_dup_same_device, 246, 0) \ x(inode_bi_subvol_missing, 247, 0) \ x(inode_bi_subvol_wrong, 248, 0) \ - x(inode_points_to_missing_dirent, 249, 0) \ - x(inode_points_to_wrong_dirent, 250, 0) \ + x(inode_points_to_missing_dirent, 249, FSCK_AUTOFIX) \ + x(inode_points_to_wrong_dirent, 250, FSCK_AUTOFIX) \ x(inode_bi_parent_nonzero, 251, 0) \ x(dirent_to_missing_parent_subvol, 252, 0) \ x(dirent_not_visible_in_parent_subvol, 253, 0) \ diff --git a/fs/bcachefs/str_hash.h b/fs/bcachefs/str_hash.h index 215eed4cce6d..ec2b1feea520 100644 --- a/fs/bcachefs/str_hash.h +++ b/fs/bcachefs/str_hash.h @@ -46,8 +46,7 @@ bch2_hash_info_init(struct bch_fs *c, const struct bch_inode_unpacked *bi) { /* XXX ick */ struct bch_hash_info info = { - .type = (bi->bi_flags >> INODE_STR_HASH_OFFSET) & - ~(~0U << INODE_STR_HASH_BITS), + .type = INODE_STR_HASH(bi), .siphash_key = { .k0 = bi->bi_hash_seed } }; @@ -253,19 +252,20 @@ int bch2_hash_needs_whiteout(struct btree_trans *trans, } static __always_inline -int bch2_hash_set_in_snapshot(struct btree_trans *trans, +struct bkey_s_c bch2_hash_set_or_get_in_snapshot(struct btree_trans *trans, + struct btree_iter *iter, const struct bch_hash_desc desc, const struct bch_hash_info *info, subvol_inum inum, u32 snapshot, struct bkey_i *insert, enum btree_iter_update_trigger_flags flags) { - struct btree_iter iter, slot = { NULL }; + struct btree_iter slot = {}; struct bkey_s_c k; bool found = false; int ret; - for_each_btree_key_upto_norestart(trans, iter, desc.btree_id, + for_each_btree_key_upto_norestart(trans, *iter, desc.btree_id, SPOS(insert->k.p.inode, desc.hash_bkey(info, bkey_i_to_s_c(insert)), snapshot), @@ -280,7 +280,7 @@ int bch2_hash_set_in_snapshot(struct btree_trans *trans, } if (!slot.path && !(flags & STR_HASH_must_replace)) - bch2_trans_copy_iter(&slot, &iter); + bch2_trans_copy_iter(&slot, iter); if (k.k->type != KEY_TYPE_hash_whiteout) goto not_found; @@ -290,29 +290,50 @@ int bch2_hash_set_in_snapshot(struct btree_trans *trans, ret = -BCH_ERR_ENOSPC_str_hash_create; out: bch2_trans_iter_exit(trans, &slot); - bch2_trans_iter_exit(trans, &iter); - - return ret; + bch2_trans_iter_exit(trans, iter); + return ret ? bkey_s_c_err(ret) : bkey_s_c_null; found: found = true; not_found: - - if (!found && (flags & STR_HASH_must_replace)) { + if (found && (flags & STR_HASH_must_create)) { + bch2_trans_iter_exit(trans, &slot); + return k; + } else if (!found && (flags & STR_HASH_must_replace)) { ret = -BCH_ERR_ENOENT_str_hash_set_must_replace; - } else if (found && (flags & STR_HASH_must_create)) { - ret = -BCH_ERR_EEXIST_str_hash_set; } else { if (!found && slot.path) - swap(iter, slot); + swap(*iter, slot); - insert->k.p = iter.pos; - ret = bch2_trans_update(trans, &iter, insert, flags); + insert->k.p = iter->pos; + ret = bch2_trans_update(trans, iter, insert, flags); } goto out; } static __always_inline +int bch2_hash_set_in_snapshot(struct btree_trans *trans, + const struct bch_hash_desc desc, + const struct bch_hash_info *info, + subvol_inum inum, u32 snapshot, + struct bkey_i *insert, + enum btree_iter_update_trigger_flags flags) +{ + struct btree_iter iter; + struct bkey_s_c k = bch2_hash_set_or_get_in_snapshot(trans, &iter, desc, info, inum, + snapshot, insert, flags); + int ret = bkey_err(k); + if (ret) + return ret; + if (k.k) { + bch2_trans_iter_exit(trans, &iter); + return -BCH_ERR_EEXIST_str_hash_set; + } + + return 0; +} + +static __always_inline int bch2_hash_set(struct btree_trans *trans, const struct bch_hash_desc desc, const struct bch_hash_info *info, @@ -363,8 +384,11 @@ int bch2_hash_delete(struct btree_trans *trans, struct btree_iter iter; struct bkey_s_c k = bch2_hash_lookup(trans, &iter, desc, info, inum, key, BTREE_ITER_intent); - int ret = bkey_err(k) ?: - bch2_hash_delete_at(trans, desc, info, &iter, 0); + int ret = bkey_err(k); + if (ret) + return ret; + + ret = bch2_hash_delete_at(trans, desc, info, &iter, 0); bch2_trans_iter_exit(trans, &iter); return ret; } diff --git a/fs/bcachefs/subvolume.c b/fs/bcachefs/subvolume.c index 91d8187ee168..80e5efaff524 100644 --- a/fs/bcachefs/subvolume.c +++ b/fs/bcachefs/subvolume.c @@ -319,8 +319,7 @@ int bch2_subvol_is_ro_trans(struct btree_trans *trans, u32 subvol) int bch2_subvol_is_ro(struct bch_fs *c, u32 subvol) { - return bch2_trans_do(c, NULL, NULL, 0, - bch2_subvol_is_ro_trans(trans, subvol)); + return bch2_trans_do(c, bch2_subvol_is_ro_trans(trans, subvol)); } int bch2_snapshot_get_subvol(struct btree_trans *trans, u32 snapshot, @@ -676,8 +675,8 @@ err: /* set bi_subvol on root inode */ int bch2_fs_upgrade_for_subvolumes(struct bch_fs *c) { - int ret = bch2_trans_do(c, NULL, NULL, BCH_TRANS_COMMIT_lazy_rw, - __bch2_fs_upgrade_for_subvolumes(trans)); + int ret = bch2_trans_commit_do(c, NULL, NULL, BCH_TRANS_COMMIT_lazy_rw, + __bch2_fs_upgrade_for_subvolumes(trans)); bch_err_fn(c, ret); return ret; } diff --git a/fs/bcachefs/super.c b/fs/bcachefs/super.c index 77d811a539af..657fd3759e7b 100644 --- a/fs/bcachefs/super.c +++ b/fs/bcachefs/super.c @@ -1972,7 +1972,7 @@ int bch2_dev_resize(struct bch_fs *c, struct bch_dev *ca, u64 nbuckets) }; u64 v[3] = { nbuckets - old_nbuckets, 0, 0 }; - ret = bch2_trans_do(ca->fs, NULL, NULL, 0, + ret = bch2_trans_commit_do(ca->fs, NULL, NULL, 0, bch2_disk_accounting_mod(trans, &acc, v, ARRAY_SIZE(v), false)) ?: bch2_dev_freespace_init(c, ca, old_nbuckets, nbuckets); if (ret) diff --git a/fs/bcachefs/tests.c b/fs/bcachefs/tests.c index b2f209743afe..315038a0a92d 100644 --- a/fs/bcachefs/tests.c +++ b/fs/bcachefs/tests.c @@ -450,7 +450,7 @@ static int insert_test_overlapping_extent(struct bch_fs *c, u64 inum, u64 start, k.k_i.k.p.snapshot = snapid; k.k_i.k.size = len; - ret = bch2_trans_do(c, NULL, NULL, 0, + ret = bch2_trans_commit_do(c, NULL, NULL, 0, bch2_btree_insert_nonextent(trans, BTREE_ID_extents, &k.k_i, BTREE_UPDATE_internal_snapshot_node)); bch_err_fn(c, ret); @@ -510,7 +510,7 @@ static int test_snapshots(struct bch_fs *c, u64 nr) if (ret) return ret; - ret = bch2_trans_do(c, NULL, NULL, 0, + ret = bch2_trans_commit_do(c, NULL, NULL, 0, bch2_snapshot_node_create(trans, U32_MAX, snapids, snapid_subvols, diff --git a/fs/bcachefs/xattr.c b/fs/bcachefs/xattr.c index 56c8d3fe55a4..952aca400faf 100644 --- a/fs/bcachefs/xattr.c +++ b/fs/bcachefs/xattr.c @@ -330,7 +330,7 @@ static int bch2_xattr_get_handler(const struct xattr_handler *handler, { struct bch_inode_info *inode = to_bch_ei(vinode); struct bch_fs *c = inode->v.i_sb->s_fs_info; - int ret = bch2_trans_do(c, NULL, NULL, 0, + int ret = bch2_trans_do(c, bch2_xattr_get_trans(trans, inode, name, buffer, size, handler->flags)); if (ret < 0 && bch2_err_matches(ret, ENOENT)) diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index 7980b2e33a92..4423d8b716a5 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -3819,6 +3819,8 @@ void btrfs_free_reserved_bytes(struct btrfs_block_group *cache, spin_lock(&cache->lock); if (cache->ro) space_info->bytes_readonly += num_bytes; + else if (btrfs_is_zoned(cache->fs_info)) + space_info->bytes_zone_unusable += num_bytes; cache->reserved -= num_bytes; space_info->bytes_reserved -= num_bytes; space_info->max_extent_size = 0; diff --git a/fs/btrfs/dir-item.c b/fs/btrfs/dir-item.c index 001c0c2f872c..1e8cd7c9472e 100644 --- a/fs/btrfs/dir-item.c +++ b/fs/btrfs/dir-item.c @@ -347,8 +347,8 @@ btrfs_search_dir_index_item(struct btrfs_root *root, struct btrfs_path *path, return di; } /* Adjust return code if the key was not found in the next leaf. */ - if (ret > 0) - ret = 0; + if (ret >= 0) + ret = -ENOENT; return ERR_PTR(ret); } diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 4ad5db619b00..b11bfe68dd65 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1959,7 +1959,7 @@ static void btrfs_init_qgroup(struct btrfs_fs_info *fs_info) fs_info->qgroup_seq = 1; fs_info->qgroup_ulist = NULL; fs_info->qgroup_rescan_running = false; - fs_info->qgroup_drop_subtree_thres = BTRFS_MAX_LEVEL; + fs_info->qgroup_drop_subtree_thres = BTRFS_QGROUP_DROP_SUBTREE_THRES_DEFAULT; mutex_init(&fs_info->qgroup_rescan_lock); } diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 309a8ae48434..872cca54cc6c 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -262,22 +262,23 @@ static noinline int lock_delalloc_folios(struct inode *inode, for (i = 0; i < found_folios; i++) { struct folio *folio = fbatch.folios[i]; - u32 len = end + 1 - start; + u64 range_start; + u32 range_len; if (folio == locked_folio) continue; - if (btrfs_folio_start_writer_lock(fs_info, folio, start, - len)) - goto out; - + folio_lock(folio); if (!folio_test_dirty(folio) || folio->mapping != mapping) { - btrfs_folio_end_writer_lock(fs_info, folio, start, - len); + folio_unlock(folio); goto out; } + range_start = max_t(u64, folio_pos(folio), start); + range_len = min_t(u64, folio_pos(folio) + folio_size(folio), + end + 1) - range_start; + btrfs_folio_set_writer_lock(fs_info, folio, range_start, range_len); - processed_end = folio_pos(folio) + folio_size(folio) - 1; + processed_end = range_start + range_len - 1; } folio_batch_release(&fbatch); cond_resched(); diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c index 25d191f1ac10..668c617444a5 100644 --- a/fs/btrfs/extent_map.c +++ b/fs/btrfs/extent_map.c @@ -243,13 +243,19 @@ static bool mergeable_maps(const struct extent_map *prev, const struct extent_ma /* * Handle the on-disk data extents merge for @prev and @next. * + * @prev: left extent to merge + * @next: right extent to merge + * @merged: the extent we will not discard after the merge; updated with new values + * + * After this, one of the two extents is the new merged extent and the other is + * removed from the tree and likely freed. Note that @merged is one of @prev/@next + * so there is const/non-const aliasing occurring here. + * * Only touches disk_bytenr/disk_num_bytes/offset/ram_bytes. * For now only uncompressed regular extent can be merged. - * - * @prev and @next will be both updated to point to the new merged range. - * Thus one of them should be removed by the caller. */ -static void merge_ondisk_extents(struct extent_map *prev, struct extent_map *next) +static void merge_ondisk_extents(const struct extent_map *prev, const struct extent_map *next, + struct extent_map *merged) { u64 new_disk_bytenr; u64 new_disk_num_bytes; @@ -284,15 +290,10 @@ static void merge_ondisk_extents(struct extent_map *prev, struct extent_map *nex new_disk_bytenr; new_offset = prev->disk_bytenr + prev->offset - new_disk_bytenr; - prev->disk_bytenr = new_disk_bytenr; - prev->disk_num_bytes = new_disk_num_bytes; - prev->ram_bytes = new_disk_num_bytes; - prev->offset = new_offset; - - next->disk_bytenr = new_disk_bytenr; - next->disk_num_bytes = new_disk_num_bytes; - next->ram_bytes = new_disk_num_bytes; - next->offset = new_offset; + merged->disk_bytenr = new_disk_bytenr; + merged->disk_num_bytes = new_disk_num_bytes; + merged->ram_bytes = new_disk_num_bytes; + merged->offset = new_offset; } static void dump_extent_map(struct btrfs_fs_info *fs_info, const char *prefix, @@ -361,7 +362,7 @@ static void try_merge_map(struct btrfs_inode *inode, struct extent_map *em) em->generation = max(em->generation, merge->generation); if (em->disk_bytenr < EXTENT_MAP_LAST_BYTE) - merge_ondisk_extents(merge, em); + merge_ondisk_extents(merge, em, em); em->flags |= EXTENT_FLAG_MERGED; validate_extent_map(fs_info, em); @@ -378,7 +379,7 @@ static void try_merge_map(struct btrfs_inode *inode, struct extent_map *em) if (rb && can_merge_extent_map(merge) && mergeable_maps(em, merge)) { em->len += merge->len; if (em->disk_bytenr < EXTENT_MAP_LAST_BYTE) - merge_ondisk_extents(em, merge); + merge_ondisk_extents(em, merge, em); validate_extent_map(fs_info, em); rb_erase(&merge->rb_node, &tree->root); RB_CLEAR_NODE(&merge->rb_node); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 5618ca02934a..da51edbad6a0 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4368,11 +4368,8 @@ static int btrfs_unlink_subvol(struct btrfs_trans_handle *trans, */ if (btrfs_ino(inode) == BTRFS_EMPTY_SUBVOL_DIR_OBJECTID) { di = btrfs_search_dir_index_item(root, path, dir_ino, &fname.disk_name); - if (IS_ERR_OR_NULL(di)) { - if (!di) - ret = -ENOENT; - else - ret = PTR_ERR(di); + if (IS_ERR(di)) { + ret = PTR_ERR(di); btrfs_abort_transaction(trans, ret); goto out; } diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c index 1332ec59c539..a0e8deca87a7 100644 --- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -1407,7 +1407,7 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info) fs_info->quota_root = NULL; fs_info->qgroup_flags &= ~BTRFS_QGROUP_STATUS_FLAG_ON; fs_info->qgroup_flags &= ~BTRFS_QGROUP_STATUS_FLAG_SIMPLE_MODE; - fs_info->qgroup_drop_subtree_thres = BTRFS_MAX_LEVEL; + fs_info->qgroup_drop_subtree_thres = BTRFS_QGROUP_DROP_SUBTREE_THRES_DEFAULT; spin_unlock(&fs_info->qgroup_lock); btrfs_free_qgroup_config(fs_info); diff --git a/fs/btrfs/qgroup.h b/fs/btrfs/qgroup.h index 98adf4ec7b01..c229256d6fd5 100644 --- a/fs/btrfs/qgroup.h +++ b/fs/btrfs/qgroup.h @@ -121,6 +121,8 @@ struct btrfs_inode; #define BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN (1ULL << 63) #define BTRFS_QGROUP_RUNTIME_FLAG_NO_ACCOUNTING (1ULL << 62) +#define BTRFS_QGROUP_DROP_SUBTREE_THRES_DEFAULT (3) + /* * Record a dirty extent, and info qgroup to update quota on it */ diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 98fa0f382480..926d7a9ed99d 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -340,6 +340,15 @@ static int btrfs_parse_param(struct fs_context *fc, struct fs_parameter *param) fallthrough; case Opt_compress: case Opt_compress_type: + /* + * Provide the same semantics as older kernels that don't use fs + * context, specifying the "compress" option clears + * "force-compress" without the need to pass + * "compress-force=[no|none]" before specifying "compress". + */ + if (opt != Opt_compress_force && opt != Opt_compress_force_type) + btrfs_clear_opt(ctx->mount_opt, FORCE_COMPRESS); + if (opt == Opt_compress || opt == Opt_compress_force) { ctx->compress_type = BTRFS_COMPRESS_ZLIB; ctx->compress_level = BTRFS_ZLIB_DEFAULT_LEVEL; @@ -1498,8 +1507,7 @@ static int btrfs_reconfigure(struct fs_context *fc) sync_filesystem(sb); set_bit(BTRFS_FS_STATE_REMOUNTING, &fs_info->fs_state); - if (!mount_reconfigure && - !btrfs_check_options(fs_info, &ctx->mount_opt, fc->sb_flags)) + if (!btrfs_check_options(fs_info, &ctx->mount_opt, fc->sb_flags)) return -EINVAL; ret = btrfs_check_features(fs_info, !(fc->sb_flags & SB_RDONLY)); diff --git a/fs/fat/namei_vfat.c b/fs/fat/namei_vfat.c index 6423e1dedf14..15bf32c21ac0 100644 --- a/fs/fat/namei_vfat.c +++ b/fs/fat/namei_vfat.c @@ -1037,7 +1037,7 @@ error_inode: if (corrupt < 0) { fat_fs_error(new_dir->i_sb, "%s: Filesystem corrupted (i_pos %lld)", - __func__, sinfo.i_pos); + __func__, new_i_pos); } goto out; } diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 78ebd265f425..aa587b2142e2 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -1145,10 +1145,36 @@ static void iomap_write_delalloc_scan(struct inode *inode, } /* + * When a short write occurs, the filesystem might need to use ->iomap_end + * to remove space reservations created in ->iomap_begin. + * + * For filesystems that use delayed allocation, there can be dirty pages over + * the delalloc extent outside the range of a short write but still within the + * delalloc extent allocated for this iomap if the write raced with page + * faults. + * * Punch out all the delalloc blocks in the range given except for those that * have dirty data still pending in the page cache - those are going to be * written and so must still retain the delalloc backing for writeback. * + * The punch() callback *must* only punch delalloc extents in the range passed + * to it. It must skip over all other types of extents in the range and leave + * them completely unchanged. It must do this punch atomically with respect to + * other extent modifications. + * + * The punch() callback may be called with a folio locked to prevent writeback + * extent allocation racing at the edge of the range we are currently punching. + * The locked folio may or may not cover the range being punched, so it is not + * safe for the punch() callback to lock folios itself. + * + * Lock order is: + * + * inode->i_rwsem (shared or exclusive) + * inode->i_mapping->invalidate_lock (exclusive) + * folio_lock() + * ->punch + * internal filesystem allocation lock + * * As we are scanning the page cache for data, we don't need to reimplement the * wheel - mapping_seek_hole_data() does exactly what we need to identify the * start and end of data ranges correctly even for sub-folio block sizes. This @@ -1177,7 +1203,7 @@ static void iomap_write_delalloc_scan(struct inode *inode, * require sprinkling this code with magic "+ 1" and "- 1" arithmetic and expose * the code to subtle off-by-one bugs.... */ -static void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte, +void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte, loff_t end_byte, unsigned flags, struct iomap *iomap, iomap_punch_t punch) { @@ -1185,12 +1211,13 @@ static void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte, loff_t scan_end_byte = min(i_size_read(inode), end_byte); /* - * Lock the mapping to avoid races with page faults re-instantiating - * folios and dirtying them via ->page_mkwrite whilst we walk the - * cache and perform delalloc extent removal. Failing to do this can - * leave dirty pages with no space reservation in the cache. + * The caller must hold invalidate_lock to avoid races with page faults + * re-instantiating folios and dirtying them via ->page_mkwrite whilst + * we walk the cache and perform delalloc extent removal. Failing to do + * this can leave dirty pages with no space reservation in the cache. */ - filemap_invalidate_lock(inode->i_mapping); + lockdep_assert_held_write(&inode->i_mapping->invalidate_lock); + while (start_byte < scan_end_byte) { loff_t data_end; @@ -1207,7 +1234,7 @@ static void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte, if (start_byte == -ENXIO || start_byte == scan_end_byte) break; if (WARN_ON_ONCE(start_byte < 0)) - goto out_unlock; + return; WARN_ON_ONCE(start_byte < punch_start_byte); WARN_ON_ONCE(start_byte > scan_end_byte); @@ -1218,7 +1245,7 @@ static void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte, data_end = mapping_seek_hole_data(inode->i_mapping, start_byte, scan_end_byte, SEEK_HOLE); if (WARN_ON_ONCE(data_end < 0)) - goto out_unlock; + return; /* * If we race with post-direct I/O invalidation of the page cache, @@ -1240,74 +1267,8 @@ static void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte, if (punch_start_byte < end_byte) punch(inode, punch_start_byte, end_byte - punch_start_byte, iomap); -out_unlock: - filemap_invalidate_unlock(inode->i_mapping); -} - -/* - * When a short write occurs, the filesystem may need to remove reserved space - * that was allocated in ->iomap_begin from it's ->iomap_end method. For - * filesystems that use delayed allocation, we need to punch out delalloc - * extents from the range that are not dirty in the page cache. As the write can - * race with page faults, there can be dirty pages over the delalloc extent - * outside the range of a short write but still within the delalloc extent - * allocated for this iomap. - * - * This function uses [start_byte, end_byte) intervals (i.e. open ended) to - * simplify range iterations. - * - * The punch() callback *must* only punch delalloc extents in the range passed - * to it. It must skip over all other types of extents in the range and leave - * them completely unchanged. It must do this punch atomically with respect to - * other extent modifications. - * - * The punch() callback may be called with a folio locked to prevent writeback - * extent allocation racing at the edge of the range we are currently punching. - * The locked folio may or may not cover the range being punched, so it is not - * safe for the punch() callback to lock folios itself. - * - * Lock order is: - * - * inode->i_rwsem (shared or exclusive) - * inode->i_mapping->invalidate_lock (exclusive) - * folio_lock() - * ->punch - * internal filesystem allocation lock - */ -void iomap_file_buffered_write_punch_delalloc(struct inode *inode, - loff_t pos, loff_t length, ssize_t written, unsigned flags, - struct iomap *iomap, iomap_punch_t punch) -{ - loff_t start_byte; - loff_t end_byte; - unsigned int blocksize = i_blocksize(inode); - - if (iomap->type != IOMAP_DELALLOC) - return; - - /* If we didn't reserve the blocks, we're not allowed to punch them. */ - if (!(iomap->flags & IOMAP_F_NEW)) - return; - - /* - * start_byte refers to the first unused block after a short write. If - * nothing was written, round offset down to point at the first block in - * the range. - */ - if (unlikely(!written)) - start_byte = round_down(pos, blocksize); - else - start_byte = round_up(pos + written, blocksize); - end_byte = round_up(pos + length, blocksize); - - /* Nothing to do if we've written the entire delalloc extent */ - if (start_byte >= end_byte) - return; - - iomap_write_delalloc_release(inode, start_byte, end_byte, flags, iomap, - punch); } -EXPORT_SYMBOL_GPL(iomap_file_buffered_write_punch_delalloc); +EXPORT_SYMBOL_GPL(iomap_write_delalloc_release); static loff_t iomap_unshare_iter(struct iomap_iter *iter) { diff --git a/fs/jfs/jfs_dmap.c b/fs/jfs/jfs_dmap.c index 974ecf5e0d95..3ab410059dc2 100644 --- a/fs/jfs/jfs_dmap.c +++ b/fs/jfs/jfs_dmap.c @@ -187,7 +187,7 @@ int dbMount(struct inode *ipbmap) } bmp->db_numag = le32_to_cpu(dbmp_le->dn_numag); - if (!bmp->db_numag || bmp->db_numag >= MAXAG) { + if (!bmp->db_numag || bmp->db_numag > MAXAG) { err = -EINVAL; goto err_release_metapage; } diff --git a/fs/namespace.c b/fs/namespace.c index 93c377816d75..d26f5e6d2ca3 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -3944,7 +3944,9 @@ struct mnt_namespace *copy_mnt_ns(unsigned long flags, struct mnt_namespace *ns, new = copy_tree(old, old->mnt.mnt_root, copy_flags); if (IS_ERR(new)) { namespace_unlock(); - free_mnt_ns(new_ns); + ns_free_inum(&new_ns->ns); + dec_mnt_namespaces(new_ns->ucounts); + mnt_ns_release(new_ns); return ERR_CAST(new); } if (user_ns != ns->user_ns) { diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c index c40e226053cc..af46a598f4d7 100644 --- a/fs/netfs/buffered_read.c +++ b/fs/netfs/buffered_read.c @@ -67,7 +67,8 @@ static int netfs_begin_cache_read(struct netfs_io_request *rreq, struct netfs_in * Decant the list of folios to read into a rolling buffer. */ static size_t netfs_load_buffer_from_ra(struct netfs_io_request *rreq, - struct folio_queue *folioq) + struct folio_queue *folioq, + struct folio_batch *put_batch) { unsigned int order, nr; size_t size = 0; @@ -82,6 +83,9 @@ static size_t netfs_load_buffer_from_ra(struct netfs_io_request *rreq, order = folio_order(folio); folioq->orders[i] = order; size += PAGE_SIZE << order; + + if (!folio_batch_add(put_batch, folio)) + folio_batch_release(put_batch); } for (int i = nr; i < folioq_nr_slots(folioq); i++) @@ -120,6 +124,9 @@ static ssize_t netfs_prepare_read_iterator(struct netfs_io_subrequest *subreq) * that we will need to release later - but we don't want to do * that until after we've started the I/O. */ + struct folio_batch put_batch; + + folio_batch_init(&put_batch); while (rreq->submitted < subreq->start + rsize) { struct folio_queue *tail = rreq->buffer_tail, *new; size_t added; @@ -132,10 +139,11 @@ static ssize_t netfs_prepare_read_iterator(struct netfs_io_subrequest *subreq) new->prev = tail; tail->next = new; rreq->buffer_tail = new; - added = netfs_load_buffer_from_ra(rreq, new); + added = netfs_load_buffer_from_ra(rreq, new, &put_batch); rreq->iter.count += added; rreq->submitted += added; } + folio_batch_release(&put_batch); } subreq->len = rsize; @@ -348,6 +356,7 @@ static int netfs_wait_for_read(struct netfs_io_request *rreq) static int netfs_prime_buffer(struct netfs_io_request *rreq) { struct folio_queue *folioq; + struct folio_batch put_batch; size_t added; folioq = kmalloc(sizeof(*folioq), GFP_KERNEL); @@ -360,39 +369,14 @@ static int netfs_prime_buffer(struct netfs_io_request *rreq) rreq->submitted = rreq->start; iov_iter_folio_queue(&rreq->iter, ITER_DEST, folioq, 0, 0, 0); - added = netfs_load_buffer_from_ra(rreq, folioq); + folio_batch_init(&put_batch); + added = netfs_load_buffer_from_ra(rreq, folioq, &put_batch); + folio_batch_release(&put_batch); rreq->iter.count += added; rreq->submitted += added; return 0; } -/* - * Drop the ref on each folio that we inherited from the VM readahead code. We - * still have the folio locks to pin the page until we complete the I/O. - * - * Note that we can't just release the batch in each queue struct as we use the - * occupancy count in other places. - */ -static void netfs_put_ra_refs(struct folio_queue *folioq) -{ - struct folio_batch fbatch; - - folio_batch_init(&fbatch); - while (folioq) { - for (unsigned int slot = 0; slot < folioq_count(folioq); slot++) { - struct folio *folio = folioq_folio(folioq, slot); - if (!folio) - continue; - trace_netfs_folio(folio, netfs_folio_trace_read_put); - if (!folio_batch_add(&fbatch, folio)) - folio_batch_release(&fbatch); - } - folioq = folioq->next; - } - - folio_batch_release(&fbatch); -} - /** * netfs_readahead - Helper to manage a read request * @ractl: The description of the readahead request @@ -436,9 +420,6 @@ void netfs_readahead(struct readahead_control *ractl) goto cleanup_free; netfs_read_to_pagecache(rreq); - /* Release the folio refs whilst we're waiting for the I/O. */ - netfs_put_ra_refs(rreq->buffer); - netfs_put_request(rreq, true, netfs_rreq_trace_put_return); return; diff --git a/fs/netfs/locking.c b/fs/netfs/locking.c index 21eab56ee2f9..2249ecd09d0a 100644 --- a/fs/netfs/locking.c +++ b/fs/netfs/locking.c @@ -109,6 +109,7 @@ int netfs_start_io_write(struct inode *inode) up_write(&inode->i_rwsem); return -ERESTARTSYS; } + downgrade_write(&inode->i_rwsem); return 0; } EXPORT_SYMBOL(netfs_start_io_write); @@ -123,7 +124,7 @@ EXPORT_SYMBOL(netfs_start_io_write); void netfs_end_io_write(struct inode *inode) __releases(inode->i_rwsem) { - up_write(&inode->i_rwsem); + up_read(&inode->i_rwsem); } EXPORT_SYMBOL(netfs_end_io_write); diff --git a/fs/netfs/read_collect.c b/fs/netfs/read_collect.c index b18c65ba5580..3cbb289535a8 100644 --- a/fs/netfs/read_collect.c +++ b/fs/netfs/read_collect.c @@ -77,6 +77,8 @@ static void netfs_unlock_read_folio(struct netfs_io_subrequest *subreq, folio_unlock(folio); } } + + folioq_clear(folioq, slot); } /* diff --git a/fs/nilfs2/dir.c b/fs/nilfs2/dir.c index fe5b1a30c509..a8602729586a 100644 --- a/fs/nilfs2/dir.c +++ b/fs/nilfs2/dir.c @@ -289,7 +289,7 @@ static int nilfs_readdir(struct file *file, struct dir_context *ctx) * The folio is mapped and unlocked. When the caller is finished with * the entry, it should call folio_release_kmap(). * - * On failure, returns NULL and the caller should ignore foliop. + * On failure, returns an error pointer and the caller should ignore foliop. */ struct nilfs_dir_entry *nilfs_find_entry(struct inode *dir, const struct qstr *qstr, struct folio **foliop) @@ -312,22 +312,24 @@ struct nilfs_dir_entry *nilfs_find_entry(struct inode *dir, do { char *kaddr = nilfs_get_folio(dir, n, foliop); - if (!IS_ERR(kaddr)) { - de = (struct nilfs_dir_entry *)kaddr; - kaddr += nilfs_last_byte(dir, n) - reclen; - while ((char *) de <= kaddr) { - if (de->rec_len == 0) { - nilfs_error(dir->i_sb, - "zero-length directory entry"); - folio_release_kmap(*foliop, kaddr); - goto out; - } - if (nilfs_match(namelen, name, de)) - goto found; - de = nilfs_next_entry(de); + if (IS_ERR(kaddr)) + return ERR_CAST(kaddr); + + de = (struct nilfs_dir_entry *)kaddr; + kaddr += nilfs_last_byte(dir, n) - reclen; + while ((char *)de <= kaddr) { + if (de->rec_len == 0) { + nilfs_error(dir->i_sb, + "zero-length directory entry"); + folio_release_kmap(*foliop, kaddr); + goto out; } - folio_release_kmap(*foliop, kaddr); + if (nilfs_match(namelen, name, de)) + goto found; + de = nilfs_next_entry(de); } + folio_release_kmap(*foliop, kaddr); + if (++n >= npages) n = 0; /* next folio is past the blocks we've got */ @@ -340,7 +342,7 @@ struct nilfs_dir_entry *nilfs_find_entry(struct inode *dir, } } while (n != start); out: - return NULL; + return ERR_PTR(-ENOENT); found: ei->i_dir_start_lookup = n; @@ -384,18 +386,18 @@ fail: return NULL; } -ino_t nilfs_inode_by_name(struct inode *dir, const struct qstr *qstr) +int nilfs_inode_by_name(struct inode *dir, const struct qstr *qstr, ino_t *ino) { - ino_t res = 0; struct nilfs_dir_entry *de; struct folio *folio; de = nilfs_find_entry(dir, qstr, &folio); - if (de) { - res = le64_to_cpu(de->inode); - folio_release_kmap(folio, de); - } - return res; + if (IS_ERR(de)) + return PTR_ERR(de); + + *ino = le64_to_cpu(de->inode); + folio_release_kmap(folio, de); + return 0; } void nilfs_set_link(struct inode *dir, struct nilfs_dir_entry *de, diff --git a/fs/nilfs2/namei.c b/fs/nilfs2/namei.c index c950139db6ef..4905063790c5 100644 --- a/fs/nilfs2/namei.c +++ b/fs/nilfs2/namei.c @@ -55,12 +55,20 @@ nilfs_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags) { struct inode *inode; ino_t ino; + int res; if (dentry->d_name.len > NILFS_NAME_LEN) return ERR_PTR(-ENAMETOOLONG); - ino = nilfs_inode_by_name(dir, &dentry->d_name); - inode = ino ? nilfs_iget(dir->i_sb, NILFS_I(dir)->i_root, ino) : NULL; + res = nilfs_inode_by_name(dir, &dentry->d_name, &ino); + if (res) { + if (res != -ENOENT) + return ERR_PTR(res); + inode = NULL; + } else { + inode = nilfs_iget(dir->i_sb, NILFS_I(dir)->i_root, ino); + } + return d_splice_alias(inode, dentry); } @@ -263,10 +271,11 @@ static int nilfs_do_unlink(struct inode *dir, struct dentry *dentry) struct folio *folio; int err; - err = -ENOENT; de = nilfs_find_entry(dir, &dentry->d_name, &folio); - if (!de) + if (IS_ERR(de)) { + err = PTR_ERR(de); goto out; + } inode = d_inode(dentry); err = -EIO; @@ -362,10 +371,11 @@ static int nilfs_rename(struct mnt_idmap *idmap, if (unlikely(err)) return err; - err = -ENOENT; old_de = nilfs_find_entry(old_dir, &old_dentry->d_name, &old_folio); - if (!old_de) + if (IS_ERR(old_de)) { + err = PTR_ERR(old_de); goto out; + } if (S_ISDIR(old_inode->i_mode)) { err = -EIO; @@ -382,10 +392,12 @@ static int nilfs_rename(struct mnt_idmap *idmap, if (dir_de && !nilfs_empty_dir(new_inode)) goto out_dir; - err = -ENOENT; - new_de = nilfs_find_entry(new_dir, &new_dentry->d_name, &new_folio); - if (!new_de) + new_de = nilfs_find_entry(new_dir, &new_dentry->d_name, + &new_folio); + if (IS_ERR(new_de)) { + err = PTR_ERR(new_de); goto out_dir; + } nilfs_set_link(new_dir, new_de, new_folio, old_inode); folio_release_kmap(new_folio, new_de); nilfs_mark_inode_dirty(new_dir); @@ -440,12 +452,13 @@ out: */ static struct dentry *nilfs_get_parent(struct dentry *child) { - unsigned long ino; + ino_t ino; + int res; struct nilfs_root *root; - ino = nilfs_inode_by_name(d_inode(child), &dotdot_name); - if (!ino) - return ERR_PTR(-ENOENT); + res = nilfs_inode_by_name(d_inode(child), &dotdot_name, &ino); + if (res) + return ERR_PTR(res); root = NILFS_I(d_inode(child))->i_root; diff --git a/fs/nilfs2/nilfs.h b/fs/nilfs2/nilfs.h index fb1c4c5bae7c..45d03826eaf1 100644 --- a/fs/nilfs2/nilfs.h +++ b/fs/nilfs2/nilfs.h @@ -254,7 +254,7 @@ static inline __u32 nilfs_mask_flags(umode_t mode, __u32 flags) /* dir.c */ int nilfs_add_link(struct dentry *, struct inode *); -ino_t nilfs_inode_by_name(struct inode *, const struct qstr *); +int nilfs_inode_by_name(struct inode *dir, const struct qstr *qstr, ino_t *ino); int nilfs_make_empty(struct inode *, struct inode *); struct nilfs_dir_entry *nilfs_find_entry(struct inode *, const struct qstr *, struct folio **); diff --git a/fs/nilfs2/page.c b/fs/nilfs2/page.c index 9c0b7cddeaae..5436eb0424bd 100644 --- a/fs/nilfs2/page.c +++ b/fs/nilfs2/page.c @@ -77,7 +77,8 @@ void nilfs_forget_buffer(struct buffer_head *bh) const unsigned long clear_bits = (BIT(BH_Uptodate) | BIT(BH_Dirty) | BIT(BH_Mapped) | BIT(BH_Async_Write) | BIT(BH_NILFS_Volatile) | - BIT(BH_NILFS_Checked) | BIT(BH_NILFS_Redirected)); + BIT(BH_NILFS_Checked) | BIT(BH_NILFS_Redirected) | + BIT(BH_Delay)); lock_buffer(bh); set_mask_bits(&bh->b_state, clear_bits, 0); @@ -406,7 +407,8 @@ void nilfs_clear_folio_dirty(struct folio *folio) const unsigned long clear_bits = (BIT(BH_Uptodate) | BIT(BH_Dirty) | BIT(BH_Mapped) | BIT(BH_Async_Write) | BIT(BH_NILFS_Volatile) | - BIT(BH_NILFS_Checked) | BIT(BH_NILFS_Redirected)); + BIT(BH_NILFS_Checked) | BIT(BH_NILFS_Redirected) | + BIT(BH_Delay)); bh = head; do { diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c index ad131a2fc58e..58887456e3c5 100644 --- a/fs/ocfs2/file.c +++ b/fs/ocfs2/file.c @@ -1129,9 +1129,12 @@ int ocfs2_setattr(struct mnt_idmap *idmap, struct dentry *dentry, trace_ocfs2_setattr(inode, dentry, (unsigned long long)OCFS2_I(inode)->ip_blkno, dentry->d_name.len, dentry->d_name.name, - attr->ia_valid, attr->ia_mode, - from_kuid(&init_user_ns, attr->ia_uid), - from_kgid(&init_user_ns, attr->ia_gid)); + attr->ia_valid, + attr->ia_valid & ATTR_MODE ? attr->ia_mode : 0, + attr->ia_valid & ATTR_UID ? + from_kuid(&init_user_ns, attr->ia_uid) : 0, + attr->ia_valid & ATTR_GID ? + from_kgid(&init_user_ns, attr->ia_gid) : 0); /* ensuring we don't even attempt to truncate a symlink */ if (S_ISLNK(inode->i_mode)) diff --git a/fs/open.c b/fs/open.c index acaeb3e25c88..5da4df2f9b18 100644 --- a/fs/open.c +++ b/fs/open.c @@ -1457,6 +1457,8 @@ SYSCALL_DEFINE4(openat2, int, dfd, const char __user *, filename, if (unlikely(usize < OPEN_HOW_SIZE_VER0)) return -EINVAL; + if (unlikely(usize > PAGE_SIZE)) + return -E2BIG; err = copy_struct_from_user(&tmp, sizeof(tmp), how, usize); if (err) diff --git a/fs/proc/fd.c b/fs/proc/fd.c index 1f54a54bfb91..5e391cbca7a3 100644 --- a/fs/proc/fd.c +++ b/fs/proc/fd.c @@ -77,7 +77,7 @@ static int seq_fdinfo_open(struct inode *inode, struct file *file) return single_open(file, seq_show, inode); } -/** +/* * Shared /proc/pid/fdinfo and /proc/pid/fdinfo/fd permission helper to ensure * that the current task has PTRACE_MODE_READ in addition to the normal * POSIX-like checks. diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 72f14fd59c2d..e52bd96137a6 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -909,8 +909,15 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) { /* * Don't forget to update Documentation/ on changes. + * + * The length of the second argument of mnemonics[] + * needs to be 3 instead of previously set 2 + * (i.e. from [BITS_PER_LONG][2] to [BITS_PER_LONG][3]) + * to avoid spurious + * -Werror=unterminated-string-initialization warning + * with GCC 15 */ - static const char mnemonics[BITS_PER_LONG][2] = { + static const char mnemonics[BITS_PER_LONG][3] = { /* * In case if we meet a flag we don't know about. */ @@ -987,11 +994,8 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) for (i = 0; i < BITS_PER_LONG; i++) { if (!mnemonics[i][0]) continue; - if (vma->vm_flags & (1UL << i)) { - seq_putc(m, mnemonics[i][0]); - seq_putc(m, mnemonics[i][1]); - seq_putc(m, ' '); - } + if (vma->vm_flags & (1UL << i)) + seq_printf(m, "%s ", mnemonics[i]); } seq_putc(m, '\n'); } diff --git a/fs/smb/client/cifsproto.h b/fs/smb/client/cifsproto.h index 68c716e6261b..1d3470bca45e 100644 --- a/fs/smb/client/cifsproto.h +++ b/fs/smb/client/cifsproto.h @@ -252,10 +252,6 @@ extern int cifs_read_from_socket(struct TCP_Server_Info *server, char *buf, unsigned int to_read); extern ssize_t cifs_discard_from_socket(struct TCP_Server_Info *server, size_t to_read); -extern int cifs_read_page_from_socket(struct TCP_Server_Info *server, - struct page *page, - unsigned int page_offset, - unsigned int to_read); int cifs_read_iter_from_socket(struct TCP_Server_Info *server, struct iov_iter *iter, unsigned int to_read); @@ -623,8 +619,6 @@ enum securityEnum cifs_select_sectype(struct TCP_Server_Info *, int cifs_alloc_hash(const char *name, struct shash_desc **sdesc); void cifs_free_hash(struct shash_desc **sdesc); -struct cifs_chan * -cifs_ses_find_chan(struct cifs_ses *ses, struct TCP_Server_Info *server); int cifs_try_adding_channels(struct cifs_ses *ses); bool is_server_using_iface(struct TCP_Server_Info *server, struct cifs_server_iface *iface); @@ -640,9 +634,6 @@ cifs_chan_set_in_reconnect(struct cifs_ses *ses, void cifs_chan_clear_in_reconnect(struct cifs_ses *ses, struct TCP_Server_Info *server); -bool -cifs_chan_in_reconnect(struct cifs_ses *ses, - struct TCP_Server_Info *server); void cifs_chan_set_need_reconnect(struct cifs_ses *ses, struct TCP_Server_Info *server); diff --git a/fs/smb/client/compress.c b/fs/smb/client/compress.c index 63b5a55b7a57..766b4de13da7 100644 --- a/fs/smb/client/compress.c +++ b/fs/smb/client/compress.c @@ -166,7 +166,6 @@ static int collect_sample(const struct iov_iter *iter, ssize_t max, u8 *sample) loff_t start = iter->xarray_start + iter->iov_offset; pgoff_t last, index = start / PAGE_SIZE; size_t len, off, foff; - ssize_t ret = 0; void *p; int s = 0; @@ -193,9 +192,6 @@ static int collect_sample(const struct iov_iter *iter, ssize_t max, u8 *sample) memcpy(&sample[s], p, len2); kunmap_local(p); - if (ret < 0) - return ret; - s += len2; if (len2 < SZ_2K || s >= max - SZ_2K) diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c index adf8758847f6..15d94ac4095e 100644 --- a/fs/smb/client/connect.c +++ b/fs/smb/client/connect.c @@ -795,18 +795,6 @@ cifs_discard_from_socket(struct TCP_Server_Info *server, size_t to_read) } int -cifs_read_page_from_socket(struct TCP_Server_Info *server, struct page *page, - unsigned int page_offset, unsigned int to_read) -{ - struct msghdr smb_msg = {}; - struct bio_vec bv; - - bvec_set_page(&bv, page, to_read, page_offset); - iov_iter_bvec(&smb_msg.msg_iter, ITER_DEST, &bv, 1, to_read); - return cifs_readv_from_socket(server, &smb_msg); -} - -int cifs_read_iter_from_socket(struct TCP_Server_Info *server, struct iov_iter *iter, unsigned int to_read) { diff --git a/fs/smb/client/sess.c b/fs/smb/client/sess.c index 3216f786908f..c88e9657f47a 100644 --- a/fs/smb/client/sess.c +++ b/fs/smb/client/sess.c @@ -115,18 +115,6 @@ cifs_chan_clear_in_reconnect(struct cifs_ses *ses, ses->chans[chan_index].in_reconnect = false; } -bool -cifs_chan_in_reconnect(struct cifs_ses *ses, - struct TCP_Server_Info *server) -{ - unsigned int chan_index = cifs_ses_get_chan_index(ses, server); - - if (chan_index == CIFS_INVAL_CHAN_INDEX) - return true; /* err on the safer side */ - - return CIFS_CHAN_IN_RECONNECT(ses, chan_index); -} - void cifs_chan_set_need_reconnect(struct cifs_ses *ses, struct TCP_Server_Info *server) @@ -487,26 +475,6 @@ cifs_chan_update_iface(struct cifs_ses *ses, struct TCP_Server_Info *server) spin_unlock(&ses->chan_lock); } -/* - * If server is a channel of ses, return the corresponding enclosing - * cifs_chan otherwise return NULL. - */ -struct cifs_chan * -cifs_ses_find_chan(struct cifs_ses *ses, struct TCP_Server_Info *server) -{ - int i; - - spin_lock(&ses->chan_lock); - for (i = 0; i < ses->chan_count; i++) { - if (ses->chans[i].server == server) { - spin_unlock(&ses->chan_lock); - return &ses->chans[i]; - } - } - spin_unlock(&ses->chan_lock); - return NULL; -} - static int cifs_ses_add_channel(struct cifs_ses *ses, struct cifs_server_iface *iface) diff --git a/fs/smb/client/smb2ops.c b/fs/smb/client/smb2ops.c index 6b385fce3f2a..24a2aa04a108 100644 --- a/fs/smb/client/smb2ops.c +++ b/fs/smb/client/smb2ops.c @@ -1158,7 +1158,7 @@ smb2_set_ea(const unsigned int xid, struct cifs_tcon *tcon, struct cifs_fid fid; unsigned int size[1]; void *data[1]; - struct smb2_file_full_ea_info *ea = NULL; + struct smb2_file_full_ea_info *ea; struct smb2_query_info_rsp *rsp; int rc, used_len = 0; int retries = 0, cur_sleep = 1; @@ -1179,6 +1179,7 @@ replay_again: if (!utf16_path) return -ENOMEM; + ea = NULL; resp_buftype[0] = resp_buftype[1] = resp_buftype[2] = CIFS_NO_BUFFER; vars = kzalloc(sizeof(*vars), GFP_KERNEL); if (!vars) { diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c index b2f16a7b696d..6584b5cddc28 100644 --- a/fs/smb/client/smb2pdu.c +++ b/fs/smb/client/smb2pdu.c @@ -3313,6 +3313,15 @@ SMB2_ioctl_init(struct cifs_tcon *tcon, struct TCP_Server_Info *server, return rc; if (indatalen) { + unsigned int len; + + if (WARN_ON_ONCE(smb3_encryption_required(tcon) && + (check_add_overflow(total_len - 1, + ALIGN(indatalen, 8), &len) || + len > MAX_CIFS_SMALL_BUFFER_SIZE))) { + cifs_small_buf_release(req); + return -EIO; + } /* * indatalen is usually small at a couple of bytes max, so * just allocate through generic pool diff --git a/fs/xfs/scrub/bmap_repair.c b/fs/xfs/scrub/bmap_repair.c index 49dc38acc66b..4505f4829d53 100644 --- a/fs/xfs/scrub/bmap_repair.c +++ b/fs/xfs/scrub/bmap_repair.c @@ -801,7 +801,7 @@ xrep_bmap( { struct xrep_bmap *rb; char *descr; - unsigned int max_bmbt_recs; + xfs_extnum_t max_bmbt_recs; bool large_extcount; int error = 0; diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 6dead20338e2..559a3a577097 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -116,7 +116,7 @@ xfs_end_ioend( if (unlikely(error)) { if (ioend->io_flags & IOMAP_F_SHARED) { xfs_reflink_cancel_cow_range(ip, offset, size, true); - xfs_bmap_punch_delalloc_range(ip, offset, + xfs_bmap_punch_delalloc_range(ip, XFS_DATA_FORK, offset, offset + size); } goto done; @@ -456,7 +456,7 @@ xfs_discard_folio( * byte of the next folio. Hence the end offset is only dependent on the * folio itself and not the start offset that is passed in. */ - xfs_bmap_punch_delalloc_range(ip, pos, + xfs_bmap_punch_delalloc_range(ip, XFS_DATA_FORK, pos, folio_pos(folio) + folio_size(folio)); } diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index 053d567c9108..4719ec90029c 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -442,11 +442,12 @@ out_unlock_iolock: void xfs_bmap_punch_delalloc_range( struct xfs_inode *ip, + int whichfork, xfs_off_t start_byte, xfs_off_t end_byte) { struct xfs_mount *mp = ip->i_mount; - struct xfs_ifork *ifp = &ip->i_df; + struct xfs_ifork *ifp = xfs_ifork_ptr(ip, whichfork); xfs_fileoff_t start_fsb = XFS_B_TO_FSBT(mp, start_byte); xfs_fileoff_t end_fsb = XFS_B_TO_FSB(mp, end_byte); struct xfs_bmbt_irec got, del; @@ -474,11 +475,14 @@ xfs_bmap_punch_delalloc_range( continue; } - xfs_bmap_del_extent_delay(ip, XFS_DATA_FORK, &icur, &got, &del); + xfs_bmap_del_extent_delay(ip, whichfork, &icur, &got, &del); if (!xfs_iext_get_extent(ifp, &icur, &got)) break; } + if (whichfork == XFS_COW_FORK && !ifp->if_bytes) + xfs_inode_clear_cowblocks_tag(ip); + out_unlock: xfs_iunlock(ip, XFS_ILOCK_EXCL); } @@ -580,7 +584,7 @@ xfs_free_eofblocks( */ if (ip->i_diflags & (XFS_DIFLAG_PREALLOC | XFS_DIFLAG_APPEND)) { if (ip->i_delayed_blks) { - xfs_bmap_punch_delalloc_range(ip, + xfs_bmap_punch_delalloc_range(ip, XFS_DATA_FORK, round_up(XFS_ISIZE(ip), mp->m_sb.sb_blocksize), LLONG_MAX); } diff --git a/fs/xfs/xfs_bmap_util.h b/fs/xfs/xfs_bmap_util.h index eb0895bfb9da..b29760d36e1a 100644 --- a/fs/xfs/xfs_bmap_util.h +++ b/fs/xfs/xfs_bmap_util.h @@ -30,7 +30,7 @@ xfs_bmap_rtalloc(struct xfs_bmalloca *ap) } #endif /* CONFIG_XFS_RT */ -void xfs_bmap_punch_delalloc_range(struct xfs_inode *ip, +void xfs_bmap_punch_delalloc_range(struct xfs_inode *ip, int whichfork, xfs_off_t start_byte, xfs_off_t end_byte); struct kgetbmap { diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 412b1d71b52b..b19916b11fd5 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -348,9 +348,82 @@ xfs_file_splice_read( } /* + * Take care of zeroing post-EOF blocks when they might exist. + * + * Returns 0 if successfully, a negative error for a failure, or 1 if this + * function dropped the iolock and reacquired it exclusively and the caller + * needs to restart the write sanity checks. + */ +static ssize_t +xfs_file_write_zero_eof( + struct kiocb *iocb, + struct iov_iter *from, + unsigned int *iolock, + size_t count, + bool *drained_dio) +{ + struct xfs_inode *ip = XFS_I(iocb->ki_filp->f_mapping->host); + loff_t isize; + int error; + + /* + * We need to serialise against EOF updates that occur in IO completions + * here. We want to make sure that nobody is changing the size while + * we do this check until we have placed an IO barrier (i.e. hold + * XFS_IOLOCK_EXCL) that prevents new IO from being dispatched. The + * spinlock effectively forms a memory barrier once we have + * XFS_IOLOCK_EXCL so we are guaranteed to see the latest EOF value and + * hence be able to correctly determine if we need to run zeroing. + */ + spin_lock(&ip->i_flags_lock); + isize = i_size_read(VFS_I(ip)); + if (iocb->ki_pos <= isize) { + spin_unlock(&ip->i_flags_lock); + return 0; + } + spin_unlock(&ip->i_flags_lock); + + if (iocb->ki_flags & IOCB_NOWAIT) + return -EAGAIN; + + if (!*drained_dio) { + /* + * If zeroing is needed and we are currently holding the iolock + * shared, we need to update it to exclusive which implies + * having to redo all checks before. + */ + if (*iolock == XFS_IOLOCK_SHARED) { + xfs_iunlock(ip, *iolock); + *iolock = XFS_IOLOCK_EXCL; + xfs_ilock(ip, *iolock); + iov_iter_reexpand(from, count); + } + + /* + * We now have an IO submission barrier in place, but AIO can do + * EOF updates during IO completion and hence we now need to + * wait for all of them to drain. Non-AIO DIO will have drained + * before we are given the XFS_IOLOCK_EXCL, and so for most + * cases this wait is a no-op. + */ + inode_dio_wait(VFS_I(ip)); + *drained_dio = true; + return 1; + } + + trace_xfs_zero_eof(ip, isize, iocb->ki_pos - isize); + + xfs_ilock(ip, XFS_MMAPLOCK_EXCL); + error = xfs_zero_range(ip, isize, iocb->ki_pos - isize, NULL); + xfs_iunlock(ip, XFS_MMAPLOCK_EXCL); + + return error; +} + +/* * Common pre-write limit and setup checks. * - * Called with the iolocked held either shared and exclusive according to + * Called with the iolock held either shared and exclusive according to * @iolock, and returns with it held. Might upgrade the iolock to exclusive * if called for a direct write beyond i_size. */ @@ -360,13 +433,10 @@ xfs_file_write_checks( struct iov_iter *from, unsigned int *iolock) { - struct file *file = iocb->ki_filp; - struct inode *inode = file->f_mapping->host; - struct xfs_inode *ip = XFS_I(inode); - ssize_t error = 0; + struct inode *inode = iocb->ki_filp->f_mapping->host; size_t count = iov_iter_count(from); bool drained_dio = false; - loff_t isize; + ssize_t error; restart: error = generic_write_checks(iocb, from); @@ -389,7 +459,7 @@ restart: * exclusively. */ if (*iolock == XFS_IOLOCK_SHARED && !IS_NOSEC(inode)) { - xfs_iunlock(ip, *iolock); + xfs_iunlock(XFS_I(inode), *iolock); *iolock = XFS_IOLOCK_EXCL; error = xfs_ilock_iocb(iocb, *iolock); if (error) { @@ -400,64 +470,24 @@ restart: } /* - * If the offset is beyond the size of the file, we need to zero any + * If the offset is beyond the size of the file, we need to zero all * blocks that fall between the existing EOF and the start of this - * write. If zeroing is needed and we are currently holding the iolock - * shared, we need to update it to exclusive which implies having to - * redo all checks before. + * write. * - * We need to serialise against EOF updates that occur in IO completions - * here. We want to make sure that nobody is changing the size while we - * do this check until we have placed an IO barrier (i.e. hold the - * XFS_IOLOCK_EXCL) that prevents new IO from being dispatched. The - * spinlock effectively forms a memory barrier once we have the - * XFS_IOLOCK_EXCL so we are guaranteed to see the latest EOF value and - * hence be able to correctly determine if we need to run zeroing. - * - * We can do an unlocked check here safely as IO completion can only - * extend EOF. Truncate is locked out at this point, so the EOF can - * not move backwards, only forwards. Hence we only need to take the - * slow path and spin locks when we are at or beyond the current EOF. + * We can do an unlocked check for i_size here safely as I/O completion + * can only extend EOF. Truncate is locked out at this point, so the + * EOF can not move backwards, only forwards. Hence we only need to take + * the slow path when we are at or beyond the current EOF. */ - if (iocb->ki_pos <= i_size_read(inode)) - goto out; - - spin_lock(&ip->i_flags_lock); - isize = i_size_read(inode); - if (iocb->ki_pos > isize) { - spin_unlock(&ip->i_flags_lock); - - if (iocb->ki_flags & IOCB_NOWAIT) - return -EAGAIN; - - if (!drained_dio) { - if (*iolock == XFS_IOLOCK_SHARED) { - xfs_iunlock(ip, *iolock); - *iolock = XFS_IOLOCK_EXCL; - xfs_ilock(ip, *iolock); - iov_iter_reexpand(from, count); - } - /* - * We now have an IO submission barrier in place, but - * AIO can do EOF updates during IO completion and hence - * we now need to wait for all of them to drain. Non-AIO - * DIO will have drained before we are given the - * XFS_IOLOCK_EXCL, and so for most cases this wait is a - * no-op. - */ - inode_dio_wait(inode); - drained_dio = true; + if (iocb->ki_pos > i_size_read(inode)) { + error = xfs_file_write_zero_eof(iocb, from, iolock, count, + &drained_dio); + if (error == 1) goto restart; - } - - trace_xfs_zero_eof(ip, isize, iocb->ki_pos - isize); - error = xfs_zero_range(ip, isize, iocb->ki_pos - isize, NULL); if (error) return error; - } else - spin_unlock(&ip->i_flags_lock); + } -out: return kiocb_modified(iocb); } diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 1e11f48814c0..916531d9f83c 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -975,6 +975,7 @@ xfs_buffered_write_iomap_begin( int allocfork = XFS_DATA_FORK; int error = 0; unsigned int lockmode = XFS_ILOCK_EXCL; + unsigned int iomap_flags = 0; u64 seq; if (xfs_is_shutdown(mp)) @@ -1145,6 +1146,11 @@ xfs_buffered_write_iomap_begin( } } + /* + * Flag newly allocated delalloc blocks with IOMAP_F_NEW so we punch + * them out if the write happens to fail. + */ + iomap_flags |= IOMAP_F_NEW; if (allocfork == XFS_COW_FORK) { error = xfs_bmapi_reserve_delalloc(ip, allocfork, offset_fsb, end_fsb - offset_fsb, prealloc_blocks, &cmap, @@ -1162,19 +1168,11 @@ xfs_buffered_write_iomap_begin( if (error) goto out_unlock; - /* - * Flag newly allocated delalloc blocks with IOMAP_F_NEW so we punch - * them out if the write happens to fail. - */ - seq = xfs_iomap_inode_sequence(ip, IOMAP_F_NEW); - xfs_iunlock(ip, lockmode); trace_xfs_iomap_alloc(ip, offset, count, allocfork, &imap); - return xfs_bmbt_to_iomap(ip, iomap, &imap, flags, IOMAP_F_NEW, seq); - found_imap: - seq = xfs_iomap_inode_sequence(ip, 0); + seq = xfs_iomap_inode_sequence(ip, iomap_flags); xfs_iunlock(ip, lockmode); - return xfs_bmbt_to_iomap(ip, iomap, &imap, flags, 0, seq); + return xfs_bmbt_to_iomap(ip, iomap, &imap, flags, iomap_flags, seq); convert_delay: xfs_iunlock(ip, lockmode); @@ -1188,20 +1186,20 @@ convert_delay: return 0; found_cow: - seq = xfs_iomap_inode_sequence(ip, 0); if (imap.br_startoff <= offset_fsb) { - error = xfs_bmbt_to_iomap(ip, srcmap, &imap, flags, 0, seq); + error = xfs_bmbt_to_iomap(ip, srcmap, &imap, flags, 0, + xfs_iomap_inode_sequence(ip, 0)); if (error) goto out_unlock; - seq = xfs_iomap_inode_sequence(ip, IOMAP_F_SHARED); - xfs_iunlock(ip, lockmode); - return xfs_bmbt_to_iomap(ip, iomap, &cmap, flags, - IOMAP_F_SHARED, seq); + } else { + xfs_trim_extent(&cmap, offset_fsb, + imap.br_startoff - offset_fsb); } - xfs_trim_extent(&cmap, offset_fsb, imap.br_startoff - offset_fsb); + iomap_flags |= IOMAP_F_SHARED; + seq = xfs_iomap_inode_sequence(ip, iomap_flags); xfs_iunlock(ip, lockmode); - return xfs_bmbt_to_iomap(ip, iomap, &cmap, flags, 0, seq); + return xfs_bmbt_to_iomap(ip, iomap, &cmap, flags, iomap_flags, seq); out_unlock: xfs_iunlock(ip, lockmode); @@ -1215,7 +1213,10 @@ xfs_buffered_write_delalloc_punch( loff_t length, struct iomap *iomap) { - xfs_bmap_punch_delalloc_range(XFS_I(inode), offset, offset + length); + xfs_bmap_punch_delalloc_range(XFS_I(inode), + (iomap->flags & IOMAP_F_SHARED) ? + XFS_COW_FORK : XFS_DATA_FORK, + offset, offset + length); } static int @@ -1227,8 +1228,30 @@ xfs_buffered_write_iomap_end( unsigned flags, struct iomap *iomap) { - iomap_file_buffered_write_punch_delalloc(inode, offset, length, written, - flags, iomap, &xfs_buffered_write_delalloc_punch); + loff_t start_byte, end_byte; + + /* If we didn't reserve the blocks, we're not allowed to punch them. */ + if (iomap->type != IOMAP_DELALLOC || !(iomap->flags & IOMAP_F_NEW)) + return 0; + + /* Nothing to do if we've written the entire delalloc extent */ + start_byte = iomap_last_written_block(inode, offset, written); + end_byte = round_up(offset + length, i_blocksize(inode)); + if (start_byte >= end_byte) + return 0; + + /* For zeroing operations the callers already hold invalidate_lock. */ + if (flags & (IOMAP_UNSHARE | IOMAP_ZERO)) { + rwsem_assert_held_write(&inode->i_mapping->invalidate_lock); + iomap_write_delalloc_release(inode, start_byte, end_byte, flags, + iomap, xfs_buffered_write_delalloc_punch); + } else { + filemap_invalidate_lock(inode->i_mapping); + iomap_write_delalloc_release(inode, start_byte, end_byte, flags, + iomap, xfs_buffered_write_delalloc_punch); + filemap_invalidate_unlock(inode->i_mapping); + } + return 0; } @@ -1435,6 +1458,8 @@ xfs_zero_range( { struct inode *inode = VFS_I(ip); + xfs_assert_ilocked(ip, XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL); + if (IS_DAX(inode)) return dax_zero_range(inode, pos, len, did_zero, &xfs_dax_write_iomap_ops); diff --git a/include/linux/host1x.h b/include/linux/host1x.h index 9c8119ed13a4..c4dde3aafcac 100644 --- a/include/linux/host1x.h +++ b/include/linux/host1x.h @@ -466,6 +466,7 @@ struct host1x_memory_context { refcount_t ref; struct pid *owner; + struct device_dma_parameters dma_parms; struct device dev; u64 dma_mask; u32 stream_id; diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 67d0ab3c3bba..ef5b80e48599 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -322,6 +322,24 @@ struct thpsize { (transparent_hugepage_flags & \ (1<<TRANSPARENT_HUGEPAGE_USE_ZERO_PAGE_FLAG)) +static inline bool vma_thp_disabled(struct vm_area_struct *vma, + unsigned long vm_flags) +{ + /* + * Explicitly disabled through madvise or prctl, or some + * architectures may disable THP for some mappings, for + * example, s390 kvm. + */ + return (vm_flags & VM_NOHUGEPAGE) || + test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags); +} + +static inline bool thp_disabled_by_hw(void) +{ + /* If the hardware/firmware marked hugepage support disabled. */ + return transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_UNSUPPORTED); +} + unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags); unsigned long thp_get_unmapped_area_vmflags(struct file *filp, unsigned long addr, diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 4ad12a3c8bae..d0420e962ffd 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -256,6 +256,20 @@ static inline const struct iomap *iomap_iter_srcmap(const struct iomap_iter *i) return &i->iomap; } +/* + * Return the file offset for the first unchanged block after a short write. + * + * If nothing was written, round @pos down to point at the first block in + * the range, else round up to include the partially written block. + */ +static inline loff_t iomap_last_written_block(struct inode *inode, loff_t pos, + ssize_t written) +{ + if (unlikely(!written)) + return round_down(pos, i_blocksize(inode)); + return round_up(pos + written, i_blocksize(inode)); +} + ssize_t iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *from, const struct iomap_ops *ops, void *private); int iomap_read_folio(struct folio *folio, const struct iomap_ops *ops); @@ -276,9 +290,9 @@ vm_fault_t iomap_page_mkwrite(struct vm_fault *vmf, typedef void (*iomap_punch_t)(struct inode *inode, loff_t offset, loff_t length, struct iomap *iomap); -void iomap_file_buffered_write_punch_delalloc(struct inode *inode, loff_t pos, - loff_t length, ssize_t written, unsigned flag, - struct iomap *iomap, iomap_punch_t punch); +void iomap_write_delalloc_release(struct inode *inode, loff_t start_byte, + loff_t end_byte, unsigned flags, struct iomap *iomap, + iomap_punch_t punch); int iomap_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, u64 start, u64 len, const struct iomap_ops *ops); diff --git a/include/linux/irqchip/arm-gic-v4.h b/include/linux/irqchip/arm-gic-v4.h index ecabed6d3307..7f1f11a5e4e4 100644 --- a/include/linux/irqchip/arm-gic-v4.h +++ b/include/linux/irqchip/arm-gic-v4.h @@ -66,10 +66,12 @@ struct its_vpe { bool enabled; bool group; } sgi_config[16]; - atomic_t vmapp_count; }; }; + /* Track the VPE being mapped */ + atomic_t vmapp_count; + /* * Ensures mutual exclusion between affinity setting of the * vPE and vLPI operations using vpe->col_idx. diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index db567d26f7b9..45be36e5285f 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1313,8 +1313,6 @@ void mark_page_dirty(struct kvm *kvm, gfn_t gfn); struct kvm_memslots *kvm_vcpu_memslots(struct kvm_vcpu *vcpu); struct kvm_memory_slot *kvm_vcpu_gfn_to_memslot(struct kvm_vcpu *vcpu, gfn_t gfn); -kvm_pfn_t kvm_vcpu_gfn_to_pfn_atomic(struct kvm_vcpu *vcpu, gfn_t gfn); -kvm_pfn_t kvm_vcpu_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn); int kvm_vcpu_map(struct kvm_vcpu *vcpu, gpa_t gpa, struct kvm_host_map *map); void kvm_vcpu_unmap(struct kvm_vcpu *vcpu, struct kvm_host_map *map, bool dirty); unsigned long kvm_vcpu_gfn_to_hva(struct kvm_vcpu *vcpu, gfn_t gfn); diff --git a/include/linux/mm.h b/include/linux/mm.h index ecf63d2b0582..61fff5d34ed5 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3818,8 +3818,9 @@ void *sparse_buffer_alloc(unsigned long size); struct page * __populate_section_memmap(unsigned long pfn, unsigned long nr_pages, int nid, struct vmem_altmap *altmap, struct dev_pagemap *pgmap); -void pmd_init(void *addr); void pud_init(void *addr); +void pmd_init(void *addr); +void kernel_pte_init(void *addr); pgd_t *vmemmap_pgd_populate(unsigned long addr, int node); p4d_t *vmemmap_p4d_populate(pgd_t *pgd, unsigned long addr, int node); pud_t *vmemmap_pud_populate(p4d_t *p4d, unsigned long addr, int node); diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index bbd30f3c5d29..389f92e4d980 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -3384,6 +3384,12 @@ static inline void netif_tx_wake_all_queues(struct net_device *dev) static __always_inline void netif_tx_stop_queue(struct netdev_queue *dev_queue) { + /* Paired with READ_ONCE() from dev_watchdog() */ + WRITE_ONCE(dev_queue->trans_start, jiffies); + + /* This barrier is paired with smp_mb() from dev_watchdog() */ + smp_mb__before_atomic(); + /* Must be an atomic op see netif_txq_try_stop() */ set_bit(__QUEUE_STATE_DRV_XOFF, &dev_queue->state); } @@ -3510,6 +3516,12 @@ static inline void netdev_tx_sent_queue(struct netdev_queue *dev_queue, if (likely(dql_avail(&dev_queue->dql) >= 0)) return; + /* Paired with READ_ONCE() from dev_watchdog() */ + WRITE_ONCE(dev_queue->trans_start, jiffies); + + /* This barrier is paired with smp_mb() from dev_watchdog() */ + smp_mb__before_atomic(); + set_bit(__QUEUE_STATE_STACK_XOFF, &dev_queue->state); /* diff --git a/include/linux/percpu.h b/include/linux/percpu.h index b6321fc49159..52b5ea663b9f 100644 --- a/include/linux/percpu.h +++ b/include/linux/percpu.h @@ -41,7 +41,11 @@ PCPU_MIN_ALLOC_SHIFT) #ifdef CONFIG_RANDOM_KMALLOC_CACHES -#define PERCPU_DYNAMIC_SIZE_SHIFT 12 +# if defined(CONFIG_LOCKDEP) && !defined(CONFIG_PAGE_SIZE_4KB) +# define PERCPU_DYNAMIC_SIZE_SHIFT 13 +# else +# define PERCPU_DYNAMIC_SIZE_SHIFT 12 +#endif /* LOCKDEP and PAGE_SIZE > 4KiB */ #else #define PERCPU_DYNAMIC_SIZE_SHIFT 10 #endif diff --git a/include/linux/sched.h b/include/linux/sched.h index 449dd64ed9ac..bb343136ddd0 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -2133,6 +2133,11 @@ static inline void set_task_cpu(struct task_struct *p, unsigned int cpu) #endif /* CONFIG_SMP */ +static inline bool task_is_runnable(struct task_struct *p) +{ + return p->on_rq && !p->se.sched_delayed; +} + extern bool sched_task_on_rq(struct task_struct *p); extern unsigned long get_wchan(struct task_struct *p); extern struct task_struct *cpu_curr_snapshot(int cpu); diff --git a/include/linux/soc/qcom/geni-se.h b/include/linux/soc/qcom/geni-se.h index c3bca9c0bf2c..2996a3c28ef3 100644 --- a/include/linux/soc/qcom/geni-se.h +++ b/include/linux/soc/qcom/geni-se.h @@ -258,8 +258,8 @@ struct geni_se { #define RX_DMA_PARITY_ERR BIT(5) #define RX_DMA_BREAK GENMASK(8, 7) #define RX_GENI_GP_IRQ GENMASK(10, 5) -#define RX_GENI_CANCEL_IRQ BIT(11) #define RX_GENI_GP_IRQ_EXT GENMASK(13, 12) +#define RX_GENI_CANCEL_IRQ BIT(14) /* SE_HW_PARAM_0 fields */ #define TX_FIFO_WIDTH_MSK GENMASK(29, 24) diff --git a/include/linux/soundwire/sdw_intel.h b/include/linux/soundwire/sdw_intel.h index 37ae69365fe2..734dc1fa3b5b 100644 --- a/include/linux/soundwire/sdw_intel.h +++ b/include/linux/soundwire/sdw_intel.h @@ -227,7 +227,7 @@ struct sdw_intel_ops { /** * struct sdw_intel_acpi_info - Soundwire Intel information found in ACPI tables * @handle: ACPI controller handle - * @count: link count found with "sdw-master-count" property + * @count: link count found with "sdw-master-count" or "sdw-manager-list" property * @link_mask: bit-wise mask listing links enabled by BIOS menu * * this structure could be expanded to e.g. provide all the _ADR diff --git a/include/linux/task_work.h b/include/linux/task_work.h index cf5e7e891a77..2964171856e0 100644 --- a/include/linux/task_work.h +++ b/include/linux/task_work.h @@ -14,11 +14,14 @@ init_task_work(struct callback_head *twork, task_work_func_t func) } enum task_work_notify_mode { - TWA_NONE, + TWA_NONE = 0, TWA_RESUME, TWA_SIGNAL, TWA_SIGNAL_NO_IPI, TWA_NMI_CURRENT, + + TWA_FLAGS = 0xff00, + TWAF_NO_ALLOC = 0x0100, }; static inline bool task_work_pending(struct task_struct *task) diff --git a/include/net/bluetooth/bluetooth.h b/include/net/bluetooth/bluetooth.h index 5d655e109b2c..f66bc85c6411 100644 --- a/include/net/bluetooth/bluetooth.h +++ b/include/net/bluetooth/bluetooth.h @@ -403,6 +403,7 @@ int bt_sock_register(int proto, const struct net_proto_family *ops); void bt_sock_unregister(int proto); void bt_sock_link(struct bt_sock_list *l, struct sock *s); void bt_sock_unlink(struct bt_sock_list *l, struct sock *s); +bool bt_sock_linked(struct bt_sock_list *l, struct sock *s); struct sock *bt_sock_alloc(struct net *net, struct socket *sock, struct proto *prot, int proto, gfp_t prio, int kern); int bt_sock_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, diff --git a/include/net/netns/xfrm.h b/include/net/netns/xfrm.h index d489d9250bff..ae60d6664095 100644 --- a/include/net/netns/xfrm.h +++ b/include/net/netns/xfrm.h @@ -51,7 +51,6 @@ struct netns_xfrm { struct hlist_head *policy_byidx; unsigned int policy_idx_hmask; unsigned int idx_generator; - struct hlist_head policy_inexact[XFRM_POLICY_MAX]; struct xfrm_policy_hash policy_bydst[XFRM_POLICY_MAX]; unsigned int policy_count[XFRM_POLICY_MAX * 2]; struct work_struct policy_hash_work; diff --git a/include/net/sock.h b/include/net/sock.h index bf7fa3db10ae..7464e9f9f47c 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2742,6 +2742,11 @@ static inline bool sk_is_stream_unix(const struct sock *sk) return sk->sk_family == AF_UNIX && sk->sk_type == SOCK_STREAM; } +static inline bool sk_is_vsock(const struct sock *sk) +{ + return sk->sk_family == AF_VSOCK; +} + /** * sk_eat_skb - Release a skb if it is no longer needed * @sk: socket to eat this skb from diff --git a/include/net/xfrm.h b/include/net/xfrm.h index b6bfdc6416c7..a0bdd58f401c 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -349,20 +349,25 @@ struct xfrm_if_cb { void xfrm_if_register_cb(const struct xfrm_if_cb *ifcb); void xfrm_if_unregister_cb(void); +struct xfrm_dst_lookup_params { + struct net *net; + int tos; + int oif; + xfrm_address_t *saddr; + xfrm_address_t *daddr; + u32 mark; + __u8 ipproto; + union flowi_uli uli; +}; + struct net_device; struct xfrm_type; struct xfrm_dst; struct xfrm_policy_afinfo { struct dst_ops *dst_ops; - struct dst_entry *(*dst_lookup)(struct net *net, - int tos, int oif, - const xfrm_address_t *saddr, - const xfrm_address_t *daddr, - u32 mark); - int (*get_saddr)(struct net *net, int oif, - xfrm_address_t *saddr, - xfrm_address_t *daddr, - u32 mark); + struct dst_entry *(*dst_lookup)(const struct xfrm_dst_lookup_params *params); + int (*get_saddr)(xfrm_address_t *saddr, + const struct xfrm_dst_lookup_params *params); int (*fill_dst)(struct xfrm_dst *xdst, struct net_device *dev, const struct flowi *fl); @@ -1764,10 +1769,7 @@ static inline int xfrm_user_policy(struct sock *sk, int optname, } #endif -struct dst_entry *__xfrm_dst_lookup(struct net *net, int tos, int oif, - const xfrm_address_t *saddr, - const xfrm_address_t *daddr, - int family, u32 mark); +struct dst_entry *__xfrm_dst_lookup(int family, const struct xfrm_dst_lookup_params *params); struct xfrm_policy *xfrm_policy_alloc(struct net *net, gfp_t gfp); diff --git a/include/trace/events/dma.h b/include/trace/events/dma.h index 569f86a44aaa..b0f41265191c 100644 --- a/include/trace/events/dma.h +++ b/include/trace/events/dma.h @@ -121,7 +121,7 @@ TRACE_EVENT(dma_alloc, TP_STRUCT__entry( __string(device, dev_name(dev)) - __field(u64, phys_addr) + __field(void *, virt_addr) __field(u64, dma_addr) __field(size_t, size) __field(gfp_t, flags) @@ -130,18 +130,18 @@ TRACE_EVENT(dma_alloc, TP_fast_assign( __assign_str(device); - __entry->phys_addr = virt_to_phys(virt_addr); + __entry->virt_addr = virt_addr; __entry->dma_addr = dma_addr; __entry->size = size; __entry->flags = flags; __entry->attrs = attrs; ), - TP_printk("%s dma_addr=%llx size=%zu phys_addr=%llx flags=%s attrs=%s", + TP_printk("%s dma_addr=%llx size=%zu virt_addr=%p flags=%s attrs=%s", __get_str(device), __entry->dma_addr, __entry->size, - __entry->phys_addr, + __entry->virt_addr, show_gfp_flags(__entry->flags), decode_dma_attrs(__entry->attrs)) ); @@ -153,7 +153,7 @@ TRACE_EVENT(dma_free, TP_STRUCT__entry( __string(device, dev_name(dev)) - __field(u64, phys_addr) + __field(void *, virt_addr) __field(u64, dma_addr) __field(size_t, size) __field(unsigned long, attrs) @@ -161,17 +161,17 @@ TRACE_EVENT(dma_free, TP_fast_assign( __assign_str(device); - __entry->phys_addr = virt_to_phys(virt_addr); + __entry->virt_addr = virt_addr; __entry->dma_addr = dma_addr; __entry->size = size; __entry->attrs = attrs; ), - TP_printk("%s dma_addr=%llx size=%zu phys_addr=%llx attrs=%s", + TP_printk("%s dma_addr=%llx size=%zu virt_addr=%p attrs=%s", __get_str(device), __entry->dma_addr, __entry->size, - __entry->phys_addr, + __entry->virt_addr, decode_dma_attrs(__entry->attrs)) ); diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h index b5f5369b6300..9d5c00b0285c 100644 --- a/include/trace/events/huge_memory.h +++ b/include/trace/events/huge_memory.h @@ -208,7 +208,7 @@ TRACE_EVENT(mm_khugepaged_scan_file, TRACE_EVENT(mm_khugepaged_collapse_file, TP_PROTO(struct mm_struct *mm, struct folio *new_folio, pgoff_t index, - bool is_shmem, unsigned long addr, struct file *file, + unsigned long addr, bool is_shmem, struct file *file, int nr, int result), TP_ARGS(mm, new_folio, index, addr, is_shmem, file, nr, result), TP_STRUCT__entry( @@ -233,7 +233,7 @@ TRACE_EVENT(mm_khugepaged_collapse_file, __entry->result = result; ), - TP_printk("mm=%p, hpage_pfn=0x%lx, index=%ld, addr=%ld, is_shmem=%d, filename=%s, nr=%d, result=%s", + TP_printk("mm=%p, hpage_pfn=0x%lx, index=%ld, addr=%lx, is_shmem=%d, filename=%s, nr=%d, result=%s", __entry->mm, __entry->hpfn, __entry->index, diff --git a/include/trace/events/netfs.h b/include/trace/events/netfs.h index 1d7c52821e55..69975c9c6823 100644 --- a/include/trace/events/netfs.h +++ b/include/trace/events/netfs.h @@ -172,7 +172,6 @@ EM(netfs_folio_trace_read, "read") \ EM(netfs_folio_trace_read_done, "read-done") \ EM(netfs_folio_trace_read_gaps, "read-gaps") \ - EM(netfs_folio_trace_read_put, "read-put") \ EM(netfs_folio_trace_read_unlock, "read-unlock") \ EM(netfs_folio_trace_redirtied, "redirtied") \ EM(netfs_folio_trace_store, "store") \ diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index c6cd7c7aeeee..e8241b320c6d 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -6047,11 +6047,6 @@ enum { BPF_F_MARK_ENFORCE = (1ULL << 6), }; -/* BPF_FUNC_clone_redirect and BPF_FUNC_redirect flags. */ -enum { - BPF_F_INGRESS = (1ULL << 0), -}; - /* BPF_FUNC_skb_set_tunnel_key and BPF_FUNC_skb_get_tunnel_key flags. */ enum { BPF_F_TUNINFO_IPV6 = (1ULL << 0), @@ -6198,10 +6193,12 @@ enum { BPF_F_BPRM_SECUREEXEC = (1ULL << 0), }; -/* Flags for bpf_redirect_map helper */ +/* Flags for bpf_redirect and bpf_redirect_map helpers */ enum { - BPF_F_BROADCAST = (1ULL << 3), - BPF_F_EXCLUDE_INGRESS = (1ULL << 4), + BPF_F_INGRESS = (1ULL << 0), /* used for skb path */ + BPF_F_BROADCAST = (1ULL << 3), /* used for XDP path */ + BPF_F_EXCLUDE_INGRESS = (1ULL << 4), /* used for XDP path */ +#define BPF_F_REDIRECT_FLAGS (BPF_F_INGRESS | BPF_F_BROADCAST | BPF_F_EXCLUDE_INGRESS) }; #define __bpf_md_ptr(type, name) \ diff --git a/include/uapi/linux/ublk_cmd.h b/include/uapi/linux/ublk_cmd.h index c8dc5f8ea699..12873639ea96 100644 --- a/include/uapi/linux/ublk_cmd.h +++ b/include/uapi/linux/ublk_cmd.h @@ -175,7 +175,13 @@ /* use ioctl encoding for uring command */ #define UBLK_F_CMD_IOCTL_ENCODE (1UL << 6) -/* Copy between request and user buffer by pread()/pwrite() */ +/* + * Copy between request and user buffer by pread()/pwrite() + * + * Not available for UBLK_F_UNPRIVILEGED_DEV, otherwise userspace may + * deceive us by not filling request buffer, then kernel uninitialized + * data may be leaked. + */ #define UBLK_F_USER_COPY (1UL << 7) /* diff --git a/include/xen/acpi.h b/include/xen/acpi.h index daa96a22d257..c66a8461612e 100644 --- a/include/xen/acpi.h +++ b/include/xen/acpi.h @@ -35,6 +35,8 @@ #include <linux/types.h> +typedef int (*get_gsi_from_sbdf_t)(u32 sbdf); + #ifdef CONFIG_XEN_DOM0 #include <asm/xen/hypervisor.h> #include <xen/xen.h> @@ -72,6 +74,8 @@ int xen_acpi_get_gsi_info(struct pci_dev *dev, int *gsi_out, int *trigger_out, int *polarity_out); +void xen_acpi_register_get_gsi_func(get_gsi_from_sbdf_t func); +int xen_acpi_get_gsi_from_sbdf(u32 sbdf); #else static inline void xen_acpi_sleep_register(void) { @@ -89,12 +93,12 @@ static inline int xen_acpi_get_gsi_info(struct pci_dev *dev, { return -1; } -#endif -#ifdef CONFIG_XEN_PCI_STUB -int pcistub_get_gsi_from_sbdf(unsigned int sbdf); -#else -static inline int pcistub_get_gsi_from_sbdf(unsigned int sbdf) +static inline void xen_acpi_register_get_gsi_func(get_gsi_from_sbdf_t func) +{ +} + +static inline int xen_acpi_get_gsi_from_sbdf(u32 sbdf) { return -1; } diff --git a/init/Kconfig b/init/Kconfig index 530a382ee0fe..c521e1421ad4 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -62,7 +62,7 @@ config LLD_VERSION config RUSTC_VERSION int - default $(shell,$(srctree)/scripts/rustc-version.sh $(RUSTC)) + default $(rustc-version) help It does not depend on `RUST` since that one may need to use the version in a `depends on`. @@ -78,6 +78,10 @@ config RUST_IS_AVAILABLE In particular, the Makefile target 'rustavailable' is useful to check why the Rust toolchain is not being detected. +config RUSTC_LLVM_VERSION + int + default $(rustc-llvm-version) + config CC_CAN_LINK bool default $(success,$(srctree)/scripts/cc-can-link.sh $(CC) $(CLANG_FLAGS) $(USERCFLAGS) $(USERLDFLAGS) $(m64-flag)) if 64BIT @@ -1946,7 +1950,7 @@ config RUST depends on !GCC_PLUGIN_RANDSTRUCT depends on !RANDSTRUCT depends on !DEBUG_INFO_BTF || PAHOLE_HAS_LANG_EXCLUDE - depends on !CFI_CLANG || RUSTC_VERSION >= 107900 && HAVE_CFI_ICALL_NORMALIZE_INTEGERS + depends on !CFI_CLANG || HAVE_CFI_ICALL_NORMALIZE_INTEGERS_RUSTC select CFI_ICALL_NORMALIZE_INTEGERS if CFI_CLANG depends on !CALL_PADDING || RUSTC_VERSION >= 108100 depends on !KASAN_SW_TAGS diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index 9d70b2cf7b1e..70b6675941ff 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -284,7 +284,14 @@ static inline bool io_sqring_full(struct io_ring_ctx *ctx) { struct io_rings *r = ctx->rings; - return READ_ONCE(r->sq.tail) - ctx->cached_sq_head == ctx->sq_entries; + /* + * SQPOLL must use the actual sqring head, as using the cached_sq_head + * is race prone if the SQPOLL thread has grabbed entries but not yet + * committed them to the ring. For !SQPOLL, this doesn't matter, but + * since this helper is just used for SQPOLL sqring waits (or POLLOUT), + * just read the actual sqring head unconditionally. + */ + return READ_ONCE(r->sq.tail) - READ_ONCE(r->sq.head) == ctx->sq_entries; } static inline unsigned int io_sqring_entries(struct io_ring_ctx *ctx) @@ -320,6 +327,7 @@ static inline int io_run_task_work(void) if (current->io_uring) { unsigned int count = 0; + __set_current_state(TASK_RUNNING); tctx_task_work_run(current->io_uring, UINT_MAX, &count); if (count) ret = true; diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c index 33a3d156a85b..6f3b6de230bd 100644 --- a/io_uring/rsrc.c +++ b/io_uring/rsrc.c @@ -1176,7 +1176,8 @@ static int io_clone_buffers(struct io_ring_ctx *ctx, struct io_ring_ctx *src_ctx for (i = 0; i < nbufs; i++) { struct io_mapped_ubuf *src = src_ctx->user_bufs[i]; - refcount_inc(&src->refs); + if (src != &dummy_ubuf) + refcount_inc(&src->refs); user_bufs[i] = src; } diff --git a/io_uring/rw.c b/io_uring/rw.c index 80ae3c2ebb70..354c4e175654 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -807,7 +807,7 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode, int rw_type) * reliably. If not, or it IOCB_NOWAIT is set, don't retry. */ if (kiocb->ki_flags & IOCB_NOWAIT || - ((file->f_flags & O_NONBLOCK && (req->flags & REQ_F_SUPPORT_NOWAIT)))) + ((file->f_flags & O_NONBLOCK && !(req->flags & REQ_F_SUPPORT_NOWAIT)))) req->flags |= REQ_F_NOWAIT; if (ctx->flags & IORING_SETUP_IOPOLL) { diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c index 6292ac5f9bd1..3bc61628ab25 100644 --- a/kernel/bpf/bpf_lsm.c +++ b/kernel/bpf/bpf_lsm.c @@ -339,10 +339,6 @@ BTF_ID(func, bpf_lsm_path_chmod) BTF_ID(func, bpf_lsm_path_chown) #endif /* CONFIG_SECURITY_PATH */ -#ifdef CONFIG_KEYS -BTF_ID(func, bpf_lsm_key_free) -#endif /* CONFIG_KEYS */ - BTF_ID(func, bpf_lsm_mmap_file) BTF_ID(func, bpf_lsm_netlink_send) BTF_ID(func, bpf_lsm_path_notify) diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 75e4fe83c509..5cd1c7a23848 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -3523,7 +3523,7 @@ end: * (i + 1) * elem_size * where i is the repeat index and elem_size is the size of an element. */ -static int btf_repeat_fields(struct btf_field_info *info, +static int btf_repeat_fields(struct btf_field_info *info, int info_cnt, u32 field_cnt, u32 repeat_cnt, u32 elem_size) { u32 i, j; @@ -3543,6 +3543,12 @@ static int btf_repeat_fields(struct btf_field_info *info, } } + /* The type of struct size or variable size is u32, + * so the multiplication will not overflow. + */ + if (field_cnt * (repeat_cnt + 1) > info_cnt) + return -E2BIG; + cur = field_cnt; for (i = 0; i < repeat_cnt; i++) { memcpy(&info[cur], &info[0], field_cnt * sizeof(info[0])); @@ -3587,7 +3593,7 @@ static int btf_find_nested_struct(const struct btf *btf, const struct btf_type * info[i].off += off; if (nelems > 1) { - err = btf_repeat_fields(info, ret, nelems - 1, t->size); + err = btf_repeat_fields(info, info_cnt, ret, nelems - 1, t->size); if (err == 0) ret *= nelems; else @@ -3681,10 +3687,10 @@ static int btf_find_field_one(const struct btf *btf, if (ret == BTF_FIELD_IGNORE) return 0; - if (nelems > info_cnt) + if (!info_cnt) return -E2BIG; if (nelems > 1) { - ret = btf_repeat_fields(info, 1, nelems - 1, sz); + ret = btf_repeat_fields(info, info_cnt, 1, nelems - 1, sz); if (ret < 0) return ret; } @@ -8961,6 +8967,7 @@ int bpf_core_apply(struct bpf_core_ctx *ctx, const struct bpf_core_relo *relo, if (!type) { bpf_log(ctx->log, "relo #%u: bad type id %u\n", relo_idx, relo->type_id); + kfree(specs); return -EINVAL; } diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index 9e0e3b0a18e4..7878be18e9d2 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -333,9 +333,11 @@ static int dev_map_hash_get_next_key(struct bpf_map *map, void *key, static int dev_map_bpf_prog_run(struct bpf_prog *xdp_prog, struct xdp_frame **frames, int n, - struct net_device *dev) + struct net_device *tx_dev, + struct net_device *rx_dev) { - struct xdp_txq_info txq = { .dev = dev }; + struct xdp_txq_info txq = { .dev = tx_dev }; + struct xdp_rxq_info rxq = { .dev = rx_dev }; struct xdp_buff xdp; int i, nframes = 0; @@ -346,6 +348,7 @@ static int dev_map_bpf_prog_run(struct bpf_prog *xdp_prog, xdp_convert_frame_to_buff(xdpf, &xdp); xdp.txq = &txq; + xdp.rxq = &rxq; act = bpf_prog_run_xdp(xdp_prog, &xdp); switch (act) { @@ -360,7 +363,7 @@ static int dev_map_bpf_prog_run(struct bpf_prog *xdp_prog, bpf_warn_invalid_xdp_action(NULL, xdp_prog, act); fallthrough; case XDP_ABORTED: - trace_xdp_exception(dev, xdp_prog, act); + trace_xdp_exception(tx_dev, xdp_prog, act); fallthrough; case XDP_DROP: xdp_return_frame_rx_napi(xdpf); @@ -388,7 +391,7 @@ static void bq_xmit_all(struct xdp_dev_bulk_queue *bq, u32 flags) } if (bq->xdp_prog) { - to_send = dev_map_bpf_prog_run(bq->xdp_prog, bq->q, cnt, dev); + to_send = dev_map_bpf_prog_run(bq->xdp_prog, bq->q, cnt, dev, bq->dev_rx); if (!to_send) goto out; } diff --git a/kernel/bpf/log.c b/kernel/bpf/log.c index 5aebfc3051e3..4a858fdb6476 100644 --- a/kernel/bpf/log.c +++ b/kernel/bpf/log.c @@ -688,8 +688,7 @@ static void print_reg_state(struct bpf_verifier_env *env, if (t == SCALAR_VALUE && reg->precise) verbose(env, "P"); if (t == SCALAR_VALUE && tnum_is_const(reg->var_off)) { - /* reg->off should be 0 for SCALAR_VALUE */ - verbose_snum(env, reg->var_off.value + reg->off); + verbose_snum(env, reg->var_off.value); return; } diff --git a/kernel/bpf/ringbuf.c b/kernel/bpf/ringbuf.c index e20b90c36131..de3b681d1d13 100644 --- a/kernel/bpf/ringbuf.c +++ b/kernel/bpf/ringbuf.c @@ -29,7 +29,7 @@ struct bpf_ringbuf { u64 mask; struct page **pages; int nr_pages; - spinlock_t spinlock ____cacheline_aligned_in_smp; + raw_spinlock_t spinlock ____cacheline_aligned_in_smp; /* For user-space producer ring buffers, an atomic_t busy bit is used * to synchronize access to the ring buffers in the kernel, rather than * the spinlock that is used for kernel-producer ring buffers. This is @@ -173,7 +173,7 @@ static struct bpf_ringbuf *bpf_ringbuf_alloc(size_t data_sz, int numa_node) if (!rb) return NULL; - spin_lock_init(&rb->spinlock); + raw_spin_lock_init(&rb->spinlock); atomic_set(&rb->busy, 0); init_waitqueue_head(&rb->waitq); init_irq_work(&rb->work, bpf_ringbuf_notify); @@ -421,10 +421,10 @@ static void *__bpf_ringbuf_reserve(struct bpf_ringbuf *rb, u64 size) cons_pos = smp_load_acquire(&rb->consumer_pos); if (in_nmi()) { - if (!spin_trylock_irqsave(&rb->spinlock, flags)) + if (!raw_spin_trylock_irqsave(&rb->spinlock, flags)) return NULL; } else { - spin_lock_irqsave(&rb->spinlock, flags); + raw_spin_lock_irqsave(&rb->spinlock, flags); } pend_pos = rb->pending_pos; @@ -450,7 +450,7 @@ static void *__bpf_ringbuf_reserve(struct bpf_ringbuf *rb, u64 size) */ if (new_prod_pos - cons_pos > rb->mask || new_prod_pos - pend_pos > rb->mask) { - spin_unlock_irqrestore(&rb->spinlock, flags); + raw_spin_unlock_irqrestore(&rb->spinlock, flags); return NULL; } @@ -462,7 +462,7 @@ static void *__bpf_ringbuf_reserve(struct bpf_ringbuf *rb, u64 size) /* pairs with consumer's smp_load_acquire() */ smp_store_release(&rb->producer_pos, new_prod_pos); - spin_unlock_irqrestore(&rb->spinlock, flags); + raw_spin_unlock_irqrestore(&rb->spinlock, flags); return (void *)hdr + BPF_RINGBUF_HDR_SZ; } diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index a8f1808a1ca5..8cfa7183d2ef 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -3565,15 +3565,16 @@ static void bpf_perf_link_dealloc(struct bpf_link *link) } static int bpf_perf_link_fill_common(const struct perf_event *event, - char __user *uname, u32 ulen, + char __user *uname, u32 *ulenp, u64 *probe_offset, u64 *probe_addr, u32 *fd_type, unsigned long *missed) { const char *buf; - u32 prog_id; + u32 prog_id, ulen; size_t len; int err; + ulen = *ulenp; if (!ulen ^ !uname) return -EINVAL; @@ -3581,10 +3582,17 @@ static int bpf_perf_link_fill_common(const struct perf_event *event, probe_offset, probe_addr, missed); if (err) return err; + + if (buf) { + len = strlen(buf); + *ulenp = len + 1; + } else { + *ulenp = 1; + } if (!uname) return 0; + if (buf) { - len = strlen(buf); err = bpf_copy_to_user(uname, buf, ulen, len); if (err) return err; @@ -3609,7 +3617,7 @@ static int bpf_perf_link_fill_kprobe(const struct perf_event *event, uname = u64_to_user_ptr(info->perf_event.kprobe.func_name); ulen = info->perf_event.kprobe.name_len; - err = bpf_perf_link_fill_common(event, uname, ulen, &offset, &addr, + err = bpf_perf_link_fill_common(event, uname, &ulen, &offset, &addr, &type, &missed); if (err) return err; @@ -3617,7 +3625,7 @@ static int bpf_perf_link_fill_kprobe(const struct perf_event *event, info->perf_event.type = BPF_PERF_EVENT_KRETPROBE; else info->perf_event.type = BPF_PERF_EVENT_KPROBE; - + info->perf_event.kprobe.name_len = ulen; info->perf_event.kprobe.offset = offset; info->perf_event.kprobe.missed = missed; if (!kallsyms_show_value(current_cred())) @@ -3639,7 +3647,7 @@ static int bpf_perf_link_fill_uprobe(const struct perf_event *event, uname = u64_to_user_ptr(info->perf_event.uprobe.file_name); ulen = info->perf_event.uprobe.name_len; - err = bpf_perf_link_fill_common(event, uname, ulen, &offset, &addr, + err = bpf_perf_link_fill_common(event, uname, &ulen, &offset, &addr, &type, NULL); if (err) return err; @@ -3648,6 +3656,7 @@ static int bpf_perf_link_fill_uprobe(const struct perf_event *event, info->perf_event.type = BPF_PERF_EVENT_URETPROBE; else info->perf_event.type = BPF_PERF_EVENT_UPROBE; + info->perf_event.uprobe.name_len = ulen; info->perf_event.uprobe.offset = offset; info->perf_event.uprobe.cookie = event->bpf_cookie; return 0; @@ -3673,12 +3682,18 @@ static int bpf_perf_link_fill_tracepoint(const struct perf_event *event, { char __user *uname; u32 ulen; + int err; uname = u64_to_user_ptr(info->perf_event.tracepoint.tp_name); ulen = info->perf_event.tracepoint.name_len; + err = bpf_perf_link_fill_common(event, uname, &ulen, NULL, NULL, NULL, NULL); + if (err) + return err; + info->perf_event.type = BPF_PERF_EVENT_TRACEPOINT; + info->perf_event.tracepoint.name_len = ulen; info->perf_event.tracepoint.cookie = event->bpf_cookie; - return bpf_perf_link_fill_common(event, uname, ulen, NULL, NULL, NULL, NULL); + return 0; } static int bpf_perf_link_fill_perf_event(const struct perf_event *event, diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c index 02aa9db8d796..5af9e130e500 100644 --- a/kernel/bpf/task_iter.c +++ b/kernel/bpf/task_iter.c @@ -99,7 +99,7 @@ static struct task_struct *task_seq_get_next(struct bpf_iter_seq_task_common *co rcu_read_lock(); pid = find_pid_ns(common->pid, common->ns); if (pid) { - task = get_pid_task(pid, PIDTYPE_TGID); + task = get_pid_task(pid, PIDTYPE_PID); *tid = common->pid; } rcu_read_unlock(); diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 9a7ed527e47e..411ab1b57af4 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -2750,10 +2750,16 @@ static struct btf *__find_kfunc_desc_btf(struct bpf_verifier_env *env, b->module = mod; b->offset = offset; + /* sort() reorders entries by value, so b may no longer point + * to the right entry after this + */ sort(tab->descs, tab->nr_descs, sizeof(tab->descs[0]), kfunc_btf_cmp_by_off, NULL); + } else { + btf = b->btf; } - return b->btf; + + return btf; } void bpf_free_kfunc_btf_tab(struct bpf_kfunc_btf_tab *tab) @@ -6333,10 +6339,10 @@ static void coerce_reg_to_size_sx(struct bpf_reg_state *reg, int size) /* both of s64_max/s64_min positive or negative */ if ((s64_max >= 0) == (s64_min >= 0)) { - reg->smin_value = reg->s32_min_value = s64_min; - reg->smax_value = reg->s32_max_value = s64_max; - reg->umin_value = reg->u32_min_value = s64_min; - reg->umax_value = reg->u32_max_value = s64_max; + reg->s32_min_value = reg->smin_value = s64_min; + reg->s32_max_value = reg->smax_value = s64_max; + reg->u32_min_value = reg->umin_value = s64_min; + reg->u32_max_value = reg->umax_value = s64_max; reg->var_off = tnum_range(s64_min, s64_max); return; } @@ -14264,12 +14270,13 @@ static int adjust_reg_min_max_vals(struct bpf_verifier_env *env, * r1 += 0x1 * if r2 < 1000 goto ... * use r1 in memory access - * So remember constant delta between r2 and r1 and update r1 after - * 'if' condition. + * So for 64-bit alu remember constant delta between r2 and r1 and + * update r1 after 'if' condition. */ - if (env->bpf_capable && BPF_OP(insn->code) == BPF_ADD && - dst_reg->id && is_reg_const(src_reg, alu32)) { - u64 val = reg_const_value(src_reg, alu32); + if (env->bpf_capable && + BPF_OP(insn->code) == BPF_ADD && !alu32 && + dst_reg->id && is_reg_const(src_reg, false)) { + u64 val = reg_const_value(src_reg, false); if ((dst_reg->id & BPF_ADD_CONST) || /* prevent overflow in sync_linked_regs() later */ @@ -15326,8 +15333,12 @@ static void sync_linked_regs(struct bpf_verifier_state *vstate, struct bpf_reg_s continue; if ((!(reg->id & BPF_ADD_CONST) && !(known_reg->id & BPF_ADD_CONST)) || reg->off == known_reg->off) { + s32 saved_subreg_def = reg->subreg_def; + copy_register_state(reg, known_reg); + reg->subreg_def = saved_subreg_def; } else { + s32 saved_subreg_def = reg->subreg_def; s32 saved_off = reg->off; fake_reg.type = SCALAR_VALUE; @@ -15340,6 +15351,7 @@ static void sync_linked_regs(struct bpf_verifier_state *vstate, struct bpf_reg_s * otherwise another sync_linked_regs() will be incorrect. */ reg->off = saved_off; + reg->subreg_def = saved_subreg_def; scalar32_min_max_add(reg, &fake_reg); scalar_min_max_add(reg, &fake_reg); @@ -22310,7 +22322,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3 /* 'struct bpf_verifier_env' can be global, but since it's not small, * allocate/free it every time bpf_check() is called */ - env = kzalloc(sizeof(struct bpf_verifier_env), GFP_KERNEL); + env = kvzalloc(sizeof(struct bpf_verifier_env), GFP_KERNEL); if (!env) return -ENOMEM; @@ -22546,6 +22558,6 @@ err_unlock: mutex_unlock(&bpf_verifier_lock); vfree(env->insn_aux_data); err_free_env: - kfree(env); + kvfree(env); return ret; } diff --git a/kernel/events/core.c b/kernel/events/core.c index e3589c4287cb..cdd09769e6c5 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -9251,7 +9251,7 @@ static void perf_event_switch(struct task_struct *task, }, }; - if (!sched_in && task->on_rq) { + if (!sched_in && task_is_runnable(task)) { switch_event.event_id.header.misc |= PERF_RECORD_MISC_SWITCH_OUT_PREEMPT; } diff --git a/kernel/freezer.c b/kernel/freezer.c index 44bbd7dbd2c8..8d530d0949ff 100644 --- a/kernel/freezer.c +++ b/kernel/freezer.c @@ -109,7 +109,12 @@ static int __set_task_frozen(struct task_struct *p, void *arg) { unsigned int state = READ_ONCE(p->__state); - if (p->on_rq) + /* + * Allow freezing the sched_delayed tasks; they will not execute until + * ttwu() fixes them up, so it is safe to swap their state now, instead + * of waiting for them to get fully dequeued. + */ + if (task_is_runnable(p)) return 0; if (p != current && task_curr(p)) diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 6333f4ccf024..4d7ee95df06e 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -986,6 +986,15 @@ static bool rcu_tasks_is_holdout(struct task_struct *t) return false; /* + * t->on_rq && !t->se.sched_delayed *could* be considered sleeping but + * since it is a spurious state (it will transition into the + * traditional blocked state or get woken up without outside + * dependencies), not considering it such should only affect timing. + * + * Be conservative for now and not include it. + */ + + /* * Idle tasks (or idle injection) within the idle loop are RCU-tasks * quiescent states. But CPU boot code performed by the idle task * isn't a quiescent state. diff --git a/kernel/sched/core.c b/kernel/sched/core.c index aeb595514461..dbfb5717d6af 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -548,6 +548,11 @@ sched_core_dequeue(struct rq *rq, struct task_struct *p, int flags) { } * ON_RQ_MIGRATING state is used for migration without holding both * rq->locks. It indicates task_cpu() is not stable, see task_rq_lock(). * + * Additionally it is possible to be ->on_rq but still be considered not + * runnable when p->se.sched_delayed is true. These tasks are on the runqueue + * but will be dequeued as soon as they get picked again. See the + * task_is_runnable() helper. + * * p->on_cpu <- { 0, 1 }: * * is set by prepare_task() and cleared by finish_task() such that it will be @@ -2012,11 +2017,6 @@ void enqueue_task(struct rq *rq, struct task_struct *p, int flags) if (!(flags & ENQUEUE_NOCLOCK)) update_rq_clock(rq); - if (!(flags & ENQUEUE_RESTORE)) { - sched_info_enqueue(rq, p); - psi_enqueue(p, (flags & ENQUEUE_WAKEUP) && !(flags & ENQUEUE_MIGRATED)); - } - p->sched_class->enqueue_task(rq, p, flags); /* * Must be after ->enqueue_task() because ENQUEUE_DELAYED can clear @@ -2024,6 +2024,11 @@ void enqueue_task(struct rq *rq, struct task_struct *p, int flags) */ uclamp_rq_inc(rq, p); + if (!(flags & ENQUEUE_RESTORE)) { + sched_info_enqueue(rq, p); + psi_enqueue(p, flags & ENQUEUE_MIGRATED); + } + if (sched_core_enabled(rq)) sched_core_enqueue(rq, p); } @@ -2041,7 +2046,7 @@ inline bool dequeue_task(struct rq *rq, struct task_struct *p, int flags) if (!(flags & DEQUEUE_SAVE)) { sched_info_dequeue(rq, p); - psi_dequeue(p, flags & DEQUEUE_SLEEP); + psi_dequeue(p, !(flags & DEQUEUE_SLEEP)); } /* @@ -4323,9 +4328,10 @@ static bool __task_needs_rq_lock(struct task_struct *p) * @arg: Argument to function. * * Fix the task in it's current state by avoiding wakeups and or rq operations - * and call @func(@arg) on it. This function can use ->on_rq and task_curr() - * to work out what the state is, if required. Given that @func can be invoked - * with a runqueue lock held, it had better be quite lightweight. + * and call @func(@arg) on it. This function can use task_is_runnable() and + * task_curr() to work out what the state is, if required. Given that @func + * can be invoked with a runqueue lock held, it had better be quite + * lightweight. * * Returns: * Whatever @func returns @@ -6544,6 +6550,7 @@ static void __sched notrace __schedule(int sched_mode) * as a preemption by schedule_debug() and RCU. */ bool preempt = sched_mode > SM_NONE; + bool block = false; unsigned long *switch_count; unsigned long prev_state; struct rq_flags rf; @@ -6629,6 +6636,7 @@ static void __sched notrace __schedule(int sched_mode) * After this, schedule() must not care about p->state any more. */ block_task(rq, prev, flags); + block = true; } switch_count = &prev->nvcsw; } @@ -6674,7 +6682,7 @@ picked: migrate_disable_switch(rq, prev); psi_account_irqtime(rq, prev, next); - psi_sched_switch(prev, next, !task_on_rq_queued(prev)); + psi_sched_switch(prev, next, block); trace_sched_switch(preempt, prev, next, prev_state); @@ -7017,20 +7025,20 @@ int default_wake_function(wait_queue_entry_t *curr, unsigned mode, int wake_flag } EXPORT_SYMBOL(default_wake_function); -void __setscheduler_prio(struct task_struct *p, int prio) +const struct sched_class *__setscheduler_class(struct task_struct *p, int prio) { if (dl_prio(prio)) - p->sched_class = &dl_sched_class; - else if (rt_prio(prio)) - p->sched_class = &rt_sched_class; + return &dl_sched_class; + + if (rt_prio(prio)) + return &rt_sched_class; + #ifdef CONFIG_SCHED_CLASS_EXT - else if (task_should_scx(p)) - p->sched_class = &ext_sched_class; + if (task_should_scx(p)) + return &ext_sched_class; #endif - else - p->sched_class = &fair_sched_class; - p->prio = prio; + return &fair_sched_class; } #ifdef CONFIG_RT_MUTEXES @@ -7076,7 +7084,7 @@ void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task) { int prio, oldprio, queued, running, queue_flag = DEQUEUE_SAVE | DEQUEUE_MOVE | DEQUEUE_NOCLOCK; - const struct sched_class *prev_class; + const struct sched_class *prev_class, *next_class; struct rq_flags rf; struct rq *rq; @@ -7134,6 +7142,11 @@ void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task) queue_flag &= ~DEQUEUE_MOVE; prev_class = p->sched_class; + next_class = __setscheduler_class(p, prio); + + if (prev_class != next_class && p->se.sched_delayed) + dequeue_task(rq, p, DEQUEUE_SLEEP | DEQUEUE_DELAYED | DEQUEUE_NOCLOCK); + queued = task_on_rq_queued(p); running = task_current(rq, p); if (queued) @@ -7171,7 +7184,9 @@ void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task) p->rt.timeout = 0; } - __setscheduler_prio(p, prio); + p->sched_class = next_class; + p->prio = prio; + check_class_changing(rq, p, prev_class); if (queued) @@ -10465,7 +10480,9 @@ void task_tick_mm_cid(struct rq *rq, struct task_struct *curr) return; if (time_before(now, READ_ONCE(curr->mm->mm_cid_next_scan))) return; - task_work_add(curr, work, TWA_RESUME); + + /* No page allocation under rq lock */ + task_work_add(curr, work, TWA_RESUME | TWAF_NO_ALLOC); } void sched_mm_cid_exit_signals(struct task_struct *t) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 9ce93d0bf452..be1b917dc8ce 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -2385,7 +2385,7 @@ static void set_next_task_dl(struct rq *rq, struct task_struct *p, bool first) deadline_queue_push_tasks(rq); - if (hrtick_enabled(rq)) + if (hrtick_enabled_dl(rq)) start_hrtick_dl(rq, &p->dl); } diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 6eae3b69bf6e..5900b06fd036 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -4493,7 +4493,7 @@ static void scx_ops_disable_workfn(struct kthread_work *work) sched_deq_and_put_task(p, DEQUEUE_SAVE | DEQUEUE_MOVE, &ctx); - __setscheduler_prio(p, p->prio); + p->sched_class = __setscheduler_class(p, p->prio); check_class_changing(task_rq(p), p, old_class); sched_enq_and_set_task(&ctx); @@ -5204,7 +5204,7 @@ static int scx_ops_enable(struct sched_ext_ops *ops, struct bpf_link *link) sched_deq_and_put_task(p, DEQUEUE_SAVE | DEQUEUE_MOVE, &ctx); p->scx.slice = SCX_SLICE_DFL; - __setscheduler_prio(p, p->prio); + p->sched_class = __setscheduler_class(p, p->prio); check_class_changing(task_rq(p), p, old_class); sched_enq_and_set_task(&ctx); diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 225b31aaee55..c157d4860a3b 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1247,7 +1247,7 @@ static void update_curr(struct cfs_rq *cfs_rq) account_cfs_rq_runtime(cfs_rq, delta_exec); - if (rq->nr_running == 1) + if (cfs_rq->nr_running == 1) return; if (resched || did_preempt_short(cfs_rq, curr)) { @@ -6058,10 +6058,13 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq) for_each_sched_entity(se) { struct cfs_rq *qcfs_rq = cfs_rq_of(se); - if (se->on_rq) { - SCHED_WARN_ON(se->sched_delayed); + /* Handle any unfinished DELAY_DEQUEUE business first. */ + if (se->sched_delayed) { + int flags = DEQUEUE_SLEEP | DEQUEUE_DELAYED; + + dequeue_entity(qcfs_rq, se, flags); + } else if (se->on_rq) break; - } enqueue_entity(qcfs_rq, se, ENQUEUE_WAKEUP); if (cfs_rq_is_idle(group_cfs_rq(se))) @@ -13174,22 +13177,6 @@ static void attach_task_cfs_rq(struct task_struct *p) static void switched_from_fair(struct rq *rq, struct task_struct *p) { detach_task_cfs_rq(p); - /* - * Since this is called after changing class, this is a little weird - * and we cannot use DEQUEUE_DELAYED. - */ - if (p->se.sched_delayed) { - /* First, dequeue it from its new class' structures */ - dequeue_task(rq, p, DEQUEUE_NOCLOCK | DEQUEUE_SLEEP); - /* - * Now, clean up the fair_sched_class side of things - * related to sched_delayed being true and that wasn't done - * due to the generic dequeue not using DEQUEUE_DELAYED. - */ - finish_delayed_dequeue_entity(&p->se); - p->se.rel_deadline = 0; - __block_task(rq, p); - } } static void switched_to_fair(struct rq *rq, struct task_struct *p) diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 6085ef50febf..081519ffab46 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -3800,7 +3800,7 @@ static inline int rt_effective_prio(struct task_struct *p, int prio) extern int __sched_setscheduler(struct task_struct *p, const struct sched_attr *attr, bool user, bool pi); extern int __sched_setaffinity(struct task_struct *p, struct affinity_context *ctx); -extern void __setscheduler_prio(struct task_struct *p, int prio); +extern const struct sched_class *__setscheduler_class(struct task_struct *p, int prio); extern void set_load_weight(struct task_struct *p, bool update_load); extern void enqueue_task(struct rq *rq, struct task_struct *p, int flags); extern bool dequeue_task(struct rq *rq, struct task_struct *p, int flags); diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h index 237780aa3c53..767e098a3bd1 100644 --- a/kernel/sched/stats.h +++ b/kernel/sched/stats.h @@ -119,45 +119,63 @@ static inline void psi_account_irqtime(struct rq *rq, struct task_struct *curr, /* * PSI tracks state that persists across sleeps, such as iowaits and * memory stalls. As a result, it has to distinguish between sleeps, - * where a task's runnable state changes, and requeues, where a task - * and its state are being moved between CPUs and runqueues. + * where a task's runnable state changes, and migrations, where a task + * and its runnable state are being moved between CPUs and runqueues. + * + * A notable case is a task whose dequeue is delayed. PSI considers + * those sleeping, but because they are still on the runqueue they can + * go through migration requeues. In this case, *sleeping* states need + * to be transferred. */ -static inline void psi_enqueue(struct task_struct *p, bool wakeup) +static inline void psi_enqueue(struct task_struct *p, bool migrate) { - int clear = 0, set = TSK_RUNNING; + int clear = 0, set = 0; if (static_branch_likely(&psi_disabled)) return; - if (p->in_memstall) - set |= TSK_MEMSTALL_RUNNING; - - if (!wakeup) { + if (p->se.sched_delayed) { + /* CPU migration of "sleeping" task */ + SCHED_WARN_ON(!migrate); if (p->in_memstall) set |= TSK_MEMSTALL; + if (p->in_iowait) + set |= TSK_IOWAIT; + } else if (migrate) { + /* CPU migration of runnable task */ + set = TSK_RUNNING; + if (p->in_memstall) + set |= TSK_MEMSTALL | TSK_MEMSTALL_RUNNING; } else { + /* Wakeup of new or sleeping task */ if (p->in_iowait) clear |= TSK_IOWAIT; + set = TSK_RUNNING; + if (p->in_memstall) + set |= TSK_MEMSTALL_RUNNING; } psi_task_change(p, clear, set); } -static inline void psi_dequeue(struct task_struct *p, bool sleep) +static inline void psi_dequeue(struct task_struct *p, bool migrate) { if (static_branch_likely(&psi_disabled)) return; /* + * When migrating a task to another CPU, clear all psi + * state. The enqueue callback above will work it out. + */ + if (migrate) + psi_task_change(p, p->psi_flags, 0); + + /* * A voluntary sleep is a dequeue followed by a task switch. To * avoid walking all ancestors twice, psi_task_switch() handles * TSK_RUNNING and TSK_IOWAIT for us when it moves TSK_ONCPU. * Do nothing here. */ - if (sleep) - return; - - psi_task_change(p, p->psi_flags, 0); } static inline void psi_ttwu_dequeue(struct task_struct *p) @@ -190,8 +208,8 @@ static inline void psi_sched_switch(struct task_struct *prev, } #else /* CONFIG_PSI */ -static inline void psi_enqueue(struct task_struct *p, bool wakeup) {} -static inline void psi_dequeue(struct task_struct *p, bool sleep) {} +static inline void psi_enqueue(struct task_struct *p, bool migrate) {} +static inline void psi_dequeue(struct task_struct *p, bool migrate) {} static inline void psi_ttwu_dequeue(struct task_struct *p) {} static inline void psi_sched_switch(struct task_struct *prev, struct task_struct *next, diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c index aa70beee9895..0470bcc3d204 100644 --- a/kernel/sched/syscalls.c +++ b/kernel/sched/syscalls.c @@ -529,7 +529,7 @@ int __sched_setscheduler(struct task_struct *p, { int oldpolicy = -1, policy = attr->sched_policy; int retval, oldprio, newprio, queued, running; - const struct sched_class *prev_class; + const struct sched_class *prev_class, *next_class; struct balance_callback *head; struct rq_flags rf; int reset_on_fork; @@ -706,6 +706,12 @@ change: queue_flags &= ~DEQUEUE_MOVE; } + prev_class = p->sched_class; + next_class = __setscheduler_class(p, newprio); + + if (prev_class != next_class && p->se.sched_delayed) + dequeue_task(rq, p, DEQUEUE_SLEEP | DEQUEUE_DELAYED | DEQUEUE_NOCLOCK); + queued = task_on_rq_queued(p); running = task_current(rq, p); if (queued) @@ -713,11 +719,10 @@ change: if (running) put_prev_task(rq, p); - prev_class = p->sched_class; - if (!(attr->sched_flags & SCHED_FLAG_KEEP_PARAMS)) { __setscheduler_params(p, attr); - __setscheduler_prio(p, newprio); + p->sched_class = next_class; + p->prio = newprio; } __setscheduler_uclamp(p, attr); check_class_changing(rq, p, prev_class); diff --git a/kernel/task_work.c b/kernel/task_work.c index 5d14d639ac71..c969f1f26be5 100644 --- a/kernel/task_work.c +++ b/kernel/task_work.c @@ -55,15 +55,26 @@ int task_work_add(struct task_struct *task, struct callback_head *work, enum task_work_notify_mode notify) { struct callback_head *head; + int flags = notify & TWA_FLAGS; + notify &= ~TWA_FLAGS; if (notify == TWA_NMI_CURRENT) { if (WARN_ON_ONCE(task != current)) return -EINVAL; if (!IS_ENABLED(CONFIG_IRQ_WORK)) return -EINVAL; } else { - /* record the work call stack in order to print it in KASAN reports */ - kasan_record_aux_stack(work); + /* + * Record the work call stack in order to print it in KASAN + * reports. + * + * Note that stack allocation can fail if TWAF_NO_ALLOC flag + * is set and new page is needed to expand the stack buffer. + */ + if (flags & TWAF_NO_ALLOC) + kasan_record_aux_stack_noalloc(work); + else + kasan_record_aux_stack(work); } head = READ_ONCE(task->task_works); diff --git a/kernel/time/posix-clock.c b/kernel/time/posix-clock.c index 316a4e8c97d3..1af0bb2cc45c 100644 --- a/kernel/time/posix-clock.c +++ b/kernel/time/posix-clock.c @@ -309,6 +309,9 @@ static int pc_clock_settime(clockid_t id, const struct timespec64 *ts) struct posix_clock_desc cd; int err; + if (!timespec64_valid_strict(ts)) + return -EINVAL; + err = get_clock_desc(id, &cd); if (err) return err; @@ -318,9 +321,6 @@ static int pc_clock_settime(clockid_t id, const struct timespec64 *ts) goto out; } - if (!timespec64_valid_strict(ts)) - return -EINVAL; - if (cd.clk->ops.clock_settime) err = cd.clk->ops.clock_settime(cd.clk, ts); else diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 753a184c7090..f203f000da1a 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -434,6 +434,12 @@ static void tick_nohz_kick_task(struct task_struct *tsk) * smp_mb__after_spin_lock() * tick_nohz_task_switch() * LOAD p->tick_dep_mask + * + * XXX given a task picks up the dependency on schedule(), should we + * only care about tasks that are currently on the CPU instead of all + * that are on the runqueue? + * + * That is, does this want to be: task_on_cpu() / task_curr()? */ if (!sched_task_on_rq(tsk)) return; diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index a582cd25ca87..3bd402fa62a4 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -3133,7 +3133,8 @@ static int bpf_uprobe_multi_link_fill_link_info(const struct bpf_link *link, struct bpf_uprobe_multi_link *umulti_link; u32 ucount = info->uprobe_multi.count; int err = 0, i; - long left; + char *p, *buf; + long left = 0; if (!upath ^ !upath_size) return -EINVAL; @@ -3147,26 +3148,23 @@ static int bpf_uprobe_multi_link_fill_link_info(const struct bpf_link *link, info->uprobe_multi.pid = umulti_link->task ? task_pid_nr_ns(umulti_link->task, task_active_pid_ns(current)) : 0; - if (upath) { - char *p, *buf; - - upath_size = min_t(u32, upath_size, PATH_MAX); - - buf = kmalloc(upath_size, GFP_KERNEL); - if (!buf) - return -ENOMEM; - p = d_path(&umulti_link->path, buf, upath_size); - if (IS_ERR(p)) { - kfree(buf); - return PTR_ERR(p); - } - upath_size = buf + upath_size - p; - left = copy_to_user(upath, p, upath_size); + upath_size = upath_size ? min_t(u32, upath_size, PATH_MAX) : PATH_MAX; + buf = kmalloc(upath_size, GFP_KERNEL); + if (!buf) + return -ENOMEM; + p = d_path(&umulti_link->path, buf, upath_size); + if (IS_ERR(p)) { kfree(buf); - if (left) - return -EFAULT; - info->uprobe_multi.path_size = upath_size; + return PTR_ERR(p); } + upath_size = buf + upath_size - p; + + if (upath) + left = copy_to_user(upath, p, upath_size); + kfree(buf); + if (left) + return -EFAULT; + info->uprobe_multi.path_size = upath_size; if (!uoffsets && !ucookies && !uref_ctr_offsets) return 0; diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c index d7d4fb403f6f..41e7a15dcb50 100644 --- a/kernel/trace/fgraph.c +++ b/kernel/trace/fgraph.c @@ -1160,19 +1160,14 @@ void fgraph_update_pid_func(void) static int start_graph_tracing(void) { unsigned long **ret_stack_list; - int ret, cpu; + int ret; - ret_stack_list = kmalloc(SHADOW_STACK_SIZE, GFP_KERNEL); + ret_stack_list = kcalloc(FTRACE_RETSTACK_ALLOC_SIZE, + sizeof(*ret_stack_list), GFP_KERNEL); if (!ret_stack_list) return -ENOMEM; - /* The cpu_boot init_task->ret_stack will never be freed */ - for_each_online_cpu(cpu) { - if (!idle_task(cpu)->ret_stack) - ftrace_graph_init_idle_task(idle_task(cpu), cpu); - } - do { ret = alloc_retstack_tasklist(ret_stack_list); } while (ret == -EAGAIN); @@ -1242,14 +1237,34 @@ static void ftrace_graph_disable_direct(bool disable_branch) fgraph_direct_gops = &fgraph_stub; } +/* The cpu_boot init_task->ret_stack will never be freed */ +static int fgraph_cpu_init(unsigned int cpu) +{ + if (!idle_task(cpu)->ret_stack) + ftrace_graph_init_idle_task(idle_task(cpu), cpu); + return 0; +} + int register_ftrace_graph(struct fgraph_ops *gops) { + static bool fgraph_initialized; int command = 0; int ret = 0; int i = -1; mutex_lock(&ftrace_lock); + if (!fgraph_initialized) { + ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "fgraph_idle_init", + fgraph_cpu_init, NULL); + if (ret < 0) { + pr_warn("fgraph: Error to init cpu hotplug support\n"); + return ret; + } + fgraph_initialized = true; + ret = 0; + } + if (!fgraph_array[0]) { /* The array must always have real data on it */ for (i = 0; i < FGRAPH_ARRAY_SIZE; i++) diff --git a/kernel/trace/trace_eprobe.c b/kernel/trace/trace_eprobe.c index b0e0ec85912e..ebda68ee9abf 100644 --- a/kernel/trace/trace_eprobe.c +++ b/kernel/trace/trace_eprobe.c @@ -912,6 +912,11 @@ static int __trace_eprobe_create(int argc, const char *argv[]) } } + if (argc - 2 > MAX_TRACE_ARGS) { + ret = -E2BIG; + goto error; + } + mutex_lock(&event_mutex); event_call = find_and_get_event(sys_name, sys_event); ep = alloc_event_probe(group, event, event_call, argc - 2); @@ -937,7 +942,7 @@ static int __trace_eprobe_create(int argc, const char *argv[]) argc -= 2; argv += 2; /* parse arguments */ - for (i = 0; i < argc && i < MAX_TRACE_ARGS; i++) { + for (i = 0; i < argc; i++) { trace_probe_log_set_index(i + 2); ret = trace_eprobe_tp_update_arg(ep, argv, i); if (ret) diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c index a079abd8955b..c62d1629cffe 100644 --- a/kernel/trace/trace_fprobe.c +++ b/kernel/trace/trace_fprobe.c @@ -1187,6 +1187,10 @@ static int __trace_fprobe_create(int argc, const char *argv[]) argc = new_argc; argv = new_argv; } + if (argc > MAX_TRACE_ARGS) { + ret = -E2BIG; + goto out; + } ret = traceprobe_expand_dentry_args(argc, argv, &dbuf); if (ret) @@ -1203,7 +1207,7 @@ static int __trace_fprobe_create(int argc, const char *argv[]) } /* parse arguments */ - for (i = 0; i < argc && i < MAX_TRACE_ARGS; i++) { + for (i = 0; i < argc; i++) { trace_probe_log_set_index(i + 2); ctx.offset = 0; ret = traceprobe_parse_probe_arg(&tf->tp, i, argv[i], &ctx); diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index 61a6da808203..263fac44d3ca 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -1013,6 +1013,10 @@ static int __trace_kprobe_create(int argc, const char *argv[]) argc = new_argc; argv = new_argv; } + if (argc > MAX_TRACE_ARGS) { + ret = -E2BIG; + goto out; + } ret = traceprobe_expand_dentry_args(argc, argv, &dbuf); if (ret) @@ -1029,7 +1033,7 @@ static int __trace_kprobe_create(int argc, const char *argv[]) } /* parse arguments */ - for (i = 0; i < argc && i < MAX_TRACE_ARGS; i++) { + for (i = 0; i < argc; i++) { trace_probe_log_set_index(i + 2); ctx.offset = 0; ret = traceprobe_parse_probe_arg(&tk->tp, i, argv[i], &ctx); diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c index 39877c80d6cb..16a5e368e7b7 100644 --- a/kernel/trace/trace_probe.c +++ b/kernel/trace/trace_probe.c @@ -276,7 +276,7 @@ int traceprobe_parse_event_name(const char **pevent, const char **pgroup, } trace_probe_log_err(offset, NO_EVENT_NAME); return -EINVAL; - } else if (len > MAX_EVENT_NAME_LEN) { + } else if (len >= MAX_EVENT_NAME_LEN) { trace_probe_log_err(offset, EVENT_TOO_LONG); return -EINVAL; } diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c index c4ad7cd7e778..1469dd8075fa 100644 --- a/kernel/trace/trace_selftest.c +++ b/kernel/trace/trace_selftest.c @@ -1485,7 +1485,7 @@ trace_selftest_startup_wakeup(struct tracer *trace, struct trace_array *tr) /* reset the max latency */ tr->max_latency = 0; - while (p->on_rq) { + while (task_is_runnable(p)) { /* * Sleep to make sure the -deadline thread is asleep too. * On virtual machines we can't rely on timings, diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c index c40531d2cbad..b30fc8fcd095 100644 --- a/kernel/trace/trace_uprobe.c +++ b/kernel/trace/trace_uprobe.c @@ -565,6 +565,8 @@ static int __trace_uprobe_create(int argc, const char **argv) if (argc < 2) return -ECANCELED; + if (argc - 2 > MAX_TRACE_ARGS) + return -E2BIG; if (argv[0][1] == ':') event = &argv[0][2]; @@ -690,7 +692,7 @@ static int __trace_uprobe_create(int argc, const char **argv) tu->filename = filename; /* parse arguments */ - for (i = 0; i < argc && i < MAX_TRACE_ARGS; i++) { + for (i = 0; i < argc; i++) { struct traceprobe_parse_context ctx = { .flags = (is_return ? TPARG_FL_RETURN : 0) | TPARG_FL_USER, }; @@ -875,6 +877,7 @@ struct uprobe_cpu_buffer { }; static struct uprobe_cpu_buffer __percpu *uprobe_cpu_buffer; static int uprobe_buffer_refcnt; +#define MAX_UCB_BUFFER_SIZE PAGE_SIZE static int uprobe_buffer_init(void) { @@ -979,6 +982,11 @@ static struct uprobe_cpu_buffer *prepare_uprobe_buffer(struct trace_uprobe *tu, ucb = uprobe_buffer_get(); ucb->dsize = tu->tp.size + dsize; + if (WARN_ON_ONCE(ucb->dsize > MAX_UCB_BUFFER_SIZE)) { + ucb->dsize = MAX_UCB_BUFFER_SIZE; + dsize = MAX_UCB_BUFFER_SIZE - tu->tp.size; + } + store_trace_args(ucb->buf, &tu->tp, regs, NULL, esize, dsize); *ucbp = ucb; @@ -998,9 +1006,6 @@ static void __uprobe_trace_func(struct trace_uprobe *tu, WARN_ON(call != trace_file->event_call); - if (WARN_ON_ONCE(ucb->dsize > PAGE_SIZE)) - return; - if (trace_trigger_soft_disabled(trace_file)) return; diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 7315f643817a..7312ae7c3cc5 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -3060,7 +3060,7 @@ config RUST_BUILD_ASSERT_ALLOW bool "Allow unoptimized build-time assertions" depends on RUST help - Controls how are `build_error!` and `build_assert!` handled during build. + Controls how `build_error!` and `build_assert!` are handled during the build. If calls to them exist in the binary, it may indicate a violated invariant or that the optimizer failed to verify the invariant during compilation. diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan index 98016e137b7f..233ab2096924 100644 --- a/lib/Kconfig.kasan +++ b/lib/Kconfig.kasan @@ -22,8 +22,11 @@ config ARCH_DISABLE_KASAN_INLINE config CC_HAS_KASAN_GENERIC def_bool $(cc-option, -fsanitize=kernel-address) +# GCC appears to ignore no_sanitize_address when -fsanitize=kernel-hwaddress +# is passed. See https://bugzilla.kernel.org/show_bug.cgi?id=218854 (and +# the linked LKML thread) for more details. config CC_HAS_KASAN_SW_TAGS - def_bool $(cc-option, -fsanitize=kernel-hwaddress) + def_bool !CC_IS_GCC && $(cc-option, -fsanitize=kernel-hwaddress) # This option is only required for software KASAN modes. # Old GCC versions do not have proper support for no_sanitize_address. @@ -98,7 +101,7 @@ config KASAN_SW_TAGS help Enables Software Tag-Based KASAN. - Requires GCC 11+ or Clang. + Requires Clang. Supported only on arm64 CPUs and relies on Top Byte Ignore. diff --git a/lib/buildid.c b/lib/buildid.c index 290641d92ac1..c4b0f376fb34 100644 --- a/lib/buildid.c +++ b/lib/buildid.c @@ -5,6 +5,7 @@ #include <linux/elf.h> #include <linux/kernel.h> #include <linux/pagemap.h> +#include <linux/secretmem.h> #define BUILD_ID 3 @@ -64,6 +65,10 @@ static int freader_get_folio(struct freader *r, loff_t file_off) freader_put_folio(r); + /* reject secretmem folios created with memfd_secret() */ + if (secretmem_mapping(r->file->f_mapping)) + return -EFAULT; + r->folio = filemap_get_folio(r->file->f_mapping, file_off >> PAGE_SHIFT); /* if sleeping is allowed, wait for the page, if necessary */ diff --git a/lib/codetag.c b/lib/codetag.c index afa8a2d4f317..d1fbbb7c2ec3 100644 --- a/lib/codetag.c +++ b/lib/codetag.c @@ -228,6 +228,9 @@ bool codetag_unload_module(struct module *mod) if (!mod) return true; + /* await any module's kfree_rcu() operations to complete */ + kvfree_rcu_barrier(); + mutex_lock(&codetag_lock); list_for_each_entry(cttype, &codetag_types, link) { struct codetag_module *found = NULL; diff --git a/lib/crypto/mpi/mpi-mul.c b/lib/crypto/mpi/mpi-mul.c index 892a246216b9..7e6ff1ce3e9b 100644 --- a/lib/crypto/mpi/mpi-mul.c +++ b/lib/crypto/mpi/mpi-mul.c @@ -21,7 +21,7 @@ int mpi_mul(MPI w, MPI u, MPI v) int usign, vsign, sign_product; int assign_wp = 0; mpi_ptr_t tmp_limb = NULL; - int err; + int err = 0; if (u->nlimbs < v->nlimbs) { /* Swap U and V. */ diff --git a/lib/maple_tree.c b/lib/maple_tree.c index 20990ecba2dd..3619301dda2e 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -2196,6 +2196,8 @@ static inline void mas_node_or_none(struct ma_state *mas, /* * mas_wr_node_walk() - Find the correct offset for the index in the @mas. + * If @mas->index cannot be found within the containing + * node, we traverse to the last entry in the node. * @wr_mas: The maple write state * * Uses mas_slot_locked() and does not need to worry about dead nodes. @@ -3532,7 +3534,7 @@ static bool mas_wr_walk(struct ma_wr_state *wr_mas) return true; } -static bool mas_wr_walk_index(struct ma_wr_state *wr_mas) +static void mas_wr_walk_index(struct ma_wr_state *wr_mas) { struct ma_state *mas = wr_mas->mas; @@ -3541,11 +3543,9 @@ static bool mas_wr_walk_index(struct ma_wr_state *wr_mas) wr_mas->content = mas_slot_locked(mas, wr_mas->slots, mas->offset); if (ma_is_leaf(wr_mas->type)) - return true; + return; mas_wr_walk_traverse(wr_mas); - } - return true; } /* * mas_extend_spanning_null() - Extend a store of a %NULL to include surrounding %NULLs. @@ -3765,8 +3765,8 @@ static noinline void mas_wr_spanning_store(struct ma_wr_state *wr_mas) memset(&b_node, 0, sizeof(struct maple_big_node)); /* Copy l_mas and store the value in b_node. */ mas_store_b_node(&l_wr_mas, &b_node, l_mas.end); - /* Copy r_mas into b_node. */ - if (r_mas.offset <= r_mas.end) + /* Copy r_mas into b_node if there is anything to copy. */ + if (r_mas.max > r_mas.last) mas_mab_cp(&r_mas, r_mas.offset, r_mas.end, &b_node, b_node.b_end + 1); else @@ -4218,7 +4218,7 @@ static inline void mas_wr_store_type(struct ma_wr_state *wr_mas) /* Potential spanning rebalance collapsing a node */ if (new_end < mt_min_slots[wr_mas->type]) { - if (!mte_is_root(mas->node)) { + if (!mte_is_root(mas->node) && !(mas->mas_flags & MA_STATE_BULK)) { mas->store_type = wr_rebalance; return; } diff --git a/lib/objpool.c b/lib/objpool.c index 234f9d0bd081..fd108fe0d095 100644 --- a/lib/objpool.c +++ b/lib/objpool.c @@ -76,7 +76,7 @@ objpool_init_percpu_slots(struct objpool_head *pool, int nr_objs, * mimimal size of vmalloc is one page since vmalloc would * always align the requested size to page size */ - if (pool->gfp & GFP_ATOMIC) + if ((pool->gfp & GFP_ATOMIC) == GFP_ATOMIC) slot = kmalloc_node(size, pool->gfp, cpu_to_node(i)); else slot = __vmalloc_node(size, sizeof(void *), pool->gfp, diff --git a/mm/damon/tests/sysfs-kunit.h b/mm/damon/tests/sysfs-kunit.h index 1c9b596057a7..7b5c7b307da9 100644 --- a/mm/damon/tests/sysfs-kunit.h +++ b/mm/damon/tests/sysfs-kunit.h @@ -67,6 +67,7 @@ static void damon_sysfs_test_add_targets(struct kunit *test) damon_destroy_ctx(ctx); kfree(sysfs_targets->targets_arr); kfree(sysfs_targets); + kfree(sysfs_target->regions); kfree(sysfs_target); } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 87b49ecc7b1e..2fb328880b50 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -109,18 +109,7 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, if (!vma->vm_mm) /* vdso */ return 0; - /* - * Explicitly disabled through madvise or prctl, or some - * architectures may disable THP for some mappings, for - * example, s390 kvm. - * */ - if ((vm_flags & VM_NOHUGEPAGE) || - test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags)) - return 0; - /* - * If the hardware/firmware marked hugepage support disabled. - */ - if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_UNSUPPORTED)) + if (thp_disabled_by_hw() || vma_thp_disabled(vma, vm_flags)) return 0; /* khugepaged doesn't collapse DAX vma, but page fault is fine. */ diff --git a/mm/kasan/init.c b/mm/kasan/init.c index 89895f38f722..ac607c306292 100644 --- a/mm/kasan/init.c +++ b/mm/kasan/init.c @@ -106,6 +106,10 @@ static void __ref zero_pte_populate(pmd_t *pmd, unsigned long addr, } } +void __weak __meminit kernel_pte_init(void *addr) +{ +} + static int __ref zero_pmd_populate(pud_t *pud, unsigned long addr, unsigned long end) { @@ -126,8 +130,10 @@ static int __ref zero_pmd_populate(pud_t *pud, unsigned long addr, if (slab_is_available()) p = pte_alloc_one_kernel(&init_mm); - else + else { p = early_alloc(PAGE_SIZE, NUMA_NO_NODE); + kernel_pte_init(p); + } if (!p) return -ENOMEM; diff --git a/mm/khugepaged.c b/mm/khugepaged.c index f9c39898eaff..b538c3d48386 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -2227,7 +2227,7 @@ rollback: folio_put(new_folio); out: VM_BUG_ON(!list_empty(&pagelist)); - trace_mm_khugepaged_collapse_file(mm, new_folio, index, is_shmem, addr, file, HPAGE_PMD_NR, result); + trace_mm_khugepaged_collapse_file(mm, new_folio, index, addr, is_shmem, file, HPAGE_PMD_NR, result); return result; } @@ -2252,7 +2252,7 @@ static int hpage_collapse_scan_file(struct mm_struct *mm, unsigned long addr, continue; if (xa_is_value(folio)) { - ++swap; + swap += 1 << xas_get_order(&xas); if (cc->is_khugepaged && swap > khugepaged_max_ptes_swap) { result = SCAN_EXCEED_SWAP_PTE; @@ -2299,7 +2299,7 @@ static int hpage_collapse_scan_file(struct mm_struct *mm, unsigned long addr, * is just too costly... */ - present++; + present += folio_nr_pages(folio); if (need_resched()) { xas_pause(&xas); diff --git a/mm/memory.c b/mm/memory.c index 2366578015ad..3ccee51adfbb 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4181,11 +4181,6 @@ fallback: return __alloc_swap_folio(vmf); } #else /* !CONFIG_TRANSPARENT_HUGEPAGE */ -static inline bool can_swapin_thp(struct vm_fault *vmf, pte_t *ptep, int nr_pages) -{ - return false; -} - static struct folio *alloc_swap_folio(struct vm_fault *vmf) { return __alloc_swap_folio(vmf); @@ -4925,6 +4920,15 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page) pmd_t entry; vm_fault_t ret = VM_FAULT_FALLBACK; + /* + * It is too late to allocate a small folio, we already have a large + * folio in the pagecache: especially s390 KVM cannot tolerate any + * PMD mappings, but PTE-mapped THP are fine. So let's simply refuse any + * PMD mappings if THPs are disabled. + */ + if (thp_disabled_by_hw() || vma_thp_disabled(vma, vma->vm_flags)) + return ret; + if (!thp_vma_suitable_order(vma, haddr, PMD_ORDER)) return ret; @@ -6346,7 +6350,8 @@ static inline void pfnmap_args_setup(struct follow_pfnmap_args *args, static inline void pfnmap_lockdep_assert(struct vm_area_struct *vma) { #ifdef CONFIG_LOCKDEP - struct address_space *mapping = vma->vm_file->f_mapping; + struct file *file = vma->vm_file; + struct address_space *mapping = file ? file->f_mapping : NULL; if (mapping) lockdep_assert(lockdep_is_held(&vma->vm_file->f_mapping->i_mmap_rwsem) || diff --git a/mm/mmap.c b/mm/mmap.c index dd4b35a25aeb..9c0fb43064b5 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1371,7 +1371,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr, struct maple_tree mt_detach; unsigned long end = addr + len; bool writable_file_mapping = false; - int error = -ENOMEM; + int error; VMA_ITERATOR(vmi, mm, addr); VMG_STATE(vmg, mm, &vmi, addr, end, vm_flags, pgoff); @@ -1396,8 +1396,10 @@ unsigned long mmap_region(struct file *file, unsigned long addr, } /* Check against address space limit. */ - if (!may_expand_vm(mm, vm_flags, pglen - vms.nr_pages)) + if (!may_expand_vm(mm, vm_flags, pglen - vms.nr_pages)) { + error = -ENOMEM; goto abort_munmap; + } /* * Private writable mapping: check memory availability @@ -1405,8 +1407,11 @@ unsigned long mmap_region(struct file *file, unsigned long addr, if (accountable_mapping(file, vm_flags)) { charged = pglen; charged -= vms.nr_accounted; - if (charged && security_vm_enough_memory_mm(mm, charged)) - goto abort_munmap; + if (charged) { + error = security_vm_enough_memory_mm(mm, charged); + if (error) + goto abort_munmap; + } vms.nr_accounted = 0; vm_flags |= VM_ACCOUNT; @@ -1422,8 +1427,10 @@ unsigned long mmap_region(struct file *file, unsigned long addr, * not unmapped, but the maps are removed from the list. */ vma = vm_area_alloc(mm); - if (!vma) + if (!vma) { + error = -ENOMEM; goto unacct_error; + } vma_iter_config(&vmi, addr, end); vma_set_range(vma, addr, end, pgoff); @@ -1453,9 +1460,10 @@ unsigned long mmap_region(struct file *file, unsigned long addr, * Expansion is handled above, merging is handled below. * Drivers should not alter the address of the VMA. */ - error = -EINVAL; - if (WARN_ON((addr != vma->vm_start))) + if (WARN_ON((addr != vma->vm_start))) { + error = -EINVAL; goto close_and_free_vma; + } vma_iter_config(&vmi, addr, end); /* @@ -1500,13 +1508,15 @@ unsigned long mmap_region(struct file *file, unsigned long addr, } /* Allow architectures to sanity-check the vm_flags */ - error = -EINVAL; - if (!arch_validate_flags(vma->vm_flags)) + if (!arch_validate_flags(vma->vm_flags)) { + error = -EINVAL; goto close_and_free_vma; + } - error = -ENOMEM; - if (vma_iter_prealloc(&vmi, vma)) + if (vma_iter_prealloc(&vmi, vma)) { + error = -ENOMEM; goto close_and_free_vma; + } /* Lock the VMA since it is modified after insertion into VMA tree */ vma_start_write(vma); diff --git a/mm/mremap.c b/mm/mremap.c index 24712f8dbb6b..dda09e957a5d 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -238,6 +238,7 @@ static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr, { spinlock_t *old_ptl, *new_ptl; struct mm_struct *mm = vma->vm_mm; + bool res = false; pmd_t pmd; if (!arch_supports_page_table_move()) @@ -277,19 +278,25 @@ static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr, if (new_ptl != old_ptl) spin_lock_nested(new_ptl, SINGLE_DEPTH_NESTING); - /* Clear the pmd */ pmd = *old_pmd; + + /* Racing with collapse? */ + if (unlikely(!pmd_present(pmd) || pmd_leaf(pmd))) + goto out_unlock; + /* Clear the pmd */ pmd_clear(old_pmd); + res = true; VM_BUG_ON(!pmd_none(*new_pmd)); pmd_populate(mm, new_pmd, pmd_pgtable(pmd)); flush_tlb_range(vma, old_addr, old_addr + PMD_SIZE); +out_unlock: if (new_ptl != old_ptl) spin_unlock(new_ptl); spin_unlock(old_ptl); - return true; + return res; } #else static inline bool move_normal_pmd(struct vm_area_struct *vma, diff --git a/mm/shmem.c b/mm/shmem.c index 4f11b5506363..c5adb987b23c 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1664,12 +1664,7 @@ unsigned long shmem_allowable_huge_orders(struct inode *inode, loff_t i_size; int order; - if (vma && ((vm_flags & VM_NOHUGEPAGE) || - test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))) - return 0; - - /* If the hardware/firmware marked hugepage support disabled. */ - if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_UNSUPPORTED)) + if (thp_disabled_by_hw() || (vma && vma_thp_disabled(vma, vm_flags))) return 0; global_huge = shmem_huge_global_enabled(inode, index, write_end, diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index edcc7a6b0f6f..c0388b2e959d 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -184,6 +184,10 @@ static void * __meminit vmemmap_alloc_block_zero(unsigned long size, int node) return p; } +void __weak __meminit kernel_pte_init(void *addr) +{ +} + pmd_t * __meminit vmemmap_pmd_populate(pud_t *pud, unsigned long addr, int node) { pmd_t *pmd = pmd_offset(pud, addr); @@ -191,6 +195,7 @@ pmd_t * __meminit vmemmap_pmd_populate(pud_t *pud, unsigned long addr, int node) void *p = vmemmap_alloc_block_zero(PAGE_SIZE, node); if (!p) return NULL; + kernel_pte_init(p); pmd_populate_kernel(&init_mm, pmd, p); } return pmd; diff --git a/mm/swapfile.c b/mm/swapfile.c index 0cded32414a1..b0915f3fab31 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -194,9 +194,6 @@ static int __try_to_reclaim_swap(struct swap_info_struct *si, if (IS_ERR(folio)) return 0; - /* offset could point to the middle of a large folio */ - entry = folio->swap; - offset = swp_offset(entry); nr_pages = folio_nr_pages(folio); ret = -nr_pages; @@ -210,6 +207,10 @@ static int __try_to_reclaim_swap(struct swap_info_struct *si, if (!folio_trylock(folio)) goto out; + /* offset could point to the middle of a large folio */ + entry = folio->swap; + offset = swp_offset(entry); + need_reclaim = ((flags & TTRS_ANYWAY) || ((flags & TTRS_UNMAPPED) && !folio_mapped(folio)) || ((flags & TTRS_FULL) && mem_cgroup_swap_full(folio))); @@ -2312,7 +2313,7 @@ static int unuse_mm(struct mm_struct *mm, unsigned int type) mmap_read_lock(mm); for_each_vma(vmi, vma) { - if (vma->anon_vma) { + if (vma->anon_vma && !is_vm_hugetlb_page(vma)) { ret = unuse_vma(vma, type); if (ret) break; diff --git a/mm/vmscan.c b/mm/vmscan.c index 749cdc110c74..eb4e8440c507 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4963,8 +4963,8 @@ static void lru_gen_shrink_node(struct pglist_data *pgdat, struct scan_control * blk_finish_plug(&plug); done: - /* kswapd should never fail */ - pgdat->kswapd_failures = 0; + if (sc->nr_reclaimed > reclaimed) + pgdat->kswapd_failures = 0; } /****************************************************************************** diff --git a/net/9p/client.c b/net/9p/client.c index 5cd94721d974..09f8ced9f8bb 100644 --- a/net/9p/client.c +++ b/net/9p/client.c @@ -977,8 +977,10 @@ error: struct p9_client *p9_client_create(const char *dev_name, char *options) { int err; + static atomic_t seqno = ATOMIC_INIT(0); struct p9_client *clnt; char *client_id; + char *cache_name; clnt = kmalloc(sizeof(*clnt), GFP_KERNEL); if (!clnt) @@ -1035,15 +1037,23 @@ struct p9_client *p9_client_create(const char *dev_name, char *options) if (err) goto close_trans; + cache_name = kasprintf(GFP_KERNEL, + "9p-fcall-cache-%u", atomic_inc_return(&seqno)); + if (!cache_name) { + err = -ENOMEM; + goto close_trans; + } + /* P9_HDRSZ + 4 is the smallest packet header we can have that is * followed by data accessed from userspace by read */ clnt->fcall_cache = - kmem_cache_create_usercopy("9p-fcall-cache", clnt->msize, + kmem_cache_create_usercopy(cache_name, clnt->msize, 0, 0, P9_HDRSZ + 4, clnt->msize - (P9_HDRSZ + 4), NULL); + kfree(cache_name); return clnt; close_trans: diff --git a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c index 67604ccec2f4..0b4d0a8bd361 100644 --- a/net/bluetooth/af_bluetooth.c +++ b/net/bluetooth/af_bluetooth.c @@ -185,6 +185,28 @@ void bt_sock_unlink(struct bt_sock_list *l, struct sock *sk) } EXPORT_SYMBOL(bt_sock_unlink); +bool bt_sock_linked(struct bt_sock_list *l, struct sock *s) +{ + struct sock *sk; + + if (!l || !s) + return false; + + read_lock(&l->lock); + + sk_for_each(sk, &l->head) { + if (s == sk) { + read_unlock(&l->lock); + return true; + } + } + + read_unlock(&l->lock); + + return false; +} +EXPORT_SYMBOL(bt_sock_linked); + void bt_accept_enqueue(struct sock *parent, struct sock *sk, bool bh) { const struct cred *old_cred; @@ -825,11 +847,14 @@ cleanup_sysfs: bt_sysfs_cleanup(); cleanup_led: bt_leds_cleanup(); + debugfs_remove_recursive(bt_debugfs); return err; } static void __exit bt_exit(void) { + iso_exit(); + mgmt_exit(); sco_exit(); diff --git a/net/bluetooth/bnep/core.c b/net/bluetooth/bnep/core.c index a3bc0934cc13..d44987d4515c 100644 --- a/net/bluetooth/bnep/core.c +++ b/net/bluetooth/bnep/core.c @@ -745,8 +745,7 @@ static int __init bnep_init(void) if (flt[0]) BT_INFO("BNEP filters: %s", flt); - bnep_sock_init(); - return 0; + return bnep_sock_init(); } static void __exit bnep_exit(void) diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c index 629c302f7407..96d097b21d13 100644 --- a/net/bluetooth/hci_core.c +++ b/net/bluetooth/hci_core.c @@ -1644,12 +1644,12 @@ void hci_adv_instances_clear(struct hci_dev *hdev) struct adv_info *adv_instance, *n; if (hdev->adv_instance_timeout) { - cancel_delayed_work(&hdev->adv_instance_expire); + disable_delayed_work(&hdev->adv_instance_expire); hdev->adv_instance_timeout = 0; } list_for_each_entry_safe(adv_instance, n, &hdev->adv_instances, list) { - cancel_delayed_work_sync(&adv_instance->rpa_expired_cb); + disable_delayed_work_sync(&adv_instance->rpa_expired_cb); list_del(&adv_instance->list); kfree(adv_instance); } @@ -2685,11 +2685,11 @@ void hci_unregister_dev(struct hci_dev *hdev) list_del(&hdev->list); write_unlock(&hci_dev_list_lock); - cancel_work_sync(&hdev->rx_work); - cancel_work_sync(&hdev->cmd_work); - cancel_work_sync(&hdev->tx_work); - cancel_work_sync(&hdev->power_on); - cancel_work_sync(&hdev->error_reset); + disable_work_sync(&hdev->rx_work); + disable_work_sync(&hdev->cmd_work); + disable_work_sync(&hdev->tx_work); + disable_work_sync(&hdev->power_on); + disable_work_sync(&hdev->error_reset); hci_cmd_sync_clear(hdev); @@ -2796,8 +2796,14 @@ static void hci_cancel_cmd_sync(struct hci_dev *hdev, int err) { bt_dev_dbg(hdev, "err 0x%2.2x", err); - cancel_delayed_work_sync(&hdev->cmd_timer); - cancel_delayed_work_sync(&hdev->ncmd_timer); + if (hci_dev_test_flag(hdev, HCI_UNREGISTER)) { + disable_delayed_work_sync(&hdev->cmd_timer); + disable_delayed_work_sync(&hdev->ncmd_timer); + } else { + cancel_delayed_work_sync(&hdev->cmd_timer); + cancel_delayed_work_sync(&hdev->ncmd_timer); + } + atomic_set(&hdev->cmd_cnt, 1); hci_cmd_sync_cancel_sync(hdev, err); diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c index 40ccdef168d7..ae7a5817883a 100644 --- a/net/bluetooth/hci_sync.c +++ b/net/bluetooth/hci_sync.c @@ -5131,9 +5131,15 @@ int hci_dev_close_sync(struct hci_dev *hdev) bt_dev_dbg(hdev, ""); - cancel_delayed_work(&hdev->power_off); - cancel_delayed_work(&hdev->ncmd_timer); - cancel_delayed_work(&hdev->le_scan_disable); + if (hci_dev_test_flag(hdev, HCI_UNREGISTER)) { + disable_delayed_work(&hdev->power_off); + disable_delayed_work(&hdev->ncmd_timer); + disable_delayed_work(&hdev->le_scan_disable); + } else { + cancel_delayed_work(&hdev->power_off); + cancel_delayed_work(&hdev->ncmd_timer); + cancel_delayed_work(&hdev->le_scan_disable); + } hci_cmd_sync_cancel_sync(hdev, ENODEV); diff --git a/net/bluetooth/iso.c b/net/bluetooth/iso.c index d5e00d0dd1a0..7a83e400ac77 100644 --- a/net/bluetooth/iso.c +++ b/net/bluetooth/iso.c @@ -93,6 +93,16 @@ static struct sock *iso_get_sock(bdaddr_t *src, bdaddr_t *dst, #define ISO_CONN_TIMEOUT (HZ * 40) #define ISO_DISCONN_TIMEOUT (HZ * 2) +static struct sock *iso_sock_hold(struct iso_conn *conn) +{ + if (!conn || !bt_sock_linked(&iso_sk_list, conn->sk)) + return NULL; + + sock_hold(conn->sk); + + return conn->sk; +} + static void iso_sock_timeout(struct work_struct *work) { struct iso_conn *conn = container_of(work, struct iso_conn, @@ -100,9 +110,7 @@ static void iso_sock_timeout(struct work_struct *work) struct sock *sk; iso_conn_lock(conn); - sk = conn->sk; - if (sk) - sock_hold(sk); + sk = iso_sock_hold(conn); iso_conn_unlock(conn); if (!sk) @@ -209,9 +217,7 @@ static void iso_conn_del(struct hci_conn *hcon, int err) /* Kill socket */ iso_conn_lock(conn); - sk = conn->sk; - if (sk) - sock_hold(sk); + sk = iso_sock_hold(conn); iso_conn_unlock(conn); if (sk) { @@ -2301,13 +2307,9 @@ int iso_init(void) hci_register_cb(&iso_cb); - if (IS_ERR_OR_NULL(bt_debugfs)) - return 0; - - if (!iso_debugfs) { + if (!IS_ERR_OR_NULL(bt_debugfs)) iso_debugfs = debugfs_create_file("iso", 0444, bt_debugfs, NULL, &iso_debugfs_fops); - } iso_inited = true; diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c index a5ac160c592e..1c7252a36866 100644 --- a/net/bluetooth/sco.c +++ b/net/bluetooth/sco.c @@ -76,6 +76,16 @@ struct sco_pinfo { #define SCO_CONN_TIMEOUT (HZ * 40) #define SCO_DISCONN_TIMEOUT (HZ * 2) +static struct sock *sco_sock_hold(struct sco_conn *conn) +{ + if (!conn || !bt_sock_linked(&sco_sk_list, conn->sk)) + return NULL; + + sock_hold(conn->sk); + + return conn->sk; +} + static void sco_sock_timeout(struct work_struct *work) { struct sco_conn *conn = container_of(work, struct sco_conn, @@ -87,9 +97,7 @@ static void sco_sock_timeout(struct work_struct *work) sco_conn_unlock(conn); return; } - sk = conn->sk; - if (sk) - sock_hold(sk); + sk = sco_sock_hold(conn); sco_conn_unlock(conn); if (!sk) @@ -194,9 +202,7 @@ static void sco_conn_del(struct hci_conn *hcon, int err) /* Kill socket */ sco_conn_lock(conn); - sk = conn->sk; - if (sk) - sock_hold(sk); + sk = sco_sock_hold(conn); sco_conn_unlock(conn); if (sk) { diff --git a/net/core/filter.c b/net/core/filter.c index a88e6924c4c0..58761263176c 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -2438,9 +2438,9 @@ out: /* Internal, non-exposed redirect flags. */ enum { - BPF_F_NEIGH = (1ULL << 1), - BPF_F_PEER = (1ULL << 2), - BPF_F_NEXTHOP = (1ULL << 3), + BPF_F_NEIGH = (1ULL << 16), + BPF_F_PEER = (1ULL << 17), + BPF_F_NEXTHOP = (1ULL << 18), #define BPF_F_REDIRECT_INTERNAL (BPF_F_NEIGH | BPF_F_PEER | BPF_F_NEXTHOP) }; @@ -2450,6 +2450,8 @@ BPF_CALL_3(bpf_clone_redirect, struct sk_buff *, skb, u32, ifindex, u64, flags) struct sk_buff *clone; int ret; + BUILD_BUG_ON(BPF_F_REDIRECT_INTERNAL & BPF_F_REDIRECT_FLAGS); + if (unlikely(flags & (~(BPF_F_INGRESS) | BPF_F_REDIRECT_INTERNAL))) return -EINVAL; diff --git a/net/core/sock_map.c b/net/core/sock_map.c index 242c91a6e3d3..07d6aa4e39ef 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -647,6 +647,8 @@ BPF_CALL_4(bpf_sk_redirect_map, struct sk_buff *, skb, sk = __sock_map_lookup_elem(map, key); if (unlikely(!sk || !sock_map_redirect_allowed(sk))) return SK_DROP; + if ((flags & BPF_F_INGRESS) && sk_is_vsock(sk)) + return SK_DROP; skb_bpf_set_redir(skb, sk, flags & BPF_F_INGRESS); return SK_PASS; @@ -675,6 +677,8 @@ BPF_CALL_4(bpf_msg_redirect_map, struct sk_msg *, msg, return SK_DROP; if (!(flags & BPF_F_INGRESS) && !sk_is_tcp(sk)) return SK_DROP; + if (sk_is_vsock(sk)) + return SK_DROP; msg->flags = flags; msg->sk_redir = sk; @@ -1249,6 +1253,8 @@ BPF_CALL_4(bpf_sk_redirect_hash, struct sk_buff *, skb, sk = __sock_hash_lookup_elem(map, key); if (unlikely(!sk || !sock_map_redirect_allowed(sk))) return SK_DROP; + if ((flags & BPF_F_INGRESS) && sk_is_vsock(sk)) + return SK_DROP; skb_bpf_set_redir(skb, sk, flags & BPF_F_INGRESS); return SK_PASS; @@ -1277,6 +1283,8 @@ BPF_CALL_4(bpf_msg_redirect_hash, struct sk_msg *, msg, return SK_DROP; if (!(flags & BPF_F_INGRESS) && !sk_is_tcp(sk)) return SK_DROP; + if (sk_is_vsock(sk)) + return SK_DROP; msg->flags = flags; msg->sk_redir = sk; diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c index 0294fef577fa..7e1c2faed1ff 100644 --- a/net/ipv4/xfrm4_policy.c +++ b/net/ipv4/xfrm4_policy.c @@ -17,47 +17,43 @@ #include <net/ip.h> #include <net/l3mdev.h> -static struct dst_entry *__xfrm4_dst_lookup(struct net *net, struct flowi4 *fl4, - int tos, int oif, - const xfrm_address_t *saddr, - const xfrm_address_t *daddr, - u32 mark) +static struct dst_entry *__xfrm4_dst_lookup(struct flowi4 *fl4, + const struct xfrm_dst_lookup_params *params) { struct rtable *rt; memset(fl4, 0, sizeof(*fl4)); - fl4->daddr = daddr->a4; - fl4->flowi4_tos = tos; - fl4->flowi4_l3mdev = l3mdev_master_ifindex_by_index(net, oif); - fl4->flowi4_mark = mark; - if (saddr) - fl4->saddr = saddr->a4; - - rt = __ip_route_output_key(net, fl4); + fl4->daddr = params->daddr->a4; + fl4->flowi4_tos = params->tos; + fl4->flowi4_l3mdev = l3mdev_master_ifindex_by_index(params->net, + params->oif); + fl4->flowi4_mark = params->mark; + if (params->saddr) + fl4->saddr = params->saddr->a4; + fl4->flowi4_proto = params->ipproto; + fl4->uli = params->uli; + + rt = __ip_route_output_key(params->net, fl4); if (!IS_ERR(rt)) return &rt->dst; return ERR_CAST(rt); } -static struct dst_entry *xfrm4_dst_lookup(struct net *net, int tos, int oif, - const xfrm_address_t *saddr, - const xfrm_address_t *daddr, - u32 mark) +static struct dst_entry *xfrm4_dst_lookup(const struct xfrm_dst_lookup_params *params) { struct flowi4 fl4; - return __xfrm4_dst_lookup(net, &fl4, tos, oif, saddr, daddr, mark); + return __xfrm4_dst_lookup(&fl4, params); } -static int xfrm4_get_saddr(struct net *net, int oif, - xfrm_address_t *saddr, xfrm_address_t *daddr, - u32 mark) +static int xfrm4_get_saddr(xfrm_address_t *saddr, + const struct xfrm_dst_lookup_params *params) { struct dst_entry *dst; struct flowi4 fl4; - dst = __xfrm4_dst_lookup(net, &fl4, 0, oif, NULL, daddr, mark); + dst = __xfrm4_dst_lookup(&fl4, params); if (IS_ERR(dst)) return -EHOSTUNREACH; diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c index b1d81c4270ab..1f19b6f14484 100644 --- a/net/ipv6/xfrm6_policy.c +++ b/net/ipv6/xfrm6_policy.c @@ -23,23 +23,24 @@ #include <net/ip6_route.h> #include <net/l3mdev.h> -static struct dst_entry *xfrm6_dst_lookup(struct net *net, int tos, int oif, - const xfrm_address_t *saddr, - const xfrm_address_t *daddr, - u32 mark) +static struct dst_entry *xfrm6_dst_lookup(const struct xfrm_dst_lookup_params *params) { struct flowi6 fl6; struct dst_entry *dst; int err; memset(&fl6, 0, sizeof(fl6)); - fl6.flowi6_l3mdev = l3mdev_master_ifindex_by_index(net, oif); - fl6.flowi6_mark = mark; - memcpy(&fl6.daddr, daddr, sizeof(fl6.daddr)); - if (saddr) - memcpy(&fl6.saddr, saddr, sizeof(fl6.saddr)); + fl6.flowi6_l3mdev = l3mdev_master_ifindex_by_index(params->net, + params->oif); + fl6.flowi6_mark = params->mark; + memcpy(&fl6.daddr, params->daddr, sizeof(fl6.daddr)); + if (params->saddr) + memcpy(&fl6.saddr, params->saddr, sizeof(fl6.saddr)); - dst = ip6_route_output(net, NULL, &fl6); + fl6.flowi4_proto = params->ipproto; + fl6.uli = params->uli; + + dst = ip6_route_output(params->net, NULL, &fl6); err = dst->error; if (dst->error) { @@ -50,15 +51,14 @@ static struct dst_entry *xfrm6_dst_lookup(struct net *net, int tos, int oif, return dst; } -static int xfrm6_get_saddr(struct net *net, int oif, - xfrm_address_t *saddr, xfrm_address_t *daddr, - u32 mark) +static int xfrm6_get_saddr(xfrm_address_t *saddr, + const struct xfrm_dst_lookup_params *params) { struct dst_entry *dst; struct net_device *dev; struct inet6_dev *idev; - dst = xfrm6_dst_lookup(net, 0, oif, NULL, daddr, mark); + dst = xfrm6_dst_lookup(params); if (IS_ERR(dst)) return -EHOSTUNREACH; @@ -68,7 +68,8 @@ static int xfrm6_get_saddr(struct net *net, int oif, return -EHOSTUNREACH; } dev = idev->dev; - ipv6_dev_get_saddr(dev_net(dev), dev, &daddr->in6, 0, &saddr->in6); + ipv6_dev_get_saddr(dev_net(dev), dev, ¶ms->daddr->in6, 0, + &saddr->in6); dst_release(dst); return 0; } diff --git a/net/netfilter/nf_bpf_link.c b/net/netfilter/nf_bpf_link.c index 5257d5e7eb09..3d64a4511fcf 100644 --- a/net/netfilter/nf_bpf_link.c +++ b/net/netfilter/nf_bpf_link.c @@ -23,6 +23,7 @@ static unsigned int nf_hook_run_bpf(void *bpf_prog, struct sk_buff *skb, struct bpf_nf_link { struct bpf_link link; struct nf_hook_ops hook_ops; + netns_tracker ns_tracker; struct net *net; u32 dead; const struct nf_defrag_hook *defrag_hook; @@ -120,6 +121,7 @@ static void bpf_nf_link_release(struct bpf_link *link) if (!cmpxchg(&nf_link->dead, 0, 1)) { nf_unregister_net_hook(nf_link->net, &nf_link->hook_ops); bpf_nf_disable_defrag(nf_link); + put_net_track(nf_link->net, &nf_link->ns_tracker); } } @@ -150,11 +152,12 @@ static int bpf_nf_link_fill_link_info(const struct bpf_link *link, struct bpf_link_info *info) { struct bpf_nf_link *nf_link = container_of(link, struct bpf_nf_link, link); + const struct nf_defrag_hook *hook = nf_link->defrag_hook; info->netfilter.pf = nf_link->hook_ops.pf; info->netfilter.hooknum = nf_link->hook_ops.hooknum; info->netfilter.priority = nf_link->hook_ops.priority; - info->netfilter.flags = 0; + info->netfilter.flags = hook ? BPF_F_NETFILTER_IP_DEFRAG : 0; return 0; } @@ -257,6 +260,8 @@ int bpf_nf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog) return err; } + get_net_track(net, &link->ns_tracker, GFP_KERNEL); + return bpf_link_settle(&link_primer); } diff --git a/net/netfilter/xt_NFLOG.c b/net/netfilter/xt_NFLOG.c index d80abd6ccaf8..6dcf4bc7e30b 100644 --- a/net/netfilter/xt_NFLOG.c +++ b/net/netfilter/xt_NFLOG.c @@ -79,7 +79,7 @@ static struct xt_target nflog_tg_reg[] __read_mostly = { { .name = "NFLOG", .revision = 0, - .family = NFPROTO_IPV4, + .family = NFPROTO_IPV6, .checkentry = nflog_tg_check, .destroy = nflog_tg_destroy, .target = nflog_tg, diff --git a/net/netfilter/xt_TRACE.c b/net/netfilter/xt_TRACE.c index f3fa4f11348c..a642ff09fc8e 100644 --- a/net/netfilter/xt_TRACE.c +++ b/net/netfilter/xt_TRACE.c @@ -49,6 +49,7 @@ static struct xt_target trace_tg_reg[] __read_mostly = { .target = trace_tg, .checkentry = trace_tg_check, .destroy = trace_tg_destroy, + .me = THIS_MODULE, }, #endif }; diff --git a/net/netfilter/xt_mark.c b/net/netfilter/xt_mark.c index f76fe04fc9a4..65b965ca40ea 100644 --- a/net/netfilter/xt_mark.c +++ b/net/netfilter/xt_mark.c @@ -62,7 +62,7 @@ static struct xt_target mark_tg_reg[] __read_mostly = { { .name = "MARK", .revision = 2, - .family = NFPROTO_IPV4, + .family = NFPROTO_IPV6, .target = mark_tg, .targetsize = sizeof(struct xt_mark_tginfo2), .me = THIS_MODULE, diff --git a/net/sched/act_api.c b/net/sched/act_api.c index c3f8dd7f8e65..839790043256 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -1497,8 +1497,29 @@ int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla, bool skip_sw = tc_skip_sw(fl_flags); bool skip_hw = tc_skip_hw(fl_flags); - if (tc_act_bind(act->tcfa_flags)) + if (tc_act_bind(act->tcfa_flags)) { + /* Action is created by classifier and is not + * standalone. Check that the user did not set + * any action flags different than the + * classifier flags, and inherit the flags from + * the classifier for the compatibility case + * where no flags were specified at all. + */ + if ((tc_act_skip_sw(act->tcfa_flags) && !skip_sw) || + (tc_act_skip_hw(act->tcfa_flags) && !skip_hw)) { + NL_SET_ERR_MSG(extack, + "Mismatch between action and filter offload flags"); + err = -EINVAL; + goto err; + } + if (skip_sw) + act->tcfa_flags |= TCA_ACT_FLAGS_SKIP_SW; + if (skip_hw) + act->tcfa_flags |= TCA_ACT_FLAGS_SKIP_HW; continue; + } + + /* Action is standalone */ if (skip_sw != tc_act_skip_sw(act->tcfa_flags) || skip_hw != tc_act_skip_hw(act->tcfa_flags)) { NL_SET_ERR_MSG(extack, diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index 2af24547a82c..38ec18f73de4 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -512,9 +512,15 @@ static void dev_watchdog(struct timer_list *t) struct netdev_queue *txq; txq = netdev_get_tx_queue(dev, i); - trans_start = READ_ONCE(txq->trans_start); if (!netif_xmit_stopped(txq)) continue; + + /* Paired with WRITE_ONCE() + smp_mb...() in + * netdev_tx_sent_queue() and netif_tx_stop_queue(). + */ + smp_mb(); + trans_start = READ_ONCE(txq->trans_start); + if (time_after(jiffies, trans_start + dev->watchdog_timeo)) { timedout_ms = jiffies_to_msecs(jiffies - trans_start); atomic_long_inc(&txq->trans_timeout); diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c index 8498d0606b24..8623dc0bafc0 100644 --- a/net/sched/sch_taprio.c +++ b/net/sched/sch_taprio.c @@ -1965,7 +1965,8 @@ static int taprio_change(struct Qdisc *sch, struct nlattr *opt, taprio_start_sched(sch, start, new_admin); - rcu_assign_pointer(q->admin_sched, new_admin); + admin = rcu_replace_pointer(q->admin_sched, new_admin, + lockdep_rtnl_is_held()); if (admin) call_rcu(&admin->rcu, taprio_free_sched_cb); @@ -2373,9 +2374,6 @@ static int taprio_dump(struct Qdisc *sch, struct sk_buff *skb) struct tc_mqprio_qopt opt = { 0 }; struct nlattr *nest, *sched_nest; - oper = rtnl_dereference(q->oper_sched); - admin = rtnl_dereference(q->admin_sched); - mqprio_qopt_reconstruct(dev, &opt); nest = nla_nest_start_noflag(skb, TCA_OPTIONS); @@ -2396,18 +2394,23 @@ static int taprio_dump(struct Qdisc *sch, struct sk_buff *skb) nla_put_u32(skb, TCA_TAPRIO_ATTR_TXTIME_DELAY, q->txtime_delay)) goto options_error; + rcu_read_lock(); + + oper = rtnl_dereference(q->oper_sched); + admin = rtnl_dereference(q->admin_sched); + if (oper && taprio_dump_tc_entries(skb, q, oper)) - goto options_error; + goto options_error_rcu; if (oper && dump_schedule(skb, oper)) - goto options_error; + goto options_error_rcu; if (!admin) goto done; sched_nest = nla_nest_start_noflag(skb, TCA_TAPRIO_ATTR_ADMIN_SCHED); if (!sched_nest) - goto options_error; + goto options_error_rcu; if (dump_schedule(skb, admin)) goto admin_error; @@ -2415,11 +2418,15 @@ static int taprio_dump(struct Qdisc *sch, struct sk_buff *skb) nla_nest_end(skb, sched_nest); done: + rcu_read_unlock(); return nla_nest_end(skb, nest); admin_error: nla_nest_cancel(skb, sched_nest); +options_error_rcu: + rcu_read_unlock(); + options_error: nla_nest_cancel(skb, nest); diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index 884ee128851e..ccbd2bc0d210 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -1707,6 +1707,7 @@ int virtio_transport_read_skb(struct vsock_sock *vsk, skb_read_actor_t recv_acto { struct virtio_vsock_sock *vvs = vsk->trans; struct sock *sk = sk_vsock(vsk); + struct virtio_vsock_hdr *hdr; struct sk_buff *skb; int off = 0; int err; @@ -1716,10 +1717,19 @@ int virtio_transport_read_skb(struct vsock_sock *vsk, skb_read_actor_t recv_acto * works for types other than dgrams. */ skb = __skb_recv_datagram(sk, &vvs->rx_queue, MSG_DONTWAIT, &off, &err); + if (!skb) { + spin_unlock_bh(&vvs->rx_lock); + return err; + } + + hdr = virtio_vsock_hdr(skb); + if (le32_to_cpu(hdr->flags) & VIRTIO_VSOCK_SEQ_EOM) + vvs->msg_count--; + + virtio_transport_dec_rx_pkt(vvs, le32_to_cpu(hdr->len)); spin_unlock_bh(&vvs->rx_lock); - if (!skb) - return err; + virtio_transport_send_credit_update(vsk); return recv_actor(sk, skb); } diff --git a/net/vmw_vsock/vsock_bpf.c b/net/vmw_vsock/vsock_bpf.c index c42c5cc18f32..4aa6e74ec295 100644 --- a/net/vmw_vsock/vsock_bpf.c +++ b/net/vmw_vsock/vsock_bpf.c @@ -114,14 +114,6 @@ static int vsock_bpf_recvmsg(struct sock *sk, struct msghdr *msg, return copied; } -/* Copy of original proto with updated sock_map methods */ -static struct proto vsock_bpf_prot = { - .close = sock_map_close, - .recvmsg = vsock_bpf_recvmsg, - .sock_is_readable = sk_msg_is_readable, - .unhash = sock_map_unhash, -}; - static void vsock_bpf_rebuild_protos(struct proto *prot, const struct proto *base) { *prot = *base; diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c index f123b7c9ec82..b33c4591e09a 100644 --- a/net/xfrm/xfrm_device.c +++ b/net/xfrm/xfrm_device.c @@ -269,6 +269,8 @@ int xfrm_dev_state_add(struct net *net, struct xfrm_state *x, dev = dev_get_by_index(net, xuo->ifindex); if (!dev) { + struct xfrm_dst_lookup_params params; + if (!(xuo->flags & XFRM_OFFLOAD_INBOUND)) { saddr = &x->props.saddr; daddr = &x->id.daddr; @@ -277,9 +279,12 @@ int xfrm_dev_state_add(struct net *net, struct xfrm_state *x, daddr = &x->props.saddr; } - dst = __xfrm_dst_lookup(net, 0, 0, saddr, daddr, - x->props.family, - xfrm_smark_get(0, x)); + memset(¶ms, 0, sizeof(params)); + params.net = net; + params.saddr = saddr; + params.daddr = daddr; + params.mark = xfrm_smark_get(0, x); + dst = __xfrm_dst_lookup(x->props.family, ¶ms); if (IS_ERR(dst)) return (is_packet_offload) ? -EINVAL : 0; diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index 914bac03b52a..a2ea9dbac90b 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -270,10 +270,8 @@ static const struct xfrm_if_cb *xfrm_if_get_cb(void) return rcu_dereference(xfrm_if_cb); } -struct dst_entry *__xfrm_dst_lookup(struct net *net, int tos, int oif, - const xfrm_address_t *saddr, - const xfrm_address_t *daddr, - int family, u32 mark) +struct dst_entry *__xfrm_dst_lookup(int family, + const struct xfrm_dst_lookup_params *params) { const struct xfrm_policy_afinfo *afinfo; struct dst_entry *dst; @@ -282,7 +280,7 @@ struct dst_entry *__xfrm_dst_lookup(struct net *net, int tos, int oif, if (unlikely(afinfo == NULL)) return ERR_PTR(-EAFNOSUPPORT); - dst = afinfo->dst_lookup(net, tos, oif, saddr, daddr, mark); + dst = afinfo->dst_lookup(params); rcu_read_unlock(); @@ -296,6 +294,7 @@ static inline struct dst_entry *xfrm_dst_lookup(struct xfrm_state *x, xfrm_address_t *prev_daddr, int family, u32 mark) { + struct xfrm_dst_lookup_params params; struct net *net = xs_net(x); xfrm_address_t *saddr = &x->props.saddr; xfrm_address_t *daddr = &x->id.daddr; @@ -310,7 +309,29 @@ static inline struct dst_entry *xfrm_dst_lookup(struct xfrm_state *x, daddr = x->coaddr; } - dst = __xfrm_dst_lookup(net, tos, oif, saddr, daddr, family, mark); + params.net = net; + params.saddr = saddr; + params.daddr = daddr; + params.tos = tos; + params.oif = oif; + params.mark = mark; + params.ipproto = x->id.proto; + if (x->encap) { + switch (x->encap->encap_type) { + case UDP_ENCAP_ESPINUDP: + params.ipproto = IPPROTO_UDP; + params.uli.ports.sport = x->encap->encap_sport; + params.uli.ports.dport = x->encap->encap_dport; + break; + case TCP_ENCAP_ESPINTCP: + params.ipproto = IPPROTO_TCP; + params.uli.ports.sport = x->encap->encap_sport; + params.uli.ports.dport = x->encap->encap_dport; + break; + } + } + + dst = __xfrm_dst_lookup(family, ¶ms); if (!IS_ERR(dst)) { if (prev_saddr != saddr) @@ -2432,15 +2453,15 @@ int __xfrm_sk_clone_policy(struct sock *sk, const struct sock *osk) } static int -xfrm_get_saddr(struct net *net, int oif, xfrm_address_t *local, - xfrm_address_t *remote, unsigned short family, u32 mark) +xfrm_get_saddr(unsigned short family, xfrm_address_t *saddr, + const struct xfrm_dst_lookup_params *params) { int err; const struct xfrm_policy_afinfo *afinfo = xfrm_policy_get_afinfo(family); if (unlikely(afinfo == NULL)) return -EINVAL; - err = afinfo->get_saddr(net, oif, local, remote, mark); + err = afinfo->get_saddr(saddr, params); rcu_read_unlock(); return err; } @@ -2469,9 +2490,14 @@ xfrm_tmpl_resolve_one(struct xfrm_policy *policy, const struct flowi *fl, remote = &tmpl->id.daddr; local = &tmpl->saddr; if (xfrm_addr_any(local, tmpl->encap_family)) { - error = xfrm_get_saddr(net, fl->flowi_oif, - &tmp, remote, - tmpl->encap_family, 0); + struct xfrm_dst_lookup_params params; + + memset(¶ms, 0, sizeof(params)); + params.net = net; + params.oif = fl->flowi_oif; + params.daddr = remote; + error = xfrm_get_saddr(tmpl->encap_family, &tmp, + ¶ms); if (error) goto fail; local = &tmp; @@ -4180,7 +4206,6 @@ static int __net_init xfrm_policy_init(struct net *net) net->xfrm.policy_count[dir] = 0; net->xfrm.policy_count[XFRM_POLICY_MAX + dir] = 0; - INIT_HLIST_HEAD(&net->xfrm.policy_inexact[dir]); htab = &net->xfrm.policy_bydst[dir]; htab->table = xfrm_hash_alloc(sz); @@ -4234,8 +4259,6 @@ static void xfrm_policy_fini(struct net *net) for (dir = 0; dir < XFRM_POLICY_MAX; dir++) { struct xfrm_policy_hash *htab; - WARN_ON(!hlist_empty(&net->xfrm.policy_inexact[dir])); - htab = &net->xfrm.policy_bydst[dir]; sz = (htab->hmask + 1) * sizeof(struct hlist_head); WARN_ON(!hlist_empty(htab->table)); diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index 2b10a45ff124..e3b8ce89831a 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -201,6 +201,7 @@ static int verify_newsa_info(struct xfrm_usersa_info *p, { int err; u8 sa_dir = attrs[XFRMA_SA_DIR] ? nla_get_u8(attrs[XFRMA_SA_DIR]) : 0; + u16 family = p->sel.family; err = -EINVAL; switch (p->family) { @@ -221,7 +222,10 @@ static int verify_newsa_info(struct xfrm_usersa_info *p, goto out; } - switch (p->sel.family) { + if (!family && !(p->flags & XFRM_STATE_AF_UNSPEC)) + family = p->family; + + switch (family) { case AF_UNSPEC: break; @@ -1098,7 +1102,9 @@ static int copy_to_user_auth(struct xfrm_algo_auth *auth, struct sk_buff *skb) if (!nla) return -EMSGSIZE; ap = nla_data(nla); - memcpy(ap, auth, sizeof(struct xfrm_algo_auth)); + strscpy_pad(ap->alg_name, auth->alg_name, sizeof(ap->alg_name)); + ap->alg_key_len = auth->alg_key_len; + ap->alg_trunc_len = auth->alg_trunc_len; if (redact_secret && auth->alg_key_len) memset(ap->alg_key, 0, (auth->alg_key_len + 7) / 8); else diff --git a/scripts/Kconfig.include b/scripts/Kconfig.include index 785a491e5996..33193ca6e803 100644 --- a/scripts/Kconfig.include +++ b/scripts/Kconfig.include @@ -65,6 +65,9 @@ cc-option-bit = $(if-success,$(CC) -Werror $(1) -E -x c /dev/null -o /dev/null,$ m32-flag := $(cc-option-bit,-m32) m64-flag := $(cc-option-bit,-m64) +rustc-version := $(shell,$(srctree)/scripts/rustc-version.sh $(RUSTC)) +rustc-llvm-version := $(shell,$(srctree)/scripts/rustc-llvm-version.sh $(RUSTC)) + # $(rustc-option,<flag>) # Return y if the Rust compiler supports <flag>, n otherwise # Calls to this should be guarded so that they are not evaluated if diff --git a/scripts/Makefile.compiler b/scripts/Makefile.compiler index 057305eae85c..e0842496d26e 100644 --- a/scripts/Makefile.compiler +++ b/scripts/Makefile.compiler @@ -53,13 +53,11 @@ cc-option = $(call __cc-option, $(CC),\ # cc-option-yn # Usage: flag := $(call cc-option-yn,-march=winchip-c6) -cc-option-yn = $(call try-run,\ - $(CC) -Werror $(KBUILD_CPPFLAGS) $(KBUILD_CFLAGS) $(1) -c -x c /dev/null -o "$$TMP",y,n) +cc-option-yn = $(if $(call cc-option,$1),y,n) # cc-disable-warning # Usage: cflags-y += $(call cc-disable-warning,unused-but-set-variable) -cc-disable-warning = $(call try-run,\ - $(CC) -Werror $(KBUILD_CPPFLAGS) $(KBUILD_CFLAGS) -W$(strip $(1)) -c -x c /dev/null -o "$$TMP",-Wno-$(strip $(1))) +cc-disable-warning = $(if $(call cc-option,-W$(strip $1)),-Wno-$(strip $1)) # gcc-min-version # Usage: cflags-$(call gcc-min-version, 70100) += -foo @@ -75,8 +73,11 @@ ld-option = $(call try-run, $(LD) $(KBUILD_LDFLAGS) $(1) -v,$(1),$(2),$(3)) # __rustc-option # Usage: MY_RUSTFLAGS += $(call __rustc-option,$(RUSTC),$(MY_RUSTFLAGS),-Cinstrument-coverage,-Zinstrument-coverage) +# TODO: remove RUSTC_BOOTSTRAP=1 when we raise the minimum GNU Make version to 4.4 __rustc-option = $(call try-run,\ - $(1) $(2) $(3) --crate-type=rlib /dev/null --out-dir=$$TMPOUT -o "$$TMP",$(3),$(4)) + echo '#![allow(missing_docs)]#![feature(no_core)]#![no_core]' | RUSTC_BOOTSTRAP=1\ + $(1) --sysroot=/dev/null $(filter-out --sysroot=/dev/null,$(2)) $(3)\ + --crate-type=rlib --out-dir=$(TMPOUT) --emit=obj=- - >/dev/null,$(3),$(4)) # rustc-option # Usage: rustflags-y += $(call rustc-option,-Cinstrument-coverage,-Zinstrument-coverage) @@ -85,5 +86,4 @@ rustc-option = $(call __rustc-option, $(RUSTC),\ # rustc-option-yn # Usage: flag := $(call rustc-option-yn,-Cinstrument-coverage) -rustc-option-yn = $(call try-run,\ - $(RUSTC) $(KBUILD_RUSTFLAGS) $(1) --crate-type=rlib /dev/null --out-dir=$$TMPOUT -o "$$TMP",y,n) +rustc-option-yn = $(if $(call rustc-option,$1),y,n) diff --git a/scripts/rustc-llvm-version.sh b/scripts/rustc-llvm-version.sh new file mode 100755 index 000000000000..b6063cbe5bdc --- /dev/null +++ b/scripts/rustc-llvm-version.sh @@ -0,0 +1,22 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +# +# Usage: $ ./rustc-llvm-version.sh rustc +# +# Print the LLVM version that the Rust compiler uses in a 6 digit form. + +# Convert the version string x.y.z to a canonical up-to-6-digits form. +get_canonical_version() +{ + IFS=. + set -- $1 + echo $((10000 * $1 + 100 * $2 + $3)) +} + +if output=$("$@" --version --verbose 2>/dev/null | grep LLVM); then + set -- $output + get_canonical_version $3 +else + echo 0 + exit 1 +fi diff --git a/security/ipe/Kconfig b/security/ipe/Kconfig index 3ab582606ed2..3c75bf267da4 100644 --- a/security/ipe/Kconfig +++ b/security/ipe/Kconfig @@ -31,6 +31,25 @@ config IPE_BOOT_POLICY If unsure, leave blank. +config IPE_POLICY_SIG_SECONDARY_KEYRING + bool "IPE policy update verification with secondary keyring" + default y + depends on SECONDARY_TRUSTED_KEYRING + help + Also allow the secondary trusted keyring to verify IPE policy + updates. + + If unsure, answer Y. + +config IPE_POLICY_SIG_PLATFORM_KEYRING + bool "IPE policy update verification with platform keyring" + default y + depends on INTEGRITY_PLATFORM_KEYRING + help + Also allow the platform keyring to verify IPE policy updates. + + If unsure, answer Y. + menu "IPE Trust Providers" config IPE_PROP_DM_VERITY diff --git a/security/ipe/policy.c b/security/ipe/policy.c index d8e7db857a2e..b628f696e32b 100644 --- a/security/ipe/policy.c +++ b/security/ipe/policy.c @@ -106,8 +106,8 @@ int ipe_update_policy(struct inode *root, const char *text, size_t textlen, goto err; } - if (ver_to_u64(old) > ver_to_u64(new)) { - rc = -EINVAL; + if (ver_to_u64(old) >= ver_to_u64(new)) { + rc = -ESTALE; goto err; } @@ -169,9 +169,21 @@ struct ipe_policy *ipe_new_policy(const char *text, size_t textlen, goto err; } - rc = verify_pkcs7_signature(NULL, 0, new->pkcs7, pkcs7len, NULL, + rc = verify_pkcs7_signature(NULL, 0, new->pkcs7, pkcs7len, +#ifdef CONFIG_IPE_POLICY_SIG_SECONDARY_KEYRING + VERIFY_USE_SECONDARY_KEYRING, +#else + NULL, +#endif VERIFYING_UNSPECIFIED_SIGNATURE, set_pkcs7_data, new); +#ifdef CONFIG_IPE_POLICY_SIG_PLATFORM_KEYRING + if (rc == -ENOKEY || rc == -EKEYREJECTED) + rc = verify_pkcs7_signature(NULL, 0, new->pkcs7, pkcs7len, + VERIFY_USE_PLATFORM_KEYRING, + VERIFYING_UNSPECIFIED_SIGNATURE, + set_pkcs7_data, new); +#endif if (rc) goto err; } else { diff --git a/sound/Kconfig b/sound/Kconfig index 4c036a9a420a..8b40205394fe 100644 --- a/sound/Kconfig +++ b/sound/Kconfig @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0-only menuconfig SOUND tristate "Sound card support" - depends on HAS_IOMEM || UML + depends on HAS_IOMEM || INDIRECT_IOMEM help If you have a sound card in your computer, i.e. if it can say more than an occasional beep, say Y. diff --git a/sound/hda/intel-sdw-acpi.c b/sound/hda/intel-sdw-acpi.c index 04d6b6beabca..ed530e0dd4dd 100644 --- a/sound/hda/intel-sdw-acpi.c +++ b/sound/hda/intel-sdw-acpi.c @@ -56,18 +56,21 @@ static int sdw_intel_scan_controller(struct sdw_intel_acpi_info *info) { struct acpi_device *adev = acpi_fetch_acpi_dev(info->handle); - u8 count, i; + struct fwnode_handle *fwnode; + unsigned long list; + unsigned int i; + u32 count; + u32 tmp; int ret; if (!adev) return -EINVAL; - /* Found controller, find links supported */ - count = 0; - ret = fwnode_property_read_u8_array(acpi_fwnode_handle(adev), - "mipi-sdw-master-count", &count, 1); + fwnode = acpi_fwnode_handle(adev); /* + * Found controller, find links supported + * * In theory we could check the number of links supported in * hardware, but in that step we cannot assume SoundWire IP is * powered. @@ -78,11 +81,19 @@ sdw_intel_scan_controller(struct sdw_intel_acpi_info *info) * * We will check the hardware capabilities in the startup() step */ - + ret = fwnode_property_read_u32(fwnode, "mipi-sdw-manager-list", &tmp); if (ret) { - dev_err(&adev->dev, - "Failed to read mipi-sdw-master-count: %d\n", ret); - return -EINVAL; + ret = fwnode_property_read_u32(fwnode, "mipi-sdw-master-count", &count); + if (ret) { + dev_err(&adev->dev, + "Failed to read mipi-sdw-master-count: %d\n", + ret); + return ret; + } + list = GENMASK(count - 1, 0); + } else { + list = tmp; + count = hweight32(list); } /* Check count is within bounds */ @@ -101,14 +112,14 @@ sdw_intel_scan_controller(struct sdw_intel_acpi_info *info) info->count = count; info->link_mask = 0; - for (i = 0; i < count; i++) { + for_each_set_bit(i, &list, SDW_INTEL_MAX_LINKS) { if (ctrl_link_mask && !(ctrl_link_mask & BIT(i))) { dev_dbg(&adev->dev, "Link %d masked, will not be enabled\n", i); continue; } - if (!is_link_enabled(acpi_fwnode_handle(adev), i)) { + if (!is_link_enabled(fwnode, i)) { dev_dbg(&adev->dev, "Link %d not selected in firmware\n", i); continue; diff --git a/sound/pci/hda/patch_conexant.c b/sound/pci/hda/patch_conexant.c index b61ce5e6f5ec..c74f6742c359 100644 --- a/sound/pci/hda/patch_conexant.c +++ b/sound/pci/hda/patch_conexant.c @@ -303,6 +303,7 @@ enum { CXT_FIXUP_HP_SPECTRE, CXT_FIXUP_HP_GATE_MIC, CXT_FIXUP_MUTE_LED_GPIO, + CXT_FIXUP_HP_ELITEONE_OUT_DIS, CXT_FIXUP_HP_ZBOOK_MUTE_LED, CXT_FIXUP_HEADSET_MIC, CXT_FIXUP_HP_MIC_NO_PRESENCE, @@ -320,6 +321,19 @@ static void cxt_fixup_stereo_dmic(struct hda_codec *codec, spec->gen.inv_dmic_split = 1; } +/* fix widget control pin settings */ +static void cxt_fixup_update_pinctl(struct hda_codec *codec, + const struct hda_fixup *fix, int action) +{ + if (action == HDA_FIXUP_ACT_PROBE) { + /* Unset OUT_EN for this Node pin, leaving only HP_EN. + * This is the value stored in the codec register after + * the correct initialization of the previous windows boot. + */ + snd_hda_set_pin_ctl_cache(codec, 0x1d, AC_PINCTL_HP_EN); + } +} + static void cxt5066_increase_mic_boost(struct hda_codec *codec, const struct hda_fixup *fix, int action) { @@ -971,6 +985,10 @@ static const struct hda_fixup cxt_fixups[] = { .type = HDA_FIXUP_FUNC, .v.func = cxt_fixup_mute_led_gpio, }, + [CXT_FIXUP_HP_ELITEONE_OUT_DIS] = { + .type = HDA_FIXUP_FUNC, + .v.func = cxt_fixup_update_pinctl, + }, [CXT_FIXUP_HP_ZBOOK_MUTE_LED] = { .type = HDA_FIXUP_FUNC, .v.func = cxt_fixup_hp_zbook_mute_led, @@ -1061,6 +1079,7 @@ static const struct snd_pci_quirk cxt5066_fixups[] = { SND_PCI_QUIRK(0x103c, 0x83b2, "HP EliteBook 840 G5", CXT_FIXUP_HP_DOCK), SND_PCI_QUIRK(0x103c, 0x83b3, "HP EliteBook 830 G5", CXT_FIXUP_HP_DOCK), SND_PCI_QUIRK(0x103c, 0x83d3, "HP ProBook 640 G4", CXT_FIXUP_HP_DOCK), + SND_PCI_QUIRK(0x103c, 0x83e5, "HP EliteOne 1000 G2", CXT_FIXUP_HP_ELITEONE_OUT_DIS), SND_PCI_QUIRK(0x103c, 0x8402, "HP ProBook 645 G4", CXT_FIXUP_MUTE_LED_GPIO), SND_PCI_QUIRK(0x103c, 0x8427, "HP ZBook Studio G5", CXT_FIXUP_HP_ZBOOK_MUTE_LED), SND_PCI_QUIRK(0x103c, 0x844f, "HP ZBook Studio G5", CXT_FIXUP_HP_ZBOOK_MUTE_LED), diff --git a/sound/pci/hda/patch_cs8409.c b/sound/pci/hda/patch_cs8409.c index 26f3c31600d7..614327218634 100644 --- a/sound/pci/hda/patch_cs8409.c +++ b/sound/pci/hda/patch_cs8409.c @@ -1403,8 +1403,9 @@ void dolphin_fixups(struct hda_codec *codec, const struct hda_fixup *fix, int ac kctrl = snd_hda_gen_add_kctl(&spec->gen, "Line Out Playback Volume", &cs42l42_dac_volume_mixer); /* Update Line Out kcontrol template */ - kctrl->private_value = HDA_COMPOSE_AMP_VAL_OFS(DOLPHIN_HP_PIN_NID, 3, CS8409_CODEC1, - HDA_OUTPUT, CS42L42_VOL_DAC) | HDA_AMP_VAL_MIN_MUTE; + if (kctrl) + kctrl->private_value = HDA_COMPOSE_AMP_VAL_OFS(DOLPHIN_HP_PIN_NID, 3, CS8409_CODEC1, + HDA_OUTPUT, CS42L42_VOL_DAC) | HDA_AMP_VAL_MIN_MUTE; cs8409_enable_ur(codec, 0); snd_hda_codec_set_name(codec, "CS8409/CS42L42"); break; diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 5e2e927656cd..3bbf5fab2881 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -7403,6 +7403,49 @@ static void alc245_fixup_hp_spectre_x360_eu0xxx(struct hda_codec *codec, alc245_fixup_hp_gpio_led(codec, fix, action); } +/* some changes for Spectre x360 16, 2024 model */ +static void alc245_fixup_hp_spectre_x360_16_aa0xxx(struct hda_codec *codec, + const struct hda_fixup *fix, int action) +{ + /* + * The Pin Complex 0x14 for the treble speakers is wrongly reported as + * unconnected. + * The Pin Complex 0x17 for the bass speakers has the lowest association + * and sequence values so shift it up a bit to squeeze 0x14 in. + */ + struct alc_spec *spec = codec->spec; + static const struct hda_pintbl pincfgs[] = { + { 0x14, 0x90170110 }, // top/treble + { 0x17, 0x90170111 }, // bottom/bass + { } + }; + + /* + * Force DAC 0x02 for the bass speakers 0x17. + */ + static const hda_nid_t conn[] = { 0x02 }; + + switch (action) { + case HDA_FIXUP_ACT_PRE_PROBE: + /* needed for amp of back speakers */ + spec->gpio_mask |= 0x01; + spec->gpio_dir |= 0x01; + snd_hda_apply_pincfgs(codec, pincfgs); + snd_hda_override_conn_list(codec, 0x17, ARRAY_SIZE(conn), conn); + break; + case HDA_FIXUP_ACT_INIT: + /* need to toggle GPIO to enable the amp of back speakers */ + alc_update_gpio_data(codec, 0x01, true); + msleep(100); + alc_update_gpio_data(codec, 0x01, false); + break; + } + + cs35l41_fixup_i2c_two(codec, fix, action); + alc245_fixup_hp_mute_led_coefbit(codec, fix, action); + alc245_fixup_hp_gpio_led(codec, fix, action); +} + /* * ALC287 PCM hooks */ @@ -7725,6 +7768,7 @@ enum { ALC256_FIXUP_ACER_SFG16_MICMUTE_LED, ALC256_FIXUP_HEADPHONE_AMP_VOL, ALC245_FIXUP_HP_SPECTRE_X360_EU0XXX, + ALC245_FIXUP_HP_SPECTRE_X360_16_AA0XXX, ALC285_FIXUP_ASUS_GA403U, ALC285_FIXUP_ASUS_GA403U_HEADSET_MIC, ALC285_FIXUP_ASUS_GA403U_I2C_SPEAKER2_TO_DAC1, @@ -10011,6 +10055,10 @@ static const struct hda_fixup alc269_fixups[] = { .type = HDA_FIXUP_FUNC, .v.func = alc245_fixup_hp_spectre_x360_eu0xxx, }, + [ALC245_FIXUP_HP_SPECTRE_X360_16_AA0XXX] = { + .type = HDA_FIXUP_FUNC, + .v.func = alc245_fixup_hp_spectre_x360_16_aa0xxx, + }, [ALC285_FIXUP_ASUS_GA403U] = { .type = HDA_FIXUP_FUNC, .v.func = alc285_fixup_asus_ga403u, @@ -10198,6 +10246,8 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x1028, 0x0c1e, "Dell Precision 3540", ALC236_FIXUP_DELL_DUAL_CODECS), SND_PCI_QUIRK(0x1028, 0x0c28, "Dell Inspiron 16 Plus 7630", ALC295_FIXUP_DELL_INSPIRON_TOP_SPEAKERS), SND_PCI_QUIRK(0x1028, 0x0c4d, "Dell", ALC287_FIXUP_CS35L41_I2C_4), + SND_PCI_QUIRK(0x1028, 0x0c94, "Dell Polaris 3 metal", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x1028, 0x0c96, "Dell Polaris 2in1", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x1028, 0x0cbd, "Dell Oasis 13 CS MTL-U", ALC289_FIXUP_DELL_CS35L41_SPI_2), SND_PCI_QUIRK(0x1028, 0x0cbe, "Dell Oasis 13 2-IN-1 MTL-U", ALC289_FIXUP_DELL_CS35L41_SPI_2), SND_PCI_QUIRK(0x1028, 0x0cbf, "Dell Oasis 13 Low Weight MTU-L", ALC289_FIXUP_DELL_CS35L41_SPI_2), @@ -10448,7 +10498,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x103c, 0x8be9, "HP Envy 15", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x103c, 0x8bf0, "HP", ALC236_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8c15, "HP Spectre x360 2-in-1 Laptop 14-eu0xxx", ALC245_FIXUP_HP_SPECTRE_X360_EU0XXX), - SND_PCI_QUIRK(0x103c, 0x8c16, "HP Spectre 16", ALC287_FIXUP_CS35L41_I2C_2), + SND_PCI_QUIRK(0x103c, 0x8c16, "HP Spectre x360 2-in-1 Laptop 16-aa0xxx", ALC245_FIXUP_HP_SPECTRE_X360_16_AA0XXX), SND_PCI_QUIRK(0x103c, 0x8c17, "HP Spectre 16", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x103c, 0x8c21, "HP Pavilion Plus Laptop 14-ey0XXX", ALC245_FIXUP_HP_X360_MUTE_LEDS), SND_PCI_QUIRK(0x103c, 0x8c30, "HP Victus 15-fb1xxx", ALC245_FIXUP_HP_MUTE_LED_COEFBIT), @@ -10501,11 +10551,15 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x1043, 0x103f, "ASUS TX300", ALC282_FIXUP_ASUS_TX300), SND_PCI_QUIRK(0x1043, 0x106d, "Asus K53BE", ALC269_FIXUP_LIMIT_INT_MIC_BOOST), SND_PCI_QUIRK(0x1043, 0x10a1, "ASUS UX391UA", ALC294_FIXUP_ASUS_SPK), + SND_PCI_QUIRK(0x1043, 0x10a4, "ASUS TP3407SA", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x1043, 0x10c0, "ASUS X540SA", ALC256_FIXUP_ASUS_MIC), SND_PCI_QUIRK(0x1043, 0x10d0, "ASUS X540LA/X540LJ", ALC255_FIXUP_ASUS_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1043, 0x10d3, "ASUS K6500ZC", ALC294_FIXUP_ASUS_SPK), + SND_PCI_QUIRK(0x1043, 0x1154, "ASUS TP3607SH", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x1043, 0x115d, "Asus 1015E", ALC269_FIXUP_LIMIT_INT_MIC_BOOST), SND_PCI_QUIRK(0x1043, 0x11c0, "ASUS X556UR", ALC255_FIXUP_ASUS_MIC_NO_PRESENCE), + SND_PCI_QUIRK(0x1043, 0x1204, "ASUS Strix G615JHR_JMR_JPR", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x1043, 0x1214, "ASUS Strix G615LH_LM_LP", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x1043, 0x125e, "ASUS Q524UQK", ALC255_FIXUP_ASUS_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1043, 0x1271, "ASUS X430UN", ALC256_FIXUP_ASUS_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1043, 0x1290, "ASUS X441SA", ALC233_FIXUP_EAPD_COEF_AND_MIC_NO_PRESENCE), @@ -10583,6 +10637,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x1043, 0x1e51, "ASUS Zephyrus M15", ALC294_FIXUP_ASUS_GU502_PINS), SND_PCI_QUIRK(0x1043, 0x1e5e, "ASUS ROG Strix G513", ALC294_FIXUP_ASUS_G513_PINS), SND_PCI_QUIRK(0x1043, 0x1e8e, "ASUS Zephyrus G15", ALC289_FIXUP_ASUS_GA401), + SND_PCI_QUIRK(0x1043, 0x1eb3, "ASUS Ally RCLA72", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x1043, 0x1ed3, "ASUS HN7306W", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x1043, 0x1ee2, "ASUS UM6702RA/RC", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x1043, 0x1c52, "ASUS Zephyrus G15 2022", ALC289_FIXUP_ASUS_GA401), @@ -10597,6 +10652,13 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x1043, 0x3a40, "ASUS G814JZR", ALC285_FIXUP_ASUS_SPI_REAR_SPEAKERS), SND_PCI_QUIRK(0x1043, 0x3a50, "ASUS G834JYR/JZR", ALC285_FIXUP_ASUS_SPI_REAR_SPEAKERS), SND_PCI_QUIRK(0x1043, 0x3a60, "ASUS G634JYR/JZR", ALC285_FIXUP_ASUS_SPI_REAR_SPEAKERS), + SND_PCI_QUIRK(0x1043, 0x3e30, "ASUS TP3607SA", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x1043, 0x3ee0, "ASUS Strix G815_JHR_JMR_JPR", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x1043, 0x3ef0, "ASUS Strix G635LR_LW_LX", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x1043, 0x3f00, "ASUS Strix G815LH_LM_LP", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x1043, 0x3f10, "ASUS Strix G835LR_LW_LX", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x1043, 0x3f20, "ASUS Strix G615LR_LW", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x1043, 0x3f30, "ASUS Strix G815LR_LW", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x1043, 0x831a, "ASUS P901", ALC269_FIXUP_STEREO_DMIC), SND_PCI_QUIRK(0x1043, 0x834a, "ASUS S101", ALC269_FIXUP_STEREO_DMIC), SND_PCI_QUIRK(0x1043, 0x8398, "ASUS P1005", ALC269_FIXUP_STEREO_DMIC), @@ -10819,11 +10881,14 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x17aa, 0x3878, "Lenovo Legion 7 Slim 16ARHA7", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x17aa, 0x387d, "Yoga S780-16 pro Quad AAC", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x387e, "Yoga S780-16 pro Quad YC", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x17aa, 0x387f, "Yoga S780-16 pro dual LX", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x17aa, 0x3880, "Yoga S780-16 pro dual YC", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x3881, "YB9 dual power mode2 YC", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x3882, "Lenovo Yoga Pro 7 14APH8", ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN), SND_PCI_QUIRK(0x17aa, 0x3884, "Y780 YG DUAL", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x3886, "Y780 VECO DUAL", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x3891, "Lenovo Yoga Pro 7 14AHP9", ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN), + SND_PCI_QUIRK(0x17aa, 0x38a5, "Y580P AMD dual", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x38a7, "Y780P AMD YG dual", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x38a8, "Y780P AMD VECO dual", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x38a9, "Thinkbook 16P", ALC287_FIXUP_MG_RTKC_CSAMP_CS35L41_I2C_THINKPAD), @@ -10832,6 +10897,8 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x17aa, 0x38b5, "Legion Slim 7 16IRH8", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x17aa, 0x38b6, "Legion Slim 7 16APH8", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x17aa, 0x38b7, "Legion Slim 7 16APH8", ALC287_FIXUP_CS35L41_I2C_2), + SND_PCI_QUIRK(0x17aa, 0x38b8, "Yoga S780-14.5 proX AMD YC Dual", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x17aa, 0x38b9, "Yoga S780-14.5 proX AMD LX Dual", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x38ba, "Yoga S780-14.5 Air AMD quad YC", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x38bb, "Yoga S780-14.5 Air AMD quad AAC", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x38be, "Yoga S980-14.5 proX YC Dual", ALC287_FIXUP_TAS2781_I2C), @@ -10842,12 +10909,22 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x17aa, 0x38cb, "Y790 YG DUAL", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x38cd, "Y790 VECO DUAL", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x38d2, "Lenovo Yoga 9 14IMH9", ALC287_FIXUP_YOGA9_14IMH9_BASS_SPK_PIN), + SND_PCI_QUIRK(0x17aa, 0x38d3, "Yoga S990-16 Pro IMH YC Dual", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x17aa, 0x38d4, "Yoga S990-16 Pro IMH VECO Dual", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x17aa, 0x38d5, "Yoga S990-16 Pro IMH YC Quad", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x17aa, 0x38d6, "Yoga S990-16 Pro IMH VECO Quad", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x38d7, "Lenovo Yoga 9 14IMH9", ALC287_FIXUP_YOGA9_14IMH9_BASS_SPK_PIN), + SND_PCI_QUIRK(0x17aa, 0x38df, "Yoga Y990 Intel YC Dual", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x17aa, 0x38e0, "Yoga Y990 Intel VECO Dual", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x17aa, 0x38f8, "Yoga Book 9i", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x38df, "Y990 YG DUAL", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x38f9, "Thinkbook 16P Gen5", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x17aa, 0x38fa, "Thinkbook 16P Gen5", ALC287_FIXUP_CS35L41_I2C_2), + SND_PCI_QUIRK(0x17aa, 0x38fd, "ThinkBook plus Gen5 Hybrid", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x3902, "Lenovo E50-80", ALC269_FIXUP_DMIC_THINKPAD_ACPI), SND_PCI_QUIRK(0x17aa, 0x3913, "Lenovo 145", ALC236_FIXUP_LENOVO_INV_DMIC), + SND_PCI_QUIRK(0x17aa, 0x391f, "Yoga S990-16 pro Quad YC Quad", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x17aa, 0x3920, "Yoga S990-16 pro Quad VECO Quad", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x3977, "IdeaPad S210", ALC283_FIXUP_INT_MIC), SND_PCI_QUIRK(0x17aa, 0x3978, "Lenovo B50-70", ALC269_FIXUP_DMIC_THINKPAD_ACPI), SND_PCI_QUIRK(0x17aa, 0x3bf8, "Quanta FL1", ALC269_FIXUP_PCM_44K), diff --git a/sound/usb/line6/capture.c b/sound/usb/line6/capture.c index 970c9bdce0b2..84a9b7b76f43 100644 --- a/sound/usb/line6/capture.c +++ b/sound/usb/line6/capture.c @@ -2,7 +2,7 @@ /* * Line 6 Linux USB driver * - * Copyright (C) 2004-2010 Markus Grabner (grabner@icg.tugraz.at) + * Copyright (C) 2004-2010 Markus Grabner (line6@grabner-graz.at) */ #include <linux/slab.h> diff --git a/sound/usb/line6/capture.h b/sound/usb/line6/capture.h index 20e05a5eceb4..90572dae134e 100644 --- a/sound/usb/line6/capture.h +++ b/sound/usb/line6/capture.h @@ -2,7 +2,7 @@ /* * Line 6 Linux USB driver * - * Copyright (C) 2004-2010 Markus Grabner (grabner@icg.tugraz.at) + * Copyright (C) 2004-2010 Markus Grabner (line6@grabner-graz.at) */ #ifndef CAPTURE_H diff --git a/sound/usb/line6/driver.c b/sound/usb/line6/driver.c index 9df49a880b75..e9eb5c74d6c7 100644 --- a/sound/usb/line6/driver.c +++ b/sound/usb/line6/driver.c @@ -2,7 +2,7 @@ /* * Line 6 Linux USB driver * - * Copyright (C) 2004-2010 Markus Grabner (grabner@icg.tugraz.at) + * Copyright (C) 2004-2010 Markus Grabner (line6@grabner-graz.at) */ #include <linux/kernel.h> @@ -20,7 +20,7 @@ #include "midi.h" #include "playback.h" -#define DRIVER_AUTHOR "Markus Grabner <grabner@icg.tugraz.at>" +#define DRIVER_AUTHOR "Markus Grabner <line6@grabner-graz.at>" #define DRIVER_DESC "Line 6 USB Driver" /* diff --git a/sound/usb/line6/driver.h b/sound/usb/line6/driver.h index dbb1d90d3647..5736ad4256a5 100644 --- a/sound/usb/line6/driver.h +++ b/sound/usb/line6/driver.h @@ -2,7 +2,7 @@ /* * Line 6 Linux USB driver * - * Copyright (C) 2004-2010 Markus Grabner (grabner@icg.tugraz.at) + * Copyright (C) 2004-2010 Markus Grabner (line6@grabner-graz.at) */ #ifndef DRIVER_H diff --git a/sound/usb/line6/midi.c b/sound/usb/line6/midi.c index 0838632c788e..9b5176086280 100644 --- a/sound/usb/line6/midi.c +++ b/sound/usb/line6/midi.c @@ -2,7 +2,7 @@ /* * Line 6 Linux USB driver * - * Copyright (C) 2004-2010 Markus Grabner (grabner@icg.tugraz.at) + * Copyright (C) 2004-2010 Markus Grabner (line6@grabner-graz.at) */ #include <linux/slab.h> diff --git a/sound/usb/line6/midi.h b/sound/usb/line6/midi.h index 918754e79be4..3409c742c173 100644 --- a/sound/usb/line6/midi.h +++ b/sound/usb/line6/midi.h @@ -2,7 +2,7 @@ /* * Line 6 Linux USB driver * - * Copyright (C) 2004-2010 Markus Grabner (grabner@icg.tugraz.at) + * Copyright (C) 2004-2010 Markus Grabner (line6@grabner-graz.at) */ #ifndef MIDI_H diff --git a/sound/usb/line6/midibuf.c b/sound/usb/line6/midibuf.c index e7f830f7526c..57fca134b337 100644 --- a/sound/usb/line6/midibuf.c +++ b/sound/usb/line6/midibuf.c @@ -2,7 +2,7 @@ /* * Line 6 Linux USB driver * - * Copyright (C) 2004-2010 Markus Grabner (grabner@icg.tugraz.at) + * Copyright (C) 2004-2010 Markus Grabner (line6@grabner-graz.at) */ #include <linux/slab.h> diff --git a/sound/usb/line6/midibuf.h b/sound/usb/line6/midibuf.h index 542e8d836f87..1dae5fac9dde 100644 --- a/sound/usb/line6/midibuf.h +++ b/sound/usb/line6/midibuf.h @@ -2,7 +2,7 @@ /* * Line 6 Linux USB driver * - * Copyright (C) 2004-2010 Markus Grabner (grabner@icg.tugraz.at) + * Copyright (C) 2004-2010 Markus Grabner (line6@grabner-graz.at) */ #ifndef MIDIBUF_H diff --git a/sound/usb/line6/pcm.c b/sound/usb/line6/pcm.c index 6a4af725aedd..d4dbbc432505 100644 --- a/sound/usb/line6/pcm.c +++ b/sound/usb/line6/pcm.c @@ -2,7 +2,7 @@ /* * Line 6 Linux USB driver * - * Copyright (C) 2004-2010 Markus Grabner (grabner@icg.tugraz.at) + * Copyright (C) 2004-2010 Markus Grabner (line6@grabner-graz.at) */ #include <linux/slab.h> diff --git a/sound/usb/line6/pcm.h b/sound/usb/line6/pcm.h index 9c683042ff06..a15913bf2a7a 100644 --- a/sound/usb/line6/pcm.h +++ b/sound/usb/line6/pcm.h @@ -2,7 +2,7 @@ /* * Line 6 Linux USB driver * - * Copyright (C) 2004-2010 Markus Grabner (grabner@icg.tugraz.at) + * Copyright (C) 2004-2010 Markus Grabner (line6@grabner-graz.at) */ /* diff --git a/sound/usb/line6/playback.c b/sound/usb/line6/playback.c index 8233c61e23f1..9f26f66e6792 100644 --- a/sound/usb/line6/playback.c +++ b/sound/usb/line6/playback.c @@ -2,7 +2,7 @@ /* * Line 6 Linux USB driver * - * Copyright (C) 2004-2010 Markus Grabner (grabner@icg.tugraz.at) + * Copyright (C) 2004-2010 Markus Grabner (line6@grabner-graz.at) */ #include <linux/slab.h> diff --git a/sound/usb/line6/playback.h b/sound/usb/line6/playback.h index 2ca832c83851..2e0ec0ade0bf 100644 --- a/sound/usb/line6/playback.h +++ b/sound/usb/line6/playback.h @@ -2,7 +2,7 @@ /* * Line 6 Linux USB driver * - * Copyright (C) 2004-2010 Markus Grabner (grabner@icg.tugraz.at) + * Copyright (C) 2004-2010 Markus Grabner (line6@grabner-graz.at) */ #ifndef PLAYBACK_H diff --git a/sound/usb/line6/pod.c b/sound/usb/line6/pod.c index d173971e5f02..6f948c3e8f9e 100644 --- a/sound/usb/line6/pod.c +++ b/sound/usb/line6/pod.c @@ -2,7 +2,7 @@ /* * Line 6 Linux USB driver * - * Copyright (C) 2004-2010 Markus Grabner (grabner@icg.tugraz.at) + * Copyright (C) 2004-2010 Markus Grabner (line6@grabner-graz.at) */ #include <linux/slab.h> diff --git a/sound/usb/line6/toneport.c b/sound/usb/line6/toneport.c index e33df58740a9..ca2c6f5de407 100644 --- a/sound/usb/line6/toneport.c +++ b/sound/usb/line6/toneport.c @@ -2,7 +2,7 @@ /* * Line 6 Linux USB driver * - * Copyright (C) 2004-2010 Markus Grabner (grabner@icg.tugraz.at) + * Copyright (C) 2004-2010 Markus Grabner (line6@grabner-graz.at) * Emil Myhrman (emil.myhrman@gmail.com) */ diff --git a/sound/usb/line6/variax.c b/sound/usb/line6/variax.c index c2245aa93b08..b2f6637c84b2 100644 --- a/sound/usb/line6/variax.c +++ b/sound/usb/line6/variax.c @@ -2,7 +2,7 @@ /* * Line 6 Linux USB driver * - * Copyright (C) 2004-2010 Markus Grabner (grabner@icg.tugraz.at) + * Copyright (C) 2004-2010 Markus Grabner (line6@grabner-graz.at) */ #include <linux/slab.h> diff --git a/sound/usb/mixer_scarlett2.c b/sound/usb/mixer_scarlett2.c index 1150cf104985..4cddf84db631 100644 --- a/sound/usb/mixer_scarlett2.c +++ b/sound/usb/mixer_scarlett2.c @@ -5613,6 +5613,8 @@ static int scarlett2_update_filter_values(struct usb_mixer_interface *mixer) info->peq_flt_total_count * SCARLETT2_BIQUAD_COEFFS, peq_flt_values); + if (err < 0) + return err; for (i = 0, dst_idx = 0; i < info->dsp_input_count; i++) { src_idx = i * diff --git a/sound/usb/stream.c b/sound/usb/stream.c index d70c140813d6..c1ea8844a46f 100644 --- a/sound/usb/stream.c +++ b/sound/usb/stream.c @@ -1067,6 +1067,7 @@ found_clock: UAC3_BADD_PD_ID10 : UAC3_BADD_PD_ID11; pd->pd_d1d0_rec = UAC3_BADD_PD_RECOVER_D1D0; pd->pd_d2d0_rec = UAC3_BADD_PD_RECOVER_D2D0; + pd->ctrl_iface = ctrl_intf; } else { fp->attributes = parse_uac_endpoint_attributes(chip, alts, diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 1fb3cb2636e6..e8241b320c6d 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -5519,11 +5519,12 @@ union bpf_attr { * **-EOPNOTSUPP** if the hash calculation failed or **-EINVAL** if * invalid arguments are passed. * - * void *bpf_kptr_xchg(void *map_value, void *ptr) + * void *bpf_kptr_xchg(void *dst, void *ptr) * Description - * Exchange kptr at pointer *map_value* with *ptr*, and return the - * old value. *ptr* can be NULL, otherwise it must be a referenced - * pointer which will be released when this helper is called. + * Exchange kptr at pointer *dst* with *ptr*, and return the old value. + * *dst* can be map value or local kptr. *ptr* can be NULL, otherwise + * it must be a referenced pointer which will be released when this helper + * is called. * Return * The old value of kptr (which can be NULL). The returned pointer * if not NULL, is a reference which must be released using its @@ -6046,11 +6047,6 @@ enum { BPF_F_MARK_ENFORCE = (1ULL << 6), }; -/* BPF_FUNC_clone_redirect and BPF_FUNC_redirect flags. */ -enum { - BPF_F_INGRESS = (1ULL << 0), -}; - /* BPF_FUNC_skb_set_tunnel_key and BPF_FUNC_skb_get_tunnel_key flags. */ enum { BPF_F_TUNINFO_IPV6 = (1ULL << 0), @@ -6197,10 +6193,12 @@ enum { BPF_F_BPRM_SECUREEXEC = (1ULL << 0), }; -/* Flags for bpf_redirect_map helper */ +/* Flags for bpf_redirect and bpf_redirect_map helpers */ enum { - BPF_F_BROADCAST = (1ULL << 3), - BPF_F_EXCLUDE_INGRESS = (1ULL << 4), + BPF_F_INGRESS = (1ULL << 0), /* used for skb path */ + BPF_F_BROADCAST = (1ULL << 3), /* used for XDP path */ + BPF_F_EXCLUDE_INGRESS = (1ULL << 4), /* used for XDP path */ +#define BPF_F_REDIRECT_FLAGS (BPF_F_INGRESS | BPF_F_BROADCAST | BPF_F_EXCLUDE_INGRESS) }; #define __bpf_md_ptr(type, name) \ diff --git a/tools/testing/radix-tree/maple.c b/tools/testing/radix-tree/maple.c index 1873ddbe16cc..551ae6898c1d 100644 --- a/tools/testing/radix-tree/maple.c +++ b/tools/testing/radix-tree/maple.c @@ -36317,6 +36317,28 @@ static inline int check_vma_modification(struct maple_tree *mt) return 0; } +/* + * test to check that bulk stores do not use wr_rebalance as the store + * type. + */ +static inline void check_bulk_rebalance(struct maple_tree *mt) +{ + MA_STATE(mas, mt, ULONG_MAX, ULONG_MAX); + int max = 10; + + build_full_tree(mt, 0, 2); + + /* erase every entry in the tree */ + do { + /* set up bulk store mode */ + mas_expected_entries(&mas, max); + mas_erase(&mas); + MT_BUG_ON(mt, mas.store_type == wr_rebalance); + } while (mas_prev(&mas, 0) != NULL); + + mas_destroy(&mas); +} + void farmer_tests(void) { struct maple_node *node; @@ -36328,6 +36350,10 @@ void farmer_tests(void) check_vma_modification(&tree); mtree_destroy(&tree); + mt_init(&tree); + check_bulk_rebalance(&tree); + mtree_destroy(&tree); + tree.ma_root = xa_mk_value(0); mt_dump(&tree, mt_dump_dec); @@ -36406,9 +36432,93 @@ void farmer_tests(void) check_nomem(&tree); } +static unsigned long get_last_index(struct ma_state *mas) +{ + struct maple_node *node = mas_mn(mas); + enum maple_type mt = mte_node_type(mas->node); + unsigned long *pivots = ma_pivots(node, mt); + unsigned long last_index = mas_data_end(mas); + + BUG_ON(last_index == 0); + + return pivots[last_index - 1] + 1; +} + +/* + * Assert that we handle spanning stores that consume the entirety of the right + * leaf node correctly. + */ +static void test_spanning_store_regression(void) +{ + unsigned long from = 0, to = 0; + DEFINE_MTREE(tree); + MA_STATE(mas, &tree, 0, 0); + + /* + * Build a 3-level tree. We require a parent node below the root node + * and 2 leaf nodes under it, so we can span the entirety of the right + * hand node. + */ + build_full_tree(&tree, 0, 3); + + /* Descend into position at depth 2. */ + mas_reset(&mas); + mas_start(&mas); + mas_descend(&mas); + mas_descend(&mas); + + /* + * We need to establish a tree like the below. + * + * Then we can try a store in [from, to] which results in a spanned + * store across nodes B and C, with the maple state at the time of the + * write being such that only the subtree at A and below is considered. + * + * Height + * 0 Root Node + * / \ + * pivot = to / \ pivot = ULONG_MAX + * / \ + * 1 A [-----] ... + * / \ + * pivot = from / \ pivot = to + * / \ + * 2 (LEAVES) B [-----] [-----] C + * ^--- Last pivot to. + */ + while (true) { + unsigned long tmp = get_last_index(&mas); + + if (mas_next_sibling(&mas)) { + from = tmp; + to = mas.max; + } else { + break; + } + } + + BUG_ON(from == 0 && to == 0); + + /* Perform the store. */ + mas_set_range(&mas, from, to); + mas_store_gfp(&mas, xa_mk_value(0xdead), GFP_KERNEL); + + /* If the regression occurs, the validation will fail. */ + mt_validate(&tree); + + /* Cleanup. */ + __mt_destroy(&tree); +} + +static void regression_tests(void) +{ + test_spanning_store_regression(); +} + void maple_tree_tests(void) { #if !defined(BENCH) + regression_tests(); farmer_tests(); #endif maple_tree_seed(); diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index f04af11df8eb..75016962f795 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -157,7 +157,8 @@ TEST_GEN_PROGS_EXTENDED = \ flow_dissector_load test_flow_dissector test_tcp_check_syncookie_user \ test_lirc_mode2_user xdping test_cpp runqslower bench bpf_testmod.ko \ xskxceiver xdp_redirect_multi xdp_synproxy veristat xdp_hw_metadata \ - xdp_features bpf_test_no_cfi.ko + xdp_features bpf_test_no_cfi.ko bpf_test_modorder_x.ko \ + bpf_test_modorder_y.ko TEST_GEN_FILES += liburandom_read.so urandom_read sign-file uprobe_multi @@ -263,7 +264,7 @@ $(OUTPUT)/%:%.c ifeq ($(SRCARCH),$(filter $(SRCARCH),x86 riscv)) LLD := lld else -LLD := ld +LLD := $(shell command -v $(LD)) endif # Filter out -static for liburandom_read.so and its dependent targets so that static builds @@ -303,6 +304,19 @@ $(OUTPUT)/bpf_test_no_cfi.ko: $(VMLINUX_BTF) $(RESOLVE_BTFIDS) $(wildcard bpf_te $(Q)$(MAKE) $(submake_extras) RESOLVE_BTFIDS=$(RESOLVE_BTFIDS) -C bpf_test_no_cfi $(Q)cp bpf_test_no_cfi/bpf_test_no_cfi.ko $@ +$(OUTPUT)/bpf_test_modorder_x.ko: $(VMLINUX_BTF) $(RESOLVE_BTFIDS) $(wildcard bpf_test_modorder_x/Makefile bpf_test_modorder_x/*.[ch]) + $(call msg,MOD,,$@) + $(Q)$(RM) bpf_test_modorder_x/bpf_test_modorder_x.ko # force re-compilation + $(Q)$(MAKE) $(submake_extras) RESOLVE_BTFIDS=$(RESOLVE_BTFIDS) -C bpf_test_modorder_x + $(Q)cp bpf_test_modorder_x/bpf_test_modorder_x.ko $@ + +$(OUTPUT)/bpf_test_modorder_y.ko: $(VMLINUX_BTF) $(RESOLVE_BTFIDS) $(wildcard bpf_test_modorder_y/Makefile bpf_test_modorder_y/*.[ch]) + $(call msg,MOD,,$@) + $(Q)$(RM) bpf_test_modorder_y/bpf_test_modorder_y.ko # force re-compilation + $(Q)$(MAKE) $(submake_extras) RESOLVE_BTFIDS=$(RESOLVE_BTFIDS) -C bpf_test_modorder_y + $(Q)cp bpf_test_modorder_y/bpf_test_modorder_y.ko $@ + + DEFAULT_BPFTOOL := $(HOST_SCRATCH_DIR)/sbin/bpftool ifneq ($(CROSS_COMPILE),) CROSS_BPFTOOL := $(SCRATCH_DIR)/sbin/bpftool @@ -722,6 +736,8 @@ TRUNNER_EXTRA_SOURCES := test_progs.c \ ip_check_defrag_frags.h TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read $(OUTPUT)/bpf_testmod.ko \ $(OUTPUT)/bpf_test_no_cfi.ko \ + $(OUTPUT)/bpf_test_modorder_x.ko \ + $(OUTPUT)/bpf_test_modorder_y.ko \ $(OUTPUT)/liburandom_read.so \ $(OUTPUT)/xdp_synproxy \ $(OUTPUT)/sign-file \ @@ -856,6 +872,8 @@ EXTRA_CLEAN := $(SCRATCH_DIR) $(HOST_SCRATCH_DIR) \ $(addprefix $(OUTPUT)/,*.o *.d *.skel.h *.lskel.h *.subskel.h \ no_alu32 cpuv4 bpf_gcc bpf_testmod.ko \ bpf_test_no_cfi.ko \ + bpf_test_modorder_x.ko \ + bpf_test_modorder_y.ko \ liburandom_read.so) \ $(OUTPUT)/FEATURE-DUMP.selftests diff --git a/tools/testing/selftests/bpf/bpf_test_modorder_x/Makefile b/tools/testing/selftests/bpf/bpf_test_modorder_x/Makefile new file mode 100644 index 000000000000..40b25b98ad1b --- /dev/null +++ b/tools/testing/selftests/bpf/bpf_test_modorder_x/Makefile @@ -0,0 +1,19 @@ +BPF_TESTMOD_DIR := $(realpath $(dir $(abspath $(lastword $(MAKEFILE_LIST))))) +KDIR ?= $(abspath $(BPF_TESTMOD_DIR)/../../../../..) + +ifeq ($(V),1) +Q = +else +Q = @ +endif + +MODULES = bpf_test_modorder_x.ko + +obj-m += bpf_test_modorder_x.o + +all: + +$(Q)make -C $(KDIR) M=$(BPF_TESTMOD_DIR) modules + +clean: + +$(Q)make -C $(KDIR) M=$(BPF_TESTMOD_DIR) clean + diff --git a/tools/testing/selftests/bpf/bpf_test_modorder_x/bpf_test_modorder_x.c b/tools/testing/selftests/bpf/bpf_test_modorder_x/bpf_test_modorder_x.c new file mode 100644 index 000000000000..0cc747fa912f --- /dev/null +++ b/tools/testing/selftests/bpf/bpf_test_modorder_x/bpf_test_modorder_x.c @@ -0,0 +1,39 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <linux/bpf.h> +#include <linux/btf.h> +#include <linux/module.h> +#include <linux/init.h> + +__bpf_kfunc_start_defs(); + +__bpf_kfunc int bpf_test_modorder_retx(void) +{ + return 'x'; +} + +__bpf_kfunc_end_defs(); + +BTF_KFUNCS_START(bpf_test_modorder_kfunc_x_ids) +BTF_ID_FLAGS(func, bpf_test_modorder_retx); +BTF_KFUNCS_END(bpf_test_modorder_kfunc_x_ids) + +static const struct btf_kfunc_id_set bpf_test_modorder_x_set = { + .owner = THIS_MODULE, + .set = &bpf_test_modorder_kfunc_x_ids, +}; + +static int __init bpf_test_modorder_x_init(void) +{ + return register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, + &bpf_test_modorder_x_set); +} + +static void __exit bpf_test_modorder_x_exit(void) +{ +} + +module_init(bpf_test_modorder_x_init); +module_exit(bpf_test_modorder_x_exit); + +MODULE_DESCRIPTION("BPF selftest ordertest module X"); +MODULE_LICENSE("GPL"); diff --git a/tools/testing/selftests/bpf/bpf_test_modorder_y/Makefile b/tools/testing/selftests/bpf/bpf_test_modorder_y/Makefile new file mode 100644 index 000000000000..52c3ab9d84e2 --- /dev/null +++ b/tools/testing/selftests/bpf/bpf_test_modorder_y/Makefile @@ -0,0 +1,19 @@ +BPF_TESTMOD_DIR := $(realpath $(dir $(abspath $(lastword $(MAKEFILE_LIST))))) +KDIR ?= $(abspath $(BPF_TESTMOD_DIR)/../../../../..) + +ifeq ($(V),1) +Q = +else +Q = @ +endif + +MODULES = bpf_test_modorder_y.ko + +obj-m += bpf_test_modorder_y.o + +all: + +$(Q)make -C $(KDIR) M=$(BPF_TESTMOD_DIR) modules + +clean: + +$(Q)make -C $(KDIR) M=$(BPF_TESTMOD_DIR) clean + diff --git a/tools/testing/selftests/bpf/bpf_test_modorder_y/bpf_test_modorder_y.c b/tools/testing/selftests/bpf/bpf_test_modorder_y/bpf_test_modorder_y.c new file mode 100644 index 000000000000..c627ee085d13 --- /dev/null +++ b/tools/testing/selftests/bpf/bpf_test_modorder_y/bpf_test_modorder_y.c @@ -0,0 +1,39 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <linux/bpf.h> +#include <linux/btf.h> +#include <linux/module.h> +#include <linux/init.h> + +__bpf_kfunc_start_defs(); + +__bpf_kfunc int bpf_test_modorder_rety(void) +{ + return 'y'; +} + +__bpf_kfunc_end_defs(); + +BTF_KFUNCS_START(bpf_test_modorder_kfunc_y_ids) +BTF_ID_FLAGS(func, bpf_test_modorder_rety); +BTF_KFUNCS_END(bpf_test_modorder_kfunc_y_ids) + +static const struct btf_kfunc_id_set bpf_test_modorder_y_set = { + .owner = THIS_MODULE, + .set = &bpf_test_modorder_kfunc_y_ids, +}; + +static int __init bpf_test_modorder_y_init(void) +{ + return register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, + &bpf_test_modorder_y_set); +} + +static void __exit bpf_test_modorder_y_exit(void) +{ +} + +module_init(bpf_test_modorder_y_init); +module_exit(bpf_test_modorder_y_exit); + +MODULE_DESCRIPTION("BPF selftest ordertest module Y"); +MODULE_LICENSE("GPL"); diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c index 52e6f7570475..f0a3a9c18e9e 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c @@ -226,7 +226,7 @@ static void test_task_common_nocheck(struct bpf_iter_attach_opts *opts, ASSERT_OK(pthread_create(&thread_id, NULL, &do_nothing_wait, NULL), "pthread_create"); - skel->bss->tid = getpid(); + skel->bss->tid = gettid(); do_dummy_read_opts(skel->progs.dump_task, opts); @@ -249,25 +249,42 @@ static void test_task_common(struct bpf_iter_attach_opts *opts, int num_unknown, ASSERT_EQ(num_known_tid, num_known, "check_num_known_tid"); } -static void test_task_tid(void) +static void *run_test_task_tid(void *arg) { LIBBPF_OPTS(bpf_iter_attach_opts, opts); union bpf_iter_link_info linfo; int num_unknown_tid, num_known_tid; + ASSERT_NEQ(getpid(), gettid(), "check_new_thread_id"); + memset(&linfo, 0, sizeof(linfo)); - linfo.task.tid = getpid(); + linfo.task.tid = gettid(); opts.link_info = &linfo; opts.link_info_len = sizeof(linfo); test_task_common(&opts, 0, 1); linfo.task.tid = 0; linfo.task.pid = getpid(); - test_task_common(&opts, 1, 1); + /* This includes the parent thread, this thread, + * and the do_nothing_wait thread + */ + test_task_common(&opts, 2, 1); test_task_common_nocheck(NULL, &num_unknown_tid, &num_known_tid); - ASSERT_GT(num_unknown_tid, 1, "check_num_unknown_tid"); + ASSERT_GT(num_unknown_tid, 2, "check_num_unknown_tid"); ASSERT_EQ(num_known_tid, 1, "check_num_known_tid"); + + return NULL; +} + +static void test_task_tid(void) +{ + pthread_t thread_id; + + /* Create a new thread so pid and tid aren't the same */ + ASSERT_OK(pthread_create(&thread_id, NULL, &run_test_task_tid, NULL), + "pthread_create"); + ASSERT_FALSE(pthread_join(thread_id, NULL), "pthread_join"); } static void test_task_pid(void) diff --git a/tools/testing/selftests/bpf/prog_tests/cgroup_ancestor.c b/tools/testing/selftests/bpf/prog_tests/cgroup_ancestor.c index 9250a1e9f9af..3f9ffdf71343 100644 --- a/tools/testing/selftests/bpf/prog_tests/cgroup_ancestor.c +++ b/tools/testing/selftests/bpf/prog_tests/cgroup_ancestor.c @@ -35,7 +35,7 @@ static int send_datagram(void) if (!ASSERT_OK_FD(sock, "create socket")) return sock; - if (!ASSERT_OK(connect(sock, &addr, sizeof(addr)), "connect")) { + if (!ASSERT_OK(connect(sock, (struct sockaddr *)&addr, sizeof(addr)), "connect")) { close(sock); return -1; } diff --git a/tools/testing/selftests/bpf/prog_tests/cpumask.c b/tools/testing/selftests/bpf/prog_tests/cpumask.c index 2570bd4b0cb2..e58a04654238 100644 --- a/tools/testing/selftests/bpf/prog_tests/cpumask.c +++ b/tools/testing/selftests/bpf/prog_tests/cpumask.c @@ -23,6 +23,7 @@ static const char * const cpumask_success_testcases[] = { "test_global_mask_array_l2_rcu", "test_global_mask_nested_rcu", "test_global_mask_nested_deep_rcu", + "test_global_mask_nested_deep_array_rcu", "test_cpumask_weight", }; diff --git a/tools/testing/selftests/bpf/prog_tests/fill_link_info.c b/tools/testing/selftests/bpf/prog_tests/fill_link_info.c index f3932941bbaa..d50cbd8040d4 100644 --- a/tools/testing/selftests/bpf/prog_tests/fill_link_info.c +++ b/tools/testing/selftests/bpf/prog_tests/fill_link_info.c @@ -67,8 +67,9 @@ again: ASSERT_EQ(info.perf_event.kprobe.cookie, PERF_EVENT_COOKIE, "kprobe_cookie"); + ASSERT_EQ(info.perf_event.kprobe.name_len, strlen(KPROBE_FUNC) + 1, + "name_len"); if (!info.perf_event.kprobe.func_name) { - ASSERT_EQ(info.perf_event.kprobe.name_len, 0, "name_len"); info.perf_event.kprobe.func_name = ptr_to_u64(&buf); info.perf_event.kprobe.name_len = sizeof(buf); goto again; @@ -79,8 +80,9 @@ again: ASSERT_EQ(err, 0, "cmp_kprobe_func_name"); break; case BPF_PERF_EVENT_TRACEPOINT: + ASSERT_EQ(info.perf_event.tracepoint.name_len, strlen(TP_NAME) + 1, + "name_len"); if (!info.perf_event.tracepoint.tp_name) { - ASSERT_EQ(info.perf_event.tracepoint.name_len, 0, "name_len"); info.perf_event.tracepoint.tp_name = ptr_to_u64(&buf); info.perf_event.tracepoint.name_len = sizeof(buf); goto again; @@ -96,8 +98,9 @@ again: case BPF_PERF_EVENT_URETPROBE: ASSERT_EQ(info.perf_event.uprobe.offset, offset, "uprobe_offset"); + ASSERT_EQ(info.perf_event.uprobe.name_len, strlen(UPROBE_FILE) + 1, + "name_len"); if (!info.perf_event.uprobe.file_name) { - ASSERT_EQ(info.perf_event.uprobe.name_len, 0, "name_len"); info.perf_event.uprobe.file_name = ptr_to_u64(&buf); info.perf_event.uprobe.name_len = sizeof(buf); goto again; @@ -417,6 +420,15 @@ verify_umulti_link_info(int fd, bool retprobe, __u64 *offsets, if (!ASSERT_NEQ(err, -1, "readlink")) return -1; + memset(&info, 0, sizeof(info)); + err = bpf_link_get_info_by_fd(fd, &info, &len); + if (!ASSERT_OK(err, "bpf_link_get_info_by_fd")) + return -1; + + ASSERT_EQ(info.uprobe_multi.count, 3, "info.uprobe_multi.count"); + ASSERT_EQ(info.uprobe_multi.path_size, strlen(path) + 1, + "info.uprobe_multi.path_size"); + for (bit = 0; bit < 8; bit++) { memset(&info, 0, sizeof(info)); info.uprobe_multi.path = ptr_to_u64(path_buf); diff --git a/tools/testing/selftests/bpf/prog_tests/kfunc_module_order.c b/tools/testing/selftests/bpf/prog_tests/kfunc_module_order.c new file mode 100644 index 000000000000..48c0560d398e --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/kfunc_module_order.c @@ -0,0 +1,55 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <test_progs.h> +#include <testing_helpers.h> + +#include "kfunc_module_order.skel.h" + +static int test_run_prog(const struct bpf_program *prog, + struct bpf_test_run_opts *opts) +{ + int err; + + err = bpf_prog_test_run_opts(bpf_program__fd(prog), opts); + if (!ASSERT_OK(err, "bpf_prog_test_run_opts")) + return err; + + if (!ASSERT_EQ((int)opts->retval, 0, bpf_program__name(prog))) + return -EINVAL; + + return 0; +} + +void test_kfunc_module_order(void) +{ + struct kfunc_module_order *skel; + char pkt_data[64] = {}; + int err = 0; + + DECLARE_LIBBPF_OPTS(bpf_test_run_opts, test_opts, .data_in = pkt_data, + .data_size_in = sizeof(pkt_data)); + + err = load_module("bpf_test_modorder_x.ko", + env_verbosity > VERBOSE_NONE); + if (!ASSERT_OK(err, "load bpf_test_modorder_x.ko")) + return; + + err = load_module("bpf_test_modorder_y.ko", + env_verbosity > VERBOSE_NONE); + if (!ASSERT_OK(err, "load bpf_test_modorder_y.ko")) + goto exit_modx; + + skel = kfunc_module_order__open_and_load(); + if (!ASSERT_OK_PTR(skel, "kfunc_module_order__open_and_load()")) { + err = -EINVAL; + goto exit_mods; + } + + test_run_prog(skel->progs.call_kfunc_xy, &test_opts); + test_run_prog(skel->progs.call_kfunc_yx, &test_opts); + + kfunc_module_order__destroy(skel); +exit_mods: + unload_module("bpf_test_modorder_y", env_verbosity > VERBOSE_NONE); +exit_modx: + unload_module("bpf_test_modorder_x", env_verbosity > VERBOSE_NONE); +} diff --git a/tools/testing/selftests/bpf/prog_tests/netfilter_link_attach.c b/tools/testing/selftests/bpf/prog_tests/netfilter_link_attach.c index 4297a2a4cb11..2f52fa2641ba 100644 --- a/tools/testing/selftests/bpf/prog_tests/netfilter_link_attach.c +++ b/tools/testing/selftests/bpf/prog_tests/netfilter_link_attach.c @@ -26,10 +26,43 @@ static const struct nf_link_test nf_hook_link_tests[] = { { .pf = NFPROTO_INET, .priority = 1, .name = "invalid-inet-not-supported", }, - { .pf = NFPROTO_IPV4, .priority = -10000, .expect_success = true, .name = "attach ipv4", }, - { .pf = NFPROTO_IPV6, .priority = 10001, .expect_success = true, .name = "attach ipv6", }, + { + .pf = NFPROTO_IPV4, + .hooknum = NF_INET_POST_ROUTING, + .priority = -10000, + .flags = 0, + .expect_success = true, + .name = "attach ipv4", + }, + { + .pf = NFPROTO_IPV6, + .hooknum = NF_INET_FORWARD, + .priority = 10001, + .flags = BPF_F_NETFILTER_IP_DEFRAG, + .expect_success = true, + .name = "attach ipv6", + }, }; +static void verify_netfilter_link_info(struct bpf_link *link, const struct nf_link_test nf_expected) +{ + struct bpf_link_info info; + __u32 len = sizeof(info); + int err, fd; + + memset(&info, 0, len); + + fd = bpf_link__fd(link); + err = bpf_link_get_info_by_fd(fd, &info, &len); + ASSERT_OK(err, "get_link_info"); + + ASSERT_EQ(info.type, BPF_LINK_TYPE_NETFILTER, "info link type"); + ASSERT_EQ(info.netfilter.pf, nf_expected.pf, "info nf protocol family"); + ASSERT_EQ(info.netfilter.hooknum, nf_expected.hooknum, "info nf hooknum"); + ASSERT_EQ(info.netfilter.priority, nf_expected.priority, "info nf priority"); + ASSERT_EQ(info.netfilter.flags, nf_expected.flags, "info nf flags"); +} + void test_netfilter_link_attach(void) { struct test_netfilter_link_attach *skel; @@ -64,6 +97,8 @@ void test_netfilter_link_attach(void) if (!ASSERT_OK_PTR(link, "program attach successful")) continue; + verify_netfilter_link_info(link, nf_hook_link_tests[i]); + link2 = bpf_program__attach_netfilter(prog, &opts); ASSERT_ERR_PTR(link2, "attach program with same pf/hook/priority"); @@ -73,6 +108,9 @@ void test_netfilter_link_attach(void) link2 = bpf_program__attach_netfilter(prog, &opts); if (!ASSERT_OK_PTR(link2, "program reattach successful")) continue; + + verify_netfilter_link_info(link2, nf_hook_link_tests[i]); + if (!ASSERT_OK(bpf_link__destroy(link2), "link destroy")) break; } else { diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c index e26b5150fc43..5356f26bbb3f 100644 --- a/tools/testing/selftests/bpf/prog_tests/verifier.c +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c @@ -44,6 +44,7 @@ #include "verifier_ld_ind.skel.h" #include "verifier_ldsx.skel.h" #include "verifier_leak_ptr.skel.h" +#include "verifier_linked_scalars.skel.h" #include "verifier_loops1.skel.h" #include "verifier_lwt.skel.h" #include "verifier_map_in_map.skel.h" @@ -170,6 +171,7 @@ void test_verifier_jit_convergence(void) { RUN(verifier_jit_convergence); } void test_verifier_ld_ind(void) { RUN(verifier_ld_ind); } void test_verifier_ldsx(void) { RUN(verifier_ldsx); } void test_verifier_leak_ptr(void) { RUN(verifier_leak_ptr); } +void test_verifier_linked_scalars(void) { RUN(verifier_linked_scalars); } void test_verifier_loops1(void) { RUN(verifier_loops1); } void test_verifier_lwt(void) { RUN(verifier_lwt); } void test_verifier_map_in_map(void) { RUN(verifier_map_in_map); } diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_devmap_attach.c b/tools/testing/selftests/bpf/prog_tests/xdp_devmap_attach.c index ce6812558287..27ffed17d4be 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_devmap_attach.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_devmap_attach.c @@ -1,6 +1,9 @@ // SPDX-License-Identifier: GPL-2.0 +#include <arpa/inet.h> #include <uapi/linux/bpf.h> #include <linux/if_link.h> +#include <network_helpers.h> +#include <net/if.h> #include <test_progs.h> #include "test_xdp_devmap_helpers.skel.h" @@ -8,31 +11,36 @@ #include "test_xdp_with_devmap_helpers.skel.h" #define IFINDEX_LO 1 +#define TEST_NS "devmap_attach_ns" static void test_xdp_with_devmap_helpers(void) { - struct test_xdp_with_devmap_helpers *skel; + struct test_xdp_with_devmap_helpers *skel = NULL; struct bpf_prog_info info = {}; struct bpf_devmap_val val = { .ifindex = IFINDEX_LO, }; __u32 len = sizeof(info); - int err, dm_fd, map_fd; + int err, dm_fd, dm_fd_redir, map_fd; + struct nstoken *nstoken = NULL; + char data[10] = {}; __u32 idx = 0; + SYS(out_close, "ip netns add %s", TEST_NS); + nstoken = open_netns(TEST_NS); + if (!ASSERT_OK_PTR(nstoken, "open_netns")) + goto out_close; + SYS(out_close, "ip link set dev lo up"); skel = test_xdp_with_devmap_helpers__open_and_load(); if (!ASSERT_OK_PTR(skel, "test_xdp_with_devmap_helpers__open_and_load")) - return; + goto out_close; - dm_fd = bpf_program__fd(skel->progs.xdp_redir_prog); - err = bpf_xdp_attach(IFINDEX_LO, dm_fd, XDP_FLAGS_SKB_MODE, NULL); + dm_fd_redir = bpf_program__fd(skel->progs.xdp_redir_prog); + err = bpf_xdp_attach(IFINDEX_LO, dm_fd_redir, XDP_FLAGS_SKB_MODE, NULL); if (!ASSERT_OK(err, "Generic attach of program with 8-byte devmap")) goto out_close; - err = bpf_xdp_detach(IFINDEX_LO, XDP_FLAGS_SKB_MODE, NULL); - ASSERT_OK(err, "XDP program detach"); - dm_fd = bpf_program__fd(skel->progs.xdp_dummy_dm); map_fd = bpf_map__fd(skel->maps.dm_ports); err = bpf_prog_get_info_by_fd(dm_fd, &info, &len); @@ -47,6 +55,22 @@ static void test_xdp_with_devmap_helpers(void) ASSERT_OK(err, "Read devmap entry"); ASSERT_EQ(info.id, val.bpf_prog.id, "Match program id to devmap entry prog_id"); + /* send a packet to trigger any potential bugs in there */ + DECLARE_LIBBPF_OPTS(bpf_test_run_opts, opts, + .data_in = &data, + .data_size_in = 10, + .flags = BPF_F_TEST_XDP_LIVE_FRAMES, + .repeat = 1, + ); + err = bpf_prog_test_run_opts(dm_fd_redir, &opts); + ASSERT_OK(err, "XDP test run"); + + /* wait for the packets to be flushed */ + kern_sync_rcu(); + + err = bpf_xdp_detach(IFINDEX_LO, XDP_FLAGS_SKB_MODE, NULL); + ASSERT_OK(err, "XDP program detach"); + /* can not attach BPF_XDP_DEVMAP program to a device */ err = bpf_xdp_attach(IFINDEX_LO, dm_fd, XDP_FLAGS_SKB_MODE, NULL); if (!ASSERT_NEQ(err, 0, "Attach of BPF_XDP_DEVMAP program")) @@ -67,6 +91,8 @@ static void test_xdp_with_devmap_helpers(void) ASSERT_NEQ(err, 0, "Add BPF_XDP program with frags to devmap entry"); out_close: + close_netns(nstoken); + SYS_NOFAIL("ip netns del %s", TEST_NS); test_xdp_with_devmap_helpers__destroy(skel); } @@ -124,6 +150,86 @@ out_close: test_xdp_with_devmap_frags_helpers__destroy(skel); } +static void test_xdp_with_devmap_helpers_veth(void) +{ + struct test_xdp_with_devmap_helpers *skel = NULL; + struct bpf_prog_info info = {}; + struct bpf_devmap_val val = {}; + struct nstoken *nstoken = NULL; + __u32 len = sizeof(info); + int err, dm_fd, dm_fd_redir, map_fd, ifindex_dst; + char data[10] = {}; + __u32 idx = 0; + + SYS(out_close, "ip netns add %s", TEST_NS); + nstoken = open_netns(TEST_NS); + if (!ASSERT_OK_PTR(nstoken, "open_netns")) + goto out_close; + + SYS(out_close, "ip link add veth_src type veth peer name veth_dst"); + SYS(out_close, "ip link set dev veth_src up"); + SYS(out_close, "ip link set dev veth_dst up"); + + val.ifindex = if_nametoindex("veth_src"); + ifindex_dst = if_nametoindex("veth_dst"); + if (!ASSERT_NEQ(val.ifindex, 0, "val.ifindex") || + !ASSERT_NEQ(ifindex_dst, 0, "ifindex_dst")) + goto out_close; + + skel = test_xdp_with_devmap_helpers__open_and_load(); + if (!ASSERT_OK_PTR(skel, "test_xdp_with_devmap_helpers__open_and_load")) + goto out_close; + + dm_fd_redir = bpf_program__fd(skel->progs.xdp_redir_prog); + err = bpf_xdp_attach(val.ifindex, dm_fd_redir, XDP_FLAGS_DRV_MODE, NULL); + if (!ASSERT_OK(err, "Attach of program with 8-byte devmap")) + goto out_close; + + dm_fd = bpf_program__fd(skel->progs.xdp_dummy_dm); + map_fd = bpf_map__fd(skel->maps.dm_ports); + err = bpf_prog_get_info_by_fd(dm_fd, &info, &len); + if (!ASSERT_OK(err, "bpf_prog_get_info_by_fd")) + goto out_close; + + val.bpf_prog.fd = dm_fd; + err = bpf_map_update_elem(map_fd, &idx, &val, 0); + ASSERT_OK(err, "Add program to devmap entry"); + + err = bpf_map_lookup_elem(map_fd, &idx, &val); + ASSERT_OK(err, "Read devmap entry"); + ASSERT_EQ(info.id, val.bpf_prog.id, "Match program id to devmap entry prog_id"); + + /* attach dummy to other side to enable reception */ + dm_fd = bpf_program__fd(skel->progs.xdp_dummy_prog); + err = bpf_xdp_attach(ifindex_dst, dm_fd, XDP_FLAGS_DRV_MODE, NULL); + if (!ASSERT_OK(err, "Attach of dummy XDP")) + goto out_close; + + /* send a packet to trigger any potential bugs in there */ + DECLARE_LIBBPF_OPTS(bpf_test_run_opts, opts, + .data_in = &data, + .data_size_in = 10, + .flags = BPF_F_TEST_XDP_LIVE_FRAMES, + .repeat = 1, + ); + err = bpf_prog_test_run_opts(dm_fd_redir, &opts); + ASSERT_OK(err, "XDP test run"); + + /* wait for the packets to be flushed */ + kern_sync_rcu(); + + err = bpf_xdp_detach(val.ifindex, XDP_FLAGS_DRV_MODE, NULL); + ASSERT_OK(err, "XDP program detach"); + + err = bpf_xdp_detach(ifindex_dst, XDP_FLAGS_DRV_MODE, NULL); + ASSERT_OK(err, "XDP program detach"); + +out_close: + close_netns(nstoken); + SYS_NOFAIL("ip netns del %s", TEST_NS); + test_xdp_with_devmap_helpers__destroy(skel); +} + void serial_test_xdp_devmap_attach(void) { if (test__start_subtest("DEVMAP with programs in entries")) @@ -134,4 +240,7 @@ void serial_test_xdp_devmap_attach(void) if (test__start_subtest("Verifier check of DEVMAP programs")) test_neg_xdp_devmap_helpers(); + + if (test__start_subtest("DEVMAP with programs in entries on veth")) + test_xdp_with_devmap_helpers_veth(); } diff --git a/tools/testing/selftests/bpf/progs/cpumask_common.h b/tools/testing/selftests/bpf/progs/cpumask_common.h index b979e91f55f0..4ece7873ba60 100644 --- a/tools/testing/selftests/bpf/progs/cpumask_common.h +++ b/tools/testing/selftests/bpf/progs/cpumask_common.h @@ -7,6 +7,11 @@ #include "errno.h" #include <stdbool.h> +/* Should use BTF_FIELDS_MAX, but it is not always available in vmlinux.h, + * so use the hard-coded number as a workaround. + */ +#define CPUMASK_KPTR_FIELDS_MAX 11 + int err; #define private(name) SEC(".bss." #name) __attribute__((aligned(8))) diff --git a/tools/testing/selftests/bpf/progs/cpumask_failure.c b/tools/testing/selftests/bpf/progs/cpumask_failure.c index a988d2823b52..b40b52548ffb 100644 --- a/tools/testing/selftests/bpf/progs/cpumask_failure.c +++ b/tools/testing/selftests/bpf/progs/cpumask_failure.c @@ -10,6 +10,21 @@ char _license[] SEC("license") = "GPL"; +struct kptr_nested_array_2 { + struct bpf_cpumask __kptr * mask; +}; + +struct kptr_nested_array_1 { + /* Make btf_parse_fields() in map_create() return -E2BIG */ + struct kptr_nested_array_2 d_2[CPUMASK_KPTR_FIELDS_MAX + 1]; +}; + +struct kptr_nested_array { + struct kptr_nested_array_1 d_1; +}; + +private(MASK_NESTED) static struct kptr_nested_array global_mask_nested_arr; + /* Prototype for all of the program trace events below: * * TRACE_EVENT(task_newtask, @@ -187,3 +202,23 @@ int BPF_PROG(test_global_mask_rcu_no_null_check, struct task_struct *task, u64 c return 0; } + +SEC("tp_btf/task_newtask") +__failure __msg("has no valid kptr") +int BPF_PROG(test_invalid_nested_array, struct task_struct *task, u64 clone_flags) +{ + struct bpf_cpumask *local, *prev; + + local = create_cpumask(); + if (!local) + return 0; + + prev = bpf_kptr_xchg(&global_mask_nested_arr.d_1.d_2[CPUMASK_KPTR_FIELDS_MAX].mask, local); + if (prev) { + bpf_cpumask_release(prev); + err = 3; + return 0; + } + + return 0; +} diff --git a/tools/testing/selftests/bpf/progs/cpumask_success.c b/tools/testing/selftests/bpf/progs/cpumask_success.c index fd8106831c32..80ee469b0b60 100644 --- a/tools/testing/selftests/bpf/progs/cpumask_success.c +++ b/tools/testing/selftests/bpf/progs/cpumask_success.c @@ -31,11 +31,59 @@ struct kptr_nested_deep { struct kptr_nested_pair ptr_pairs[3]; }; +struct kptr_nested_deep_array_1_2 { + int dummy; + struct bpf_cpumask __kptr * mask[CPUMASK_KPTR_FIELDS_MAX]; +}; + +struct kptr_nested_deep_array_1_1 { + int dummy; + struct kptr_nested_deep_array_1_2 d_2; +}; + +struct kptr_nested_deep_array_1 { + long dummy; + struct kptr_nested_deep_array_1_1 d_1; +}; + +struct kptr_nested_deep_array_2_2 { + long dummy[2]; + struct bpf_cpumask __kptr * mask; +}; + +struct kptr_nested_deep_array_2_1 { + int dummy; + struct kptr_nested_deep_array_2_2 d_2[CPUMASK_KPTR_FIELDS_MAX]; +}; + +struct kptr_nested_deep_array_2 { + long dummy; + struct kptr_nested_deep_array_2_1 d_1; +}; + +struct kptr_nested_deep_array_3_2 { + long dummy[2]; + struct bpf_cpumask __kptr * mask; +}; + +struct kptr_nested_deep_array_3_1 { + int dummy; + struct kptr_nested_deep_array_3_2 d_2; +}; + +struct kptr_nested_deep_array_3 { + long dummy; + struct kptr_nested_deep_array_3_1 d_1[CPUMASK_KPTR_FIELDS_MAX]; +}; + private(MASK) static struct bpf_cpumask __kptr * global_mask_array[2]; private(MASK) static struct bpf_cpumask __kptr * global_mask_array_l2[2][1]; private(MASK) static struct bpf_cpumask __kptr * global_mask_array_one[1]; private(MASK) static struct kptr_nested global_mask_nested[2]; private(MASK_DEEP) static struct kptr_nested_deep global_mask_nested_deep; +private(MASK_1) static struct kptr_nested_deep_array_1 global_mask_nested_deep_array_1; +private(MASK_2) static struct kptr_nested_deep_array_2 global_mask_nested_deep_array_2; +private(MASK_3) static struct kptr_nested_deep_array_3 global_mask_nested_deep_array_3; static bool is_test_task(void) { @@ -543,12 +591,21 @@ static int _global_mask_array_rcu(struct bpf_cpumask **mask0, goto err_exit; } - /* [<mask 0>, NULL] */ - if (!*mask0 || *mask1) { + /* [<mask 0>, *] */ + if (!*mask0) { err = 2; goto err_exit; } + if (!mask1) + goto err_exit; + + /* [*, NULL] */ + if (*mask1) { + err = 3; + goto err_exit; + } + local = create_cpumask(); if (!local) { err = 9; @@ -632,6 +689,23 @@ int BPF_PROG(test_global_mask_nested_deep_rcu, struct task_struct *task, u64 clo } SEC("tp_btf/task_newtask") +int BPF_PROG(test_global_mask_nested_deep_array_rcu, struct task_struct *task, u64 clone_flags) +{ + int i; + + for (i = 0; i < CPUMASK_KPTR_FIELDS_MAX; i++) + _global_mask_array_rcu(&global_mask_nested_deep_array_1.d_1.d_2.mask[i], NULL); + + for (i = 0; i < CPUMASK_KPTR_FIELDS_MAX; i++) + _global_mask_array_rcu(&global_mask_nested_deep_array_2.d_1.d_2[i].mask, NULL); + + for (i = 0; i < CPUMASK_KPTR_FIELDS_MAX; i++) + _global_mask_array_rcu(&global_mask_nested_deep_array_3.d_1[i].d_2.mask, NULL); + + return 0; +} + +SEC("tp_btf/task_newtask") int BPF_PROG(test_cpumask_weight, struct task_struct *task, u64 clone_flags) { struct bpf_cpumask *local; diff --git a/tools/testing/selftests/bpf/progs/kfunc_module_order.c b/tools/testing/selftests/bpf/progs/kfunc_module_order.c new file mode 100644 index 000000000000..76003d04c95f --- /dev/null +++ b/tools/testing/selftests/bpf/progs/kfunc_module_order.c @@ -0,0 +1,30 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> + +extern int bpf_test_modorder_retx(void) __ksym; +extern int bpf_test_modorder_rety(void) __ksym; + +SEC("classifier") +int call_kfunc_xy(struct __sk_buff *skb) +{ + int ret1, ret2; + + ret1 = bpf_test_modorder_retx(); + ret2 = bpf_test_modorder_rety(); + + return ret1 == 'x' && ret2 == 'y' ? 0 : -1; +} + +SEC("classifier") +int call_kfunc_yx(struct __sk_buff *skb) +{ + int ret1, ret2; + + ret1 = bpf_test_modorder_rety(); + ret2 = bpf_test_modorder_retx(); + + return ret1 == 'y' && ret2 == 'x' ? 0 : -1; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/test_xdp_with_devmap_helpers.c b/tools/testing/selftests/bpf/progs/test_xdp_with_devmap_helpers.c index 4139a14f9996..92b65a485d4a 100644 --- a/tools/testing/selftests/bpf/progs/test_xdp_with_devmap_helpers.c +++ b/tools/testing/selftests/bpf/progs/test_xdp_with_devmap_helpers.c @@ -12,7 +12,7 @@ struct { SEC("xdp") int xdp_redir_prog(struct xdp_md *ctx) { - return bpf_redirect_map(&dm_ports, 1, 0); + return bpf_redirect_map(&dm_ports, 0, 0); } /* invalid program on DEVMAP entry; diff --git a/tools/testing/selftests/bpf/progs/verifier_linked_scalars.c b/tools/testing/selftests/bpf/progs/verifier_linked_scalars.c new file mode 100644 index 000000000000..8f755d2464cf --- /dev/null +++ b/tools/testing/selftests/bpf/progs/verifier_linked_scalars.c @@ -0,0 +1,34 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> +#include "bpf_misc.h" + +SEC("socket") +__description("scalars: find linked scalars") +__failure +__msg("math between fp pointer and 2147483647 is not allowed") +__naked void scalars(void) +{ + asm volatile (" \ + r0 = 0; \ + r1 = 0x80000001 ll; \ + r1 /= 1; \ + r2 = r1; \ + r4 = r1; \ + w2 += 0x7FFFFFFF; \ + w4 += 0; \ + if r2 == 0 goto l1; \ + exit; \ +l1: \ + r4 >>= 63; \ + r3 = 1; \ + r3 -= r4; \ + r3 *= 0x7FFFFFFF; \ + r3 += r10; \ + *(u8*)(r3 - 1) = r0; \ + exit; \ +" ::: __clobber_all); +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/verifier_movsx.c b/tools/testing/selftests/bpf/progs/verifier_movsx.c index 028ec855587b..994bbc346d25 100644 --- a/tools/testing/selftests/bpf/progs/verifier_movsx.c +++ b/tools/testing/selftests/bpf/progs/verifier_movsx.c @@ -287,6 +287,46 @@ l0_%=: \ : __clobber_all); } +SEC("socket") +__description("MOV64SX, S8, unsigned range_check") +__success __retval(0) +__naked void mov64sx_s8_range_check(void) +{ + asm volatile (" \ + call %[bpf_get_prandom_u32]; \ + r0 &= 0x1; \ + r0 += 0xfe; \ + r0 = (s8)r0; \ + if r0 < 0xfffffffffffffffe goto label_%=; \ + r0 = 0; \ + exit; \ +label_%=: \ + exit; \ +" : + : __imm(bpf_get_prandom_u32) + : __clobber_all); +} + +SEC("socket") +__description("MOV32SX, S8, unsigned range_check") +__success __retval(0) +__naked void mov32sx_s8_range_check(void) +{ + asm volatile (" \ + call %[bpf_get_prandom_u32]; \ + w0 &= 0x1; \ + w0 += 0xfe; \ + w0 = (s8)w0; \ + if w0 < 0xfffffffe goto label_%=; \ + r0 = 0; \ + exit; \ +label_%=: \ + exit; \ + " : + : __imm(bpf_get_prandom_u32) + : __clobber_all); +} + #else SEC("socket") diff --git a/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c b/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c index 2ecf77b623e0..7c5e5e6d10eb 100644 --- a/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c +++ b/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c @@ -760,4 +760,71 @@ __naked void two_old_ids_one_cur_id(void) : __clobber_all); } +SEC("socket") +/* Note the flag, see verifier.c:opt_subreg_zext_lo32_rnd_hi32() */ +__flag(BPF_F_TEST_RND_HI32) +__success +/* This test was added because of a bug in verifier.c:sync_linked_regs(), + * upon range propagation it destroyed subreg_def marks for registers. + * The subreg_def mark is used to decide whether zero extension instructions + * are needed when register is read. When BPF_F_TEST_RND_HI32 is set it + * also causes generation of statements to randomize upper halves of + * read registers. + * + * The test is written in a way to return an upper half of a register + * that is affected by range propagation and must have it's subreg_def + * preserved. This gives a return value of 0 and leads to undefined + * return value if subreg_def mark is not preserved. + */ +__retval(0) +/* Check that verifier believes r1/r0 are zero at exit */ +__log_level(2) +__msg("4: (77) r1 >>= 32 ; R1_w=0") +__msg("5: (bf) r0 = r1 ; R0_w=0 R1_w=0") +__msg("6: (95) exit") +__msg("from 3 to 4") +__msg("4: (77) r1 >>= 32 ; R1_w=0") +__msg("5: (bf) r0 = r1 ; R0_w=0 R1_w=0") +__msg("6: (95) exit") +/* Verify that statements to randomize upper half of r1 had not been + * generated. + */ +__xlated("call unknown") +__xlated("r0 &= 2147483647") +__xlated("w1 = w0") +/* This is how disasm.c prints BPF_ZEXT_REG at the moment, x86 and arm + * are the only CI archs that do not need zero extension for subregs. + */ +#if !defined(__TARGET_ARCH_x86) && !defined(__TARGET_ARCH_arm64) +__xlated("w1 = w1") +#endif +__xlated("if w0 < 0xa goto pc+0") +__xlated("r1 >>= 32") +__xlated("r0 = r1") +__xlated("exit") +__naked void linked_regs_and_subreg_def(void) +{ + asm volatile ( + "call %[bpf_ktime_get_ns];" + /* make sure r0 is in 32-bit range, otherwise w1 = w0 won't + * assign same IDs to registers. + */ + "r0 &= 0x7fffffff;" + /* link w1 and w0 via ID */ + "w1 = w0;" + /* 'if' statement propagates range info from w0 to w1, + * but should not affect w1->subreg_def property. + */ + "if w0 < 10 goto +0;" + /* r1 is read here, on archs that require subreg zero + * extension this would cause zext patch generation. + */ + "r1 >>= 32;" + "r0 = r1;" + "exit;" + : + : __imm(bpf_ktime_get_ns) + : __clobber_all); +} + char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/testing_helpers.c b/tools/testing/selftests/bpf/testing_helpers.c index d3c3c3a24150..5e9f16683be5 100644 --- a/tools/testing/selftests/bpf/testing_helpers.c +++ b/tools/testing/selftests/bpf/testing_helpers.c @@ -367,7 +367,7 @@ int delete_module(const char *name, int flags) return syscall(__NR_delete_module, name, flags); } -int unload_bpf_testmod(bool verbose) +int unload_module(const char *name, bool verbose) { int ret, cnt = 0; @@ -375,11 +375,11 @@ int unload_bpf_testmod(bool verbose) fprintf(stdout, "Failed to trigger kernel-side RCU sync!\n"); for (;;) { - ret = delete_module("bpf_testmod", 0); + ret = delete_module(name, 0); if (!ret || errno != EAGAIN) break; if (++cnt > 10000) { - fprintf(stdout, "Unload of bpf_testmod timed out\n"); + fprintf(stdout, "Unload of %s timed out\n", name); break; } usleep(100); @@ -388,41 +388,51 @@ int unload_bpf_testmod(bool verbose) if (ret) { if (errno == ENOENT) { if (verbose) - fprintf(stdout, "bpf_testmod.ko is already unloaded.\n"); + fprintf(stdout, "%s.ko is already unloaded.\n", name); return -1; } - fprintf(stdout, "Failed to unload bpf_testmod.ko from kernel: %d\n", -errno); + fprintf(stdout, "Failed to unload %s.ko from kernel: %d\n", name, -errno); return -1; } if (verbose) - fprintf(stdout, "Successfully unloaded bpf_testmod.ko.\n"); + fprintf(stdout, "Successfully unloaded %s.ko.\n", name); return 0; } -int load_bpf_testmod(bool verbose) +int load_module(const char *path, bool verbose) { int fd; if (verbose) - fprintf(stdout, "Loading bpf_testmod.ko...\n"); + fprintf(stdout, "Loading %s...\n", path); - fd = open("bpf_testmod.ko", O_RDONLY); + fd = open(path, O_RDONLY); if (fd < 0) { - fprintf(stdout, "Can't find bpf_testmod.ko kernel module: %d\n", -errno); + fprintf(stdout, "Can't find %s kernel module: %d\n", path, -errno); return -ENOENT; } if (finit_module(fd, "", 0)) { - fprintf(stdout, "Failed to load bpf_testmod.ko into the kernel: %d\n", -errno); + fprintf(stdout, "Failed to load %s into the kernel: %d\n", path, -errno); close(fd); return -EINVAL; } close(fd); if (verbose) - fprintf(stdout, "Successfully loaded bpf_testmod.ko.\n"); + fprintf(stdout, "Successfully loaded %s.\n", path); return 0; } +int unload_bpf_testmod(bool verbose) +{ + return unload_module("bpf_testmod", verbose); +} + +int load_bpf_testmod(bool verbose) +{ + return load_module("bpf_testmod.ko", verbose); +} + /* * Trigger synchronize_rcu() in kernel. */ diff --git a/tools/testing/selftests/bpf/testing_helpers.h b/tools/testing/selftests/bpf/testing_helpers.h index d55f6ab12433..46d7f7089f63 100644 --- a/tools/testing/selftests/bpf/testing_helpers.h +++ b/tools/testing/selftests/bpf/testing_helpers.h @@ -38,6 +38,8 @@ int unload_bpf_testmod(bool verbose); int kern_sync_rcu(void); int finit_module(int fd, const char *param_values, int flags); int delete_module(const char *name, int flags); +int load_module(const char *path, bool verbose); +int unload_module(const char *name, bool verbose); static inline __u64 get_time_ns(void) { diff --git a/tools/testing/selftests/hid/Makefile b/tools/testing/selftests/hid/Makefile index 38ae31bb07b5..662209f5fabc 100644 --- a/tools/testing/selftests/hid/Makefile +++ b/tools/testing/selftests/hid/Makefile @@ -18,6 +18,7 @@ TEST_PROGS += hid-usb_crash.sh TEST_PROGS += hid-wacom.sh TEST_FILES := run-hid-tools-tests.sh +TEST_FILES += tests CXX ?= $(CROSS_COMPILE)g++ diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 960cf6a77198..156fbfae940f 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -248,6 +248,9 @@ CFLAGS += -Wall -Wstrict-prototypes -Wuninitialized -O2 -g -std=gnu99 \ ifeq ($(ARCH),s390) CFLAGS += -march=z10 endif +ifeq ($(ARCH),x86) + CFLAGS += -march=x86-64-v2 +endif ifeq ($(ARCH),arm64) tools_dir := $(top_srcdir)/tools arm64_tools_dir := $(tools_dir)/arch/arm64/tools/ diff --git a/tools/testing/selftests/kvm/aarch64/set_id_regs.c b/tools/testing/selftests/kvm/aarch64/set_id_regs.c index 2a3fe7914b72..b87e53580bfc 100644 --- a/tools/testing/selftests/kvm/aarch64/set_id_regs.c +++ b/tools/testing/selftests/kvm/aarch64/set_id_regs.c @@ -68,6 +68,8 @@ struct test_feature_reg { } static const struct reg_ftr_bits ftr_id_aa64dfr0_el1[] = { + S_REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64DFR0_EL1, DoubleLock, 0), + REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64DFR0_EL1, WRPs, 0), S_REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64DFR0_EL1, PMUVer, 0), REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64DFR0_EL1, DebugVer, ID_AA64DFR0_EL1_DebugVer_IMP), REG_FTR_END, @@ -134,6 +136,13 @@ static const struct reg_ftr_bits ftr_id_aa64pfr0_el1[] = { REG_FTR_END, }; +static const struct reg_ftr_bits ftr_id_aa64pfr1_el1[] = { + REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64PFR1_EL1, CSV2_frac, 0), + REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64PFR1_EL1, SSBS, ID_AA64PFR1_EL1_SSBS_NI), + REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64PFR1_EL1, BT, 0), + REG_FTR_END, +}; + static const struct reg_ftr_bits ftr_id_aa64mmfr0_el1[] = { REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64MMFR0_EL1, ECV, 0), REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64MMFR0_EL1, EXS, 0), @@ -200,6 +209,7 @@ static struct test_feature_reg test_regs[] = { TEST_REG(SYS_ID_AA64ISAR1_EL1, ftr_id_aa64isar1_el1), TEST_REG(SYS_ID_AA64ISAR2_EL1, ftr_id_aa64isar2_el1), TEST_REG(SYS_ID_AA64PFR0_EL1, ftr_id_aa64pfr0_el1), + TEST_REG(SYS_ID_AA64PFR1_EL1, ftr_id_aa64pfr1_el1), TEST_REG(SYS_ID_AA64MMFR0_EL1, ftr_id_aa64mmfr0_el1), TEST_REG(SYS_ID_AA64MMFR1_EL1, ftr_id_aa64mmfr1_el1), TEST_REG(SYS_ID_AA64MMFR2_EL1, ftr_id_aa64mmfr2_el1), @@ -569,9 +579,9 @@ int main(void) test_cnt = ARRAY_SIZE(ftr_id_aa64dfr0_el1) + ARRAY_SIZE(ftr_id_dfr0_el1) + ARRAY_SIZE(ftr_id_aa64isar0_el1) + ARRAY_SIZE(ftr_id_aa64isar1_el1) + ARRAY_SIZE(ftr_id_aa64isar2_el1) + ARRAY_SIZE(ftr_id_aa64pfr0_el1) + - ARRAY_SIZE(ftr_id_aa64mmfr0_el1) + ARRAY_SIZE(ftr_id_aa64mmfr1_el1) + - ARRAY_SIZE(ftr_id_aa64mmfr2_el1) + ARRAY_SIZE(ftr_id_aa64zfr0_el1) - - ARRAY_SIZE(test_regs) + 2; + ARRAY_SIZE(ftr_id_aa64pfr1_el1) + ARRAY_SIZE(ftr_id_aa64mmfr0_el1) + + ARRAY_SIZE(ftr_id_aa64mmfr1_el1) + ARRAY_SIZE(ftr_id_aa64mmfr2_el1) + + ARRAY_SIZE(ftr_id_aa64zfr0_el1) - ARRAY_SIZE(test_regs) + 2; ksft_set_plan(test_cnt); diff --git a/tools/testing/selftests/kvm/x86_64/cpuid_test.c b/tools/testing/selftests/kvm/x86_64/cpuid_test.c index 8c579ce714e9..fec03b11b059 100644 --- a/tools/testing/selftests/kvm/x86_64/cpuid_test.c +++ b/tools/testing/selftests/kvm/x86_64/cpuid_test.c @@ -60,7 +60,7 @@ static bool is_cpuid_mangled(const struct kvm_cpuid_entry2 *entrie) { int i; - for (i = 0; i < sizeof(mangled_cpuids); i++) { + for (i = 0; i < ARRAY_SIZE(mangled_cpuids); i++) { if (mangled_cpuids[i].function == entrie->function && mangled_cpuids[i].index == entrie->index) return true; diff --git a/tools/testing/selftests/mm/khugepaged.c b/tools/testing/selftests/mm/khugepaged.c index 56d4480e8d3c..8a4d34cce36b 100644 --- a/tools/testing/selftests/mm/khugepaged.c +++ b/tools/testing/selftests/mm/khugepaged.c @@ -1091,7 +1091,7 @@ static void usage(void) fprintf(stderr, "\n\t\"file,all\" mem_type requires kernel built with\n"); fprintf(stderr, "\tCONFIG_READ_ONLY_THP_FOR_FS=y\n"); fprintf(stderr, "\n\tif [dir] is a (sub)directory of a tmpfs mount, tmpfs must be\n"); - fprintf(stderr, "\tmounted with huge=madvise option for khugepaged tests to work\n"); + fprintf(stderr, "\tmounted with huge=advise option for khugepaged tests to work\n"); fprintf(stderr, "\n\tSupported Options:\n"); fprintf(stderr, "\t\t-h: This help message.\n"); fprintf(stderr, "\t\t-s: mTHP size, expressed as page order.\n"); diff --git a/tools/testing/selftests/mm/uffd-common.c b/tools/testing/selftests/mm/uffd-common.c index 717539eddf98..852e7281026e 100644 --- a/tools/testing/selftests/mm/uffd-common.c +++ b/tools/testing/selftests/mm/uffd-common.c @@ -18,7 +18,7 @@ bool test_uffdio_wp = true; unsigned long long *count_verify; uffd_test_ops_t *uffd_test_ops; uffd_test_case_ops_t *uffd_test_case_ops; -atomic_bool ready_for_fork; +pthread_barrier_t ready_for_fork; static int uffd_mem_fd_create(off_t mem_size, bool hugetlb) { @@ -519,7 +519,8 @@ void *uffd_poll_thread(void *arg) pollfd[1].fd = pipefd[cpu*2]; pollfd[1].events = POLLIN; - ready_for_fork = true; + /* Ready for parent thread to fork */ + pthread_barrier_wait(&ready_for_fork); for (;;) { ret = poll(pollfd, 2, -1); diff --git a/tools/testing/selftests/mm/uffd-common.h b/tools/testing/selftests/mm/uffd-common.h index a70ae10b5f62..3e6228d8e0dc 100644 --- a/tools/testing/selftests/mm/uffd-common.h +++ b/tools/testing/selftests/mm/uffd-common.h @@ -33,7 +33,6 @@ #include <inttypes.h> #include <stdint.h> #include <sys/random.h> -#include <stdatomic.h> #include "../kselftest.h" #include "vm_util.h" @@ -105,7 +104,7 @@ extern bool map_shared; extern bool test_uffdio_wp; extern unsigned long long *count_verify; extern volatile bool test_uffdio_copy_eexist; -extern atomic_bool ready_for_fork; +extern pthread_barrier_t ready_for_fork; extern uffd_test_ops_t anon_uffd_test_ops; extern uffd_test_ops_t shmem_uffd_test_ops; diff --git a/tools/testing/selftests/mm/uffd-unit-tests.c b/tools/testing/selftests/mm/uffd-unit-tests.c index b3d21eed203d..c8a3b1c7edff 100644 --- a/tools/testing/selftests/mm/uffd-unit-tests.c +++ b/tools/testing/selftests/mm/uffd-unit-tests.c @@ -241,6 +241,9 @@ static void *fork_event_consumer(void *data) fork_event_args *args = data; struct uffd_msg msg = { 0 }; + /* Ready for parent thread to fork */ + pthread_barrier_wait(&ready_for_fork); + /* Read until a full msg received */ while (uffd_read_msg(args->parent_uffd, &msg)); @@ -308,8 +311,12 @@ static int pagemap_test_fork(int uffd, bool with_event, bool test_pin) /* Prepare a thread to resolve EVENT_FORK */ if (with_event) { + pthread_barrier_init(&ready_for_fork, NULL, 2); if (pthread_create(&thread, NULL, fork_event_consumer, &args)) err("pthread_create()"); + /* Wait for child thread to start before forking */ + pthread_barrier_wait(&ready_for_fork); + pthread_barrier_destroy(&ready_for_fork); } child = fork(); @@ -774,7 +781,7 @@ static void uffd_sigbus_test_common(bool wp) char c; struct uffd_args args = { 0 }; - ready_for_fork = false; + pthread_barrier_init(&ready_for_fork, NULL, 2); fcntl(uffd, F_SETFL, uffd_flags | O_NONBLOCK); @@ -791,8 +798,9 @@ static void uffd_sigbus_test_common(bool wp) if (pthread_create(&uffd_mon, NULL, uffd_poll_thread, &args)) err("uffd_poll_thread create"); - while (!ready_for_fork) - ; /* Wait for the poll_thread to start executing before forking */ + /* Wait for child thread to start before forking */ + pthread_barrier_wait(&ready_for_fork); + pthread_barrier_destroy(&ready_for_fork); pid = fork(); if (pid < 0) @@ -833,7 +841,7 @@ static void uffd_events_test_common(bool wp) char c; struct uffd_args args = { 0 }; - ready_for_fork = false; + pthread_barrier_init(&ready_for_fork, NULL, 2); fcntl(uffd, F_SETFL, uffd_flags | O_NONBLOCK); if (uffd_register(uffd, area_dst, nr_pages * page_size, @@ -844,8 +852,9 @@ static void uffd_events_test_common(bool wp) if (pthread_create(&uffd_mon, NULL, uffd_poll_thread, &args)) err("uffd_poll_thread create"); - while (!ready_for_fork) - ; /* Wait for the poll_thread to start executing before forking */ + /* Wait for child thread to start before forking */ + pthread_barrier_wait(&ready_for_fork); + pthread_barrier_destroy(&ready_for_fork); pid = fork(); if (pid < 0) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 05cbb2548d99..6ca7a1045bbb 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3035,24 +3035,12 @@ kvm_pfn_t gfn_to_pfn_memslot_atomic(const struct kvm_memory_slot *slot, gfn_t gf } EXPORT_SYMBOL_GPL(gfn_to_pfn_memslot_atomic); -kvm_pfn_t kvm_vcpu_gfn_to_pfn_atomic(struct kvm_vcpu *vcpu, gfn_t gfn) -{ - return gfn_to_pfn_memslot_atomic(kvm_vcpu_gfn_to_memslot(vcpu, gfn), gfn); -} -EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_pfn_atomic); - kvm_pfn_t gfn_to_pfn(struct kvm *kvm, gfn_t gfn) { return gfn_to_pfn_memslot(gfn_to_memslot(kvm, gfn), gfn); } EXPORT_SYMBOL_GPL(gfn_to_pfn); -kvm_pfn_t kvm_vcpu_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn) -{ - return gfn_to_pfn_memslot(kvm_vcpu_gfn_to_memslot(vcpu, gfn), gfn); -} -EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_pfn); - int gfn_to_page_many_atomic(struct kvm_memory_slot *slot, gfn_t gfn, struct page **pages, int nr_pages) { @@ -6387,7 +6375,7 @@ static void kvm_sched_out(struct preempt_notifier *pn, WRITE_ONCE(vcpu->scheduled_out, true); - if (current->on_rq && vcpu->wants_to_run) { + if (task_is_runnable(current) && vcpu->wants_to_run) { WRITE_ONCE(vcpu->preempted, true); WRITE_ONCE(vcpu->ready, true); } |