summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* clk: qcom: krait-cc: also enable secondary mux and div clkChristian Marangi2022-12-021-1/+20
| | | | | | | | | | clk-krait ignore any rate change if clk is not flagged as enabled. Correctly enable the secondary mux and div clk to correctly change rate instead of silently ignoring the request. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221109005631.3189-2-ansuelsmth@gmail.com
* clk: qcom: krait-cc: fix wrong parent order for secondary muxChristian Marangi2022-12-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | The secondary mux parent order is swapped. This currently doesn't cause problems as the secondary mux is used for idle clk and as a safe clk source while reprogramming the hfpll. Each mux have 2 or more output but he always have a safe source to switch while reprogramming the connected pll. We use a clk notifier to switch to the correct parent before clk core can apply the correct rate. The parent to switch is hardcoded in the mux struct. For the secondary mux the safe source to use is the qsb parent as it's the only fixed clk as the acpus_aux is a pll that can source from pxo or from pll8. The hardcoded safe parent for the secondary mux is set to index 0 that in the secondary mux map is set to 2. But the index 0 is actually acpu_aux in the parent list. Fix the swapped parents to correctly handle idle frequency and output a sane clk_summary report. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221109005631.3189-1-ansuelsmth@gmail.com
* clk: qcom: krait-cc: use devm variant for clk notifier registerChristian Marangi2022-12-021-1/+1
| | | | | | | | | Use devm variant for clk notifier register and correctly handle free resource on driver remove. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221108215827.30475-1-ansuelsmth@gmail.com
* clk: qcom: clk-krait: fix wrong div2 functionsChristian Marangi2022-12-021-0/+2
| | | | | | | | | | | | | Currently div2 value is applied to the wrong bits. This is caused by a bug in the code where the shift is done only for lpl, for anything else the mask is not shifted to the correct bits. Fix this by correctly shift if lpl is not supported. Fixes: 4d7dc77babfe ("clk: qcom: Add support for Krait clocks") Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221108215625.30186-1-ansuelsmth@gmail.com
* clk: qcom: kpss-xcc: register it as clk providerChristian Marangi2022-12-021-4/+9
| | | | | | | | | krait-cc use this driver for the secondary mux. Register it as a clk provider to correctly use this clk in other drivers. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221108211734.3707-1-ansuelsmth@gmail.com
* clk: qcom: ipq8074: add missing networking resetsRobert Marko2022-12-021-0/+14
| | | | | | | | | | | | Downstream QCA 5.4 kernel defines networking resets which are not present in the mainline kernel but are required for the networking drivers. So, port the downstream resets and avoid using magic values for mask, construct mask for resets which require multiple bits to be set/cleared. Signed-off-by: Robert Marko <robimarko@gmail.com> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221107132901.489240-3-robimarko@gmail.com
* dt-bindings: clock: qcom: ipq8074: add missing networking resetsRobert Marko2022-12-021-0/+14
| | | | | | | | | Add bindings for the missing networking resets found in IPQ8074 GCC. Signed-off-by: Robert Marko <robimarko@gmail.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221107132901.489240-2-robimarko@gmail.com
* clk: qcom: reset: support resetting multiple bitsRobert Marko2022-12-022-2/+3
| | | | | | | | | | | | | | | | | | | | | | This patch adds the support for giving the complete bitmask in reset structure and reset operation will use this bitmask for all reset operations. Currently, reset structure only takes a single bit for each reset and then calculates the bitmask by using the BIT() macro. However, this is not sufficient anymore for newer SoC-s like IPQ8074, IPQ6018 and more, since their networking resets require multiple bits to be asserted in order to properly reset the HW block completely. So, in order to allow asserting multiple bits add "bitmask" field to qcom_reset_map, and then use that bitmask value if its populated in the driver, if its not populated, then we just default to existing behaviour and calculate the bitmask on the fly. Signed-off-by: Robert Marko <robimarko@gmail.com> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221107132901.489240-1-robimarko@gmail.com
* clk: qcom: lpass-sc7180: Avoid an extra "struct dev_pm_ops"Douglas Anderson2022-12-021-7/+3
| | | | | | | | | | | | | | The two devices managed by lpasscorecc-sc7180.c each had their own "struct dev_pm_ops". This is not needed. They are exactly the same and the structure is "static const" so it can't possible change. combine the two. This matches what's done for sc7280. This should be a noop other than saving a few bytes. Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221104064055.3.I90ba14a47683a484f26531a08f7b46ace7f0a8a9@changeid
* clk: qcom: lpass-sc7180: Fix pm_runtime usageDouglas Anderson2022-12-021-8/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The sc7180 lpass clock controller's pm_runtime usage wasn't broken quite as spectacularly as the sc7280's pm_runtime usage, but it was still broken. Putting some printouts in at boot showed me this (with serial console enabled, which makes the prints slow and thus changes timing): [ 3.109951] DOUG: my_pm_clk_resume, usage=1 [ 3.114767] DOUG: my_pm_clk_resume, usage=1 [ 3.664443] DOUG: my_pm_clk_suspend, usage=0 [ 3.897566] DOUG: my_pm_clk_suspend, usage=0 [ 3.910137] DOUG: my_pm_clk_resume, usage=1 [ 3.923217] DOUG: my_pm_clk_resume, usage=0 [ 4.440116] DOUG: my_pm_clk_suspend, usage=-1 [ 4.444982] DOUG: my_pm_clk_suspend, usage=0 [ 14.170501] DOUG: my_pm_clk_resume, usage=1 [ 14.176245] DOUG: my_pm_clk_resume, usage=0 ...or this w/out serial console: [ 0.556139] DOUG: my_pm_clk_resume, usage=1 [ 0.556279] DOUG: my_pm_clk_resume, usage=1 [ 1.058422] DOUG: my_pm_clk_suspend, usage=-1 [ 1.058464] DOUG: my_pm_clk_suspend, usage=0 [ 1.186250] DOUG: my_pm_clk_resume, usage=1 [ 1.186292] DOUG: my_pm_clk_resume, usage=0 [ 1.731536] DOUG: my_pm_clk_suspend, usage=-1 [ 1.731557] DOUG: my_pm_clk_suspend, usage=0 [ 10.288910] DOUG: my_pm_clk_resume, usage=1 [ 10.289496] DOUG: my_pm_clk_resume, usage=0 It seems to be doing roughly the right sequence of calls, but just like with sc7280 this is more by luck than anything. Having a usage of -1 is just not OK. Let's fix this like we did with sc7280. Signed-off-by: Douglas Anderson <dianders@chromium.org> Fixes: ce8c195e652f ("clk: qcom: lpasscc: Introduce pm autosuspend for SC7180") Reviewed-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221104064055.2.I49b25b9bda9430fc7ea21e5a708ca5a0aced2798@changeid
* clk: qcom: lpass-sc7280: Fix pm_runtime usageDouglas Anderson2022-12-021-34/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The pm_runtime usage in lpass-sc7280 was broken in quite a few ways. Specifically: 1. At the end of probe it called "put" twice. This is a no-no and will end us up with a negative usage count. Even worse than calling "put" twice, it never called "get" once. Thus after bootup it could be seen that the runtime usage of the devices managed by this driver was -2. 2. In some error cases it manually called pm_runtime_disable() even though it had previously used devm_add_action_or_reset() to set this up to be called automatically. This meant that in these error cases we'd double-call pm_runtime_disable(). 3. It forgot to call undo pm_runtime_use_autosuspend(), which can sometimes have subtle problems (and the docs specifically mention that you need to undo this function). Overall the above seriously calls into question how this driver is working. It seems like a combination of "it doesn't", "by luck", and "because of the weirdness of runtime_pm". Specifically I put a printout to the serial console every time the runtime suspend/resume was called for the two devices created by this driver (I wrapped the pm_clk calls). When I had serial console enabled, I found that the calls got resumed at bootup (when the clk core probed and before our double-put) and then never touched again. That's no good. [ 0.829997] DOUG: my_pm_clk_resume, usage=1 [ 0.835487] DOUG: my_pm_clk_resume, usage=1 When I disabled serial console (speeding up boot), I got a different pattern, which I guess (?) is better: [ 0.089767] DOUG: my_pm_clk_resume, usage=1 [ 0.090507] DOUG: my_pm_clk_resume, usage=1 [ 0.151885] DOUG: my_pm_clk_suspend, usage=-2 [ 0.151914] DOUG: my_pm_clk_suspend, usage=-2 [ 1.825747] DOUG: my_pm_clk_resume, usage=-1 [ 1.825774] DOUG: my_pm_clk_resume, usage=-1 [ 1.888269] DOUG: my_pm_clk_suspend, usage=-2 [ 1.888282] DOUG: my_pm_clk_suspend, usage=-2 These different patterns have to do with the fact that the core PM Runtime code really isn't designed to be robust to negative usage counts and sometimes may happen to stumble upon a behavior that happens to "work". For instance, you can see that __pm_runtime_suspend() will treat any non-zero value (including negative numbers) as if the device is in use. In any case, let's fix the driver to be correct. We'll hold a pm_runtime reference for the whole probe and then drop it (once!) at the end. We'll get rid of manual pm_runtime_disable() calls in the error handling. We'll also switch to devm_pm_runtime_enable(), which magically handles undoing pm_runtime_use_autosuspend() as of commit b4060db9251f ("PM: runtime: Have devm_pm_runtime_enable() handle pm_runtime_dont_use_autosuspend()"). While we're at this, let's also use devm_pm_clk_create() instead of rolling it ourselves. Note that the above changes make it obvious that lpassaudio_create_pm_clks() was doing more than just creating clocks. It was also setting up pm_runtime parameters. Let's rename it. All of these problems were found by code inspection. I started looking at this driver because it was involved in a deadlock that I reported a while ago [1]. Though I bisected the deadlock to commit 1b771839de05 ("clk: qcom: gdsc: enable optional power domain support"), it was never really clear why that patch affected it other than a luck of timing changes. I'll also note that by fixing the timing (as done in this change) we also seem to aboid the deadlock, which is a nice benefit. Also note that some of the fixes here are much the same type of stuff that Dmitry did in commit 72cfc73f4663 ("clk: qcom: use devm_pm_runtime_enable and devm_pm_clk_create"), but I guess lpassaudiocc-sc7280.c didn't exist then. [1] https://lore.kernel.org/r/20220922154354.2486595-1-dianders@chromium.org Fixes: a9dd26639d05 ("clk: qcom: lpass: Add support for LPASS clock controller for SC7280") Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221104064055.1.I00a0e4564a25489e85328ec41636497775627564@changeid
* clk: qcom: Add display clock controller driver for SM6375Konrad Dybcio2022-11-153-0/+620
| | | | | | | | | Add support for the display clock controller found on SM6375. Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221115155808.10899-2-konrad.dybcio@linaro.org
* dt-bindings: clock: add QCOM SM6375 display clockKonrad Dybcio2022-11-152-0/+96
| | | | | | | | | | | Add device tree bindings for display clock controller for Qualcomm Technology Inc's SM6375 SoC. Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221115155808.10899-1-konrad.dybcio@linaro.org
* clk: qcom: Add SC8280XP display clock controllerBjorn Andersson2022-11-103-0/+3228
| | | | | | | | | | | | | | The Qualcomm SC8280XP platform has two display clock controller instances, add support for these. Duplication between the two implementations is reduced by reusing any constant data between the two sets of clock data. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Tested-by: Steev Klimaszewski <steev@kali.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20220926203800.16771-3-quic_bjorande@quicinc.com
* dt-bindings: clock: Add Qualcomm SC8280XP display clock bindingsBjorn Andersson2022-11-102-0/+197
| | | | | | | | | | | The Qualcomm SC8280XP platform has two display clock controllers, add a binding for these. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20220926203800.16771-2-quic_bjorande@quicinc.com
* dt-bindings: clock: qcom: Clean-up titles and descriptionsKrzysztof Kozlowski2022-11-0860-289/+258
| | | | | | | | | | | | | | | | Clean the Qualcomm SoCs clock bindings: 1. Drop redundant "bindings" in title. 2. Correct language grammar "<independent clause without verb>, which supports" -> "provides". 3. Use full path to the bindings header, so tools can validate it. 4. Drop quotes where not needed. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> # msm8998 Acked-by: Jeffrey Hugo <quic_jhugo@quicinc.com> # mmcc Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221102163153.55460-2-krzysztof.kozlowski@linaro.org
* dt-bindings: clock: qcom,gcc-ipq8074: Use common GCC schemaKrzysztof Kozlowski2022-11-081-21/+4
| | | | | | | | | Reference common Qualcomm GCC schema to remove common pieces. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221102163153.55460-1-krzysztof.kozlowski@linaro.org
* dt-bindings: clocks: qcom,gcc-ipq8074: allow XO and sleep clocksRobert Marko2022-11-071-0/+10
| | | | | | | | | | Allow passing XO and sleep clocks to the IPQ8074 to avoid having to do a global matching by name. Signed-off-by: Robert Marko <robimarko@gmail.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221030175703.1103224-2-robimarko@gmail.com
* clk: qcom: ipq8074: convert to parent dataRobert Marko2022-11-071-968/+813
| | | | | | | | | | | Convert the IPQ8074 GCC driver to use parent data instead of global name matching. Utilize ARRAY_SIZE for num_parents instead of hardcoding the value. Signed-off-by: Robert Marko <robimarko@gmail.com> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221030175703.1103224-1-robimarko@gmail.com
* clk: qcom: Add support for QDU1000 and QRU1000 RPMh clocksMelody Olvera2022-11-071-0/+13
| | | | | | | | Add support for RMPh clocks for QDU1000 and QRU1000 SoCs. Signed-off-by: Melody Olvera <quic_molvera@quicinc.com> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221026190441.4002212-5-quic_molvera@quicinc.com
* dt-bindings: clock: Add RPMHCC for QDU1000 and QRU1000Melody Olvera2022-11-071-0/+1
| | | | | | | | Add compatible strings for RPMHCC for QDU1000 and QRU1000. Signed-off-by: Melody Olvera <quic_molvera@quicinc.com> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221026190441.4002212-3-quic_molvera@quicinc.com
* clk: qcom: dispcc-sm8250: Disable link_div_clk_src for sm8150Robert Foss2022-11-061-0/+11
| | | | | | | | | | SM8150 does not have any of the link_div_clk_src clocks, so let's disable them for this SoC. Signed-off-by: Robert Foss <robert.foss@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221102090140.965450-6-robert.foss@linaro.org
* clk: qcom: dispcc-sm8250: Add missing EDP clocks for sm8350Robert Foss2022-11-061-1/+21
| | | | | | | | | | SM8350 supports embedded displayport, but the clocks for this were previously not accounted for. Signed-off-by: Robert Foss <robert.foss@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221102090140.965450-5-robert.foss@linaro.org
* dt-bindings: clock: dispcc-sm8250: Add EDP_LINK_DIV_CLK_SRC indexRobert Foss2022-11-061-0/+1
| | | | | | | | | | Add this previously missing index, since it is supported by the SoCs targeted by the dispcc-sm8250 driver. Signed-off-by: Robert Foss <robert.foss@linaro.org> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221102090140.965450-4-robert.foss@linaro.org
* clk: qcom: dispcc-sm8250: Add RETAIN_FF_ENABLE flag for mdss_gdscRobert Foss2022-11-061-1/+1
| | | | | | | | | | | | | | All SoC supported by this driver supports the RETAIN_FF_ENABLE flag, so it should be enabled here. This feature enables registers to maintain their state after dis/re-enabling the GDSC. Signed-off-by: Robert Foss <robert.foss@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@somainline.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221102090140.965450-3-robert.foss@linaro.org
* clk: qcom: dispcc-sm8250: Disable EDP_GTC for sm8350Robert Foss2022-11-061-0/+3
| | | | | | | | | | SM8350 does not have the EDP_GTC clock, so let's disable it for this SoC. Signed-off-by: Robert Foss <robert.foss@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221102090140.965450-2-robert.foss@linaro.org
* clk: qcom: gcc-sm8250: Use retention mode for USB GDSCsManivannan Sadhasivam2022-11-061-2/+2
| | | | | | | | | | | | USB controllers on SM8250 doesn't work after coming back from suspend. This can be fixed by keeping the USB GDSCs in retention mode so that hardware can keep them ON and put into rentention mode once the parent domain goes to a low power state. Fixes: 3e5770921a88 ("clk: qcom: gcc: Add global clock controller driver for SM8250") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221102091320.66007-1-manivannan.sadhasivam@linaro.org
* dt-bindings: clock: qcom,audiocc-sm8250: add missing audio clockKrzysztof Kozlowski2022-11-061-1/+4
| | | | | | | | | | | The SM8250 DTS uses three clocks as input to LPASS AudioClock Controller (althopugh Linux driver seems not needing it), so document the missing audio voting clock. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221104174656.123979-3-krzysztof.kozlowski@linaro.org
* dt-bindings: clock: qcom,aoncc-sm8250: add missing audio clockKrzysztof Kozlowski2022-11-061-1/+4
| | | | | | | | | | | The SM8250 DTS uses three clocks as input to LPASS AON Clock Controller (althopugh Linux driver seems not needing it), so document the missing audio voting clock. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221104174656.123979-2-krzysztof.kozlowski@linaro.org
* dt-bindings: clock: qcom,aoncc-sm8250: fix compatibleKrzysztof Kozlowski2022-11-061-2/+2
| | | | | | | | | | | The SM8250 AON Clock Controller compatible used by Linux driver and DTS is qcom,sm8250-lpass-aoncc. Fixes: 7dbe5a7a3f99 ("dt-bindings: clock: Add support for LPASS Always ON Controller") Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221104174656.123979-1-krzysztof.kozlowski@linaro.org
* dt-bindings: clock: qcom,sdm845-lpasscc: convert to dtschemaKrzysztof Kozlowski2022-11-062-26/+47
| | | | | | | | | Convert Qualcomm SDM845 LPASS clock controller bindings to DT schema. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221104182108.126515-1-krzysztof.kozlowski@linaro.org
* dt-bindings: clock: Convert qcom,lcc to DT schemaLuca Weiss2022-11-062-22/+86
| | | | | | | | | | Convert the text bindings for the lcc to yaml format. Doing this showed that clocks and clock-names were not documented, so fix that now. Signed-off-by: Luca Weiss <luca@z3ntu.xyz> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221016143514.612851-1-luca@z3ntu.xyz
* clk: qcom: dispcc-sm6350: Add CLK_OPS_PARENT_ENABLE to pixel&byte srcKonrad Dybcio2022-11-061-2/+2
| | | | | | | | | | Add the CLK_OPS_PARENT_ENABLE flag to pixel and byte clk srcs to ensure set_rate can succeed. Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org> Fixes: 837519775f1d ("clk: qcom: Add display clock controller driver for SM6350") Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221010155546.73884-1-konrad.dybcio@somainline.org
* clk: qcom: gcc-sm6125: Remove gpll7 from sdcc2_appsMartin Botka2022-11-061-1/+0
| | | | | | | | | This removes gpll7 clock source from sdcc2_apps as it caused issues on the device during testing Signed-off-by: Martin Botka <martin.botka@somainline.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221001185431.493440-1-martin.botka@somainline.org
* clk: qcom: gcc-ipq806x: use parent_data for the last remaining entryDmitry Baryshkov2022-11-061-1/+3
| | | | | | | | | | | Use parent_data for the last remaining entry (pll4). This clock is provided by the lcc device. Fixes: cb02866f9a74 ("clk: qcom: gcc-ipq806x: convert parent_names to parent_data") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@somainline.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20220927113826.246241-3-dmitry.baryshkov@linaro.org
* dt-bindings: clock: qcom,gcc-ipq8064: add pll4 to used clocksDmitry Baryshkov2022-11-061-2/+7
| | | | | | | | | | | | | | On IPQ8064 (and related platforms) the GCC uses PLL4 clock provided by the LCC clock controller. Mention this in the bindings. To remain compatible with older bindings, make it optional, as the driver will fallback to getting the `pll4' clock from the system clocks list. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20220927113826.246241-2-dmitry.baryshkov@linaro.org
* dt-bindings: clock: split qcom,gcc-sdm660 to the separate fileDmitry Baryshkov2022-10-172-3/+61
| | | | | | | | | | Move schema for the GCC on SDM630/SDM636/SDM660 to a separate file to be able to define device-specific clock properties. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20220929091216.471136-1-dmitry.baryshkov@linaro.org
* Linux 6.1-rc1v6.1-rc1Linus Torvalds2022-10-171-2/+2
|
* Merge tag 'random-6.1-rc1-for-linus' of ↵Linus Torvalds2022-10-17185-421/+378
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull more random number generator updates from Jason Donenfeld: "This time with some large scale treewide cleanups. The intent of this pull is to clean up the way callers fetch random integers. The current rules for doing this right are: - If you want a secure or an insecure random u64, use get_random_u64() - If you want a secure or an insecure random u32, use get_random_u32() The old function prandom_u32() has been deprecated for a while now and is just a wrapper around get_random_u32(). Same for get_random_int(). - If you want a secure or an insecure random u16, use get_random_u16() - If you want a secure or an insecure random u8, use get_random_u8() - If you want secure or insecure random bytes, use get_random_bytes(). The old function prandom_bytes() has been deprecated for a while now and has long been a wrapper around get_random_bytes() - If you want a non-uniform random u32, u16, or u8 bounded by a certain open interval maximum, use prandom_u32_max() I say "non-uniform", because it doesn't do any rejection sampling or divisions. Hence, it stays within the prandom_*() namespace, not the get_random_*() namespace. I'm currently investigating a "uniform" function for 6.2. We'll see what comes of that. By applying these rules uniformly, we get several benefits: - By using prandom_u32_max() with an upper-bound that the compiler can prove at compile-time is ≤65536 or ≤256, internally get_random_u16() or get_random_u8() is used, which wastes fewer batched random bytes, and hence has higher throughput. - By using prandom_u32_max() instead of %, when the upper-bound is not a constant, division is still avoided, because prandom_u32_max() uses a faster multiplication-based trick instead. - By using get_random_u16() or get_random_u8() in cases where the return value is intended to indeed be a u16 or a u8, we waste fewer batched random bytes, and hence have higher throughput. This series was originally done by hand while I was on an airplane without Internet. Later, Kees and I worked on retroactively figuring out what could be done with Coccinelle and what had to be done manually, and then we split things up based on that. So while this touches a lot of files, the actual amount of code that's hand fiddled is comfortably small" * tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: prandom: remove unused functions treewide: use get_random_bytes() when possible treewide: use get_random_u32() when possible treewide: use get_random_{u8,u16}() when possible, part 2 treewide: use get_random_{u8,u16}() when possible, part 1 treewide: use prandom_u32_max() when possible, part 2 treewide: use prandom_u32_max() when possible, part 1
| * prandom: remove unused functionsJason A. Donenfeld2022-10-123-23/+5
| | | | | | | | | | | | | | | | | | | | | | | | With no callers left of prandom_u32() and prandom_bytes(), as well as get_random_int(), remove these deprecated wrappers, in favor of get_random_u32() and get_random_bytes(). Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
| * treewide: use get_random_bytes() when possibleJason A. Donenfeld2022-10-1219-24/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The prandom_bytes() function has been a deprecated inline wrapper around get_random_bytes() for several releases now, and compiles down to the exact same code. Replace the deprecated wrapper with a direct call to the real function. This was done as a basic find and replace. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> # powerpc Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
| * treewide: use get_random_u32() when possibleJason A. Donenfeld2022-10-1271-100/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The prandom_u32() function has been a deprecated inline wrapper around get_random_u32() for several releases now, and compiles down to the exact same code. Replace the deprecated wrapper with a direct call to the real function. The same also applies to get_random_int(), which is just a wrapper around get_random_u32(). This was done as a basic find and replace. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> # for ext4 Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> # for sch_cake Acked-by: Chuck Lever <chuck.lever@oracle.com> # for nfsd Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> # for thunderbolt Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs Acked-by: Helge Deller <deller@gmx.de> # for parisc Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390 Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
| * treewide: use get_random_{u8,u16}() when possible, part 2Jason A. Donenfeld2022-10-126-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than truncate a 32-bit value to a 16-bit value or an 8-bit value, simply use the get_random_{u8,u16}() functions, which are faster than wasting the additional bytes from a 32-bit value. This was done by hand, identifying all of the places where one of the random integer functions was used in a non-32-bit context. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390 Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
| * treewide: use get_random_{u8,u16}() when possible, part 1Jason A. Donenfeld2022-10-1221-34/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than truncate a 32-bit value to a 16-bit value or an 8-bit value, simply use the get_random_{u8,u16}() functions, which are faster than wasting the additional bytes from a 32-bit value. This was done mechanically with this coccinelle script: @@ expression E; identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; typedef u16; typedef __be16; typedef __le16; typedef u8; @@ ( - (get_random_u32() & 0xffff) + get_random_u16() | - (get_random_u32() & 0xff) + get_random_u8() | - (get_random_u32() % 65536) + get_random_u16() | - (get_random_u32() % 256) + get_random_u8() | - (get_random_u32() >> 16) + get_random_u16() | - (get_random_u32() >> 24) + get_random_u8() | - (u16)get_random_u32() + get_random_u16() | - (u8)get_random_u32() + get_random_u8() | - (__be16)get_random_u32() + (__be16)get_random_u16() | - (__le16)get_random_u32() + (__le16)get_random_u16() | - prandom_u32_max(65536) + get_random_u16() | - prandom_u32_max(256) + get_random_u8() | - E->inet_id = get_random_u32() + E->inet_id = get_random_u16() ) @@ identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; typedef u16; identifier v; @@ - u16 v = get_random_u32(); + u16 v = get_random_u16(); @@ identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; typedef u8; identifier v; @@ - u8 v = get_random_u32(); + u8 v = get_random_u8(); @@ identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; typedef u16; u16 v; @@ - v = get_random_u32(); + v = get_random_u16(); @@ identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; typedef u8; u8 v; @@ - v = get_random_u32(); + v = get_random_u8(); // Find a potential literal @literal_mask@ expression LITERAL; type T; identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; position p; @@ ((T)get_random_u32()@p & (LITERAL)) // Examine limits @script:python add_one@ literal << literal_mask.LITERAL; RESULT; @@ value = None if literal.startswith('0x'): value = int(literal, 16) elif literal[0] in '123456789': value = int(literal, 10) if value is None: print("I don't know how to handle %s" % (literal)) cocci.include_match(False) elif value < 256: coccinelle.RESULT = cocci.make_ident("get_random_u8") elif value < 65536: coccinelle.RESULT = cocci.make_ident("get_random_u16") else: print("Skipping large mask of %s" % (literal)) cocci.include_match(False) // Replace the literal mask with the calculated result. @plus_one@ expression literal_mask.LITERAL; position literal_mask.p; identifier add_one.RESULT; identifier FUNC; @@ - (FUNC()@p & (LITERAL)) + (RESULT() & LITERAL) Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> # for sch_cake Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
| * treewide: use prandom_u32_max() when possible, part 2Jason A. Donenfeld2022-10-124-19/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than incurring a division or requesting too many random bytes for the given range, use the prandom_u32_max() function, which only takes the minimum required bytes from the RNG and avoids divisions. This was done by hand, covering things that coccinelle could not do on its own. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> # for ext2, ext4, and sbitmap Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
| * treewide: use prandom_u32_max() when possible, part 1Jason A. Donenfeld2022-10-1289-218/+204
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than incurring a division or requesting too many random bytes for the given range, use the prandom_u32_max() function, which only takes the minimum required bytes from the RNG and avoids divisions. This was done mechanically with this coccinelle script: @basic@ expression E; type T; identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; typedef u64; @@ ( - ((T)get_random_u32() % (E)) + prandom_u32_max(E) | - ((T)get_random_u32() & ((E) - 1)) + prandom_u32_max(E * XXX_MAKE_SURE_E_IS_POW2) | - ((u64)(E) * get_random_u32() >> 32) + prandom_u32_max(E) | - ((T)get_random_u32() & ~PAGE_MASK) + prandom_u32_max(PAGE_SIZE) ) @multi_line@ identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; identifier RAND; expression E; @@ - RAND = get_random_u32(); ... when != RAND - RAND %= (E); + RAND = prandom_u32_max(E); // Find a potential literal @literal_mask@ expression LITERAL; type T; identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; position p; @@ ((T)get_random_u32()@p & (LITERAL)) // Add one to the literal. @script:python add_one@ literal << literal_mask.LITERAL; RESULT; @@ value = None if literal.startswith('0x'): value = int(literal, 16) elif literal[0] in '123456789': value = int(literal, 10) if value is None: print("I don't know how to handle %s" % (literal)) cocci.include_match(False) elif value == 2**32 - 1 or value == 2**31 - 1 or value == 2**24 - 1 or value == 2**16 - 1 or value == 2**8 - 1: print("Skipping 0x%x for cleanup elsewhere" % (value)) cocci.include_match(False) elif value & (value + 1) != 0: print("Skipping 0x%x because it's not a power of two minus one" % (value)) cocci.include_match(False) elif literal.startswith('0x'): coccinelle.RESULT = cocci.make_expr("0x%x" % (value + 1)) else: coccinelle.RESULT = cocci.make_expr("%d" % (value + 1)) // Replace the literal mask with the calculated result. @plus_one@ expression literal_mask.LITERAL; position literal_mask.p; expression add_one.RESULT; identifier FUNC; @@ - (FUNC()@p & (LITERAL)) + prandom_u32_max(RESULT) @collapse_ret@ type T; identifier VAR; expression E; @@ { - T VAR; - VAR = (E); - return VAR; + return E; } @drop_var@ type T; identifier VAR; @@ { - T VAR; ... when != VAR } Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: KP Singh <kpsingh@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> # for ext4 and sbitmap Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> # for drbd Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390 Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* | Merge tag 'perf-tools-for-v6.1-2-2022-10-16' of ↵Linus Torvalds2022-10-1736-71/+1265
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull more perf tools updates from Arnaldo Carvalho de Melo: - Use BPF CO-RE (Compile Once, Run Everywhere) to support old kernels when using bperf (perf BPF based counters) with cgroups. - Support HiSilicon PCIe Performance Monitoring Unit (PMU), that monitors bandwidth, latency, bus utilization and buffer occupancy. Documented in Documentation/admin-guide/perf/hisi-pcie-pmu.rst. - User space tasks can migrate between CPUs, so when tracing selected CPUs, system-wide sideband is still needed, fix it in the setup of Intel PT on hybrid systems. - Fix metricgroups title message in 'perf list', it should state that the metrics groups are to be used with the '-M' option, not '-e'. - Sync the msr-index.h copy with the kernel sources, adding support for using "AMD64_TSC_RATIO" in filter expressions in 'perf trace' as well as decoding it when printing the MSR tracepoint arguments. - Fix program header size and alignment when generating a JIT ELF in 'perf inject'. - Add multiple new Intel PT 'perf test' entries, including a jitdump one. - Fix the 'perf test' entries for 'perf stat' CSV and JSON output when running on PowerPC due to an invalid topology number in that arch. - Fix the 'perf test' for arm_coresight failures on the ARM Juno system. - Fix the 'perf test' attr entry for PERF_FORMAT_LOST, adding this option to the or expression expected in the intercepted perf_event_open() syscall. - Add missing condition flags ('hs', 'lo', 'vc', 'vs') for arm64 in the 'perf annotate' asm parser. - Fix 'perf mem record -C' option processing, it was being chopped up when preparing the underlying 'perf record -e mem-events' and thus being ignored, requiring using '-- -C CPUs' as a workaround. - Improvements and tidy ups for 'perf test' shell infra. - Fix Intel PT information printing segfault in uClibc, where a NULL format was being passed to fprintf. * tag 'perf-tools-for-v6.1-2-2022-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (23 commits) tools arch x86: Sync the msr-index.h copy with the kernel sources perf auxtrace arm64: Add support for parsing HiSilicon PCIe Trace packet perf auxtrace arm64: Add support for HiSilicon PCIe Tune and Trace device driver perf auxtrace arm: Refactor event list iteration in auxtrace_record__init() perf tests stat+json_output: Include sanity check for topology perf tests stat+csv_output: Include sanity check for topology perf intel-pt: Fix system_wide dummy event for hybrid perf intel-pt: Fix segfault in intel_pt_print_info() with uClibc perf test: Fix attr tests for PERF_FORMAT_LOST perf test: test_intel_pt.sh: Add 9 tests perf inject: Fix GEN_ELF_TEXT_OFFSET for jit perf test: test_intel_pt.sh: Add jitdump test perf test: test_intel_pt.sh: Tidy some alignment perf test: test_intel_pt.sh: Print a message when skipping kernel tracing perf test: test_intel_pt.sh: Tidy some perf record options perf test: test_intel_pt.sh: Fix return checking again perf: Skip and warn on unknown format 'configN' attrs perf list: Fix metricgroups title message perf mem: Fix -C option behavior for perf mem record perf annotate: Add missing condition flags for arm64 ...
| * | tools arch x86: Sync the msr-index.h copy with the kernel sourcesArnaldo Carvalho de Melo2022-10-151-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To pick up the changes in: b8d1d163604bd1e6 ("x86/apic: Don't disable x2APIC if locked") ca5b7c0d9621702e ("perf/x86/amd/lbr: Add LbrExtV2 branch record support") Addressing these tools/perf build warnings: diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h' That makes the beautification scripts to pick some new entries: $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after $ diff -u before after --- before 2022-10-14 18:06:34.294561729 -0300 +++ after 2022-10-14 18:06:41.285744044 -0300 @@ -264,6 +264,7 @@ [0xc0000102 - x86_64_specific_MSRs_offset] = "KERNEL_GS_BASE", [0xc0000103 - x86_64_specific_MSRs_offset] = "TSC_AUX", [0xc0000104 - x86_64_specific_MSRs_offset] = "AMD64_TSC_RATIO", + [0xc000010e - x86_64_specific_MSRs_offset] = "AMD64_LBR_SELECT", [0xc000010f - x86_64_specific_MSRs_offset] = "AMD_DBG_EXTN_CFG", [0xc0000300 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_STATUS", [0xc0000301 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_CTL", $ Now one can trace systemwide asking to see backtraces to where that MSR is being read/written, see this example with a previous update: # perf trace -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB" ^C# If we use -v (verbose mode) we can see what it does behind the scenes: # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB" Using CPUID AuthenticAMD-25-21-0 0x6a0 0x6a8 New filter for msr:read_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313) 0x6a0 0x6a8 New filter for msr:write_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313) mmap size 528384B ^C# Example with a frequent msr: # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2 Using CPUID AuthenticAMD-25-21-0 0x48 New filter for msr:read_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841) 0x48 New filter for msr:write_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841) mmap size 528384B Looking at the vmlinux_path (8 entries long) symsrc__init: build id mismatch for vmlinux. Using /proc/kcore for kernel data Using /proc/kallsyms for symbols 0.000 Timer/2525383 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) __switch_to_xtra ([kernel.kallsyms]) __switch_to ([kernel.kallsyms]) __schedule ([kernel.kallsyms]) schedule ([kernel.kallsyms]) futex_wait_queue_me ([kernel.kallsyms]) futex_wait ([kernel.kallsyms]) do_futex ([kernel.kallsyms]) __x64_sys_futex ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64_after_hwframe ([kernel.kallsyms]) __futex_abstimed_wait_common64 (/usr/lib64/libpthread-2.33.so) 0.030 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) __switch_to_xtra ([kernel.kallsyms]) __switch_to ([kernel.kallsyms]) __schedule ([kernel.kallsyms]) schedule_idle ([kernel.kallsyms]) do_idle ([kernel.kallsyms]) cpu_startup_entry ([kernel.kallsyms]) secondary_startup_64_no_verify ([kernel.kallsyms]) # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Daniel Sneddon <daniel.sneddon@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Link: https://lore.kernel.org/lkml/Y0nQkz2TUJxwfXJd@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | perf auxtrace arm64: Add support for parsing HiSilicon PCIe Trace packetQi Liu2022-10-157-0/+396
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for using 'perf report --dump-raw-trace' to parse PTT packet. Example usage: Output will contain raw PTT data and its textual representation, such as (8DW format): 0 0 0x5810 [0x30]: PERF_RECORD_AUXTRACE size: 0x400000 offset: 0 ref: 0xa5d50c725 idx: 0 tid: -1 cpu: 0 . . ... HISI PTT data: size 4194304 bytes . 00000000: 00 00 00 00 Prefix . 00000004: 08 20 00 60 Header DW0 . 00000008: ff 02 00 01 Header DW1 . 0000000c: 20 08 00 00 Header DW2 . 00000010: 10 e7 44 ab Header DW3 . 00000014: 2a a8 1e 01 Time . 00000020: 00 00 00 00 Prefix . 00000024: 01 00 00 60 Header DW0 . 00000028: 0f 1e 00 01 Header DW1 . 0000002c: 04 00 00 00 Header DW2 . 00000030: 40 00 81 02 Header DW3 . 00000034: ee 02 00 00 Time .... This patch only add basic parsing support according to the definition of the PTT packet described in Documentation/trace/hisi-ptt.rst. And the fields of each packet can be further decoded following the PCIe Spec's definition of TLP packet. Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bjorn Helgaas <helgaas@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: John Garry <john.garry@huawei.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Liu <liuqi6124@gmail.com> Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Zeng Prime <prime.zeng@huawei.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-pci@vger.kernel.org Cc: linuxarm@huawei.com Link: https://lore.kernel.org/r/20220927081400.14364-4-yangyicong@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | perf auxtrace arm64: Add support for HiSilicon PCIe Tune and Trace device driverQi Liu2022-10-157-1/+273
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | HiSilicon PCIe tune and trace device (PTT) could dynamically tune the PCIe link's events, and trace the TLP headers). This patch add support for PTT device in perf tool, so users could use 'perf record' to get TLP headers trace data. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Qi Liu <liuqi115@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: John Garry <john.garry@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bjorn Helgaas <helgaas@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Liu <liuqi6124@gmail.com> Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Zeng Prime <prime.zeng@huawei.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-pci@vger.kernel.org Cc: linuxarm@huawei.com Link: https://lore.kernel.org/r/20220927081400.14364-3-yangyicong@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>