Skip to content

PR of Maira's pmdomain/downstream/timeouts branch#7400

Merged
pelwell merged 5 commits into
raspberrypi:rpi-6.18.yfrom
mairacanal:pmdomain/downstream/timeouts
May 27, 2026
Merged

PR of Maira's pmdomain/downstream/timeouts branch#7400
pelwell merged 5 commits into
raspberrypi:rpi-6.18.yfrom
mairacanal:pmdomain/downstream/timeouts

Conversation

@pelwell
Copy link
Copy Markdown
Contributor

@pelwell pelwell commented May 25, 2026

Turn Maira's branch into a PR to get the build artefacts.

@pelwell
Copy link
Copy Markdown
Contributor Author

pelwell commented May 26, 2026

Consider this an Approve.

@mairacanal mairacanal force-pushed the pmdomain/downstream/timeouts branch from dd930f1 to da2f572 Compare May 26, 2026 23:55
@rvprudent-lang
Copy link
Copy Markdown

Hi @pelwell and @mairacanal,

First, sorry for not getting back sooner — I've been caught up with other things and only just had the chance to test this today.

Thank you both for the quick turnaround on this. The analysis of the MMU/TLB flush issue and the runtime PM work look spot-on based on what I was seeing.

I've just applied the patch on my RPi4 (running headless, Debian 13):

sudo rpi-update pulls/7400/head

Kernel is now 6.18.33-v8+. I removed my --gpu workaround and WayVNC is running again with hardware acceleration enabled. Boot was clean — no v3d errors or MMU pte invalid messages in dmesg.

I'll leave it running for 48 hours and report back. Fingers crossed 🤞

Thanks again for the great work!

@pelwell
Copy link
Copy Markdown
Contributor Author

pelwell commented May 27, 2026

I think this is looking good enough to merge. Are you happy for me to proceed, @mairacanal?

Commit 18605b1 ("pmdomain: bcm: bcm2835-power: Increase ASB control
timeout") raised the ASB handshake polling budget from 1us to 5us.
Surveying the pmdomain subsystem, 5us is still the smallest polling budget
by a wide margin - comparable handshakes in other drivers use:

  - 100us : starfive jh71xx-pmu, apple pmgr-pwrstate
  - 1ms   : renesas rcar-sysc, rmobile-sysc (power-on)
  - 10ms  : renesas rcar-gen4-sysc, sunxi sun55i-pck600
  - 1s    : mediatek mtk-pm-domains, mtk-scpsys

Raise the bcm2835 timeout to 100us, matching analogous drivers. 100us is
still negligible relative to a power-domain transition and gives the V3D
master ASB substantially more headroom to drain under heavy workloads,
where 5us has been observed to be insufficient in practice.

Cc: stable@vger.kernel.org
Fixes: b826d2c ("pmdomain: bcm: bcm2835-power: Increase ASB control timeout")
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Make the downstream version match the upstream commit
458f2a712ab4 ("drm/v3d: Introduce Runtime Power Management").

Signed-off-by: Maíra Canal <mcanal@igalia.com>
v3d_mmu_set_page_table() ends by calling v3d_mmu_flush_all() to flush the
MMU cache and clear the TLB after reprogramming V3D_MMU_PT_PA_BASE.
v3d_mmu_flush_all() is gated by pm_runtime_get_if_active(), which returns
0 unless runtime_status == RPM_ACTIVE.

v3d_mmu_set_page_table() is called from two paths that *know* V3D is
reachable, but where the runtime PM status might be wrong:

  1. v3d_power_resume(): the runtime resume callback itself, where
     runtime_status is RPM_RESUMING.

  2. v3d_reset(): called from the DRM scheduler timeout handler with the
     hung job's pm_runtime reference held, so RPM_ACTIVE, but here we
     don't need to take an extra reference for the duration of the flush
     either.

In the first case pm_runtime_get_if_active() returns 0, the flush is
silently skipped, and V3D resumes executing with whatever MMUC/TLB state
happened to survive the last reset. On BCM2711, this leaves stale
translations live across runtime PM cycles, manifesting as random GPU
hangs.

Split the actual flush sequence into a helper that does the writes
unconditionally, and have v3d_mmu_set_page_table() call it directly.

Fixes: 17af1d14deaf ("drm/v3d: Introduce Runtime Power Management")
Signed-off-by: Maíra Canal <mcanal@igalia.com>
v3d_clean_caches() starts the cache-clean sequence by writing
V3D_L2TCACTL_TMUWCF to V3D_CTL_L2TCACTL and then polling for that bit to
clear. It does not, however, check for an L2T flush (L2TFLS) that may
still be in flight from a previous operation.

On pre-V3D 7.1 hardware, kicking off the TMU write-combiner flush while an
L2T flush is still pending can clobber bits in L2TCACTL and cause cache
inconsistencies.

Poll for L2TFLS to clear before writing L2TCACTL on V3D < 7.1, ensuring
any pending flush has completed before a new clean is issued.

Cc: stable@vger.kernel.org
Fixes: d223f98 ("drm/v3d: Add support for compute shader dispatch.")
Signed-off-by: Maíra Canal <mcanal@igalia.com>
On runtime suspend, clean the V3D caches before suspending so all dirty
lines are written back to memory before the power domain is shut down.

Fixes several system hangs reported in [1][2][3].

Closes: raspberrypi#7381 [1]
Closes: raspberrypi#7396 [2]
Closes: raspberrypi#7397 [3]
Fixes: 17af1d14deaf ("drm/v3d: Introduce Runtime Power Management")
Signed-off-by: Maíra Canal <mcanal@igalia.com>
@mairacanal mairacanal force-pushed the pmdomain/downstream/timeouts branch from da2f572 to 716bd38 Compare May 27, 2026 13:41
@mairacanal
Copy link
Copy Markdown
Contributor

@pelwell, I just rebased the branch and reviewed the patches (fixing a few nits in the commit messages), so we are good to go. Thanks!

I'll proceed with the upstreaming process.

@pelwell pelwell merged commit 95b85be into raspberrypi:rpi-6.18.y May 27, 2026
12 checks passed
@pelwell pelwell mentioned this pull request May 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants