Skip to content

feat: Backport VRAM management patches for dmem cgroup (6.6.y)#1890

Open
deepin-wm wants to merge 4 commits into
deepin-community:linux-6.6.yfrom
deepin-wm:vram-mgmt-6.6-backport
Open

feat: Backport VRAM management patches for dmem cgroup (6.6.y)#1890
deepin-wm wants to merge 4 commits into
deepin-community:linux-6.6.yfrom
deepin-wm:vram-mgmt-6.6-backport

Conversation

@deepin-wm

@deepin-wm deepin-wm commented Jun 18, 2026

Copy link
Copy Markdown

Summary

Backport VRAM management patches from pixelcluster's dmemcg-aggressive-protect branch to improve VRAM allocation for low-end GPUs, targeting the linux-6.6.y branch.

These patches fix AMDGPU's VRAM management so that applications protected by dmem cgroup limits (dmem.low/dmem.min) are more aggressive about evicting unprotected buffers, preventing protected application buffers from being forced into GTT (system RAM) even when they are within their protection limits.

Challenges

Kernel 6.6 does not have the dmem cgroup infrastructure that was introduced in 6.14. This PR backports all prerequisites in addition to the VRAM management patches:

  1. page_counter_calculate_protection (from kernel 6.11) - Generic effective protection calculation for page counters, needed by the dmem cgroup controller
  2. dmem cgroup controller (from kernel 6.14) - The entire device memory cgroup subsystem for tracking and limiting GPU VRAM consumption
  3. TTM dmem cgroup integration - Adapted for 6.6's TTM eviction mechanism (which uses `ttm_mem_evict_first` instead of the newer `ttm_bo_evict_alloc` with `ttm_lru_walk`)

Changes

Commit 1: cgroup/dmem: Add dmem cgroup controller and page_counter_calculate_protection

  • Add `page_counter_calculate_protection()` from kernel 6.11 (guarded by `CONFIG_MEMCG || CONFIG_CGROUP_DMEM`)
  • Add the dmem cgroup controller (`kernel/cgroup/dmem.c`, `include/linux/cgroup_dmem.h`)
  • Add `CONFIG_CGROUP_DMEM` to Kconfig
  • Add `SUBSYS(dmem)` to cgroup_subsys.h
  • Add documentation

Commit 2: cgroup/dmem: Add queries for protection values (pixelcluster patch 1)

  • Add `dmem_cgroup_below_min()` and `dmem_cgroup_below_low()` helpers

Commit 3: cgroup,cgroup/dmem: Add (dmem_)cgroup_common_ancestor helper (pixelcluster patch 2)

  • Add helper for finding common ancestor of two cgroup pool states

Commit 4: drm/ttm: Add dmem cgroup support for VRAM management

Adapted version of pixelcluster's patches 3-6 for 6.6's TTM code:

  • Add `struct ttm_bo_alloc_state` for tracking allocation state
  • Add `ttm_bo_alloc_at_place()` for dmem-aware allocation
  • Add `ttm_resource_try_charge()` for pre-charging cgroups
  • Split cgroup charge from resource allocation
  • Add `css` field to `ttm_resource` and `cg` field to `ttm_resource_manager`
  • Add `ttm_mem_evict_first_dmem()` that skips protected BOs
  • Use common ancestor for correct eviction protection calculation
  • Be more aggressive when allocating below dmem cgroup protection

Notes

  • The 6.6 TTM uses a different eviction mechanism (`ttm_mem_evict_first`) vs newer kernels (`ttm_bo_evict_alloc` with `ttm_lru_walk`), so patches 3-6 were completely rewritten to achieve the same functionality
  • Uses `DEEPIN_KABI_RESERVE` slots in `ttm_resource_manager` and `ttm_resource` structs for new fields to minimize ABI impact
  • Userspace utilities (`dmemcg-booster`, `plasma-foreground-booster`) are also needed for full functionality

Source

Patches from: https://pixelcluster.github.io/VRAM-Mgmt-fixed/
Original commits by Natalie Vock natalie.vock@gmx.de

Summary by Sourcery

Backport device-memory (dmem) cgroup support and integrate it with TTM-based VRAM management to honor dmem protection when allocating and evicting GPU buffers.

New Features:

  • Introduce a dmem cgroup controller and associated user-facing cgroup-v2 files for configuring protected GPU device-memory usage per region.
  • Add device-memory aware accounting and charging APIs for GPU drivers via the new cgroup_dmem interface.
  • Extend TTM resource management so GPU buffer allocations are charged to dmem cgroups and tracked per resource and manager.

Enhancements:

  • Update TTM VRAM eviction and allocation paths to prefer evicting less-protected buffers and to respect dmem min/low protections when under memory pressure.
  • Add generic page_counter_calculate_protection support to compute effective min/low protections across cgroup hierarchies.
  • Provide helpers to find common cgroup ancestors for correct protection calculations across different cgroup pools.

Documentation:

  • Extend cgroup v2 admin documentation to describe the new dmem controller, its configuration files, and semantics.

deepin-wm and others added 4 commits June 18, 2026 19:23
…otection

Backport the dmem cgroup controller from kernel 6.14 and the
page_counter_calculate_protection function from kernel 6.11.

The dmem cgroup controller allows tracking and limiting device memory
(such as GPU VRAM) consumption via cgroups. It uses the same min/low/max
semantics as the memory cgroup.

page_counter_calculate_protection is needed by dmem to calculate
effective memory protection values. This function was factored out
of memcontrol.c in kernel 6.11.
Callers can use this feedback to be more aggressive in making space for
allocations of a cgroup if they know it is protected.

These are counterparts to memcg's mem_cgroup_below_{min,low}.

Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
This helps to find a common subtree of two resources, which is important
when determining whether it's helpful to evict one resource in favor of
another.

To facilitate this, add a common helper to find the ancestor of two
cgroups using each cgroup's ancestor array.

Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
Add dmem cgroup integration to TTM for kernel 6.6, adapting the
VRAM management improvements from pixelcluster's dmemcg-aggressive-protect
branch for the 6.6 TTM code structure.

Key changes:
- Add ttm_bo_alloc_state for tracking allocation state (charge_pool,
  limit_pool, in_evict, may_try_low)
- Add ttm_bo_alloc_at_place() for dmem-aware allocation attempts
- Add ttm_resource_try_charge() for pre-charging cgroups before
  resource allocation
- Split cgroup charge from resource allocation in ttm_resource_alloc
- Add dmem cgroup pool state (css) to ttm_resource
- Add dmem cgroup region (cg) to ttm_resource_manager
- Add ttm_mem_evict_first_dmem() that skips protected BOs during eviction
- Add ttm_bo_evict_valuable_dmem() for cgroup-aware eviction decisions
  using common ancestor for correct protection calculation
- Be more aggressive when allocating below dmem cgroup protection limits
- Retry eviction with low-protected BOs when may_try_low is set

This is a functional equivalent of pixelcluster's patches 3-6, adapted
for 6.6's TTM eviction mechanism (ttm_mem_evict_first instead of
ttm_bo_evict_alloc with ttm_lru_walk).

Original patches by Natalie Vock <natalie.vock@gmx.de>
Adapted for deepin-community/kernel linux-6.6.y branch.
@sourcery-ai

sourcery-ai Bot commented Jun 18, 2026

Copy link
Copy Markdown

Reviewer's Guide

Backports the dmem cgroup controller and its page-counter protection logic from newer kernels into 6.6 and wires it into TTM VRAM allocation/eviction paths so that VRAM usage honors dmem.low/min protections, preferentially evicting unprotected BOs and keeping protected BOs in VRAM.

Sequence diagram for dmem-aware VRAM allocation and eviction in TTM

sequenceDiagram
    participant Proc as Process
    participant BO as ttm_bo_mem_space
    participant Alloc as ttm_bo_alloc_at_place
    participant RMCharge as ttm_resource_try_charge
    participant DmemCharge as dmem_cgroup_try_charge
    participant ResAlloc as ttm_resource_alloc
    participant BelowMin as dmem_cgroup_below_min
    participant BelowLow as dmem_cgroup_below_low
    participant Force as ttm_bo_mem_force_space
    participant EvictDmem as ttm_mem_evict_first_dmem
    participant EvictVal as dmem_cgroup_state_evict_valuable

    Proc->>BO: ttm_bo_mem_space(bo, placement, mem, ctx)
    loop placement entries
        BO->>Alloc: ttm_bo_alloc_at_place(bo, place, ctx, force_space=false)
        Alloc->>RMCharge: ttm_resource_try_charge(bo, place, &charge_pool, &limit_pool)
        RMCharge->>DmemCharge: dmem_cgroup_try_charge(region, size, &pool, &limit_pool)
        DmemCharge-->>RMCharge: 0 or -EAGAIN
        RMCharge-->>Alloc: 0 or -EAGAIN
        alt charge succeeded
            Alloc->>BelowMin: dmem_cgroup_below_min(NULL, charge_pool)
            Alloc->>BelowLow: dmem_cgroup_below_low(NULL, charge_pool)
            Alloc->>ResAlloc: ttm_resource_alloc(bo, place, res, charge_pool)
            ResAlloc-->>Alloc: 0 or -ENOSPC
            alt allocated
                Alloc-->>BO: 0
                BO-->>Proc: success
            else no space but may evict
                Alloc-->>BO: -EBUSY
            end
        else hit dmem limit (-EAGAIN)
            Alloc-->>BO: mapped to -EBUSY or -ENOSPC
        end
    end

    alt BO gets -EBUSY
        BO->>Force: ttm_bo_mem_force_space(bo, place, mem, ctx, alloc_state)
        loop evict until space
            alt manager has cg
                Force->>EvictDmem: ttm_mem_evict_first_dmem(bdev, man, place, ctx, ticket, alloc_state)
                EvictDmem->>EvictVal: dmem_cgroup_state_evict_valuable(limit_pool, test_pool, try_low, &hit_low)
                EvictVal-->>EvictDmem: true/false (evict or skip protected BO)
            else
                Force->>EvictDmem: ttm_mem_evict_first(bdev, man, place, ctx, ticket)
            end
        end
        Force->>ResAlloc: ttm_resource_alloc(bo, place, mem, charge_pool)
        ResAlloc-->>Force: 0
        Force-->>BO: 0
        BO-->>Proc: success
    end
Loading

File-Level Changes

Change Details Files
Introduce page_counter_calculate_protection and wire it into the page_counter API as a shared primitive for memory protection calculations.
  • Add effective_protection() and page_counter_calculate_protection() implementations to mm/page_counter.c, using emin/elow state derived from min/low and tree usage
  • Export page_counter_calculate_protection and declare it in include/linux/page_counter.h, guarded by CONFIG_MEMCG
Add a new dmem cgroup controller for device memory accounting, including region registration, per-cgroup pools, protection-based eviction helpers, and cgroup interface files.
  • Introduce kernel/cgroup/dmem.c implementing dmem_cgrp_subsys, dmemcg_state, dmem_cgroup_region and dmem_cgroup_pool_state, with RCU and spinlock based lifetime management
  • Provide APIs in include/linux/cgroup_dmem.h for registering/unregistering regions, charging/un-charging pools, querying protection (below_low/min), eviction decisions, and finding common ancestors
  • Hook the controller into cgroup core: add CONFIG_CGROUP_DMEM, SUBSYS(dmem), build rules, and dmem v2 interface files (capacity/current/min/low/max)
  • Implement dmem_cgroup_state_evict_valuable(), dmem_cgroup_below_min/low(), and dmem_cgroup_get_common_ancestor() using page_counter_calculate_protection and cgroup_common_ancestor()
kernel/cgroup/dmem.c
include/linux/cgroup_dmem.h
include/linux/cgroup.h
include/linux/cgroup_subsys.h
kernel/cgroup/Makefile
init/Kconfig
Documentation/admin-guide/cgroup-v2.rst
Integrate dmem cgroup accounting into TTM resource management, adding charge-aware allocation and eviction that respects dmem protections.
  • Extend ttm_resource_manager with a dmem_cgroup_region* cg field and ttm_resource with a dmem_cgroup_pool_state* css field, using DEEPIN_KABI_RESERVE slots where applicable
  • Add ttm_resource_try_charge() to pre-charge a region before allocation and pass the resulting pool into a new ttm_resource_alloc() signature that records css on the resource and uncharges on free
  • Introduce ttm_bo_alloc_state to carry charge/limit pools and eviction state through allocation, and ttm_bo_alloc_at_place() to combine charging, protection checks, and allocation
  • Modify ttm_bo_mem_space() and ttm_bo_mem_force_space() to use ttm_bo_alloc_state, ttm_resource_try_charge, and a dmem-aware eviction loop, handling -EAGAIN/-EBUSY and uncharge/pool_state_put paths correctly
  • Implement ttm_bo_evict_valuable_dmem() and ttm_mem_evict_first_dmem() so eviction walks skip protected BOs and only consider low-protected BOs when allowed, using dmem_cgroup_state_evict_valuable and common ancestor pools
  • Adjust existing ttm paths (e.g., swapout) to call the updated ttm_resource_alloc() with NULL charge_pool for system placements
drivers/gpu/drm/ttm/ttm_bo.c
drivers/gpu/drm/ttm/ttm_resource.c
include/drm/ttm/ttm_resource.h

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@deepin-ci-robot

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign avenger-285714 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@deepin-ci-robot

Copy link
Copy Markdown

Hi @deepin-wm. Thanks for your PR.

I'm waiting for a deepin-community member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • The signature change to ttm_resource_alloc (new charge_pool parameter) will break any out-of-tree users; consider adding a static inline wrapper with the old three-argument signature that forwards to the new helper with a NULL pool to preserve the existing API surface.
  • In ttm_resource_free(), the uncharge uses bo->base.size rather than the resource size; if a manager allocates with alignment / padding or partial usage, this may mismatch the charged amount, so it would be safer to uncharge based on the resource’s actual size field instead of bo->base.size.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The signature change to ttm_resource_alloc (new charge_pool parameter) will break any out-of-tree users; consider adding a static inline wrapper with the old three-argument signature that forwards to the new helper with a NULL pool to preserve the existing API surface.
- In ttm_resource_free(), the uncharge uses bo->base.size rather than the resource size; if a manager allocates with alignment / padding or partial usage, this may mismatch the charged amount, so it would be safer to uncharge based on the resource’s actual size field instead of bo->base.size.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants