fix(cuda): include cooperative_groups/reduce.h for CUDA 13 (libkernels.so build) by dndungu · Pull Request #123 · zerfoo/ztensor

dndungu · 2026-06-09T01:27:15Z

Problem

On CUDA 13 (nvcc 13.x), make shared fails compiling gemv_q4k_sm121.cu:

error: namespace "cooperative_groups" has no member "plus"
    acc = cg::reduce(warp, acc, cg::plus<float>());

CUDA 13 moved cg::reduce / cg::plus out of <cooperative_groups.h> into <cooperative_groups/reduce.h>. The file only included the former, so the whole libkernels.so build breaks on CUDA 13 toolchains (e.g. the GB10 DGX, CUDA 13.0).

Fix

Add #include <cooperative_groups/reduce.h>. Verified: make shared CUDA_ARCH=sm_121 builds cleanly on the GB10 DGX (CUDA 13.0) after this change. No behavior change on older CUDA (header is additive).

CUDA 13 moved cg::reduce / cg::plus out of <cooperative_groups.h> into <cooperative_groups/reduce.h>. Without the explicit include, gemv_q4k_sm121.cu fails to compile (libkernels.so build breaks) under nvcc 13.x. Verified: the sm_121 (GB10) kernel build succeeds with this include added.

dndungu merged commit bcbdd9d into main Jun 9, 2026
1 check passed

dndungu deleted the fix/cuda13-cg-reduce-include branch June 9, 2026 06:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(cuda): include cooperative_groups/reduce.h for CUDA 13 (libkernels.so build)#123

fix(cuda): include cooperative_groups/reduce.h for CUDA 13 (libkernels.so build)#123
dndungu merged 1 commit into
mainfrom
fix/cuda13-cg-reduce-include

dndungu commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dndungu commented Jun 9, 2026

Problem

Fix

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant