Skip to content

oci: support read-only root fs and non-root user#6599

Open
dermetfan wants to merge 4 commits into
masterfrom
oci-ro
Open

oci: support read-only root fs and non-root user#6599
dermetfan wants to merge 4 commits into
masterfrom
oci-ro

Conversation

@dermetfan

@dermetfan dermetfan commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Description

Fixes #6470.
Fixes #6484.

  1. RTS stats flag no longer always emitted, only when profiling or eventlog is actually enabled, so by default a stats file is no longer written at shutdown. New profilingOutputDir option prefixes those paths, set to /logs in the image so it lands on a writable mount.
  2. Env-snapshot writer relocated. writeRootEnv now writes /tmp/cardano-env instead of /usr/local/bin/env. A symlink preserves the legacy path. Dropped unused variables that were leftover from long removed topologyUpdater.
  3. Merge-mode writes relocated to /tmp/cardano-…-merged.json. Relative file references are rewritten to absolute paths in the jq merge so they still resolve.
  4. Mount points are group-writable as fresh docker/podman volumes inherit this. K8s users can also set securityContext.fsGroup: 0.
  5. Pre-flight check. Entrypoints bail out if /tmp is not writable.

Probably easiest to look at the commits one by one. The commit messages go into more detail.

Checklist

  • Commit sequence broadly makes sense and commits have useful messages
  • New tests are added if needed and existing tests are updated.
  • Any changes are noted in the CHANGELOG.md for affected package
  • The version bounds in .cabal files are updated
  • CI passes. See note on CI. The following CI checks are required:
    • Code is linted with hlint. See .github/workflows/check-hlint.yml to get the hlint version
    • Code is formatted with stylish-haskell. See .github/workflows/stylish-haskell.yml to get the stylish-haskell version
    • Code builds on Linux, MacOS and Windows for ghc-9.6 and ghc-9.12
  • Self-reviewed the diff

dermetfan added 2 commits June 9, 2026 19:50
The cardano-node and cardano-tracer service modules unconditionally emitted
`--machine-readable -tcardano-node.stats -pocardano-node` in the default
`profilingArgs`, regardless of `cfg.profiling`. GHC's `-tFILE` always writes the
stats file on shutdown, which broke startup on read-only filesystems even
though no profiling was configured. Gate the three flags on
`cfg.profiling != "none" || cfg.eventlog` so the default RTS command line is
empty when nothing is requested.

When profiling is enabled, the same flags now consult a new option,
`services.cardano-node.profilingOutputDir` (and the tracer equivalent), to
prefix the output file paths. The option defaults to null, preserving today's
relative-path behavior on NixOS where systemd's `WorkingDirectory` equals
`cfg.stateDir`. `scripts.nix` sets it to `/logs` for the OCI script wrappers, so
profile output under the read-only OCI image lands on the writable `/logs`
mount.

In response to #6470.
`run-node` and `run-tracer` wrote a sourceable env file to `/usr/local/bin/env` at
every startup, which fails under `--read-only` / `readOnlyRootFilesystem`. The
file was introduced in d9c8317 (#2801) to feed the topologyUpdate script added the same day in
c652f10. topologyUpdate was removed in 56266c0 and never reintroduced;
the writer was preserved by accident when 34a4796 re-created the docker
context, along with a stale `# Mapping for topologyUpdater` comment. Across
all branches, there is noconsumer besides the deleted topologyUpdater.

The `CARDANO_*` snapshot remains useful for operators that exec into the
container and want the resolved (post-defaults) config in a shell, so the
writer is kept but redirected to `/tmp/cardano-env` (writable when `/tmp` is
mounted as tmpfs/emptyDir). A build-time symlink at `/usr/local/bin/env ->
/tmp/cardano-env` keeps the legacy path resolving in case any out-of-tree
consumer depends on it.

The variables that existed only for topologyUpdater are dropped from the
snapshot.

The README's new "Read-Only Root Filesystem" section documents the required
writable mounts, the new env-snapshot path, and how custom-mode operators
should direct any profile output to a writable mount.

In response to #6470.
@dermetfan dermetfan marked this pull request as ready for review June 10, 2026 14:01
@dermetfan dermetfan requested a review from a team as a code owner June 10, 2026 14:01
The entrypoints wrote `{{,tracer-}config,topology}-merged.json` to
`/opt/cardano/config/$NETWORK/`, which is image content owned by root.
Non-root containers and read-only-root mounts both fail on this write.

Redirect the writes to `/tmp/cardano-{{,tracer-}config,topology}-merged.json`.

The upstream network config keeps relative refs to genesis /
peer-snapshot files, all resolved relative to the config file's own
directory. Naively moving the merged output to `/tmp` would break those
refs. The jq merge now rewrites them to absolute paths anchored at
`/opt/cardano/config/$NETWORK/`. The key match uses a regex
`(test("GenesisFile$|^CheckpointsFile$"))` so future protocol-era keys
pick up the rewrite without code changes.

Both entrypoints also bail out early with a clear error message if
`/tmp` is not writable, instead of letting the failure cascade into a
less-obvious jq write error.

In response to #6484.
The mount-point directories /data, /ipc, /logs were created owned by root.
When a volume was first mounted there, Docker propagated the permissions
from the image, so a container running as non-root could not write.

Apply `chmod g+w` to open up the permissions enough for:

- Kubernetes `runAsUser` (assigns primary group 0 by default)
- Docker `--user <uid>` (assigns primary group 0 by default)
- OpenShift's arbitrary-UID assignment (random UID, GID 0)

For the tracer image, also add `mkdir -p data` and the matching
symlink at `/opt/cardano/data` for parity with the node image's convention.
`run-tracer` defaults `CARDANO_STATE_DIR` to `/data/tracer`
and assumed that this symlink existed. This finally makes that true.
The default state dir now works under non-root
without an explicit `/data` mount.

In response to #6484.
@dermetfan dermetfan changed the title oci: support read-only root filesystem oci: support read-only root fs and non-root user Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FR] Support running Docker image as non-root user [BUG] - Docker image incompatible with readOnlyRootFilesystem (Kubernetes security best practice)

1 participant