Improve build system and CI/CD pipeline#140
Conversation
There was a problem hiding this comment.
Review completed
Tip: cubic can generate docs of your entire codebase and keep them up to date. Try it here.
Re-trigger cubic
80f204d to
04719ac
Compare
There was a problem hiding this comment.
3 issues found across 36 files
Tip: cubic can generate docs of your entire codebase and keep them up to date. Try it here.
Re-trigger cubic
f735efa to
40a4254
Compare
|
@Mes0903 I have started the AI code review. It will take a few minutes to complete. |
There was a problem hiding this comment.
No issues found across 37 files
Tip: cubic can generate docs of your entire codebase and keep them up to date. Try it here.
Re-trigger cubic
40a4254 to
d182666
Compare
d182666 to
fb64e4a
Compare
Replace 'scripts/build-image.sh' with a source-build path organized around artifact classes. 'scripts/build-artifacts.sh' becomes the local user interface, while the actual source recipes live under 'scripts/prebuilt' and can be shared by local builds and CI. The local CLI owns artifact selection and local cleanup policy. It parses requests such as 'image', 'rootfs', 'test-tools', and 'all', handles flags such as '--clean-build' and '--full-rebuild', and decides whether to derive 'ext4.img'. 'scripts/prebuilt/artifact-recipes.sh' owns the Buildroot and Linux source operations, publishes raw artifacts, and keeps recipe-specific work out of the 'Makefile'. Default rootfs and test-tools rootfs now use separate Buildroot output directories. The default output produces 'rootfs.cpio' and the local 'ext4.img'. The test-tools output produces 'test-tools.img' with the selected payloads. This keeps default rootfs artifacts stable when the test-tools recipe adds a larger filesystem payload. The test-tools recipe is explicit. 'scripts/prebuilt/test-tools-recipe.sh' normalizes selections such as 'x11' and 'directfb2', and 'scripts/build-artifacts.sh' accepts flags like '--x11' and '--directfb2-test' directly. DirectFB2 and X11 are independent recipe entries. Local Make keeps the file-target model for guest artifacts. 'Image', 'rootfs.cpio', and 'test-tools.img' are ordinary targets from 'mk/external.mk', and missing files are fetched through the shared release download helper. Source rebuilds are requested through 'make build-artifacts', so changing a guest config means the developer explicitly asks the source-build path to refresh local artifacts. Cleanup semantics are made explicit as well. Generated stamp files move under '.stamps', 'make clean' removes Buildroot output directories, 'distclean' removes external source trees, '--clean-build' refreshes selected outputs while keeping source checkouts and download cache, and '--full-rebuild' starts from freshly cloned pinned revisions. This keeps the fast local Buildroot-cache workflow available while still giving developers a complete rebuild path when they need one. Move recipe pins into 'artifact-recipe.env' and rename the Meson RISC-V cross file to make its role clear. CI recipe keys own release freshness; local Make is responsible for existing files, missing-file downloads, and explicit source rebuilds.
Replace the old PR drift job and standalone prebuilt workflow with a resolver/materializer model inside the main workflow. The new flow treats guest artifacts as a planned input to the test jobs: first decide where every class should come from, then materialize exactly that plan before tests or publishing consume it. The resolver computes the current recipe key for each registered artifact class from '.ci/prebuilt/artifact-inputs.sh'. It compares those keys with the rolling release manifest and with any '.stamps/prebuilt-local' entries restored from the workflow-local raw cache. The output is a 'KEY=VALUE' plan containing one action per class: 'use-action-cache', 'download-release', or 'build'. The materializer executes that plan without making new policy decisions. Restored cache entries are accepted when their raw artifact and recipe-key stamp match the current recipe key. Release-backed entries are downloaded through the shared transport helper. Build entries are passed to '.ci/prebuilt/build-plan-artifacts.sh', which calls the same 'scripts/prebuilt' source recipes used by local 'make build-artifacts'. The workflow-local cache stores raw artifacts plus '.stamps/prebuilt-local'. The rolling release stores compressed '.bz2' artifacts plus 'prebuilt.sha1'. CI uses raw artifacts for fast same-workflow or nearby-workflow reuse, and publishes a compact rolling release for local Make and test jobs that consume prebuilt artifacts. Move package and verification work into '.ci/prebuilt/package.sh' and '.ci/prebuilt/verify-package.sh'. 'guest-artifact-build' uploads raw artifacts for test jobs and packaged artifacts for 'publish-prebuilt'. The publish job runs on master after the required tests have exercised the same materialized artifacts and the resolver has reported that the rolling release needs an update. The workflow layout is cleaned up at the same time. Device smoke scripts move under '.ci/device-smoke', GitHub output and formatting helpers move under '.ci/github', and 'append-github-output.sh' validates the requested outputs before appending them. Publishing follows the main workflow's resolved artifact plan, so the old '.ci/publish-prebuilt.sh' and '.github/workflows/prebuilt.yml' are removed. Add '.github/actions/provide-guest-artifacts' so Linux and macOS test jobs consume artifacts through the same contract. Test jobs either download a current rolling release or download the raw artifact bundle from the current workflow run. 'PREBUILT_FORBID_BUILD' keeps test jobs from compiling guest artifacts after the build job has already made the release/cache/build decision.
Extend the prebuilt release manifest so it records transport integrity as well as source recipe state. 'prebuilt.sha1' now contains sha1sum entries for each compressed artifact, followed by the virtual '*.recipe-key' entries used by the resolver. The two kinds of entries answer different questions and are intentionally kept in the same manifest. Recipe keys decide freshness. The resolver compares the current class recipe key with the release recipe key to decide whether the rolling release matches the current checkout. Archive checksums verify that the downloaded '.bz2' bytes match the manifest the resolver and materializer are using. Update '.ci/prebuilt/package.sh' to write archive checksum entries for every class output before writing the recipe-key entries. Update '.ci/prebuilt/verify-package.sh' to require both entry types and to recompute each compressed archive checksum before publish. This catches packaging mistakes before the rolling release is updated. Make 'scripts/prebuilt/download-release-artifact.sh' fetch 'prebuilt.sha1' before downloading an artifact. The helper verifies the archive checksum before decompression and can also compare the manifest recipe key with an expected key supplied by CI. Local Make gets byte verification for downloaded release artifacts, and CI gets an extra guard against release-manifest or asset drift between resolve and materialize. Pass the expected class recipe key from '.ci/prebuilt/materialize-artifacts.sh' when materializing a 'download-release' action. After the helper verifies the archive and recipe key, the materializer decompresses the raw artifact and stamps '.stamps/prebuilt-local' with the current recipe key. A mismatch fails before stale bytes are marked as current. Update '.ci/prebuilt/resolve-artifacts.sh' so release recipe keys are accepted from manifests that contain archive checksum entries for the class outputs. A manifest that lacks those entries is outside the current release contract and resolves as stale, which drives CI toward cache or source build. Remove the unused 'release_manifest_available' output from the resolver and the materializer's ignored-key list. Also document the bootstrap boundary in code: forks and mirrors may bootstrap from source when 'prebuilt.sha1' returns HTTP 404, while network failures and other HTTP errors remain fatal so external release-state problems stay visible.
Add 'docs/build-system-and-ci.md' as the build-system and CI overview. The document follows the execution flow through local Make file targets, explicit source builds, the shared artifact class contract, CI recipe keys, resolver decisions, materialization, packaging, testing, publishing, and fork/mirror bootstrap behavior. Update 'README.md' to match the same model. The first-run instructions describe missing guest artifacts as release downloads, local downloads verify the archive checksum in 'prebuilt.sha1', and source rebuilds stay explicit through 'make build-artifacts'. The artifact-build section describes local Make behavior and CI recipe keys as separate responsibilities. Update 'docs/networking.md' after moving device smoke scripts under '.ci/device-smoke'.
PR sysprog21#141 discussed a kernel panic where Linux saw 'vda' from 'virtio-blk' but still reported root device "". The manual run used a DTB generated for the legacy initramfs path, so its bootargs did not provide 'root=/dev/vda'. The command only attached '-d ext4.img' and did not pass '-i rootfs.cpio'. The default 'make check' paths already avoid this. External-root builds generate bootargs with 'root=/dev/vda' and pass '-d ext4.img'; the fakeroot fallback and explicit 'ENABLE_EXTERNAL_ROOT=0' path pass '-i rootfs.cpio' together with the initramfs DTB. Keep custom DTBs outside these default 'minimal.dtb' checks. For the built-in default-DTB cases, validate the option combination before boot so manual invocations get an actionable 'semu' error instead of reaching the guest VFS panic.
macOS runners can fail to materialize 'ext4.img' when the default 'cpio' is not compatible with the fakeroot environment used by 'rootfs_ext4.sh'. The build system already decides whether external root boot is enabled, so the image helper should report missing tools and let callers choose the cpio implementation instead of embedding host-specific fallback policy. Add 'ROOTFS_CPIO' as the Makefile and script interface, and pass it through the fakeroot child with the other explicit environment inputs. The helper now validates both 'ROOTFS_CPIO' and 'MKFS_EXT4', keeps cpio extraction and mkfs in one fakeroot process, and checks that the extracted rootfs contains executable '/sbin/init' before tolerating non-fatal cpio warnings. Use the interface in the macOS CI setup by installing Homebrew 'cpio' and exporting its binary path. The default remains 'cpio' for normal local builds.
fb64e4a to
184be80
Compare
| local inode | ||
|
|
||
| inode=$(ls -id "$BUILDROOT_LOCK_DIR" 2>/dev/null) || inode=missing | ||
| inode=${inode%% *} |
There was a problem hiding this comment.
Since the OS can recycle pid and reuse inodes, is there a risk that this might lead to subtle race conditions?
There was a problem hiding this comment.
I checked this again and the concern is valid. This lock is used to serialize source builds that touch the shared buildroot/ tree. The outer artifact flow may request Image, rootfs.cpio, and test-tools.img separately, but all of those paths can use or mutate Buildroot state such as the checkout, config, host toolchain, and output directories. The lock is there so those sections do not run concurrently while still allowing Buildroot itself to use its own internal parallelism.
The difficult part is automatic stale-lock recovery. The current code uses the lock directory inode plus the recorded PID as a best-effort identity before removing a stale lock. That is not a strict guarantee because both inode numbers and PIDs can be reused. More importantly, with portable shell/filesystem operations we do not have an atomic "remove this lock only if it is still owned by the same observed owner" primitive. A separate read/compare followed by rm can still race with another process that removed the stale lock and then created a fresh one.
The stale-lock case we are trying to recover from should only happen when an artifact source build is interrupted after acquiring the lock, for example when the user kills the build, the shell exits unexpectedly, or a CI runner is terminated at the wrong time. That should be much less common than normal lock contention between concurrent artifact builds.
Given that, I plan to remove the best-effort automatic stale-lock deletion. The portable path will use an atomic symlink lock whose target contains an owner token, so acquiring the lock and publishing the owner identity happen in one operation. Normal release will remove the symlink only when the token still matches the current process. If another process observes a stale owner, it will print an actionable diagnostic asking the user/CI job to remove the lock after confirming that no artifact source build is still running, instead of deleting the lock automatically.
This keeps the common path cross-platform and avoids relying on reusable PID/inode identity for correctness.
|
This is a huge. |
Improve build system and CI/CD pipeline
This pull request restructures the guest artifact build flow used by semu.
The main reason for this change is the ordering problem in the old CI/CD flow. When a branch was merged into
master, the device tests could request the rolling release prebuilt artifacts before the new prebuilt release had been built and published. That meant the tests could boot with stale release artifacts even though the current checkout required new guest artifacts, so the merge-time CI/CD path could fail for reasons unrelated to the emulator changes being tested.The old PR flow also compared guest artifact inputs against the
masterbranch. When a PR changed a guest config, every later push to that PR could trigger another source rebuild, even if the new commit did not touch any guest config or recipe input. This made PR iteration expensive because the decision was tied to branch drift instead of the exact artifact recipe required by the current checkout.The old flow also made the build and test pipeline depend heavily on GitHub-specific behavior. Cache restore, workflow artifacts, release downloads, publish decisions, and test setup were coupled together in the workflow. This PR redraws that architecture around clearer layers: source-build recipes, artifact resolution, artifact materialization, packaging/publishing, and test consumption. GitHub Actions remains the current execution environment, but the GitHub-specific pieces are now adapters around those layers rather than the place where the artifact policy lives.
Local development stays close to Make’s file-target model. Missing guest artifacts can still be downloaded from the rolling prebuilt release, while source rebuilds are explicit through
make build-artifacts. This keeps the normal local path predictable and preserves the Buildroot cache workflow developers rely on when iterating on guest configuration.CI owns the artifact decision model. The workflow computes recipe keys for the current checkout, compares them with restored workflow-local raw artifacts and the rolling release manifest, then materializes a plan as cache reuse, release download, or source build. The resolver decides where artifacts should come from, and the materializer performs that plan.
The rolling release contract is also tightened. Recipe keys continue to decide whether a release artifact matches the current source recipe, while archive checksums verify that downloaded
.bz2bytes match the published manifest before decompression and stamping. These checks cover different failure modes, so the release manifest now records both archive checksum entries and virtual recipe-key entries.Local build flow
make build-artifactsreplaces the old image-oriented source build entrypoint with an artifact-class based flow. The local CLI selects which classes to build, handles cleanup policy, and delegates the actual source recipes to shared helpers underscripts/prebuilt/.The default rootfs and test-tools rootfs now use separate Buildroot output directories. This keeps optional test payloads such as X11 and DirectFB2 from changing the default
rootfs.cpioand derivedext4.imgpath.ext4.imgremains a local derived artifact built fromrootfs.cpio.Test-tools payload selection is explicit. Options such as
--x11and--directfb2-testare passed directly to the build command, and DirectFB2 no longer implicitly selects X11.CI artifact flow
CI is split into resolution and materialization. The resolver reads the current recipe inputs, restored cache stamps, and release manifest, then emits a plan for each artifact class. The materializer executes that plan without making new policy decisions.
Workflow-local raw artifacts are tracked with
.stamps/prebuilt-local/*.recipe-key. Those stamps allow restored CI cache entries to be checked against the current checkout before reuse. Test jobs consume either the raw artifact bundle produced by the current workflow run or the rolling prebuilt release through the same composite action.Publishing is part of the main workflow. After the required tests run against the same materialized artifact set, the publish job verifies the packaged artifacts and updates the rolling
prebuiltrelease when the resolver reports release drift.Release manifest
prebuilt.sha1now records both compressed archive checksums and recipe-key entries. The archive checksum entries verify transport integrity forImage.bz2,rootfs.cpio.bz2, andtest-tools.img.bz2. The recipe-key entries describe the source recipe that produced each artifact class.Local release downloads verify archive bytes before decompression. CI release materialization additionally verifies that the manifest recipe key still matches the resolver plan before stamping the raw artifact as current.
Fork and mirror repositories can bootstrap from source when the rolling release manifest returns HTTP 404. Other release download failures remain visible as release or network state problems.
Documentation
The README now describes missing guest artifacts as release downloads and source guest rebuilds as an explicit
make build-artifactsoperation.docs/build-system-and-ci.mddocuments the local build path, CI artifact contract, resolver/materializer model, package and publish flow, and fork/mirror bootstrap behavior.The networking document is updated to point at the moved device smoke script path.
Summary by cubic
Rebuilt the guest artifact build and CI/CD pipeline around artifact classes with a recipe‑keyed resolver/materializer and verified release archive checksums to prevent stale prebuilts and speed up PRs. Added selectable
ROOTFS_CPIOfor reliableext4.imgon macOS and early validation of defaultminimal.dtbboot options.Refactors
make build-artifactsusingscripts/prebuilt/*; separate Buildroot outputs for default rootfs vs. test-tools; explicit test-tools flags (--x11,--directfb2-test).use-action-cache/download-release/build) with stamps in.stamps/prebuilt-local; tests get artifacts viaprovide-guest-artifacts; removed the standaloneprebuiltworkflow.prebuilt.sha1records*.bz2checksums and*.recipe-key; downloads verify bytes and expected recipe; package/verify via.ci/prebuilt/package.shandverify-package.sh.ROOTFS_CPIOselection with Homebrewcpioon macOS; preflight checks for defaultminimal.dtbwhen using-i/-dandroot=/dev/vda.Migration
make checkdownloads verified prebuilts; rebuild withmake build-artifacts [image|rootfs|test-tools]plus--x11/--directfb2-test,--clean-build, or--full-rebuild. On macOS, setROOTFS_CPIOif needed.prebuiltrelease publishes from the main workflow onmasterafter tests;masterruns aren’t canceled by concurrency.PREBUILT_URLdefaults to this repository; CI bootstraps from source when the release manifest returns 404.Written for commit 184be80. Summary will update on new commits.