Harden code execution readiness#1
Conversation
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughAdds PUBLIC_BETA_MODE and sandbox/runtime controls, enforces immutable images and safe Docker hosts in production/public-beta, introduces per-session expires_at and execution_count with persisted Redis TTLs that respect idle and absolute lifetime, enforces per-session execution budgets, updates API responses and health payloads, and adds CI guardrails and tests. ChangesPublic Beta Mode & Session Lifecycle Management
Sequence Diagram(s)sequenceDiagram
participant Client
participant Gateway
participant Docker
participant Redis
participant Sandbox
Note over Client,Gateway: Create session
Client->>Gateway: POST /containers
Gateway->>Gateway: compute expires_at, set execution_count=0
Gateway->>Docker: create container (labels: expires-at, execution-count), set runtime if SANDBOX_RUNTIME
Gateway->>Redis: save session (expires_at, execution_count) with TTL=min(idle,lifetime)
Gateway->>Client: 201 ContainerResponse(expires_at, execution_count, max_executions)
Note over Client,Gateway: Run execution
Client->>Gateway: POST /containers/{id}/run
Gateway->>Redis: load session
Gateway->>Gateway: if hard_expired -> remove container + 404
Gateway->>Gateway: else if execution_count >= MAX_EXECUTIONS_PER_SESSION -> remove container + 429
Gateway->>Gateway: else increment execution_count, update last_activity
Gateway->>Redis: persist updated session (new TTL)
Gateway->>Sandbox: run_code_in_sandbox()
Sandbox->>Client: result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
gateway/app.py (1)
1411-1417:⚠️ Potential issue | 🟠 Major | ⚡ Quick winCI blocker: annotate intentional in-container tmp paths to satisfy Bandit B108.
Bandit is failing on Line 1416, Line 1453, and Line 1457. These paths are intentional for sandbox tmpfs usage, but without explicit suppression the
validatejob stays red.Proposed patch
def build_tmpfs_config() -> dict[str, str]: """Build tmpfs mount configuration for sandbox container.""" owner = f"uid={SANDBOX_UID},gid={SANDBOX_GID}" return { "/home/sandbox": f"size={SANDBOX_HOME_TMPFS_SIZE},mode=0700,{owner}", - "/tmp": f"size={SANDBOX_TMP_ROOT_SIZE},mode=1777,{owner}", + "/tmp": f"size={SANDBOX_TMP_ROOT_SIZE},mode=1777,{owner}", # nosec B108 - intentional in-container tmpfs mount } @@ "environment": { "HOME": "/home/sandbox", "MPLBACKEND": "Agg", - "MPLCONFIGDIR": "/tmp/mpl_cache", + "MPLCONFIGDIR": "/tmp/mpl_cache", # nosec B108 - intentional sandbox temp path "PYTHONDONTWRITEBYTECODE": "1", "PYTHONIOENCODING": "utf-8", "PYTHONUNBUFFERED": "1", - "TMPDIR": "/tmp/misc", + "TMPDIR": "/tmp/misc", # nosec B108 - intentional sandbox temp path },Also applies to: 1450-1458
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@gateway/app.py` around lines 1411 - 1417, Bandit flags the tmp path literals in build_tmpfs_config as B108; explicitly mark these intentional in-container tmpfs mounts by adding Bandit suppression comments (e.g., append " # nosec B108 - intentional in-container tmpfs" or place a function-level "# nosec B108" directive) to the offending lines that define the dict entries for "/home/sandbox" and "/tmp" inside build_tmpfs_config so the validate job stops failing while preserving the explicit paths.tests/test_gateway_unit.py (1)
503-525:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winMake the version payload assertion deterministic for
beta.After the version change,
betais env-driven. Hardcoding"beta": Falsecan fail under public-beta envs.Proposed patch
async def test_version_endpoint_returns_execution_metadata(self) -> None: - response = await gateway_app.version() + with mock.patch.dict("os.environ", {"APP_ENV": "development", "PUBLIC_BETA_MODE": "false"}): + response = await gateway_app.version() @@ - "beta": False, + "beta": False,🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_gateway_unit.py` around lines 503 - 525, The test hardcodes "beta": False which breaks in beta environments; update test_version_endpoint_returns_execution_metadata to assert the response body's "beta" value against the app's runtime beta flag instead of a literal, e.g. call gateway_app.version(), parse the JSON body and assert body["beta"] == gateway_app.BETA (or gateway_app.is_beta / the runtime beta attribute on gateway_app) while leaving the rest of the expected keys unchanged.
🧹 Nitpick comments (1)
README.md (1)
781-781: 💤 Low valueConsider breaking the monitoring list into bullet points.
The monitoring guidance at line 781 includes many important metrics in a single sentence. While comprehensive and accurate, it could be more readable as a nested bullet list.
♻️ Optional refactor for improved readability
-7. Monitor request rates, execution latency, error rates, `429` responses, active executions, active sessions, session expirations, Redis health, Docker daemon health, container restarts, memory, CPU, and disk pressure. +7. Monitor operational metrics: + - Request rates, execution latency, error rates, and `429` responses + - Active executions, active sessions, and session expirations + - Redis health and Docker daemon health + - Container restarts, memory, CPU, and disk pressure🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@README.md` at line 781, Split the long monitoring sentence into a readable bullet list in the README: replace the single-line sentence "Monitor request rates, execution latency, error rates, `429` responses, active executions, active sessions, session expirations, Redis health, Docker daemon health, container restarts, memory, CPU, and disk pressure." with a top-level bullet list (e.g., "Performance metrics", "Errors & throttling", "Session & execution state", "Infrastructure health") and nested bullets for the specific items (e.g., under "Performance metrics" list request rates and execution latency; under "Errors & throttling" list error rates and `429` responses; etc.) so the same items are preserved but reorganized for improved readability.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/ci.yml:
- Line 131: The CI job sets GATEWAY_DOCKER_HOST but the validation path
(hardened-config job / validation script that imports gateway/app.py) reads
DOCKER_HOST, so change the job env to export DOCKER_HOST instead of
GATEWAY_DOCKER_HOST; update the hardened-config job environment variable
(replace GATEWAY_DOCKER_HOST: tcp://remote-docker:2376 with DOCKER_HOST:
tcp://remote-docker:2376) so gateway/app.py picks up the correct value during
the guardrail check.
In `@gateway/state.py`:
- Around line 427-432: The _session_ttl function currently adds
SESSION_TTL_GRACE_SECONDS to lifetime_ttl which extends hard expirations; change
lifetime_ttl calculation in _session_ttl to compute the remaining lifetime as
max(1, int(session.expires_at - time.time())) without adding
SESSION_TTL_GRACE_SECONDS so the hard expiry is not extended, and keep idle_ttl
= session_timeout_seconds + SESSION_TTL_GRACE_SECONDS and the final return as
max(1, min(idle_ttl, lifetime_ttl)) so grace only affects idle-based TTLs.
---
Outside diff comments:
In `@gateway/app.py`:
- Around line 1411-1417: Bandit flags the tmp path literals in
build_tmpfs_config as B108; explicitly mark these intentional in-container tmpfs
mounts by adding Bandit suppression comments (e.g., append " # nosec B108 -
intentional in-container tmpfs" or place a function-level "# nosec B108"
directive) to the offending lines that define the dict entries for
"/home/sandbox" and "/tmp" inside build_tmpfs_config so the validate job stops
failing while preserving the explicit paths.
In `@tests/test_gateway_unit.py`:
- Around line 503-525: The test hardcodes "beta": False which breaks in beta
environments; update test_version_endpoint_returns_execution_metadata to assert
the response body's "beta" value against the app's runtime beta flag instead of
a literal, e.g. call gateway_app.version(), parse the JSON body and assert
body["beta"] == gateway_app.BETA (or gateway_app.is_beta / the runtime beta
attribute on gateway_app) while leaving the rest of the expected keys unchanged.
---
Nitpick comments:
In `@README.md`:
- Line 781: Split the long monitoring sentence into a readable bullet list in
the README: replace the single-line sentence "Monitor request rates, execution
latency, error rates, `429` responses, active executions, active sessions,
session expirations, Redis health, Docker daemon health, container restarts,
memory, CPU, and disk pressure." with a top-level bullet list (e.g.,
"Performance metrics", "Errors & throttling", "Session & execution state",
"Infrastructure health") and nested bullets for the specific items (e.g., under
"Performance metrics" list request rates and execution latency; under "Errors &
throttling" list error rates and `429` responses; etc.) so the same items are
preserved but reorganized for improved readability.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b8237fcf-e1d6-4ee1-8da6-14cd961c5e4e
📒 Files selected for processing (11)
.env.example.github/workflows/ci.yml.github/workflows/release.ymlREADME.mdSECURITY.mddocker-compose.ymlgateway/app.pygateway/state.pygateway/version.pysandbox/Dockerfiletests/test_gateway_unit.py
| def _session_ttl(session: SessionInfo, session_timeout_seconds: int) -> int: | ||
| idle_ttl = session_timeout_seconds + SESSION_TTL_GRACE_SECONDS | ||
| if session.expires_at is None: | ||
| return idle_ttl | ||
| lifetime_ttl = max(1, int(session.expires_at - time.time())) + SESSION_TTL_GRACE_SECONDS | ||
| return max(1, min(idle_ttl, lifetime_ttl)) |
There was a problem hiding this comment.
Hard lifetime is effectively extended by grace in Redis TTL.
lifetime_ttl includes SESSION_TTL_GRACE_SECONDS, so hard-expired sessions can linger ~5 extra minutes. That weakens the absolute lifetime bound and can skew capacity enforcement.
Proposed patch
`@staticmethod`
def _session_ttl(session: SessionInfo, session_timeout_seconds: int) -> int:
idle_ttl = session_timeout_seconds + SESSION_TTL_GRACE_SECONDS
if session.expires_at is None:
- return idle_ttl
- lifetime_ttl = max(1, int(session.expires_at - time.time())) + SESSION_TTL_GRACE_SECONDS
+ return max(1, idle_ttl)
+ lifetime_ttl = max(1, int(session.expires_at - time.time()))
return max(1, min(idle_ttl, lifetime_ttl))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _session_ttl(session: SessionInfo, session_timeout_seconds: int) -> int: | |
| idle_ttl = session_timeout_seconds + SESSION_TTL_GRACE_SECONDS | |
| if session.expires_at is None: | |
| return idle_ttl | |
| lifetime_ttl = max(1, int(session.expires_at - time.time())) + SESSION_TTL_GRACE_SECONDS | |
| return max(1, min(idle_ttl, lifetime_ttl)) | |
| `@staticmethod` | |
| def _session_ttl(session: SessionInfo, session_timeout_seconds: int) -> int: | |
| idle_ttl = session_timeout_seconds + SESSION_TTL_GRACE_SECONDS | |
| if session.expires_at is None: | |
| return max(1, idle_ttl) | |
| lifetime_ttl = max(1, int(session.expires_at - time.time())) | |
| return max(1, min(idle_ttl, lifetime_ttl)) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@gateway/state.py` around lines 427 - 432, The _session_ttl function currently
adds SESSION_TTL_GRACE_SECONDS to lifetime_ttl which extends hard expirations;
change lifetime_ttl calculation in _session_ttl to compute the remaining
lifetime as max(1, int(session.expires_at - time.time())) without adding
SESSION_TTL_GRACE_SECONDS so the hard expiry is not extended, and keep idle_ttl
= session_timeout_seconds + SESSION_TTL_GRACE_SECONDS and the final return as
max(1, min(idle_ttl, lifetime_ttl)) so grace only affects idle-based TTLs.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
gateway/app.py (1)
1643-1651: 💤 Low valueRedundant
touch_sessioncall aftersave_session.The session is already saved with an updated
last_activityat Lines 1643-1649. The subsequenttouch_sessionat Line 1651 is redundant sincesave_sessionalready persists the session with the refreshed timestamp. Consider removing Line 1651.♻️ Proposed fix
session.execution_count += 1 session.last_activity = time.time() await state_backend.save_session( container_id, session, session_timeout_seconds=SESSION_TIMEOUT_SECONDS, ) - await touch_session(container_id) try: await ensure_sandbox_env_file(🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@gateway/app.py` around lines 1643 - 1651, The session is updated and persisted via session.execution_count, session.last_activity and await state_backend.save_session(container_id, session, session_timeout_seconds=SESSION_TIMEOUT_SECONDS), so remove the redundant await touch_session(container_id) call to avoid double-touching; locate the call to touch_session (the await touch_session(container_id) line) and delete it, leaving the save_session call as the single source of truth for updating last_activity..github/workflows/release.yml (1)
63-63: 💤 Low valueDocument the rationale for suppressed Bandit checks.
The
-s B102,B108,B404,B603flag skips security checks forexec, hardcoded tmp paths, subprocess imports, and subprocess calls. While these may be legitimate false positives for this codebase, adding a brief inline comment explaining why each is suppressed would help future maintainers understand the security trade-offs.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/release.yml at line 63, Update the bandit invocation that contains the suppression flags "-s B102,B108,B404,B603" by adding a short inline comment on the same line (or directly above it) that documents why each check is suppressed: explain why exec usage (B102), hardcoded tmp paths (B108), subprocess imports/calls (B404,B603) are acceptable in this repository and any mitigations in place (e.g., controlled inputs, sandboxing, CI-only usage). Reference the exact flag string "-s B102,B108,B404,B603" so reviewers can find and understand the rationale for each suppressed Bandit check.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In @.github/workflows/release.yml:
- Line 63: Update the bandit invocation that contains the suppression flags "-s
B102,B108,B404,B603" by adding a short inline comment on the same line (or
directly above it) that documents why each check is suppressed: explain why exec
usage (B102), hardcoded tmp paths (B108), subprocess imports/calls (B404,B603)
are acceptable in this repository and any mitigations in place (e.g., controlled
inputs, sandboxing, CI-only usage). Reference the exact flag string "-s
B102,B108,B404,B603" so reviewers can find and understand the rationale for each
suppressed Bandit check.
In `@gateway/app.py`:
- Around line 1643-1651: The session is updated and persisted via
session.execution_count, session.last_activity and await
state_backend.save_session(container_id, session,
session_timeout_seconds=SESSION_TIMEOUT_SECONDS), so remove the redundant await
touch_session(container_id) call to avoid double-touching; locate the call to
touch_session (the await touch_session(container_id) line) and delete it,
leaving the save_session call as the single source of truth for updating
last_activity.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 232a4788-b639-476a-8598-24ed32362fab
📒 Files selected for processing (7)
.github/workflows/ci.yml.github/workflows/release.ymlgateway/app.pygateway/requirements.txtgateway/version.jsonsandbox/requirements.txttests/test_gateway_unit.py
✅ Files skipped from review due to trivial changes (3)
- gateway/version.json
- sandbox/requirements.txt
- gateway/requirements.txt
🚧 Files skipped from review as they are similar to previous changes (1)
- .github/workflows/ci.yml
Summary
Verification
Summary by CodeRabbit
New Features
Improvements
Documentation
Tests
Chores