feat: F8-F15 — Manager/Finance dashboards, sourcing, disputes, pricing audit, RBAC#10
feat: F8-F15 — Manager/Finance dashboards, sourcing, disputes, pricing audit, RBAC#10Aparnap2 wants to merge 4 commits into
Conversation
- Add MockLLM class for testing without real LLM (MOCK_LLM=true) - Skip DB init when MOCK_LLM=true to avoid connection errors - Fix SSE parser to read event: line type instead of JSON type field - Add x-test-mode header forwarding through web proxy to agent-core - Bypass JWT auth when x-test-mode or x-user-id=test-user-id - Fix catalog-grid to check items prop matching LLM output - Strip __ui__ JSON metadata from message display text - Fix /health endpoint for mock mode without DB pool
❌ Deploy Preview for ai-sdk failed. Why did it fail? →
|
|
Warning Review limit reached
More reviews will be available in 54 minutes and 17 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more credits in the billing tab to continue. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (14)
📒 Files selected for processing (112)
📝 WalkthroughWalkthroughThis PR transitions the application from NextAuth to JWT cookie-based authentication, refactors the chat UI from LangGraph-driven to SSE-based local state, introduces a mock LLM mode with role-based tool routing in the agent core, redesigns the approval workflow with database functions and flexible GenUI callbacks, normalizes Prisma schema embeddings from vector types to strings, and instruments UI components with test IDs. ChangesEnd-to-End Auth, Chat, and Approval Redesign
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Code Review
This pull request transitions the application from next-auth to a custom JWT-based authentication system, implements a manual SSE streaming client for agent interactions, and introduces a mock LLM for testing. Significant backend updates include role-based tool filtering, context compression logic, and atomic budget operations. Feedback focuses on critical security and reliability issues: sensitive JWT tokens are being logged in middleware, the custom SSE parser fails to maintain state across chunks or handle partial JSON deltas, and the authentication layout contains a logic flaw that bypasses role-based access control. Furthermore, the database connection logic is fragile, the build configuration dangerously suppresses type and lint errors, and the removal of native vector types from the schema will break search functionality.
| const hasAuthCookie = cookieHeader.includes('token=') | ||
|
|
||
| // Debug logging (remove in production) | ||
| console.log('[Middleware] Path:', path, '| Has token cookie:', hasAuthCookie, '| Cookie:', cookieHeader.slice(0, 100)) |
There was a problem hiding this comment.
Logging the cookieHeader (which contains the JWT token) is a security risk as it exposes sensitive authentication credentials in the server logs.
| console.log('[Middleware] Path:', path, '| Has token cookie:', hasAuthCookie, '| Cookie:', cookieHeader.slice(0, 100)) | |
| // Debug logging (remove in production) | |
| console.log('[Middleware] Path:', path, '| Has token cookie:', hasAuthCookie); |
| if (user.role !== 'MERCHANT' && user.role !== 'ADMIN') { | ||
| // Allow SHOPPER and SUPPORT through - they have their own routes | ||
| // For now, just let them pass | ||
| } |
There was a problem hiding this comment.
This block is empty and effectively allows any authenticated user (including those with SHOPPER or SUPPORT roles) to access the admin/merchant layout. If this layout is intended to protect admin routes, it must perform a redirect for unauthorized roles.
if (user.role !== 'MERCHANT' && user.role !== 'ADMIN') {
redirect('/chat');
}
| let buffer = '' | ||
|
|
||
| while (true) { | ||
| const { done, value } = await reader.read() | ||
| if (done) break | ||
|
|
||
| buffer += decoder.decode(value, { stream: true }) | ||
|
|
||
| // Parse SSE events | ||
| const lines = buffer.split('\n') | ||
| buffer = lines.pop() || '' | ||
|
|
||
| let eventType = '' |
There was a problem hiding this comment.
The eventType variable is initialized inside the while loop, which means its state is lost between chunks. In SSE, an event: line can arrive in a different chunk than the corresponding data: line. This will cause the parser to fail to identify the event type for data arriving in subsequent chunks.
| let buffer = '' | |
| while (true) { | |
| const { done, value } = await reader.read() | |
| if (done) break | |
| buffer += decoder.decode(value, { stream: true }) | |
| // Parse SSE events | |
| const lines = buffer.split('\n') | |
| buffer = lines.pop() || '' | |
| let eventType = '' | |
| let buffer = ''; | |
| let eventType = ''; | |
| while (true) { | |
| const { done, value } = await reader.read(); | |
| if (done) break; | |
| buffer += decoder.decode(value, { stream: true }); | |
| // Parse SSE events | |
| const lines = buffer.split('\n'); | |
| buffer = lines.pop() || ''; |
| const inner = JSON.parse(content) | ||
| if (inner.__ui__) { | ||
| setUIComponents(prev => [...prev, { | ||
| type: inner.__ui__.name, | ||
| props: inner.__ui__.props, | ||
| }]) | ||
| content = '' // Skip showing JSON metadata in messages | ||
| } | ||
| } catch { | ||
| // Not JSON, use as-is | ||
| } |
There was a problem hiding this comment.
Attempting to JSON.parse the content delta directly will fail when using a real streaming LLM, as the JSON payload will be delivered in multiple small chunks. Furthermore, if the parse fails, the raw JSON fragment is currently appended to the chat messages (line 134), which will result in a poor user experience with visible JSON syntax in the UI.
|
|
||
| const pool = new Pool({ | ||
| connectionString: process.env.DATABASE_URL | ||
| ? process.env.DATABASE_URL.replace('postgres:postgres', 'supabase_admin:postgres').replace('smart_commerce', 'postgres') |
There was a problem hiding this comment.
Using .replace() on the DATABASE_URL to swap credentials and database names is extremely fragile and insecure. It assumes a specific format and can easily break if the URL contains the search strings in other parts (e.g., in the password). Environment-specific configurations should be handled via separate environment variables.
| ignoreBuildErrors: true, | ||
| }, | ||
| eslint: { | ||
| ignoreDuringBuilds: true, |
| for item in parsed["products"]: | ||
| if isinstance(item, dict): | ||
| item.pop("embedding", None) | ||
| msg.content = json.dumps(parsed) |
There was a problem hiding this comment.
The strip_ui_from_messages function mutates the msg.content of the original message objects in place. Since these messages are part of the AgentState, this mutation can lead to unexpected side effects or inconsistent state in LangGraph. It is safer to create a copy of the message before modifying it.
| msg.content = json.dumps(parsed) | |
| msg = msg.model_copy(update={"content": json.dumps(parsed)}) |
| id String @id @default(uuid()) | ||
| productId Int @map("product_id") | ||
| embedding Unsupported("vector") @map("embedding") | ||
| embedding String? @map("embedding") |
There was a problem hiding this comment.
Actionable comments posted: 20
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/web/app/api/agent/route.ts (1)
42-49:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winDon’t forward client-controlled test/auth headers to agent-core.
This route now passes
x-test-modestraight through while still trustingx-user-idfrom the browser. Given the documented agent-core bypass for test-mode/test-user requests, a caller can hit/api/agentwith spoofed headers and reach mock/test paths without a real authenticated identity. Derive the user id server-side from the verified token/cookie and stripx-test-modeoutside dedicated test environments.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/app/api/agent/route.ts` around lines 42 - 49, Don't forward client-controlled headers: remove req.headers.get('x-test-mode') and the client-provided x-user-id from the fetch headers when calling agentUrl. Instead, derive userId server-side from the verified token/cookie (the same place you produce the local variable userId) and set 'x-user-id' from that server-verified value; only add an 'x-test-mode' header when running in a dedicated test environment/flag (e.g., NODE_ENV/test flag) inside the same route handler before the fetch. Update the fetch call that builds headers (the block using agentUrl, token, req.headers and userId) to use the server-derived userId and conditionally include x-test-mode, and ensure req.headers is not trusted for auth/test fields.
🧹 Nitpick comments (2)
apps/web/app/(chat)/page.tsx (1)
17-20: ⚡ Quick winReplace
anyhere with concrete orunknownshapes.
props: anyand(window as any)remove type safety in the exact path that parses untrusted streaming payloads. Please introduce a typed UI payload plus a smallWindowaugmentation instead of opting out of checking here. As per coding guidelines,Never use TypeScript any type - use proper types or unknown instead.Also applies to: 49-49
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/app/`(chat)/page.tsx around lines 17 - 20, The UIComponent interface currently uses props: any which removes type safety when parsing untrusted streaming payloads; replace props: any with a concrete discriminated union or at minimum props: unknown and define a typed UI payload interface (e.g., UiPayload / UiComponentProps variants) that describes the allowed component types and their props, then narrow/validate the props at parse time in the function that consumes the stream (referencing UIComponent and the parser logic); also add a small Window augmentation for the global payload accessor instead of using (window as any) so the global variable has a typed signature, and update all usages (including the other occurrence at line ~49) to perform proper runtime validation/type narrowing before treating props as specific shapes.apps/agent-core/requirements.txt (1)
3-3: ⚡ Quick winPin the MCP package to the tested API surface.
apps/agent-core/mcp_server.pystill imports multiple concrete MCP symbols, somcp>=1.0.0makes installs non-reproducible and can break startup the next time MCP moves another import. Prefer an exact pin or a bounded range that matches the version you've actually verified.For the Python `mcp` package, which released versions expose `mcp.server.Server`, `mcp.server.stdio.stdio_server`, `mcp.types.Tool`, `mcp.types.TextContent`, and `mcp.server.lifecycle.LifespanManager`?🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/agent-core/requirements.txt` at line 3, Replace the loose dependency "mcp>=1.0.0" with a precise pin or bounded range that matches the version you actually tested (e.g. mcp==<tested_version> or mcp>=X.Y.Z,<X.Y+1.0>) so installs remain reproducible; then verify the chosen release exposes the concrete symbols imported by apps/agent-core/mcp_server.py (mcp.server.Server, mcp.server.stdio.stdio_server, mcp.types.Tool, mcp.types.TextContent, and mcp.server.lifecycle.LifespanManager) and update the pin if any symbol is missing.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@apps/agent-core/src/tools.py`:
- Around line 895-898: The finance_tools list currently includes raise_dispute
which performs writes (mutates PR state and writes audit entries), violating the
intended read-only role; remove raise_dispute from the finance_tools array
(leave base_tools only) or replace it with a read-only alternative if finance
users must see dispute info without mutating state — update the finance_tools
declaration (referencing the finance_tools variable and the raise_dispute
symbol) accordingly and add a brief comment explaining that write-capable tools
are intentionally excluded.
In `@apps/web/app/`(admin)/layout.tsx:
- Around line 31-36: The current guard checks user.role but leaves the branch
empty, allowing unauthorized roles to render the admin layout; update the branch
(the if checking user.role !== 'MERCHANT' && user.role !== 'ADMIN') to
immediately block access by performing a redirect or returning an access-denied
response (e.g., call the app's redirect helper to send users to a forbidden page
or throw a 403/Next.js notFound/redirect). Ensure the change is applied inside
the same conditional that inspects user.role so unauthorized users are not
allowed to continue rendering the admin layout.
In `@apps/web/app/`(chat)/page.tsx:
- Around line 75-82: Hardcoding real users to 'employee' in the request
headers/body causes identity collapse; update the fetch call that sets
'x-user-id' and body.user_id (the code using isTestMode, token, 'x-test-mode',
'x-user-id', Authorization, and JSON body with messages) so that when isTestMode
is false it uses the actual authenticated user's identifier (e.g., their email
or userId variable) instead of the literal 'employee'; ensure both the header
'x-user-id' and body.user_id are populated from the same real-user value the app
has (fall back to a safe default only if that value is absent) to preserve
department/role lookup and per-user rate-limiting.
- Around line 67-68: Replace the brittle cookie parsing expression that sets
const token using document.cookie.split('token=') with a call to the existing
getCookie('token') helper; update the assignment for const token to use
getCookie('token') (and fall back to '' if null/undefined) so it correctly
URL-decodes the value and avoids matching names like refresh_token.
- Around line 198-204: The budget card in the 'budget-gauge' render is using
ui.props?.total (which is absent) instead of the payload field returned by
get_budget_status; update the JSX in the 'budget-gauge' case to read the monthly
budget from ui.props?.monthlyBudget (keep the existing division by 100 and
toLocaleString formatting and preserve ui.props?.remaining for the numerator) so
the "of ₹..." value displays the correct monthlyBudget amount.
In `@apps/web/app/api/approvals/`[prId]/decide/route.ts:
- Around line 71-104: The approve_pr_and_debit_budget DB call can commit before
the subsequent audit and pRApproval inserts, causing partial commits and unsafe
retries; fix by performing the raw function call and the follow-up writes
(audit-entry insert, pRApproval insert/follow-up updates and the eventual fetch
of the updated purchaseRequest) inside a single Prisma transaction using
prisma.$transaction (or move the audit and pRApproval inserts into the
approve_pr_and_debit_budget DB function), so that
approve_pr_and_debit_budget(...), the audit/pRApproval inserts, and the final
purchaseRequest read are all executed atomically and a single success/failure is
returned to the route handler.
In `@apps/web/app/api/auth/login/route.ts`:
- Around line 82-87: The JWT and response are using the wrong property name for
department — replace accesses to user.department_id with user.departmentId (the
query selects "departmentId") so signToken(...) gets departmentId and the
response payload includes departmentId; update all occurrences (e.g., the object
passed to signToken and the response construction around the login handler) to
use user.departmentId consistently.
- Around line 18-23: The route handler in apps/web/app/api/auth/login/route.ts
is logging sensitive PII and secrets (the destructured email/password from const
{ email, password } = await req.json() and later password hash and auth token
outputs around the referenced ranges). Remove any console.log statements that
print raw credentials, password hashes, or auth tokens (including the logs at
lines ~18–23, ~49–56, and ~89–103); instead log only non-sensitive events (e.g.,
"login request received", "authentication succeeded/failed") or masked
identifiers (e.g., first char and domain of an email) and never output full
tokens or hashes from functions like the login handler/default export in this
file. Ensure no secrets are hardcoded and continue to store sensitive config in
.env.local.
- Around line 11-15: Replace the hardcoded fallback DSN in the Pool
instantiation: remove the literal localhost/postgres credentials and require
process.env.DATABASE_URL to be present (use the non-null assertion
process.env.DATABASE_URL!); if any transformation is necessary keep it applied
to process.env.DATABASE_URL! only, and if DATABASE_URL is missing throw a clear
startup error so the app fails fast. Update the Pool construction that uses new
Pool({ connectionString: ... }) to reference only the validated/nonnull env var
(and remove the replace-based fallback string).
In `@apps/web/app/auth/login/page.tsx`:
- Around line 38-39: Remove the client-side cookie write that recreates the JWT
as a non-HttpOnly cookie: delete the document.cookie assignment that uses
data.token in apps/web/app/auth/login/page.tsx (the line setting
`token=${data.token}`), and rely on the server's Set-Cookie from /api/auth/login
instead; if you need to detect login success client-side, use the API response
status/body (e.g., data.success) or server-provided flags rather than exposing
the token to JS.
In `@apps/web/components/auth-provider.tsx`:
- Around line 24-26: AuthProviderInner is computing auth state but never exposes
the parsed session to the context consumers, which is why useAuth() returns {}
and useUser() returns null; fix by passing the parsed session, loading flag and
any helper methods (e.g., setSession or signOut) into the AuthContext.Provider
value inside AuthProviderInner so consumers can read them, and ensure useAuth()
and useUser() return the corresponding values from AuthContext (not empty
defaults); update references to AuthProviderInner, AuthContext.Provider, useAuth
and useUser so the provider value shape matches what the hooks destructure.
In `@apps/web/components/genui/ApprovalCard.tsx`:
- Around line 27-39: The onSubmitDecision prop currently only accepts (decision,
prNumber, total, comments) which loses stable identifiers; change the callback
signature to include prId and threadId (e.g. onSubmitDecision: (decision:
'APPROVED' | 'REJECTED', prId?: string, threadId?: string, prNumber?: string,
total?: number, comments?: string) => Promise<void>) and update the ApprovalCard
component to pass through this.prId and this.threadId (not only
prNumber/totalAmount) when calling onSubmitDecision; also update the other
submission call sites inside this component (the other onSubmitDecision
invocations referenced in the diff) to provide prId and threadId so the approval
flow uses stable identifiers.
- Around line 67-75: The badge currently uses PR status for urgency because
safeUrgency is set from safePr.status; update this to read the actual urgency
field instead: change the assignment of safeUrgency to use safePr.urgency ??
urgency ?? 'NORMAL' (and similarly replace any other places where safePr.status
or pr.status is used to derive urgency, e.g., the other mapping around the
urgency/badge rendering at lines ~119-120) so the urgency pill gets the correct
enum values for color mapping.
In `@apps/web/components/genui/PurchaseRequestDraft.tsx`:
- Around line 77-81: The submit button in PurchaseRequestDraft.tsx
(data-testid="submit-pr-btn") is a no-op; either wire it to a submit handler or
render it disabled. Add an onClick that calls the component's submit routine
(e.g., call an existing handleSubmit/submitPurchaseRequest or an onSubmit prop
passed into PurchaseRequestDraft) and ensure the handler updates
state/dispatches the action that notifies the manager, or if the action isn't
available yet, set the button to disabled (disabled attribute and aria-disabled)
and update the className to a visually disabled style so users can’t click it.
Ensure the handler or prop name you use matches the component’s existing symbols
(handleSubmit, submitPurchaseRequest, or onSubmit) and include any necessary
loading/disabled state to prevent double submits.
In `@apps/web/components/shell/Rail.tsx`:
- Line 22: Replace the useState<any> in the Rail component with a concrete typed
shape for session: define a Session type (or import the proper NextAuth Session)
that at minimum models user with optional name and role (e.g., Session { user?:
{ name?: string; role?: string; [key: string]: any } }), then change the hook to
useState<Session | null>(null) and update any usages that assume session fields
to use the typed properties (session, setSession, and places reading
session?.user?.role or session?.user?.name). This restores type safety without
using any.
In `@apps/web/cypress/e2e/b2b-approval.cy.ts`:
- Around line 113-117: The setupMocks() function is currently a no-op causing
the spec to call real /api/agent and produce nondeterministic LLM behavior;
update setupMocks() to install the existing Cypress SSE/mocking helpers (e.g.,
call mockCatalogResponse, mockBudgetResponse, mockApprovalResponse or other
mock* helpers used elsewhere) so the test uses deterministic mocked SSE
responses instead of real agent calls, and remove the misleading
"deepseek-v3.2:cloud" comment; locate and modify the setupMocks definition and
replace the no-op body with invocations of the mock functions used by other
specs (and delete the unused deepseek comment) so the spec consistently sets up
test responses while still allowing page.tsx's window.Cypress headers to remain
untouched.
In `@apps/web/middleware.ts`:
- Around line 17-18: The console.log in middleware.ts is leaking sensitive data
by printing cookieHeader (even truncated); remove or replace it with
non-sensitive debug info: update the logging in the middleware function that
references path, hasAuthCookie and cookieHeader to omit cookieHeader (e.g., log
only path and hasAuthCookie or a boolean/masked indicator) and ensure no other
code in that function prints raw cookies or tokens.
- Around line 13-21: The middleware currently trusts the presence of "token=" in
req.headers.get('cookie') (cookieHeader/hasAuthCookie) — instead extract the
token value from the cookie string, verify it with your JWT verification utility
(e.g., jwt.verify or your existing verifyToken function) and only treat the
request as authenticated if verification succeeds; on verification failure or
missing token return NextResponse.redirect(new URL('/auth/login', req.url));
replace the naive hasAuthCookie check with proper parsing and try/catch around
the verification and remove or restrict the console.log debug output so
sensitive cookie data is not logged.
In `@apps/web/next.config.mjs`:
- Around line 5-10: Replace the hardcoded bypasses for type/lint checks in
next.config.mjs: remove typescript.ignoreBuildErrors: true and
eslint.ignoreDuringBuilds: true and instead read a specific environment flag
(e.g., IGNORE_NEXT_ERRORS or NEXT_ALLOW_BUILD_SKIP) that defaults to false so
production/CI builds fail on TypeScript/ESLint errors; update the entries using
the symbols typescript.ignoreBuildErrors and eslint.ignoreDuringBuilds to use
that env-derived boolean and document the env var for CI usage.
In `@apps/web/package.json`:
- Around line 37-39: The Prisma packages in package.json are version-skewed:
`@prisma/client` and prisma are at 7.8.0 while `@prisma/adapter-pg` and
`@prisma/internals` are at 6.x; update the dependency entries so all
Prisma-related packages use the same major version (e.g., set
"`@prisma/adapter-pg`" and "`@prisma/internals`" to 7.8.0 to match "`@prisma/client`"
and "prisma", or alternatively pin all to 6.7.0 if you intend to stay on v6), by
editing the dependency and devDependency entries for "`@prisma/adapter-pg`",
"`@prisma/internals`", "`@prisma/client`" and "prisma" in package.json and then run
install to regenerate lockfile.
---
Outside diff comments:
In `@apps/web/app/api/agent/route.ts`:
- Around line 42-49: Don't forward client-controlled headers: remove
req.headers.get('x-test-mode') and the client-provided x-user-id from the fetch
headers when calling agentUrl. Instead, derive userId server-side from the
verified token/cookie (the same place you produce the local variable userId) and
set 'x-user-id' from that server-verified value; only add an 'x-test-mode'
header when running in a dedicated test environment/flag (e.g., NODE_ENV/test
flag) inside the same route handler before the fetch. Update the fetch call that
builds headers (the block using agentUrl, token, req.headers and userId) to use
the server-derived userId and conditionally include x-test-mode, and ensure
req.headers is not trusted for auth/test fields.
---
Nitpick comments:
In `@apps/agent-core/requirements.txt`:
- Line 3: Replace the loose dependency "mcp>=1.0.0" with a precise pin or
bounded range that matches the version you actually tested (e.g.
mcp==<tested_version> or mcp>=X.Y.Z,<X.Y+1.0>) so installs remain reproducible;
then verify the chosen release exposes the concrete symbols imported by
apps/agent-core/mcp_server.py (mcp.server.Server, mcp.server.stdio.stdio_server,
mcp.types.Tool, mcp.types.TextContent, and mcp.server.lifecycle.LifespanManager)
and update the pin if any symbol is missing.
In `@apps/web/app/`(chat)/page.tsx:
- Around line 17-20: The UIComponent interface currently uses props: any which
removes type safety when parsing untrusted streaming payloads; replace props:
any with a concrete discriminated union or at minimum props: unknown and define
a typed UI payload interface (e.g., UiPayload / UiComponentProps variants) that
describes the allowed component types and their props, then narrow/validate the
props at parse time in the function that consumes the stream (referencing
UIComponent and the parser logic); also add a small Window augmentation for the
global payload accessor instead of using (window as any) so the global variable
has a typed signature, and update all usages (including the other occurrence at
line ~49) to perform proper runtime validation/type narrowing before treating
props as specific shapes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9951ede0-cf0c-4349-8b39-14bb6c96f769
⛔ Files ignored due to path filters (2)
apps/web/cypress/videos/agent-direct.cy.ts.mp4is excluded by!**/*.mp4pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (28)
.gitignoreapps/agent-core/main.pyapps/agent-core/mcp_server.pyapps/agent-core/requirements.txtapps/agent-core/src/dependencies.pyapps/agent-core/src/graph.pyapps/agent-core/src/tools.pyapps/web/app/(admin)/chat/page.tsxapps/web/app/(admin)/layout.tsxapps/web/app/(chat)/page.tsxapps/web/app/api/agent/route.tsapps/web/app/api/approvals/[prId]/decide/route.tsapps/web/app/api/auth/[...nextauth]/route.tsapps/web/app/api/auth/login/route.tsapps/web/app/api/auth/signup/route.tsapps/web/app/auth/login/page.tsxapps/web/components/auth-provider.tsxapps/web/components/genui/ApprovalCard.tsxapps/web/components/genui/CatalogGrid.tsxapps/web/components/genui/PRList.tsxapps/web/components/genui/PurchaseRequestDraft.tsxapps/web/components/shell/Rail.tsxapps/web/cypress.config.tsapps/web/cypress/e2e/b2b-approval.cy.tsapps/web/middleware.tsapps/web/next.config.mjsapps/web/package.jsonprisma/schema.prisma
| // Admin/Merchant routes require MERCHANT or ADMIN role | ||
| // SHOPPER and SUPPORT can access via separate (chat) route, not this layout | ||
| if (user.role !== 'MERCHANT' && user.role !== 'ADMIN') { | ||
| // Allow SHOPPER and SUPPORT through - they have their own routes | ||
| // For now, just let them pass | ||
| } |
There was a problem hiding this comment.
Block unauthorized roles in this layout.
This guard currently falls through, so any authenticated non-MERCHANT/ADMIN user can still render the admin layout. Redirect or reject here instead of leaving the branch empty.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@apps/web/app/`(admin)/layout.tsx around lines 31 - 36, The current guard
checks user.role but leaves the branch empty, allowing unauthorized roles to
render the admin layout; update the branch (the if checking user.role !==
'MERCHANT' && user.role !== 'ADMIN') to immediately block access by performing a
redirect or returning an access-denied response (e.g., call the app's redirect
helper to send users to a forbidden page or throw a 403/Next.js
notFound/redirect). Ensure the change is applied inside the same conditional
that inspects user.role so unauthorized users are not allowed to continue
rendering the admin layout.
| // Setup - use REAL LLM (no mocks) | ||
| function setupMocks() { | ||
| // Mock catalog search | ||
| cy.intercept('GET', '**/assistants/*/threads/*/runs/*/stream', (req) => { | ||
| const url = req.url | ||
| if (url.includes('laptop') || url.includes('show me laptops') || url.includes('I need a')) { | ||
| req.reply({ statusCode: 200, body: mockCatalogResponse() }) | ||
| } else if (url.includes('budget')) { | ||
| req.reply({ statusCode: 200, body: mockBudgetResponse() }) | ||
| } else if (url.includes('approval')) { | ||
| req.reply({ statusCode: 200, body: mockApprovalsResponse() }) | ||
| } else if (url.includes('MacBook Pro')) { | ||
| req.reply({ statusCode: 200, body: mockBudgetAlertResponse() }) | ||
| } else { | ||
| req.reply({ statusCode: 200, body: mockCatalogResponse() }) | ||
| } | ||
| }).as('mockStream') | ||
|
|
||
| // Also intercept POST to run endpoint | ||
| cy.intercept('POST', '**/assistants/*/threads/*/runs', { | ||
| statusCode: 200, | ||
| body: { run_id: 'mock-run', thread_id: 'mock-thread', assistant_id: 'customer' } | ||
| }).as('mockRun') | ||
| // Using real LLM - no interceptors | ||
| // The app will call actual API endpoints with deepseek-v3.2:cloud | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify whether this suite still has no deterministic test-mode wiring
# and whether alternate provider strings are present.
rg -n "function setupMocks|deepseek-v3\\.2|cy\\.intercept\\(|x-test-mode|x-user-id"Repository: Aparnap2/smart_commerce_agent
Length of output: 2888
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Inspect the exact setupMocks + surrounding beforeEach usage
echo "---- b2b-approval.cy.ts (around setupMocks) ----"
nl -ba apps/web/cypress/e2e/b2b-approval.cy.ts | sed -n '80,150p'
# 2) Find all occurrences of deepseek-v3.2:cloud
echo "---- Occurrences of deepseek-v3.2:cloud ----"
rg -n --hidden --no-ignore -S "deepseek-v3\.2:cloud" . || true
# 3) Broader deepseek references (to see if this is hardcoded elsewhere)
echo "---- Occurrences of deepseek ----"
rg -n --hidden --no-ignore -S "deepseek" . || true
# 4) Confirm whether this cypress file sets x-test-mode/x-user-id anywhere else
echo "---- x-test-mode/x-user-id in b2b-approval.cy.ts ----"
rg -n "x-test-mode|x-user-id" apps/web/cypress/e2e/b2b-approval.cy.ts || true
# 5) Inspect where app sets x-test-mode/x-user-id for chat requests
echo "---- app/(chat)/page.tsx (around x-test-mode headers) ----"
nl -ba "apps/web/app/(chat)/page.tsx" | sed -n '50,110p'Repository: Aparnap2/smart_commerce_agent
Length of output: 183
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "---- b2b-approval.cy.ts (around setupMocks) ----"
cat -n apps/web/cypress/e2e/b2b-approval.cy.ts | sed -n '80,150p'
echo "---- Occurrences of deepseek-v3.2:cloud ----"
rg -n -S "deepseek-v3\.2:cloud" . || true
echo "---- Occurrences of deepseek ----"
rg -n -S "deepseek" . || true
echo "---- x-test-mode/x-user-id in b2b-approval.cy.ts ----"
rg -n "x-test-mode|x-user-id" apps/web/cypress/e2e/b2b-approval.cy.ts || true
echo "---- app/(chat)/page.tsx (around x-test-mode headers) ----"
cat -n "apps/web/app/(chat)/page.tsx" | sed -n '50,110p'Repository: Aparnap2/smart_commerce_agent
Length of output: 7245
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "---- app/(chat)/page.tsx (where isTestMode is computed) ----"
rg -n "isTestMode" "apps/web/app/(chat)/page.tsx"
cat -n "apps/web/app/(chat)/page.tsx" | sed -n '1,120p'
echo "---- apps/web/app/api/agent/route.ts (x-test-mode handling + model selection) ----"
cat -n apps/web/app/api/agent/route.ts | sed -n '1,140p'
echo "---- apps/web/app/api/agent/route.ts (any llm/provider config) ----"
rg -n -S "deepseek|foundry|Azure|OpenAI|llm|model" apps/web/app/api/agent/route.ts || true
echo "---- occurrences of deepseek-v3.2 in repo ----"
rg -n -S "deepseek-v3\\.2" . || true
echo "---- occurrences of Azure AI Foundry / Foundry in repo ----"
rg -n -S "Azure AI Foundry|Foundry" . || true
echo "---- occurrences of x-test-mode usage ----"
rg -n "x-test-mode" apps/web . || trueRepository: Aparnap2/smart_commerce_agent
Length of output: 50386
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "---- cy.intercept usages in b2b-approval.cy.ts ----"
rg -n "cy\.intercept" apps/web/cypress/e2e/b2b-approval.cy.ts || true
echo "---- calls to mock*Response helpers in b2b-approval.cy.ts ----"
rg -n "mock[A-Za-z0-9_]*Response" apps/web/cypress/e2e/b2b-approval.cy.ts || true
echo "---- where setupMocks() is used ----"
rg -n "setupMocks\(\)" apps/web/cypress/e2e/b2b-approval.cy.ts || trueRepository: Aparnap2/smart_commerce_agent
Length of output: 542
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "---- b2b-approval.cy.ts (top to setupMocks) ----"
cat -n apps/web/cypress/e2e/b2b-approval.cy.ts | sed -n '1,220p'
echo "---- b2b-approval.cy.ts (rest of file) ----"
cat -n apps/web/cypress/e2e/b2b-approval.cy.ts | sed -n '220,420p'Repository: Aparnap2/smart_commerce_agent
Length of output: 12031
Fix nondeterministic E2E LLM behavior in b2b-approval.cy.ts (setupMocks)
apps/web/cypress/e2e/b2b-approval.cy.ts lines 113–117 set setupMocks() to a no-op and no longer install any cy.intercept, so the spec doesn’t use the existing mock SSE helpers (mockCatalogResponse, mockBudgetResponse, etc.) and instead relies on real /api/agent behavior. Although apps/web/app/(chat)/page.tsx sets x-test-mode: true/x-user-id: test-user-id when running under Cypress (window.Cypress), this spec still doesn’t guarantee deterministic agent-core responses.
Remove/replace the misleading deepseek-v3.2:cloud comment (it isn’t referenced anywhere else in code).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@apps/web/cypress/e2e/b2b-approval.cy.ts` around lines 113 - 117, The
setupMocks() function is currently a no-op causing the spec to call real
/api/agent and produce nondeterministic LLM behavior; update setupMocks() to
install the existing Cypress SSE/mocking helpers (e.g., call
mockCatalogResponse, mockBudgetResponse, mockApprovalResponse or other mock*
helpers used elsewhere) so the test uses deterministic mocked SSE responses
instead of real agent calls, and remove the misleading "deepseek-v3.2:cloud"
comment; locate and modify the setupMocks definition and replace the no-op body
with invocations of the mock functions used by other specs (and delete the
unused deepseek comment) so the spec consistently sets up test responses while
still allowing page.tsx's window.Cypress headers to remain untouched.
| typescript: { | ||
| ignoreBuildErrors: true, | ||
| }, | ||
| eslint: { | ||
| ignoreDuringBuilds: true, | ||
| }, |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Show exact contents around the referenced lines
FILE="apps/web/next.config.mjs"
echo "== File: $FILE =="
nl -ba "$FILE" | sed -n '1,120p'
# Search for env usage or other overrides in this config
echo "== Search for ignoreBuildErrors / ignoreDuringBuilds =="
rg -n "ignoreBuildErrors|ignoreDuringBuilds|ALLOW_UNSAFE_BUILD|process\.env" "$FILE" || true
# Check if there are other next.config.* files that might override behavior
echo "== Other next config files =="
fd "next\.config\.(js|mjs|ts)$" -a . || trueRepository: Aparnap2/smart_commerce_agent
Length of output: 171
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="apps/web/next.config.mjs"
echo "== File: $FILE =="
if [ -f "$FILE" ]; then
# line numbers without relying on nl
awk '{print NR "\t" $0}' "$FILE" | sed -n '1,160p'
else
echo "MISSING: $FILE"
exit 1
fi
echo "== rg in next config =="
rg -n "ignoreBuildErrors|ignoreDuringBuilds|ALLOW_UNSAFE_BUILD|process\.env" "$FILE" || true
echo "== Find other next.config.* files =="
find . -maxdepth 5 -type f \( -name "next.config.*" \) -printRepository: Aparnap2/smart_commerce_agent
Length of output: 2463
🌐 Web query:
Next.js next.config.js typescript.ignoreBuildErrors ignoreDuringBuilds eslint meaning during production build
💡 Result:
In next.config.js: 1) typescript.ignoreBuildErrors Meaning: When set to true, Next.js allows the production build (next build) to complete even if there are TypeScript errors in your project; it “completely skips the TypeScript type checking step.” [1][2] 2) eslint.ignoreDuringBuilds Meaning: When ESLint is detected, Next.js normally fails the production build when there are ESLint errors; setting ignoreDuringBuilds: true disables the built-in linting step during the build so production builds can complete even with ESLint errors. [3] Production build behavior (next build) - With typescript.ignoreBuildErrors: true, Next.js will not run TypeScript type checking during next build. [1] - With eslint.ignoreDuringBuilds: true, Next.js will not run its ESLint check during next build (if ESLint is detected). [3] Example configuration module.exports = { typescript: { // Dangerously allow production builds to successfully complete even if // your project has type errors. ignoreBuildErrors: true, }, eslint: { // Allow production builds to successfully complete even if your project has ESLint errors. ignoreDuringBuilds: true, }, } Both options are documented as “dangerous”/not recommended because you must run equivalent checks elsewhere (for example, in CI using tsc --noEmit and eslint) to avoid deploying incorrect code. [1][2][3]
Citations:
- 1: https://nextjs.org/docs/app/api-reference/config/next-config-js/typescript
- 2: https://nextjs.org/docs/app/api-reference/config/typescript
- 3: https://en.nextjs.im/docs/13/pages/api-reference/next-config-js/eslint/
Avoid disabling Next.js type/lint checks in production builds
apps/web/next.config.mjs hardcodes typescript.ignoreBuildErrors: true and eslint.ignoreDuringBuilds: true, letting next build succeed even with TypeScript and ESLint errors.
Suggested guardrail
const nextConfig = {
typescript: {
- ignoreBuildErrors: true,
+ ignoreBuildErrors: process.env.ALLOW_UNSAFE_BUILD === 'true',
},
eslint: {
- ignoreDuringBuilds: true,
+ ignoreDuringBuilds: process.env.ALLOW_UNSAFE_BUILD === 'true',
},
reactStrictMode: true,
output: 'standalone',📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| typescript: { | |
| ignoreBuildErrors: true, | |
| }, | |
| eslint: { | |
| ignoreDuringBuilds: true, | |
| }, | |
| typescript: { | |
| ignoreBuildErrors: process.env.ALLOW_UNSAFE_BUILD === 'true', | |
| }, | |
| eslint: { | |
| ignoreDuringBuilds: process.env.ALLOW_UNSAFE_BUILD === 'true', | |
| }, |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@apps/web/next.config.mjs` around lines 5 - 10, Replace the hardcoded bypasses
for type/lint checks in next.config.mjs: remove typescript.ignoreBuildErrors:
true and eslint.ignoreDuringBuilds: true and instead read a specific environment
flag (e.g., IGNORE_NEXT_ERRORS or NEXT_ALLOW_BUILD_SKIP) that defaults to false
so production/CI builds fail on TypeScript/ESLint errors; update the entries
using the symbols typescript.ignoreBuildErrors and eslint.ignoreDuringBuilds to
use that env-derived boolean and document the env var for CI usage.
| "@prisma/adapter-pg": "^6.7.0", | ||
| "@prisma/client": "6.7.0", | ||
| "@prisma/client": "7.8.0", | ||
| "@smart-commerce/errors": "workspace:*", |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Show relevant sections with line numbers
echo "== apps/web/package.json (around 30-110) =="
nl -ba apps/web/package.json | sed -n '30,110p'
echo
echo "== Extract Prisma-related deps =="
rg -n --no-heading '"(`@prisma/`[^"]+|prisma)"\s*:\s*"' apps/web/package.json || trueRepository: Aparnap2/smart_commerce_agent
Length of output: 178
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== apps/web/package.json (around 30-110) =="
awk 'NR>=30 && NR<=110 {printf "%d\t%s\n", NR, $0}' apps/web/package.json
echo
echo "== Extract Prisma-related deps =="
rg -n --no-heading '"(`@prisma/`[^"]+|prisma)"\s*:\s*"' apps/web/package.jsonRepository: Aparnap2/smart_commerce_agent
Length of output: 2894
Align Prisma package versions to avoid generator/runtime skew
@prisma/client and prisma are on 7.8.0, while @prisma/adapter-pg and @prisma/internals remain 6.x—this version skew often leads to breakage.
apps/web/package.json:@prisma/client7.8.0(line 38) vs@prisma/adapter-pg^6.7.0(line 37)apps/web/package.json:@prisma/internals6.7.0(line 83) vsprisma7.8.0(line 99)
Suggested alignment
- "`@prisma/adapter-pg`": "^6.7.0",
+ "`@prisma/adapter-pg`": "7.8.0",
...
- "`@prisma/internals`": "6.7.0",
+ "`@prisma/internals`": "7.8.0",🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@apps/web/package.json` around lines 37 - 39, The Prisma packages in
package.json are version-skewed: `@prisma/client` and prisma are at 7.8.0 while
`@prisma/adapter-pg` and `@prisma/internals` are at 6.x; update the dependency
entries so all Prisma-related packages use the same major version (e.g., set
"`@prisma/adapter-pg`" and "`@prisma/internals`" to 7.8.0 to match "`@prisma/client`"
and "prisma", or alternatively pin all to 6.7.0 if you intend to stay on v6), by
editing the dependency and devDependency entries for "`@prisma/adapter-pg`",
"`@prisma/internals`", "`@prisma/client`" and "prisma" in package.json and then run
install to regenerate lockfile.
…, RBAC - F8: Manager Dashboard (pending approvals, dept spend, recent activity) - F9: Finance Dashboard (budget utilisation, spend trend, flagged items) - F10: Vendor sourcing request flow (genui card + tool + prompt) - F11: Dispute flow end-to-end (5/5 tests passing) - F13: NotificationBell GenUI component - F14: Catalog pricing audit background job (SerpApi comparison, CLI) - F15: RBAC middleware + dynamic tool filtering (40 tests) Test coverage: 106 passed, 5 skipped, 19 xfailed, 0 new TS errors
There was a problem hiding this comment.
Actionable comments posted: 6
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
apps/agent-core/tests/test_notifications.py (1)
75-85:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUse the same patch target in the Redis error test.
test_publish_approval_event_redis_errorstill patchesdependencies.get_redis, while the other tests correctly patchsrc.notifications.get_redis. This can miss the mock depending on howpublish_approval_eventresolves imports.Proposed fix
`@pytest.mark.asyncio` async def test_publish_approval_event_redis_error(): """Test handling Redis connection errors gracefully.""" from src.notifications import publish_approval_event - from src import dependencies mock_redis = AsyncMock() mock_redis.publish = AsyncMock(side_effect=Exception("Redis connection failed")) - with patch.object(dependencies, 'get_redis', return_value=mock_redis): + with patch('src.notifications.get_redis', return_value=mock_redis): result = await publish_approval_event( employee_id="emp-123", pr_id="pr-uuid-456", pr_number="PR-101", decision="APPROVED",🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/agent-core/tests/test_notifications.py` around lines 75 - 85, The test test_publish_approval_event_redis_error patches the wrong target so the mock may not be used; change the patch to target the same get_redis used by publish_approval_event (patch src.notifications.get_redis) instead of dependencies.get_redis. Locate the test function test_publish_approval_event_redis_error and replace the with patch.object(dependencies, 'get_redis', ...) (or patch.object(dependencies, 'get_redis', return_value=mock_redis)) with a patch on 'src.notifications.get_redis' (returning mock_redis) so publish_approval_event uses the mocked Redis publish that raises the intended Exception.apps/web/__tests__/components/genui/CatalogGrid.test.tsx (1)
24-245:⚠️ Potential issue | 🟠 Major | ⚖️ Poor tradeoffRequired test bootstrap is missing for
*.test.tsx.This file does not include the required
beforeEachsetup withcleanupTestDB()andseedTestFixtures().As per coding guidelines,
**/*.test.{ts,tsx}: Test pattern: use beforeEach with cleanupTestDB() and seedTestFixtures(), write specific assertions (not snapshots), use real Docker postgres database.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/__tests__/components/genui/CatalogGrid.test.tsx` around lines 24 - 245, The test file CatalogGrid.test.tsx is missing the required test bootstrap; add a beforeEach that calls cleanupTestDB() then seedTestFixtures() to ensure a clean DB and seeded fixtures before each test; place this at the top of the describe('CatalogGrid', ...) scope (or imported into the test module) and ensure the helper functions cleanupTestDB and seedTestFixtures are imported or available in the test file so all tests run against the real Docker Postgres fixture per the project testing guidelines.apps/web/components/shell/Rail.tsx (1)
10-13:⚠️ Potential issue | 🟠 Major | ⚡ Quick winJWT payload parsing is not base64url-safe and can fail for valid tokens.
atob(token.split('.')[1])assumes standard base64, but JWT payloads are base64url-encoded. Normalize/pad before decoding to prevent intermittent auth-state parsing failures.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/components/shell/Rail.tsx` around lines 10 - 13, The JWT payload decoding uses atob(token.split('.')[1]) which assumes standard base64; instead normalize the base64url segment before decoding: take the payload part (token.split('.')[1]), replace '-' with '+', '_' with '/', pad with '=' to length %4==0, then pass the normalized string to atob and JSON.parse; update the code around the token variable/JSON.parse(atob(...)) usage to perform this base64url-safe normalization so valid JWTs decode reliably.apps/web/app/manager/page.tsx (1)
191-199:⚠️ Potential issue | 🟠 Major | ⚡ Quick winHandle unauthenticated/null session explicitly instead of infinite spinner.
When token parsing fails or no token exists, this page never exits the loading state.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/app/manager/page.tsx` around lines 191 - 199, The current check returns a perpetual loading spinner when session is null; change the logic in the component that references session (the conditional returning Shell with the spinner) to explicitly handle unauthenticated state: if session is null/unauthenticated, either redirect to the sign-in route (call your auth redirect method) or render a clear unauthenticated UI (e.g., Shell with Rail and a centered message + "Sign in" button that triggers the auth flow) instead of the spinner; update the conditional around session in page.tsx so only an undefined/loading state shows the spinner while a null/unauthenticated session shows the explicit fallback and an actionable next step.apps/agent-core/src/graph.py (2)
20-50:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAvoid mutating message objects in
strip_ui_from_messages.This currently edits existing
ToolMessageinstances in place, which can leak modified content back into persisted graph state/history.🔧 Suggested fix
+import copy ... def strip_ui_from_messages(messages: list[BaseMessage]) -> list[BaseMessage]: ... stripped = [] - for msg in messages: + for msg in messages: + msg_copy = copy.deepcopy(msg) # Only process ToolMessage with JSON content - if isinstance(msg, ToolMessage) and msg.content: + if isinstance(msg_copy, ToolMessage) and msg_copy.content: try: - parsed = json.loads(msg.content) + parsed = json.loads(msg_copy.content) ... - msg.content = json.dumps(parsed) + msg_copy.content = json.dumps(parsed) except json.JSONDecodeError: # Non-JSON content, leave as-is pass - stripped.append(msg) + stripped.append(msg_copy)Also applies to: 311-316
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/agent-core/src/graph.py` around lines 20 - 50, strip_ui_from_messages mutates ToolMessage instances in place; instead, avoid modifying existing messages by creating a new message object for each modified ToolMessage (e.g., deepcopy the original or construct a new ToolMessage with the same metadata and the cleaned content) and append the new object to stripped while leaving the original msg untouched; update the code paths in strip_ui_from_messages (and any other similar block that assigns to msg.content) to produce and append a new message rather than reassigning msg.content.
380-390:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winUse
AIMessagefor the approval gate assistant message (don’t constructBaseMessage(..., role="assistant")).
approval_gate_nodeappendsBaseMessage(..., role="assistant")intostate["messages"], and those messages are later sent tollm.ainvoke. LangChain expects assistant outputs to be represented withAIMessage(orChatMessagefor arbitrary roles), so this construction is likely to break the approval path at runtime.🔧 Suggested fix
-from langchain_core.messages import BaseMessage, SystemMessage, HumanMessage, ToolMessage +from langchain_core.messages import BaseMessage, SystemMessage, HumanMessage, ToolMessage, AIMessage ... - "messages": state["messages"] + [BaseMessage(content=json.dumps({ + "messages": state["messages"] + [AIMessage(content=json.dumps({ "approval_decision": "APPROVED", "message": "The manager has APPROVED the PR." - }), role="assistant")] + }))] ... - "messages": state["messages"] + [BaseMessage(content=json.dumps({ + "messages": state["messages"] + [AIMessage(content=json.dumps({ "approval_decision": "REJECTED", "message": "The manager has REJECTED the PR." - }), role="assistant")] + }))]🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/agent-core/src/graph.py` around lines 380 - 390, The assistant reply added in approval_gate_node currently constructs BaseMessage(..., role="assistant") and appends it to state["messages"], which will break downstream llm.ainvoke handling; instead create an AIMessage containing the JSON string and append that to state["messages"] (replace the BaseMessage usages inside approval_gate_node with AIMessage with the same content for both APPROVED and REJECTED branches) so LangChain receives proper assistant messages; keep the JSON payload and role semantics identical but use AIMessage rather than BaseMessage with role="assistant".
🟠 Major comments (20)
apps/agent-core/tests/test_context_compression.py-29-48 (1)
29-48:⚠️ Potential issue | 🟠 Major | ⚡ Quick winTests call a local helper instead of the actual implementation.
Lines 35 and 46 call
create_context_summary_from_pr_info, which is defined within this test file (line 51), rather than the importedcreate_context_summaryfromsrc.graph(line 8). This means the tests validate the test helper's behavior, not the real production code. To follow proper TDD (Red → Green → Refactor), these tests should call the actualsrc.graph.create_context_summaryfunction and fail until the implementation is written.✅ Proposed fix to test the real implementation
def test_create_summary_contains_pr_info(self): """Summary should contain PR number and items.""" - # Create a mock tool result with PR info pr_info = {"prId": "PR-2026-0042", "items": [{"name": "MacBook Pro"}], "success": True} - # The summary function extracts prId from tool results - summary = create_context_summary_from_pr_info(pr_info, "employee@company.com") + # Build a message history that includes the PR submission tool result + messages = [ + HumanMessage(content="Create a PR for MacBook Pro"), + ToolMessage(content=json.dumps(pr_info), tool_call_id="call-123", name="submit_for_approval"), + ] + summary = create_context_summary(messages, "employee@company.com") assert "PR-2026-0042" in summary assert "item" in summary.lower()Remove the local helper function (lines 51-59) once tests use the real implementation.
As per coding guidelines: "Write the test FIRST before implementation. Follow Red → Green → Refactor workflow" for Python files.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/agent-core/tests/test_context_compression.py` around lines 29 - 48, Tests are calling the local helper create_context_summary_from_pr_info (defined in this test file) instead of the real production function create_context_summary from src.graph; update the tests to call create_context_summary (the symbol imported at the top) in both test_create_summary_contains_pr_info and test_summary_is_concise, and then delete the local helper definition create_context_summary_from_pr_info so the tests exercise the real implementation (ensure the existing import of create_context_summary is used and no duplicate helper remains).apps/agent-core/tests/test_context_compression.py-14-27 (1)
14-27: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winTrivial assertions don't validate compression logic.
These tests simply check whether a string literal appears in a hardcoded array (lines 18, 25), which always passes and doesn't verify the actual
should_compress_contextbehavior. TDD tests should exercise the real compression decision logic—for example, by constructing a message history with or withoutsubmit_for_approvaltool calls and asserting the function's boolean return value.♻️ Example of a meaningful test
def test_compress_after_submit_for_approval(self): """Should compress context after submit_for_approval is called.""" - # Directly test the logic - check if submit_for_approval was called - # by checking for tool call names in the message flow - tool_call_names = ["search_catalog", "manage_purchase_request", "submit_for_approval"] - - # Check if submit_for_approval is in the tool calls - assert "submit_for_approval" in tool_call_names + messages = [ + HumanMessage(content="Create a PR"), + AIMessage(content="", tool_calls=[{"name": "submit_for_approval", "args": {}, "id": "call-1"}]), + ToolMessage(content='{"success": true}', tool_call_id="call-1", name="submit_for_approval"), + ] + assert should_compress_context(messages) is TrueAs per coding guidelines: "Write the test FIRST before implementation" for Python files.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/agent-core/tests/test_context_compression.py` around lines 14 - 27, Replace the trivial string-presence assertions in test_compress_after_submit_for_approval and test_no_compress_without_submission with real calls to the compression decision function (e.g., should_compress_context) using constructed message histories that include or omit a tool call with name "submit_for_approval"; build message lists that mimic the real message objects or dicts your code expects, call should_compress_context(history) and assert True when the history contains a submit_for_approval tool call and False when it does not, updating the tests to reference the actual function name should_compress_context and any helper constructors used to create tool-call messages.apps/web/app/api/finance/flagged-items/[id]/resolve/route.ts-14-20 (1)
14-20: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winValidate the
idparameter and verify the update succeeded.The handler does not validate the
idformat before executing the SQL update, nor does it check whether any row was actually updated. An invalid or non-existentidwill return{ success: true }even when nothing changed.✅ Proposed validation and result check
try { const { id } = await params + + // Validate UUID format (adjust if using different ID scheme) + const uuidRegex = /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i + if (!uuidRegex.test(id)) { + return NextResponse.json({ error: 'Invalid ID format' }, { status: 400 }) + } - await prisma.$executeRaw` + const result = await prisma.$executeRaw` UPDATE "CatalogItem" SET "pricingFlag" = false, "pricingFlaggedAt" = NULL WHERE id = ${id} ` + + if (result === 0) { + return NextResponse.json({ error: 'Item not found' }, { status: 404 }) + } return NextResponse.json({ success: true })🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/app/api/finance/flagged-items/`[id]/resolve/route.ts around lines 14 - 20, Validate the incoming id from params (e.g., ensure it's a non-empty string and matches your expected format such as a UUID using a regex or schema validator) before running the SQL, and after calling prisma.$executeRaw`... WHERE id = ${id}` check the returned affected-rows count (the value returned by prisma.$executeRaw) to confirm at least one row was updated; if validation fails return a 400 and if affected rows is 0 return a 404 (or appropriate error) instead of always returning { success: true }. Use the existing params extraction and prisma.$executeRaw call sites to implement these checks.apps/agent-core/tests/test_chat.py-4-5 (1)
4-5:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftAlign
xfailwith the actual test scope and remove hybrid mocking from endpoint “integration” testsIn
apps/agent-core/tests/test_chat.py, the module-levelxfail(lines 4-5) correctly targets the missingclientfixture—but the test cases are also written as endpoint calls that fully mockrouters.chat.create_orchestrator(in the SSE/thread/UI action tests). That makes them a hybrid “integration” test that won’t exercise the real agent flow once theclientfixture is fixed (and conflicts with the “no mocks in integration tests” rule).Either:
- Make these true integration tests (add/enable the FastAPI
clientfixture and stop patchingcreate_orchestrator), or- Treat them as unit tests of endpoint behavior (avoid hitting
/agent/chatvia the client; call the handler/service directly) and adjust the xfail/markers accordingly.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/agent-core/tests/test_chat.py` around lines 4 - 5, The module-level xfail is OK for the missing FastAPI test client but the tests also patch routers.chat.create_orchestrator which creates a hybrid integration/unit test; pick one approach and update tests accordingly: either enable the FastAPI test client fixture and remove the patches of routers.chat.create_orchestrator so the tests call the real /agent/chat endpoint exercising the full orchestrator flow, or convert the tests into pure unit tests by calling the handler/service functions directly (e.g., the function under test in routers.chat) and keep the patches, then replace the module xfail with appropriate unit-test markers for that scope. Ensure references to routers.chat.create_orchestrator and the endpoint tests (SSE/thread/UI action tests) are updated consistently.apps/web/__tests__/components/genui/PRList.test.tsx-26-187 (1)
26-187:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAdd the required DB fixture setup hook in test setup.
This suite is missing the mandated
beforeEachpattern withcleanupTestDB()andseedTestFixtures().As per coding guidelines "
**/*.test.{ts,tsx}: Test pattern: use beforeEach with cleanupTestDB() and seedTestFixtures(), write specific assertions (not snapshots), use real Docker postgres database".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/__tests__/components/genui/PRList.test.tsx` around lines 26 - 187, The test suite is missing the required DB fixture setup/teardown; add a beforeEach hook that calls cleanupTestDB() then seedTestFixtures() before each test so the suite runs against the mandated real Postgres fixtures. Locate the PRList test file and add a top-level beforeEach(async () => { await cleanupTestDB(); await seedTestFixtures(); }) (or equivalent async setup) near the describe('PRList', ...) so all tests use the seeded DB and teardown is performed consistently.apps/web/__tests__/components/SuggestedChips.test.tsx-19-21 (1)
19-21:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUpdate
beforeEachto include required test DB fixture setup.This hook currently clears mocks only; it also needs
cleanupTestDB()andseedTestFixtures().As per coding guidelines "
**/*.test.{ts,tsx}: Test pattern: use beforeEach with cleanupTestDB() and seedTestFixtures(), write specific assertions (not snapshots), use real Docker postgres database".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/__tests__/components/SuggestedChips.test.tsx` around lines 19 - 21, The beforeEach currently only calls vi.clearAllMocks(); update the setup to also call cleanupTestDB() and seedTestFixtures() so the test DB is reset and seeded before each test: locate the beforeEach block that calls vi.clearAllMocks() (in SuggestedChips.test.tsx) and add calls to cleanupTestDB() and await seedTestFixtures() (ensure async/await if needed) alongside vi.clearAllMocks() to satisfy the test fixture requirements.apps/web/__tests__/components/genui/PurchaseRequestDraft.test.tsx-24-238 (1)
24-238:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAdd the required DB fixture setup hook in test setup.
This suite is missing the mandated
beforeEachpattern withcleanupTestDB()andseedTestFixtures().As per coding guidelines "
**/*.test.{ts,tsx}: Test pattern: use beforeEach with cleanupTestDB() and seedTestFixtures(), write specific assertions (not snapshots), use real Docker postgres database".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/__tests__/components/genui/PurchaseRequestDraft.test.tsx` around lines 24 - 238, Add the mandated DB fixture setup by adding a beforeEach hook that calls cleanupTestDB() then seedTestFixtures() before each test in this suite; locate the describe('PurchaseRequestDraft', ...) block and insert the beforeEach with cleanupTestDB() and seedTestFixtures() (and ensure those helper functions are imported/available in the test file) so each test runs against a clean seeded Docker Postgres instance.apps/agent-core/routers/chat.py-95-97 (1)
95-97:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDo not expose raw exception messages to SSE/API clients.
Line 97 and Line 103 forward
str(e)to callers. This can leak internals (stack details, dependency errors, SQL fragments).Suggested fix
- yield {"event": "error", "data": json.dumps({"message": str(e)})} + yield {"event": "error", "data": json.dumps({"message": "Internal server error"})} ... - raise HTTPException(status_code=500, detail=str(e)) + raise HTTPException(status_code=500, detail="Internal server error")Also applies to: 101-103
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/agent-core/routers/chat.py` around lines 95 - 97, The code currently forwards raw exception text to clients via the yield payload; instead, log the full exception server-side with logger.exception(logger) or logger.error(..., exc_info=True) and return a generic, non-sensitive client message (e.g., "Internal server error") — optionally include a stable error_id (uuid via uuid.uuid4()) in both the log and the SSE payload so clients can reference the incident; update the error handling around the logger.error and yield {"event": "error", "data": json.dumps({"message": str(e)})} sites to implement this pattern and avoid leaking str(e) to callers.apps/web/components/genui/NotificationBell.tsx-267-270 (1)
267-270:⚠️ Potential issue | 🟠 Major | ⚡ Quick winValidate
notification.linkscheme beforewindow.open.Line 269 opens untrusted link values directly. A
javascript:/data:URL can be abused. Restrict tohttp:/https:before opening.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/components/genui/NotificationBell.tsx` around lines 267 - 270, The click handler in NotificationBell currently calls window.open(notification.link, ...) without validating the URL scheme; update the handler that contains the hasLink && notification.link branch to validate the link before opening by parsing notification.link (e.g., via the URL constructor or an <a> element) and only allow protocols 'http:' or 'https:'; if the scheme is invalid or parsing fails, do not call window.open (optionally log or ignore), ensuring no javascript:, data:, or other unsafe schemes are opened.apps/agent-core/routers/chat.py-90-91 (1)
90-91:⚠️ Potential issue | 🟠 Major | ⚡ Quick win
thread_idevent is populated from the wrong field.Line 91 emits
threadIdusingpending_pr_numberinstead of a thread identifier. This can break frontend SSE contract handling.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/agent-core/routers/chat.py` around lines 90 - 91, The SSE `thread_id` emission is using the wrong field; in the yield inside the chat SSE generator replace the value currently pulled from event.get("pending_pr_number", "") with the actual thread identifier (e.g., event.get("thread_id") or fallback to event.get("pending_pr_thread_id")). Update the yield in the generator that checks "pending_pr_id" so it emits {"event":"thread_id", "data": json.dumps({"threadId": <thread id from event.get("thread_id") or event.get("pending_pr_thread_id")>})} to satisfy the frontend SSE contract.apps/web/components/genui/PriceComparisonCard.tsx-409-413 (1)
409-413:⚠️ Potential issue | 🟠 Major | ⚡ Quick winSanitize retailer URLs before binding to
href.Line 410 uses untrusted
retailerLinkdirectly in<a href>. Enforce anhttp/httpsallowlist and drop invalid schemes.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/components/genui/PriceComparisonCard.tsx` around lines 409 - 413, Sanitize the untrusted retailerLink before assigning it to href (and before passing to onRetailerClick) in the PriceComparisonCard component: validate the scheme using a strict allowlist (only http:// or https://) by parsing retailerLink (e.g., via the URL constructor) and if invalid or missing a safe scheme, replace with a safe fallback like '#' or omit the href; then bind the sanitized value (e.g., sanitizedRetailerLink) to the <a href> and call onRetailerClick with the sanitized URL instead of the raw retailerLink.apps/agent-core/src/catalog_audit.py-205-207 (1)
205-207:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPool ownership tracking is incorrect and may close caller-managed pools.
If a caller passes
pool, cleanup still closes it whenget_pool()is unavailable. Track whether this function created the pool (e.g.,owns_pool) and only close in that case.Also applies to: 305-313
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/agent-core/src/catalog_audit.py` around lines 205 - 207, The code unconditionally closes the passed-in pool; change it to track ownership so only pools this function created are closed: introduce a boolean (e.g., owns_pool) set to False by default, set owns_pool = True when you call _fetch_item_pool() (instead of an unconditional pool assignment), and in the cleanup/finally blocks only close/await pool.close() if owns_pool is True; apply the same pattern to the other similar block referenced (lines ~305-313) so caller-managed pools are never closed by this function.apps/agent-core/scripts/run_catalog_audit.py-62-65 (1)
62-65:⚠️ Potential issue | 🟠 Major | ⚡ Quick win
finallycan raiseUnboundLocalErrorifdependencieswas never imported.
dependenciesis referenced infinallybut only imported in thetryblock. If an error occurs earlier, cleanup can mask the original failure.Also applies to: 88-91
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/agent-core/scripts/run_catalog_audit.py` around lines 62 - 65, The finally block references the module variable dependencies but the import occurs inside the try, so if the import (or anything before it) fails the finally can raise UnboundLocalError and mask the original exception; to fix, initialize dependencies = None before the try (or use a local flag), then in the finally guard any cleanup that touches dependencies by checking it is not None (or use getattr/dependencies attribute checks) before calling its cleanup functions; apply the same pattern for the second occurrence around the block that spans lines referencing dependencies (the blocks near 62-65 and 88-91) to avoid masking the original error.apps/agent-core/scripts/run_catalog_audit.py-53-55 (1)
53-55:⚠️ Potential issue | 🟠 Major | ⚡ Quick win
REDIS_URLoptionality is broken by forcing localhost default.This makes Redis effectively mandatory and can crash the script on machines without local Redis, despite the script describing Redis as optional caching.
Also applies to: 69-73
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/agent-core/scripts/run_catalog_audit.py` around lines 53 - 55, The code currently forces REDIS_URL to "redis://localhost:6379" via os.environ.setdefault which makes Redis mandatory; remove the hard default and treat REDIS_URL as optional: stop calling os.environ.setdefault("REDIS_URL", "redis://localhost:6379") and instead read os.environ.get("REDIS_URL") where present, only initializing Redis/cache if that value is non-empty. Apply this change for both occurrences (the block referencing REDIS_URL around the os.environ.setdefault call and the similar block at lines 69-73) so the script won’t crash on machines without a local Redis.apps/agent-core/src/catalog_audit.py-181-185 (1)
181-185:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUnflag path can leave stale
pricingFlaggedAtdespite clearing the flag.When no market data is found, code intends to clear prior flag state, but
pricingFlaggedAtis preserved when$2is null. This leaves inconsistent audit metadata.Also applies to: 252-258
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/agent-core/src/catalog_audit.py` around lines 181 - 185, The UPDATE currently preserves pricingFlaggedAt when $2 (marketMedianPrice) is NULL, leaving stale audit data; change the CASE expression in the SQL inside catalog_audit.py so pricingFlaggedAt is set to NULL when $2 IS NULL (i.e. CASE WHEN $2 IS NULL THEN NULL ELSE "pricingFlaggedAt" END) and apply the same fix to the other occurrence mentioned (the block around lines 252-258) so both UPDATEs clear pricingFlaggedAt on the unflag/no-market-data path.apps/web/test-genui-mock.mjs-58-74 (1)
58-74:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMock format doesn't match SSE text/event-stream.
The PR fixes SSE parsing to read
event:lines, but this mock returns JSON{ events: [...] }withcontentType: 'application/json'. The frontend's SSE parser won't be exercised. Consider returningtext/event-streamwith proper SSE format:event: metadata\ndata: {...}\n\n event: message\ndata: {...}\n\nOtherwise, this test validates JSON handling but not the SSE fix.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/test-genui-mock.mjs` around lines 58 - 74, The mock currently sends a JSON payload (mockResponse) via route.fulfill with contentType 'application/json', so the frontend SSE parser isn't exercised; update the page.route handler to return contentType 'text/event-stream' and build the body as an SSE stream by iterating the mock events (the objects in mockResponse.events) and serializing each as "event: <type>\ndata: <JSON data>\n\n" (ensure data is JSON-stringified), then pass that string to route.fulfill (keep status 200). Target the page.route closure, the mockResponse/events array, and the route.fulfill call when making this change.apps/agent-core/src/eval_suite.py-154-157 (1)
154-157:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftNo-op checks let critical failure modes pass by default.
PR-collision and context-confusion branches currently do nothing, so the evaluator returns a false pass for those scenarios.
Also applies to: 177-179
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/agent-core/src/eval_suite.py` around lines 154 - 157, The if-blocks that check pr_id and expected_outcome == "pr_created" (and the similar no-op at lines 177-179) are left as pass and therefore silently allow PR-collision and context-confusion cases to pass; replace the no-op with concrete validation logic inside the evaluator function (where pr_id and expected_outcome are used) to detect collisions and mismatches: when expected_outcome == "pr_created" verify a new PR was actually created (compare PR IDs/titles against a provided prior_prs list or context snapshot), and when checking for context-confusion ensure that the produced PR content matches the intended context (or flag a failure). On mismatch, return or raise the appropriate failure/result object (instead of pass) so the evaluator reports a failing test; update both the pr_id block and the analogous branch referenced around lines 177-179 (same function or helper) to use these checks and return explicit failure outcomes.apps/agent-core/src/eval_suite.py-143-151 (1)
143-151:⚠️ Potential issue | 🟠 Major | ⚡ Quick winBudget violation logic is tied to a hardcoded threshold, not request size.
This can misclassify outcomes because it ignores the actual amount being submitted.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/agent-core/src/eval_suite.py` around lines 143 - 151, The budget check currently compares remaining to a hardcoded 100000 which misclassifies submissions; change the logic in eval_suite.py (the block using budget_status, tool_names and remaining) to compare remaining against the actual request amount instead of a magic constant—obtain the real request size (e.g., total_request, requested_tokens, payload_size or a passed-in request_amount) from the calling context or function parameters and replace the threshold check (remaining < 100000) with remaining < total_request (or remaining < request_amount); ensure the variable used exists in scope or is added to the function signature and keep the same return structure (passed:false, reason/failure_mode) when the remaining is insufficient.apps/agent-core/tests/test_tools_tdd.py-51-52 (1)
51-52:⚠️ Potential issue | 🟠 Major | ⚡ Quick winParameterize
dept_namein the SQL query.Direct string interpolation here is a SQL injection vector and can also break if
dept_namecontains quotes.🔧 Suggested fix
- dept = await conn.fetchrow(f'SELECT id FROM "Department" WHERE name = \'{dept_name}\' LIMIT 1') + dept = await conn.fetchrow( + 'SELECT id FROM "Department" WHERE name = $1 LIMIT 1', + dept_name, + )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/agent-core/tests/test_tools_tdd.py` around lines 51 - 52, The SQL uses direct string interpolation of dept_name which is a SQL injection risk and breaks on quotes; change the call to conn.fetchrow to use a parameterized query (e.g., use a positional parameter placeholder like $1) and pass dept_name as a separate argument so that the query becomes something like conn.fetchrow('SELECT id FROM "Department" WHERE name = $1 LIMIT 1', dept_name) and keep the existing return logic (dept["id"] if dept else None).apps/web/app/manager/page.tsx-284-286 (1)
284-286:⚠️ Potential issue | 🟠 Major | ⚡ Quick winForward approval comments to
handleDecision.The comment entered in
ApprovalCardis currently discarded.🔧 Suggested fix
- onSubmitDecision={async (decision) => { - await handleDecision(approval.id, decision) + onSubmitDecision={async (decision, _prNumber, _total, comments) => { + await handleDecision(approval.id, decision, comments) }}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/app/manager/page.tsx` around lines 284 - 286, The ApprovalCard's onSubmitDecision currently only forwards the decision and drops the entered comment; update the callback to accept the comment argument from ApprovalCard and pass it into handleDecision so the comment is preserved (e.g. change onSubmitDecision={(decision) => ...} to accept the comment payload and call handleDecision(approval.id, decision, comment)). Locate the onSubmitDecision prop on the ApprovalCard and the handleDecision function and ensure handleDecision's signature supports a comment parameter (add a third parameter like comment if missing) so the comment is forwarded and saved.
🟡 Minor comments (6)
apps/agent-core/tests/test_prd_requirements.py-45-51 (1)
45-51:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winGlobal DB singleton mutation is not restored, risking cross-test leakage.
Line 50 sets
src.dependencies._db_poolglobally but this helper doesn’t restore prior state. Failures or interleaving can bleed pool state into other tests.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/agent-core/tests/test_prd_requirements.py` around lines 45 - 51, The helper setup_pool_for_tools mutates the global singleton src.dependencies._db_pool without restoring it; update it to save the current src.dependencies._db_pool before assignment and restore that saved value after the test (either by returning a teardown callback or converting setup_pool_for_tools into a context manager/fixture). Specifically, in setup_pool_for_tools capture old_pool = src_deps._db_pool, assign src_deps._db_pool = pool, and ensure old_pool is reassigned to src_deps._db_pool in test teardown or when exiting the context; keep references to the existing function name setup_pool_for_tools and to src.dependencies._db_pool and dependencies.get_pool_singleton so reviewers can locate the change.apps/agent-core/tests/test_dispute_flow.py-85-86 (1)
85-86:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winReset
dependencies._db_poolafter each test to avoid stale closed-pool state.These tests inject a global pool and close it, but never restore the singleton. Later tests can pick up a closed pool and fail nondeterministically.
Proposed fix
finally: + from src import dependencies + dependencies._db_pool = None await pool.close()Also applies to: 148-150, 168-169, 182-183, 202-203, 219-220, 239-240, 270-271, 290-291, 329-330
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/agent-core/tests/test_dispute_flow.py` around lines 85 - 86, Tests assign a global DB pool to dependencies._db_pool and close it, but never restore the original singleton; capture the previous value before overwriting (e.g., save old_pool = dependencies._db_pool), ensure each test restores dependencies._db_pool to old_pool in a finally/teardown block (or use pytest fixture with yield to set and then reset), and apply this pattern for every test that mutates dependencies._db_pool (locations around the assignments at/near lines where dependencies._db_pool is set) so later tests won't observe a closed pool.apps/agent-core/tests/test_eval_suite.py-42-60 (1)
42-60:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win
test_pr_creation_unique_iddoesn’t validate uniqueness yet.This only checks
passedtwice; it never asserts that two PR identifiers differ.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/agent-core/tests/test_eval_suite.py` around lines 42 - 60, The test_pr_creation_unique_id currently only checks pr1_result["passed"] and pr2_result["passed"] but doesn't assert the PR identifiers are distinct; update the test to extract the created PR identifier from pr1_result and pr2_result (e.g., pr1_result["pr_id"] or pr1_result["data"]["id"] / pr1_result["output"]["id"] depending on evaluate_response return shape) and add an assertion that they are not equal (assert pr1_id != pr2_id) so the test actually verifies uniqueness; keep the existing passed assertions.apps/agent-core/tests/test_tools_tdd.py-187-190 (1)
187-190:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winStrengthen assertions to validate actual behavior.
These assertions currently pass even when filtering/validation behavior is wrong, so they can mask regressions.
Also applies to: 399-401
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/agent-core/tests/test_tools_tdd.py` around lines 187 - 190, The assertion using data.get("found", 0) >= 0 is too weak—replace it with concrete checks: assert that data contains an "items" list, assert data["found"] == len(data["items"]), and for each returned item assert the item's category matches the expected filter (e.g., item.get("category") == expected_category or contains the filter), and if the test expects results assert data["found"] > 0; apply the same replacement to the other similar assertions in this file.apps/web/app/finance/page.tsx-384-385 (1)
384-385:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUse a null check for
marketMedianPricerendering.A value of
0is currently treated as missing and rendered as—.🔧 Suggested fix
- {item.marketMedianPrice ? safePrice(item.marketMedianPrice) : '—'} + {item.marketMedianPrice != null ? safePrice(item.marketMedianPrice) : '—'}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/app/finance/page.tsx` around lines 384 - 385, The rendering treats 0 as missing because it uses a truthy check; update the conditional that renders marketMedianPrice (the expression referencing item.marketMedianPrice and safePrice) to explicitly check for null/undefined (e.g., item.marketMedianPrice != null) so zero values are passed to safePrice and only null/undefined render the placeholder '—'.apps/agent-core/src/tools.py-912-915 (1)
912-915:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win
datetime.utcnow()is deprecated in Python 3.12+.Use
datetime.now(timezone.utc)(already imported) for consistency with other parts of the codebase and future compatibility.Proposed fix
- count = await conn.fetchval('SELECT COUNT(*) FROM "Dispute"') - year = datetime.utcnow().year + count = await conn.fetchval('SELECT COUNT(*) FROM "Dispute"') + year = datetime.now(timezone.utc).year reference_number = f"DIS-{year}-{int(count) + 1:04d}" - submitted_at = datetime.utcnow() + submitted_at = datetime.now(timezone.utc)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/agent-core/src/tools.py` around lines 912 - 915, Replace deprecated datetime.utcnow() calls with timezone-aware datetime.now(timezone.utc) for the variables year, reference_number, and submitted_at in the code that builds the DIS reference and timestamp; compute year = datetime.now(timezone.utc).year, build reference_number using that year and the same count logic, and set submitted_at = datetime.now(timezone.utc) to keep consistency with the rest of the codebase and use the already imported timezone.
🧹 Nitpick comments (6)
apps/agent-core/requirements.txt (1)
16-18: 💤 Low valueConsider pinning versions for reproducible builds.
The newly added dependencies use minimum-version constraints (
>=), while the existing requirements use exact pins (==). This inconsistency may lead to non-reproducible builds if a new minor/major version of these libraries introduces breaking changes. For production deployments, pinning to exact versions (or using a lock file mechanism) ensures consistent behavior across environments.📌 Example of consistent pinning
-python-dotenv>=1.0.0 -loguru>=0.7.0 -sse-starlette>=2.0.0 +python-dotenv==1.0.1 +loguru==0.7.2 +sse-starlette==2.1.3Alternatively, maintain a
requirements.lockor use Poetry/Pipenv for automatic lock file management.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/agent-core/requirements.txt` around lines 16 - 18, The requirements file currently uses minimum-version constraints for python-dotenv, loguru, and sse-starlette which conflicts with the project's exact pinning policy; update the three entries (python-dotenv, loguru, sse-starlette) to use exact versions (replace ">=" with "==" and set the specific version numbers you verified in CI or the current environment), or alternatively add a lock file (requirements.lock or a Poetry/Pipfile lock) and reference it so builds are reproducible; ensure the chosen versions match what tests currently run against to avoid regressions.apps/web/app/api/finance/flagged-items/route.ts (1)
11-21: ⚡ Quick winType assertion bypasses runtime validation.
The type assertion on line 11 assumes the database returns exactly the specified shape, but Prisma's
$queryRawreturnsunknownby default. If the database schema changes or returns unexpected types (e.g.,Decimalfor numeric columns), the subsequentNumber()conversions (lines 24-25) may silently fail or produceNaN.🛡️ Add runtime validation or use Prisma's typed query methods
Option 1: Use Prisma Client's type-safe
findManyinstead of raw SQL:- const rows = await prisma.$queryRaw<Array<{...}>>` - SELECT ... - FROM "CatalogItem" - WHERE "pricingFlag" = true - ORDER BY "pricingFlaggedAt" DESC NULLS LAST - ` + const rows = await prisma.catalogItem.findMany({ + where: { pricingFlag: true }, + orderBy: [{ pricingFlaggedAt: 'desc' }], + select: { + id: true, + name: true, + sku: true, + unitPrice: true, + category: true, + vendor: true, + vendorCode: true, + marketMedianPrice: true, + pricingFlaggedAt: true, + }, + })Option 2: If raw SQL is required, validate each row before processing:
const rows = await prisma.$queryRaw<Array<unknown>>`...` const validatedRows = rows.map(r => { if (!r || typeof r !== 'object' || !('id' in r)) { throw new Error('Invalid row shape from database') } return r as { id: string; name: string; /* ... */ } })🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/app/api/finance/flagged-items/route.ts` around lines 11 - 21, The raw call to prisma.$queryRaw is asserted to a specific shape which bypasses runtime checks and can break if DB types change (e.g., Decimal), so replace the raw query with Prisma Client's type-safe method (e.g., use prisma.catalogItem.findMany with an explicit select for id,name,sku,unitPrice,category,vendor,vendorCode,marketMedianPrice,pricingFlaggedAt) so returned types are inferred and Decimal values are handled; if you must keep $queryRaw, change the call to return Array<unknown> and add a validation step that checks each row's shape and converts Decimal-like fields to numbers before the Number(...) conversions (refer to prisma.$queryRaw and the code that currently does Number(...) on unitPrice/marketMedianPrice).apps/web/test-genui-mock.mjs (1)
80-82: ⚡ Quick winReplace
waitForTimeoutwith explicit selector waits.Arbitrary timeouts cause flaky tests. Prefer
waitForSelectorwith the component's data-testid:Suggested fix
- // Wait for the components to render - await page.waitForTimeout(2000); + // Wait for GenUI components to render + await page.waitForSelector('[data-testid="catalog-grid"]', { timeout: 5000 });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/test-genui-mock.mjs` around lines 80 - 82, Replace the flaky fixed delay call page.waitForTimeout(2000) with explicit waits for the rendered components by using page.waitForSelector targeting the component data-testid(s) present in the test (e.g., wait for "[data-testid=\"...\" ]" for the main component or child elements). Update the test to await the selector(s) that indicate the UI is ready (use page.waitForSelector for each critical element or a single parent container) instead of the arbitrary timeout so the test becomes deterministic and less flaky.apps/web/lib/auth/rbac.ts (1)
46-55: 💤 Low value
B2B_ROLE_HIERARCHYis defined but never used.
meetsRoleLevelstill evaluates against the B2CROLE_HIERARCHY. If this function is called with B2B roles likeEMPLOYEEorMANAGER,indexOfreturns-1, causing the check to always fail. Either wiremeetsRoleLevelto use the correct hierarchy based on context, or removeB2B_ROLE_HIERARCHYif unused.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/lib/auth/rbac.ts` around lines 46 - 55, meetsRoleLevel currently always uses ROLE_HIERARCHY while B2B_ROLE_HIERARCHY is unused, causing -1 lookups for B2B roles; update meetsRoleLevel to choose the correct hierarchy by checking whether both userRole and minimumRole exist in B2B_ROLE_HIERARCHY (use that) otherwise fall back to ROLE_HIERARCHY, handle missing roles by returning false (or throwing) to avoid comparing -1 indices, and keep the symbols B2B_ROLE_HIERARCHY, ROLE_HIERARCHY, and meetsRoleLevel to locate the change.apps/web/components/genui/BudgetGauge.tsx (1)
126-131: 💤 Low value
maxBudgetprop is unused inCategoryBar.
maxCategoryBudgetis calculated at line 199 and passed toCategoryBar, but the component never uses it. The bar width is based onspent / budget, not relative to max. Remove the unused prop or clarify its intended purpose.Remove unused prop
-const CategoryBar: FC<{ name: string; budget: number; spent: number; maxBudget: number }> = ({ +const CategoryBar: FC<{ name: string; budget: number; spent: number }> = ({ name, budget, spent, - maxBudget, }) => {- const maxCategoryBudget = Math.max(...categories.map(c => c.budget), 1) ... <CategoryBar key={cat.name || i} name={cat.name} budget={cat.budget} spent={cat.spent} - maxBudget={maxCategoryBudget} />Also applies to: 198-199, 258-265
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/components/genui/BudgetGauge.tsx` around lines 126 - 131, CategoryBar currently accepts a maxBudget prop that is never used; either remove maxBudget from the CategoryBar props and all call sites (where maxCategoryBudget is passed) or update the component to compute widths relative to maxBudget (e.g., use spent / maxBudget instead of spent / budget) depending on intended behavior. Locate the CategoryBar declaration and usages (symbol: CategoryBar and the prop name maxBudget) and either (a) delete maxBudget from the prop type and from every invocation (where maxCategoryBudget is passed), or (b) replace the width calculation inside CategoryBar to reference maxBudget (and add a safe-guard for zero) and keep passing maxCategoryBudget from the caller.apps/web/app/(chat)/page.tsx (1)
84-105: 💤 Low valueConsider using
unknown[]instead ofany[]for type assertions.Lines 89, 98, and similar locations use
as any[]type assertions. Per coding guidelines, preferunknownoveranyto maintain type safety.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/app/`(chat)/page.tsx around lines 84 - 105, In renderUIComponent update the loose any[] casts to unknown[] to preserve type safety: change casts for items and products (currently using "as any[] | undefined") to "as unknown[] | undefined" and propagate that type where CatalogGrid is called (items and products props), keeping loading, error, and onAddToPR usage unchanged; ensure the UIComponent prop typing reflects unknown[] so CatalogGrid receives unknown[] | undefined instead of any[] and adjust any downstream handling or type guards accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fa9bc4d4-bf39-49a0-88b1-a6d3c36847bd
📒 Files selected for processing (65)
apps/agent-core/migrations/005_add_pricing_flag_columns.sqlapps/agent-core/requirements.txtapps/agent-core/routers/__init__.pyapps/agent-core/routers/chat.pyapps/agent-core/scripts/run_catalog_audit.pyapps/agent-core/src/catalog_audit.pyapps/agent-core/src/eval_suite.pyapps/agent-core/src/graph.pyapps/agent-core/src/tools.pyapps/agent-core/tests/conftest.pyapps/agent-core/tests/test_budget_timing.pyapps/agent-core/tests/test_catalog_audit.pyapps/agent-core/tests/test_chat.pyapps/agent-core/tests/test_context_compression.pyapps/agent-core/tests/test_context_stripping.pyapps/agent-core/tests/test_dispute_flow.pyapps/agent-core/tests/test_dynamic_tools.pyapps/agent-core/tests/test_error_formatting.pyapps/agent-core/tests/test_eval_suite.pyapps/agent-core/tests/test_graph.pyapps/agent-core/tests/test_health.pyapps/agent-core/tests/test_notifications.pyapps/agent-core/tests/test_prd_requirements.pyapps/agent-core/tests/test_sanitization.pyapps/agent-core/tests/test_semantic_search.pyapps/agent-core/tests/test_serpapi_market_price.pyapps/agent-core/tests/test_tools_tdd.pyapps/web/__tests__/components/SuggestedChips.test.tsxapps/web/__tests__/components/genui/ApprovalCard.test.tsxapps/web/__tests__/components/genui/BudgetGauge.test.tsxapps/web/__tests__/components/genui/CatalogGrid.test.tsxapps/web/__tests__/components/genui/PRList.test.tsxapps/web/__tests__/components/genui/PurchaseRequestDraft.test.tsxapps/web/app/(chat)/page.tsxapps/web/app/api/agent/route.tsapps/web/app/api/finance/budget-by-department/route.tsapps/web/app/api/finance/flagged-items/[id]/resolve/route.tsapps/web/app/api/finance/flagged-items/route.tsapps/web/app/api/finance/me/route.tsapps/web/app/api/finance/spend-trend/route.tsapps/web/app/api/finance/top-categories/route.tsapps/web/app/finance/page.tsxapps/web/app/login/page.tsxapps/web/app/manager/page.tsxapps/web/components/SuggestedChips.tsxapps/web/components/genui/ApprovalCard.tsxapps/web/components/genui/BudgetAlert.tsxapps/web/components/genui/BudgetGauge.tsxapps/web/components/genui/CatalogGrid.tsxapps/web/components/genui/DisputeCard.tsxapps/web/components/genui/NotificationBell.tsxapps/web/components/genui/PRList.tsxapps/web/components/genui/PRSubmittedCard.tsxapps/web/components/genui/PriceComparisonCard.tsxapps/web/components/genui/PurchaseRequestDraft.tsxapps/web/components/genui/SourcingRequestCard.tsxapps/web/components/genui/index.tsapps/web/components/shell/Rail.tsxapps/web/lib/auth/jwt.tsapps/web/lib/auth/rbac.tsapps/web/lib/genui/__tests__/safe-render.test.tsapps/web/lib/genui/safe-render.tsapps/web/lib/ui-event-types.tsapps/web/middleware.tsapps/web/test-genui-mock.mjs
✅ Files skipped from review due to trivial changes (6)
- apps/agent-core/migrations/005_add_pricing_flag_columns.sql
- apps/agent-core/tests/test_health.py
- apps/web/app/login/page.tsx
- apps/agent-core/routers/init.py
- apps/web/tests/components/genui/BudgetGauge.test.tsx
- apps/agent-core/tests/test_budget_timing.py
| 'Authorization': token, | ||
| 'Content-Type': 'application/json', | ||
| 'Authorization': token, | ||
| 'x-test-mode': req.headers.get('x-test-mode') ?? '', |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check how agent-core handles x-test-mode header
rg -nC10 'x-test-mode|test.mode|TEST_MODE' apps/agent-core/Repository: Aparnap2/smart_commerce_agent
Length of output: 1704
Fix auth bypass via x-test-mode forwarding
apps/web/app/api/agent/route.ts forwards x-test-mode to agent-core; in apps/agent-core/routers/chat.py, auth is skipped when request.headers.get("x-test-mode") == "true" (or x-user-id == "test-user-id"), with no production/non-production gating shown. This lets any caller bypass JWT by sending these headers. Gate test-mode auth bypass strictly to non-production/internal use (e.g., MOCK_LLM=true/NODE_ENV checks and/or an additional server-side secret) and don’t forward x-test-mode from the public route in production.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@apps/web/app/api/agent/route.ts` at line 87, The public API should not
forward the 'x-test-mode' header unconditionally and the agent-core should not
skip auth solely based on that header; in route.ts stop sending 'x-test-mode' to
agent-core when running in production (e.g., only include 'x-test-mode' when
NODE_ENV !== "production" or process.env.MOCK_LLM is true) or replace it with a
server-only secret header, and in apps/agent-core/routers/chat.py tighten the
bypass check that currently reads request.headers.get("x-test-mode") == "true"
(and x-user-id == "test-user-id") so it also verifies a server-side gating
condition (e.g., process.env.MOCK_LLM === "true" or a shared internal secret)
before skipping JWT auth.
| const role = request.headers.get('x-role') | ||
| const userId = request.headers.get('x-user-id') | ||
|
|
||
| if (!role || !['FINANCE', 'ADMIN'].includes(role)) { | ||
| return NextResponse.json({ error: 'Forbidden' }, { status: 403 }) | ||
| } |
There was a problem hiding this comment.
Do not authorize from client-controlled headers.
x-role/x-user-id can be forged by callers, so this endpoint currently authorizes based on untrusted input. Derive role and user identity only from a verified server-side auth source (e.g., validated JWT/cookie context) and treat these headers as non-authoritative.
Also applies to: 17-24
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@apps/web/app/api/finance/me/route.ts` around lines 5 - 10, The code is
currently authorizing using untrusted headers (role/userId); replace reading
request.headers.get('x-role') and 'x-user-id' in the route handler with
server-validated identity from your auth layer (e.g., validate the request
cookie/JWT or call your session helper) and extract role and user id from the
verified token/session claims; update the authorization check (the logic that
uses role and userId, including the similar block at the later check around
lines 17-24) to use those verified values and remove reliance on the
client-controlled headers.
… across 16 dimensions) - Add groq provider to create_llm() with ThinkTagStrippingLLM wrapper - Add openrouter provider to create_llm() + get_openai_client() - Fix SSE handler: parse __ui__ from message content JSON, add user_role field - Add x-test-mode auth bypass header for test env - Create test_agentic_ai_eval.py: 46 tests covering tool calling, output format, RAG, decision making, compaction, state persistence, GenUI payloads, short/long-term memory, context/harness engineering, edge cases, chaos/concurrency, checkpoints, system prompt integrity, performance - Register pytest integration marker in pyproject.toml
❌ Deploy Preview for ai-sdk failed. Why did it fail? →
|
…, docker - agent-core: SSE streaming, graph, tools, salesforce client, eval suite dependencies, 16 test files (unit + integration + e2e + genui + enforcement) - web: support cockpit (three-panel), 9 GenUI support components, UIContext, ContextPanel, RBAC, middleware, auth - mocks: salesforce mock server, slack mock server - infra: docker-compose.supabase.yml, opencode config
ProcureAI — Features F8–F15
What this does
Completes the core procurement loop: employees search (catalog + live market),
build purchase requests, submit for approval, and receive decisions — all via
natural language with Generative UI components rendering real data.
Architecture highlights
interrupt(), manager resumesvia
Command(resume=decision)— no polling, no webhooksat runtime — role boundary enforced in agent, middleware, and API layer
Test coverage
What changed
Demo walkthrough (5 min)
Out of scope (intentional)
Summary by CodeRabbit
New Features
Chores
Tests