feat: F8-F15 — Manager/Finance dashboards, sourcing, disputes, pricing audit, RBAC by Aparnap2 · Pull Request #10 · Aparnap2/SupportPilot

Aparnap2 · 2026-05-21T14:27:54Z

ProcureAI — Features F8–F15

What this does

Completes the core procurement loop: employees search (catalog + live market),
build purchase requests, submit for approval, and receive decisions — all via
natural language with Generative UI components rendering real data.

Architecture highlights

LangGraph HITL: approval freezes thread via interrupt(), manager resumes
via Command(resume=decision) — no polling, no webhooks
SerpApi fallback: catalog miss → live Google Shopping prices, Redis 15min cache
Dynamic toolsets: EMPLOYEE/MANAGER/FINANCE/ADMIN each get a filtered tool list
at runtime — role boundary enforced in agent, middleware, and API layer
GenUI registry: 7 components, all handle loading/empty/null states

Test coverage

106 Python passing, 5 skipped, 19 xfailed (catalog audit needs SERPAPI_KEY in CI)
0 new TypeScript errors
1 pre-existing asyncio race in test_graph.py (passes in isolation, see inline comment)

What changed

Area	Detail
F8: Manager Dashboard	Pending approvals with Approve/Reject inline, dept spend cards, recent activity log - /manager
F9: Finance Dashboard	Budget utilisation bars, spend trend, top categories, flagged items with Resolve - /finance
F10: Vendor sourcing	raise_vendor_sourcing_request tool, SourcingRequestCard GenUI, system prompt search strategy
F11: Dispute flow	Full end-to-end: PR validation, auto ref (DIS-YYYY-NNNN), status transition, audit trail - 5/5 tests
F13: Notifications	NotificationBell GenUI - badge, dropdown, mark-read/mark-all-read, type-based icons
F14: Pricing audit	Background job comparing catalog prices vs SerpApi market median, flags >15% over - CLI + migration
F15: RBAC	Route protection in middleware.ts, x-role + x-user-id header forwarding, get_tools_for_role - 40 tests

Demo walkthrough (5 min)

Login as employee -> /chat -> "I need a standing desk" -> catalog miss -> SerpApi fires -> PriceComparisonCard with live retailer data
"Add the cheapest to PR" -> vendor sourcing guardrail fires (off-catalog)
"Show me approved monitors" -> catalog hit -> CatalogGrid -> add 2 units
"What's my department budget?" -> BudgetGauge with real numbers
"Submit for approval" -> thread freezes, notification fires
Switch to manager -> /manager -> approve PR -> budget debits live
Switch back -> chat shows "PR approved" - thread resumed

Out of scope (intentional)

F16: Supplier scorecards - deferred, no seed data for vendor performance history
F17: Multi-currency - deferred, all demo data is INR; FX rate API integration not warranted pre-demo

Summary by CodeRabbit

New Features
- Mock LLM development mode and SSE streaming chat with richer GenUI components (catalog, price-comparison, sourcing, notifications).
- Cookie/JWT-based auth + role-aware routing (Employee/Manager/Finance/Admin) and new finance dashboard with flagged-items management.
- Catalog pricing audit job and UI for flagged items; expanded PR approval and drafting UIs.
Chores
- Updated .gitignore for test/report artifacts.
Tests
- Expanded unit/integration suites and updated E2E credentials/flows.

- Add MockLLM class for testing without real LLM (MOCK_LLM=true) - Skip DB init when MOCK_LLM=true to avoid connection errors - Fix SSE parser to read event: line type instead of JSON type field - Add x-test-mode header forwarding through web proxy to agent-core - Bypass JWT auth when x-test-mode or x-user-id=test-user-id - Fix catalog-grid to check items prop matching LLM output - Strip __ui__ JSON metadata from message display text - Fix /health endpoint for mock mode without DB pool

netlify · 2026-05-21T14:28:00Z

❌ Deploy Preview for ai-sdk failed. Why did it fail? →

Name	Link
🔨 Latest commit	`a008fff`
🔍 Latest deploy log	https://app.netlify.com/projects/ai-sdk/deploys/6a127f9028fc970008c5ea5b

coderabbitai · 2026-05-21T14:28:09Z

Warning

Review limit reached

@Aparnap2, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 54 minutes and 17 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more credits in the billing tab to continue.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2eda1c66-e490-46e8-a598-a8a947c59f92

📥 Commits

Reviewing files that changed from the base of the PR and between a008fff and ca15ece.

⛔ Files ignored due to path filters (14)

apps/web/cypress/videos/b2b-approval.cy.ts-compressed.mp4 is excluded by !**/*.mp4
apps/web/cypress/videos/b2b-approval.cy.ts.mp4 is excluded by !**/*.mp4
apps/web/tests/package-lock.json is excluded by !**/package-lock.json
apps/web/tests/playwright-report/data/35f7fd21d87fe285c7b68ac291a36226d3d890fc.webm is excluded by !**/*.webm
apps/web/tests/playwright-report/data/55531981987141568265e2d703ff439650290c21.webm is excluded by !**/*.webm
apps/web/tests/playwright-report/data/a3724b4e45683f238bf79760f78fed2cf0a9d43a.png is excluded by !**/*.png
apps/web/tests/playwright-report/data/cb2b688ca5704231655d6ecdc93636d9467bd140.png is excluded by !**/*.png
apps/web/tests/playwright-report/data/e517bca78c856373c7963aa7acdcc2dc999b09b3.webm is excluded by !**/*.webm
apps/web/tests/test-results/approval-flow-Approval-Flo-636f5-tes-PR-and-manager-approves-employee/test-failed-1.png is excluded by !**/*.png
apps/web/tests/test-results/approval-flow-Approval-Flo-636f5-tes-PR-and-manager-approves-employee/video.webm is excluded by !**/*.webm
apps/web/tests/test-results/approval-flow-Approval-Flo-df74a-an-view-their-submitted-PRs-employee/test-failed-1.png is excluded by !**/*.png
apps/web/tests/test-results/approval-flow-Approval-Flo-df74a-an-view-their-submitted-PRs-employee/video.webm is excluded by !**/*.webm
apps/web/tests/test-results/approval-flow-Approval-Flow-E2E-catalog-search-shows-items-employee/test-failed-1.png is excluded by !**/*.png
apps/web/tests/test-results/approval-flow-Approval-Flow-E2E-catalog-search-shows-items-employee/video.webm is excluded by !**/*.webm

📒 Files selected for processing (112)

.opencode/opencode.json
.opencode/plugins/graphify.js
.tmp/tasks/supportpilot-phase1/subtask_01.json
.tmp/tasks/supportpilot-phase1/subtask_02.json
.tmp/tasks/supportpilot-phase1/subtask_03.json
.tmp/tasks/supportpilot-phase1/subtask_04.json
.tmp/tasks/supportpilot-phase1/task.json
.tmp/tasks/supportpilot-phase2/subtask_01.json
.tmp/tasks/supportpilot-phase2/subtask_02.json
.tmp/tasks/supportpilot-phase2/subtask_03.json
.tmp/tasks/supportpilot-phase2/subtask_04.json
.tmp/tasks/supportpilot-phase2/subtask_05.json
.tmp/tasks/supportpilot-phase2/task.json
PRD.md
apps/agent-core/main.py
apps/agent-core/migrations/006_add_support_tables.sql
apps/agent-core/pyproject.toml
apps/agent-core/routers/chat.py
apps/agent-core/src/dependencies.py
apps/agent-core/src/eval_suite.py
apps/agent-core/src/graph.py
apps/agent-core/src/llm_config.py
apps/agent-core/src/salesforce/__init__.py
apps/agent-core/src/salesforce/client.py
apps/agent-core/src/support/__init__.py
apps/agent-core/src/support/tools.py
apps/agent-core/src/tools.py
apps/agent-core/test_http_mode.py
apps/agent-core/tests/conftest.py
apps/agent-core/tests/test_agentic_ai_eval.py
apps/agent-core/tests/test_budget_timing.py
apps/agent-core/tests/test_chat_mockllm_sse.py
apps/agent-core/tests/test_context_accumulation.py
apps/agent-core/tests/test_dynamic_tools.py
apps/agent-core/tests/test_e2e_mock_llm.py
apps/agent-core/tests/test_e2e_support_flow.py
apps/agent-core/tests/test_enforcement.py
apps/agent-core/tests/test_eval_suite.py
apps/agent-core/tests/test_genui_contracts.py
apps/agent-core/tests/test_genui_emits.py
apps/agent-core/tests/test_graph.py
apps/agent-core/tests/test_langfuse_metadata.py
apps/agent-core/tests/test_prd_requirements.py
apps/agent-core/tests/test_prompt_sensitivity.py
apps/agent-core/tests/test_role_boundaries.py
apps/agent-core/tests/test_salesforce_client.py
apps/agent-core/tests/test_support_integration.py
apps/agent-core/tests/test_support_migration.py
apps/agent-core/tests/test_support_tools.py
apps/agent-core/tests/test_tax_calculation.py
apps/agent-core/tests/test_tool_selection.py
apps/web/__tests__/lib/auth/rbac-support.test.ts
apps/web/app/(chat)/page.tsx
apps/web/app/(chat)/support/page.tsx
apps/web/app/api/agent/route.ts
apps/web/app/api/agent/session/route.ts
apps/web/app/globals.css
apps/web/app/manager/page.tsx
apps/web/app/test/genui/page.tsx
apps/web/components/genui/ApprovalCard.tsx
apps/web/components/genui/BudgetAlert.tsx
apps/web/components/genui/BudgetGauge.tsx
apps/web/components/genui/CatalogGrid.tsx
apps/web/components/genui/DisputeCard.tsx
apps/web/components/genui/PRList.tsx
apps/web/components/genui/PRSubmittedCard.tsx
apps/web/components/genui/PurchaseRequestDraft.tsx
apps/web/components/genui/index.ts
apps/web/components/genui/support/CaseCreatedCard.tsx
apps/web/components/genui/support/CaseDetailCard.tsx
apps/web/components/genui/support/CaseListCard.tsx
apps/web/components/genui/support/CaseUpdatedCard.tsx
apps/web/components/genui/support/CustomerContextCard.tsx
apps/web/components/genui/support/EscalationCard.tsx
apps/web/components/genui/support/KBResultsCard.tsx
apps/web/components/genui/support/ReplyDraftCard.tsx
apps/web/components/genui/support/SimilarTicketsCard.tsx
apps/web/components/genui/support/index.ts
apps/web/components/shell/Rail.tsx
apps/web/components/support/ContextPanel.tsx
apps/web/components/support/UIContext.tsx
apps/web/cypress/e2e/genui-edge-cases.cy.ts
apps/web/cypress/e2e/rail-budget.cy.ts
apps/web/cypress/support/e2e.ts
apps/web/lib/auth/index.ts
apps/web/lib/auth/jwt.ts
apps/web/lib/auth/rbac.ts
apps/web/lib/genui/types.ts
apps/web/lib/stores/index.ts
apps/web/lib/stores/procurement.ts
apps/web/lib/ui-event-types.ts
apps/web/middleware.ts
apps/web/test-results/.last-run.json
apps/web/tests/e2e/approval-flow.spec.ts
apps/web/tests/genui-edge-cases.spec.ts
apps/web/tests/middleware/route-protection.test.ts
apps/web/tests/playwright-report/data/86189765a499e53d242e82d706b9a7c3e6553f16.md
apps/web/tests/playwright-report/data/8d36f51aeec70ed1f815305068c998b84296aa8c.md
apps/web/tests/playwright-report/data/ceb0b2f78ed790c57f044743b6e94d013bbf2189.md
apps/web/tests/playwright-report/index.html
apps/web/tests/playwright.config.ts
apps/web/tests/playwright/.auth/employee.json
apps/web/tests/playwright/.auth/manager.json
apps/web/tests/playwright/auth-smoke.spec.ts
apps/web/tests/playwright/auth.setup.ts
apps/web/tests/test-results/.last-run.json
apps/web/tests/test-results/approval-flow-Approval-Flo-636f5-tes-PR-and-manager-approves-employee/error-context.md
apps/web/tests/test-results/approval-flow-Approval-Flo-df74a-an-view-their-submitted-PRs-employee/error-context.md
apps/web/tests/test-results/approval-flow-Approval-Flow-E2E-catalog-search-shows-items-employee/error-context.md
docker-compose.supabase.yml
mocks/salesforce-mock-server.js
mocks/slack-server.js

📝 Walkthrough

Walkthrough

This PR transitions the application from NextAuth to JWT cookie-based authentication, refactors the chat UI from LangGraph-driven to SSE-based local state, introduces a mock LLM mode with role-based tool routing in the agent core, redesigns the approval workflow with database functions and flexible GenUI callbacks, normalizes Prisma schema embeddings from vector types to strings, and instruments UI components with test IDs.

Changes

End-to-End Auth, Chat, and Approval Redesign

Layer / File(s)	Summary
Backend Mock LLM and Role-Based Tool Routing `apps/agent-core/src/dependencies.py`, `apps/agent-core/src/graph.py`, `apps/agent-core/src/tools.py`, `apps/agent-core/routers/chat.py`, `apps/agent-core/src/catalog_audit.py`, `apps/agent-core/scripts/run_catalog_audit.py`, `apps/agent-core/migrations/005_add_pricing_flag_columns.sql`, `apps/agent-core/requirements.txt`	Agent core supports MOCK_LLM mode with `MockLLM`, added tool set (compare_market_price, vendor_sourcing_request, get_pricing_audit_results), message sanitization helpers, role-aware get_llm(role), new SSE chat router (`/agent/chat` and `/agent/stream`), catalog audit job and script, and migration adding pricing flag fields.
JWT Authentication System (NextAuth→Supabase/Cookie) `apps/web/app/api/auth/[...nextauth]/route.ts`, `apps/web/app/api/auth/login/route.ts`, `apps/web/app/api/auth/signup/route.ts`, `apps/web/components/auth-provider.tsx`, `apps/web/lib/auth/jwt.ts`, `apps/web/lib/auth/rbac.ts`, `apps/web/middleware.ts`, `apps/web/app/(admin)/layout.tsx`	NextAuth disabled; login now uses pg Pool + bcrypt with fallback, tokens include departmentId and AppRole, signup/login cookies use SameSite=Lax, new auth-provider reads token cookie and subscribes via Supabase client, RBAC updated to AppRole and exposes checkRouteAccess, middleware parses token cookie and injects x-role/x-user-id headers.
Chat UI SSE Integration `apps/web/app/(chat)/page.tsx`, `apps/web/app/api/agent/route.ts`, `apps/agent-core/routers/chat.py`	Client chat switched to SSE parsing from POST /api/agent using cookie token; local messages and uiComponents arrays drive rendering; API proxy enforces x-user-id/x-role headers and injects session into proxied body; agent-core streams `delta`, `ui_actions`, `thread_id`, and `complete` events.
Approval Workflow with Database Functions and GenUI Callbacks `apps/web/app/api/approvals/[prId]/decide/route.ts`, `apps/web/components/genui/ApprovalCard.tsx`, `apps/agent-core/src/tools.py`	Approval POST now calls `approve_pr_and_debit_budget` for APPROVED and updates REJECTED separately; ApprovalCard refactored to accept pr object or direct props and to call async `onSubmitDecision` callback; agent-core `process_approval` updated to debit via DB function and reorganize notifications.
GenUI Components and Safe-Render Utilities `apps/web/lib/genui/safe-render.ts`, `apps/web/lib/genui/__tests__/safe-render.test.ts`, `apps/web/components/genui/*`, `apps/web/components/genui/index.ts`	Added safe-render helpers and tests; many GenUI components rewritten/added (CatalogGrid, PRList, PurchaseRequestDraft, BudgetGauge, NotificationBell, PriceComparisonCard, SourcingRequestCard, ApprovalCard, PRSubmittedCard, DisputeCard) with defensive rendering, testids, and new props/contracts; index now lazy-loads some components.
Finance APIs and Dashboard `apps/web/app/api/finance/*`, `apps/web/app/finance/page.tsx`	New finance endpoints for budget-by-department, flagged-items list and resolve, spend trend, top categories, and finance "me"; client finance dashboard page added to consume these endpoints and resolve flagged items.
Schema Normalization, Build Config, and Dependencies `prisma/schema.prisma`, `apps/web/next.config.mjs`, `apps/web/package.json`, `apps/agent-core/requirements.txt`	Prisma preview features narrowed; vector/tsvector embedding fields changed to nullable String; migration adds pricingFlag/marketMedianPrice; Prisma packages bumped; Next.js build configured to ignore TS/ESLint errors; agent-core Python requirements updated.
Catalog Audit Job & Tools Tests `apps/agent-core/src/catalog_audit.py`, `apps/agent-core/tests/test_catalog_audit.py`, `apps/agent-core/tests/test_serpapi_market_price.py`	New catalog pricing audit implementation and script; tests added for audit logic, SerpApi integration, caching, and audit results; tests inject Redis and DB pools.
Tests, Fixtures, and E2E Setup `apps/agent-core/tests/`, `apps/web/__tests__/`, `apps/web/cypress.config.ts`, `apps/web/cypress/e2e/b2b-approval.cy.ts`, `.gitignore`, `apps/web/test-genui-mock.mjs`	Extensive new/updated unit and integration tests across agent-core and web, conftest refactor to function-scoped transactional DB connections, E2E baseUrl updated, Playwright genui mock script added, and Playwright artifacts ignored in .gitignore.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Aparnap2/smart_commerce_agent#9: Overlaps agent-core approval/notification changes touching apps/agent-core/src/tools.py.
Aparnap2/smart_commerce_agent#8: Related to process_approval refactor and tool behavior.
Aparnap2/smart_commerce_agent#6: Related work on tool set and procurement tool changes.

Poem

🐰 A rabbit's nibble at the code tonight,
Cookies now guard the auth just right,
SSE whispers stream the agent's mind,
Mock LLM hops in for tests we find,
GenUI blooms and audits hum — delight!

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/b2b-pivot-complete

gemini-code-assist

Code Review

This pull request transitions the application from next-auth to a custom JWT-based authentication system, implements a manual SSE streaming client for agent interactions, and introduces a mock LLM for testing. Significant backend updates include role-based tool filtering, context compression logic, and atomic budget operations. Feedback focuses on critical security and reliability issues: sensitive JWT tokens are being logged in middleware, the custom SSE parser fails to maintain state across chunks or handle partial JSON deltas, and the authentication layout contains a logic flaw that bypasses role-based access control. Furthermore, the database connection logic is fragile, the build configuration dangerously suppresses type and lint errors, and the removal of native vector types from the schema will break search functionality.

gemini-code-assist · 2026-05-21T14:30:29Z

+  const hasAuthCookie = cookieHeader.includes('token=')
+
+  // Debug logging (remove in production)
+  console.log('[Middleware] Path:', path, '| Has token cookie:', hasAuthCookie, '| Cookie:', cookieHeader.slice(0, 100))


Logging the cookieHeader (which contains the JWT token) is a security risk as it exposes sensitive authentication credentials in the server logs.

Suggested change

console.log('[Middleware] Path:', path, '| Has token cookie:', hasAuthCookie, '| Cookie:', cookieHeader.slice(0, 100))

// Debug logging (remove in production)

console.log('[Middleware] Path:', path, '| Has token cookie:', hasAuthCookie);

gemini-code-assist · 2026-05-21T14:30:29Z

+  if (user.role !== 'MERCHANT' && user.role !== 'ADMIN') {
+    // Allow SHOPPER and SUPPORT through - they have their own routes
+    // For now, just let them pass
  }


This block is empty and effectively allows any authenticated user (including those with SHOPPER or SUPPORT roles) to access the admin/merchant layout. If this layout is intended to protect admin routes, it must perform a redirect for unauthorized roles.

if (user.role !== 'MERCHANT' && user.role !== 'ADMIN') { redirect('/chat'); }

gemini-code-assist · 2026-05-21T14:30:29Z

+      let buffer = ''
+
+      while (true) {
+        const { done, value } = await reader.read()
+        if (done) break
+
+        buffer += decoder.decode(value, { stream: true })
+
+        // Parse SSE events
+        const lines = buffer.split('\n')
+        buffer = lines.pop() || ''
+
+        let eventType = ''


The eventType variable is initialized inside the while loop, which means its state is lost between chunks. In SSE, an event: line can arrive in a different chunk than the corresponding data: line. This will cause the parser to fail to identify the event type for data arriving in subsequent chunks.

Suggested change

let buffer = ''

while (true) {

const { done, value } = await reader.read()

if (done) break

buffer += decoder.decode(value, { stream: true })

// Parse SSE events

const lines = buffer.split('\n')

buffer = lines.pop() || ''

let eventType = ''

let buffer = '';

let eventType = '';

while (true) {

const { done, value } = await reader.read();

if (done) break;

buffer += decoder.decode(value, { stream: true });

// Parse SSE events

const lines = buffer.split('\n');

buffer = lines.pop() || '';

gemini-code-assist · 2026-05-21T14:30:30Z

+                  const inner = JSON.parse(content)
+                  if (inner.__ui__) {
+                    setUIComponents(prev => [...prev, {
+                      type: inner.__ui__.name,
+                      props: inner.__ui__.props,
+                    }])
+                    content = ''  // Skip showing JSON metadata in messages
+                  }
+                } catch {
+                  // Not JSON, use as-is
+                }


Attempting to JSON.parse the content delta directly will fail when using a real streaming LLM, as the JSON payload will be delivered in multiple small chunks. Furthermore, if the parse fails, the raw JSON fragment is currently appended to the chat messages (line 134), which will result in a poor user experience with visible JSON syntax in the UI.

gemini-code-assist · 2026-05-21T14:30:30Z


+const pool = new Pool({
+  connectionString: process.env.DATABASE_URL 
+    ? process.env.DATABASE_URL.replace('postgres:postgres', 'supabase_admin:postgres').replace('smart_commerce', 'postgres')


Using .replace() on the DATABASE_URL to swap credentials and database names is extremely fragile and insecure. It assumes a specific format and can easily break if the URL contains the search strings in other parts (e.g., in the password). Environment-specific configurations should be handled via separate environment variables.

gemini-code-assist · 2026-05-21T14:30:30Z

+    ignoreBuildErrors: true,
+  },
+  eslint: {
+    ignoreDuringBuilds: true,


Disabling TypeScript and ESLint checks during build is dangerous as it allows code with type errors or linting violations to be deployed to production. This should only be used as a temporary measure and not committed to the repository.

gemini-code-assist · 2026-05-21T14:30:30Z

+                    for item in parsed["products"]:
+                        if isinstance(item, dict):
+                            item.pop("embedding", None)
+                msg.content = json.dumps(parsed)


The strip_ui_from_messages function mutates the msg.content of the original message objects in place. Since these messages are part of the AgentState, this mutation can lead to unexpected side effects or inconsistent state in LangGraph. It is safer to create a copy of the message before modifying it.

Suggested change

msg.content = json.dumps(parsed)

msg = msg.model_copy(update={"content": json.dumps(parsed)})

gemini-code-assist · 2026-05-21T14:30:30Z

  id             String                @id @default(uuid())
  productId      Int                   @map("product_id")
-  embedding      Unsupported("vector") @map("embedding")
+  embedding      String? @map("embedding")


Changing the embedding type from Unsupported("vector") to String? disables native vector support in the database schema. This will break any existing vector similarity search functionality that relies on the pgvector extension.

coderabbitai

Actionable comments posted: 20

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

apps/web/app/api/agent/route.ts (1)
42-49: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Don’t forward client-controlled test/auth headers to agent-core.

This route now passes x-test-mode straight through while still trusting x-user-id from the browser. Given the documented agent-core bypass for test-mode/test-user requests, a caller can hit /api/agent with spoofed headers and reach mock/test paths without a real authenticated identity. Derive the user id server-side from the verified token/cookie and strip x-test-mode outside dedicated test environments.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/web/app/api/agent/route.ts` around lines 42 - 49, Don't forward
client-controlled headers: remove req.headers.get('x-test-mode') and the
client-provided x-user-id from the fetch headers when calling agentUrl. Instead,
derive userId server-side from the verified token/cookie (the same place you
produce the local variable userId) and set 'x-user-id' from that server-verified
value; only add an 'x-test-mode' header when running in a dedicated test
environment/flag (e.g., NODE_ENV/test flag) inside the same route handler before
the fetch. Update the fetch call that builds headers (the block using agentUrl,
token, req.headers and userId) to use the server-derived userId and
conditionally include x-test-mode, and ensure req.headers is not trusted for
auth/test fields.

🧹 Nitpick comments (2)

apps/web/app/(chat)/page.tsx (1)
17-20: ⚡ Quick win

Replace any here with concrete or unknown shapes.

props: any and (window as any) remove type safety in the exact path that parses untrusted streaming payloads. Please introduce a typed UI payload plus a small Window augmentation instead of opting out of checking here. As per coding guidelines, Never use TypeScript any type - use proper types or unknown instead.

Also applies to: 49-49
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/web/app/`(chat)/page.tsx around lines 17 - 20, The UIComponent interface
currently uses props: any which removes type safety when parsing untrusted
streaming payloads; replace props: any with a concrete discriminated union or at
minimum props: unknown and define a typed UI payload interface (e.g., UiPayload
/ UiComponentProps variants) that describes the allowed component types and
their props, then narrow/validate the props at parse time in the function that
consumes the stream (referencing UIComponent and the parser logic); also add a
small Window augmentation for the global payload accessor instead of using
(window as any) so the global variable has a typed signature, and update all
usages (including the other occurrence at line ~49) to perform proper runtime
validation/type narrowing before treating props as specific shapes.
apps/agent-core/requirements.txt (1)
3-3: ⚡ Quick win

Pin the MCP package to the tested API surface.

apps/agent-core/mcp_server.py still imports multiple concrete MCP symbols, so mcp>=1.0.0 makes installs non-reproducible and can break startup the next time MCP moves another import. Prefer an exact pin or a bounded range that matches the version you've actually verified.
For the Python `mcp` package, which released versions expose `mcp.server.Server`, `mcp.server.stdio.stdio_server`, `mcp.types.Tool`, `mcp.types.TextContent`, and `mcp.server.lifecycle.LifespanManager`?
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/agent-core/requirements.txt` at line 3, Replace the loose dependency
"mcp>=1.0.0" with a precise pin or bounded range that matches the version you
actually tested (e.g. mcp==<tested_version> or mcp>=X.Y.Z,<X.Y+1.0>) so installs
remain reproducible; then verify the chosen release exposes the concrete symbols
imported by apps/agent-core/mcp_server.py (mcp.server.Server,
mcp.server.stdio.stdio_server, mcp.types.Tool, mcp.types.TextContent, and
mcp.server.lifecycle.LifespanManager) and update the pin if any symbol is
missing.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/agent-core/src/tools.py`:
- Around line 895-898: The finance_tools list currently includes raise_dispute
which performs writes (mutates PR state and writes audit entries), violating the
intended read-only role; remove raise_dispute from the finance_tools array
(leave base_tools only) or replace it with a read-only alternative if finance
users must see dispute info without mutating state — update the finance_tools
declaration (referencing the finance_tools variable and the raise_dispute
symbol) accordingly and add a brief comment explaining that write-capable tools
are intentionally excluded.

In `@apps/web/app/`(admin)/layout.tsx:
- Around line 31-36: The current guard checks user.role but leaves the branch
empty, allowing unauthorized roles to render the admin layout; update the branch
(the if checking user.role !== 'MERCHANT' && user.role !== 'ADMIN') to
immediately block access by performing a redirect or returning an access-denied
response (e.g., call the app's redirect helper to send users to a forbidden page
or throw a 403/Next.js notFound/redirect). Ensure the change is applied inside
the same conditional that inspects user.role so unauthorized users are not
allowed to continue rendering the admin layout.

In `@apps/web/app/`(chat)/page.tsx:
- Around line 75-82: Hardcoding real users to 'employee' in the request
headers/body causes identity collapse; update the fetch call that sets
'x-user-id' and body.user_id (the code using isTestMode, token, 'x-test-mode',
'x-user-id', Authorization, and JSON body with messages) so that when isTestMode
is false it uses the actual authenticated user's identifier (e.g., their email
or userId variable) instead of the literal 'employee'; ensure both the header
'x-user-id' and body.user_id are populated from the same real-user value the app
has (fall back to a safe default only if that value is absent) to preserve
department/role lookup and per-user rate-limiting.
- Around line 67-68: Replace the brittle cookie parsing expression that sets
const token using document.cookie.split('token=') with a call to the existing
getCookie('token') helper; update the assignment for const token to use
getCookie('token') (and fall back to '' if null/undefined) so it correctly
URL-decodes the value and avoids matching names like refresh_token.
- Around line 198-204: The budget card in the 'budget-gauge' render is using
ui.props?.total (which is absent) instead of the payload field returned by
get_budget_status; update the JSX in the 'budget-gauge' case to read the monthly
budget from ui.props?.monthlyBudget (keep the existing division by 100 and
toLocaleString formatting and preserve ui.props?.remaining for the numerator) so
the "of ₹..." value displays the correct monthlyBudget amount.

In `@apps/web/app/api/approvals/`[prId]/decide/route.ts:
- Around line 71-104: The approve_pr_and_debit_budget DB call can commit before
the subsequent audit and pRApproval inserts, causing partial commits and unsafe
retries; fix by performing the raw function call and the follow-up writes
(audit-entry insert, pRApproval insert/follow-up updates and the eventual fetch
of the updated purchaseRequest) inside a single Prisma transaction using
prisma.$transaction (or move the audit and pRApproval inserts into the
approve_pr_and_debit_budget DB function), so that
approve_pr_and_debit_budget(...), the audit/pRApproval inserts, and the final
purchaseRequest read are all executed atomically and a single success/failure is
returned to the route handler.

In `@apps/web/app/api/auth/login/route.ts`:
- Around line 82-87: The JWT and response are using the wrong property name for
department — replace accesses to user.department_id with user.departmentId (the
query selects "departmentId") so signToken(...) gets departmentId and the
response payload includes departmentId; update all occurrences (e.g., the object
passed to signToken and the response construction around the login handler) to
use user.departmentId consistently.
- Around line 18-23: The route handler in apps/web/app/api/auth/login/route.ts
is logging sensitive PII and secrets (the destructured email/password from const
{ email, password } = await req.json() and later password hash and auth token
outputs around the referenced ranges). Remove any console.log statements that
print raw credentials, password hashes, or auth tokens (including the logs at
lines ~18–23, ~49–56, and ~89–103); instead log only non-sensitive events (e.g.,
"login request received", "authentication succeeded/failed") or masked
identifiers (e.g., first char and domain of an email) and never output full
tokens or hashes from functions like the login handler/default export in this
file. Ensure no secrets are hardcoded and continue to store sensitive config in
.env.local.
- Around line 11-15: Replace the hardcoded fallback DSN in the Pool
instantiation: remove the literal localhost/postgres credentials and require
process.env.DATABASE_URL to be present (use the non-null assertion
process.env.DATABASE_URL!); if any transformation is necessary keep it applied
to process.env.DATABASE_URL! only, and if DATABASE_URL is missing throw a clear
startup error so the app fails fast. Update the Pool construction that uses new
Pool({ connectionString: ... }) to reference only the validated/nonnull env var
(and remove the replace-based fallback string).

In `@apps/web/app/auth/login/page.tsx`:
- Around line 38-39: Remove the client-side cookie write that recreates the JWT
as a non-HttpOnly cookie: delete the document.cookie assignment that uses
data.token in apps/web/app/auth/login/page.tsx (the line setting
`token=${data.token}`), and rely on the server's Set-Cookie from /api/auth/login
instead; if you need to detect login success client-side, use the API response
status/body (e.g., data.success) or server-provided flags rather than exposing
the token to JS.

In `@apps/web/components/auth-provider.tsx`:
- Around line 24-26: AuthProviderInner is computing auth state but never exposes
the parsed session to the context consumers, which is why useAuth() returns {}
and useUser() returns null; fix by passing the parsed session, loading flag and
any helper methods (e.g., setSession or signOut) into the AuthContext.Provider
value inside AuthProviderInner so consumers can read them, and ensure useAuth()
and useUser() return the corresponding values from AuthContext (not empty
defaults); update references to AuthProviderInner, AuthContext.Provider, useAuth
and useUser so the provider value shape matches what the hooks destructure.

In `@apps/web/components/genui/ApprovalCard.tsx`:
- Around line 27-39: The onSubmitDecision prop currently only accepts (decision,
prNumber, total, comments) which loses stable identifiers; change the callback
signature to include prId and threadId (e.g. onSubmitDecision: (decision:
'APPROVED' | 'REJECTED', prId?: string, threadId?: string, prNumber?: string,
total?: number, comments?: string) => Promise<void>) and update the ApprovalCard
component to pass through this.prId and this.threadId (not only
prNumber/totalAmount) when calling onSubmitDecision; also update the other
submission call sites inside this component (the other onSubmitDecision
invocations referenced in the diff) to provide prId and threadId so the approval
flow uses stable identifiers.
- Around line 67-75: The badge currently uses PR status for urgency because
safeUrgency is set from safePr.status; update this to read the actual urgency
field instead: change the assignment of safeUrgency to use safePr.urgency ??
urgency ?? 'NORMAL' (and similarly replace any other places where safePr.status
or pr.status is used to derive urgency, e.g., the other mapping around the
urgency/badge rendering at lines ~119-120) so the urgency pill gets the correct
enum values for color mapping.

In `@apps/web/components/genui/PurchaseRequestDraft.tsx`:
- Around line 77-81: The submit button in PurchaseRequestDraft.tsx
(data-testid="submit-pr-btn") is a no-op; either wire it to a submit handler or
render it disabled. Add an onClick that calls the component's submit routine
(e.g., call an existing handleSubmit/submitPurchaseRequest or an onSubmit prop
passed into PurchaseRequestDraft) and ensure the handler updates
state/dispatches the action that notifies the manager, or if the action isn't
available yet, set the button to disabled (disabled attribute and aria-disabled)
and update the className to a visually disabled style so users can’t click it.
Ensure the handler or prop name you use matches the component’s existing symbols
(handleSubmit, submitPurchaseRequest, or onSubmit) and include any necessary
loading/disabled state to prevent double submits.

In `@apps/web/components/shell/Rail.tsx`:
- Line 22: Replace the useState<any> in the Rail component with a concrete typed
shape for session: define a Session type (or import the proper NextAuth Session)
that at minimum models user with optional name and role (e.g., Session { user?:
{ name?: string; role?: string; [key: string]: any } }), then change the hook to
useState<Session | null>(null) and update any usages that assume session fields
to use the typed properties (session, setSession, and places reading
session?.user?.role or session?.user?.name). This restores type safety without
using any.

In `@apps/web/cypress/e2e/b2b-approval.cy.ts`:
- Around line 113-117: The setupMocks() function is currently a no-op causing
the spec to call real /api/agent and produce nondeterministic LLM behavior;
update setupMocks() to install the existing Cypress SSE/mocking helpers (e.g.,
call mockCatalogResponse, mockBudgetResponse, mockApprovalResponse or other
mock* helpers used elsewhere) so the test uses deterministic mocked SSE
responses instead of real agent calls, and remove the misleading
"deepseek-v3.2:cloud" comment; locate and modify the setupMocks definition and
replace the no-op body with invocations of the mock functions used by other
specs (and delete the unused deepseek comment) so the spec consistently sets up
test responses while still allowing page.tsx's window.Cypress headers to remain
untouched.

In `@apps/web/middleware.ts`:
- Around line 17-18: The console.log in middleware.ts is leaking sensitive data
by printing cookieHeader (even truncated); remove or replace it with
non-sensitive debug info: update the logging in the middleware function that
references path, hasAuthCookie and cookieHeader to omit cookieHeader (e.g., log
only path and hasAuthCookie or a boolean/masked indicator) and ensure no other
code in that function prints raw cookies or tokens.
- Around line 13-21: The middleware currently trusts the presence of "token=" in
req.headers.get('cookie') (cookieHeader/hasAuthCookie) — instead extract the
token value from the cookie string, verify it with your JWT verification utility
(e.g., jwt.verify or your existing verifyToken function) and only treat the
request as authenticated if verification succeeds; on verification failure or
missing token return NextResponse.redirect(new URL('/auth/login', req.url));
replace the naive hasAuthCookie check with proper parsing and try/catch around
the verification and remove or restrict the console.log debug output so
sensitive cookie data is not logged.

In `@apps/web/next.config.mjs`:
- Around line 5-10: Replace the hardcoded bypasses for type/lint checks in
next.config.mjs: remove typescript.ignoreBuildErrors: true and
eslint.ignoreDuringBuilds: true and instead read a specific environment flag
(e.g., IGNORE_NEXT_ERRORS or NEXT_ALLOW_BUILD_SKIP) that defaults to false so
production/CI builds fail on TypeScript/ESLint errors; update the entries using
the symbols typescript.ignoreBuildErrors and eslint.ignoreDuringBuilds to use
that env-derived boolean and document the env var for CI usage.

In `@apps/web/package.json`:
- Around line 37-39: The Prisma packages in package.json are version-skewed:
`@prisma/client` and prisma are at 7.8.0 while `@prisma/adapter-pg` and
`@prisma/internals` are at 6.x; update the dependency entries so all
Prisma-related packages use the same major version (e.g., set
"`@prisma/adapter-pg`" and "`@prisma/internals`" to 7.8.0 to match "`@prisma/client`"
and "prisma", or alternatively pin all to 6.7.0 if you intend to stay on v6), by
editing the dependency and devDependency entries for "`@prisma/adapter-pg`",
"`@prisma/internals`", "`@prisma/client`" and "prisma" in package.json and then run
install to regenerate lockfile.

---

Outside diff comments:
In `@apps/web/app/api/agent/route.ts`:
- Around line 42-49: Don't forward client-controlled headers: remove
req.headers.get('x-test-mode') and the client-provided x-user-id from the fetch
headers when calling agentUrl. Instead, derive userId server-side from the
verified token/cookie (the same place you produce the local variable userId) and
set 'x-user-id' from that server-verified value; only add an 'x-test-mode'
header when running in a dedicated test environment/flag (e.g., NODE_ENV/test
flag) inside the same route handler before the fetch. Update the fetch call that
builds headers (the block using agentUrl, token, req.headers and userId) to use
the server-derived userId and conditionally include x-test-mode, and ensure
req.headers is not trusted for auth/test fields.

---

Nitpick comments:
In `@apps/agent-core/requirements.txt`:
- Line 3: Replace the loose dependency "mcp>=1.0.0" with a precise pin or
bounded range that matches the version you actually tested (e.g.
mcp==<tested_version> or mcp>=X.Y.Z,<X.Y+1.0>) so installs remain reproducible;
then verify the chosen release exposes the concrete symbols imported by
apps/agent-core/mcp_server.py (mcp.server.Server, mcp.server.stdio.stdio_server,
mcp.types.Tool, mcp.types.TextContent, and mcp.server.lifecycle.LifespanManager)
and update the pin if any symbol is missing.

In `@apps/web/app/`(chat)/page.tsx:
- Around line 17-20: The UIComponent interface currently uses props: any which
removes type safety when parsing untrusted streaming payloads; replace props:
any with a concrete discriminated union or at minimum props: unknown and define
a typed UI payload interface (e.g., UiPayload / UiComponentProps variants) that
describes the allowed component types and their props, then narrow/validate the
props at parse time in the function that consumes the stream (referencing
UIComponent and the parser logic); also add a small Window augmentation for the
global payload accessor instead of using (window as any) so the global variable
has a typed signature, and update all usages (including the other occurrence at
line ~49) to perform proper runtime validation/type narrowing before treating
props as specific shapes.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9951ede0-cf0c-4349-8b39-14bb6c96f769

📥 Commits

Reviewing files that changed from the base of the PR and between 46eed51 and 354f75b.

⛔ Files ignored due to path filters (2)

apps/web/cypress/videos/agent-direct.cy.ts.mp4 is excluded by !**/*.mp4
pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

📒 Files selected for processing (28)

.gitignore
apps/agent-core/main.py
apps/agent-core/mcp_server.py
apps/agent-core/requirements.txt
apps/agent-core/src/dependencies.py
apps/agent-core/src/graph.py
apps/agent-core/src/tools.py
apps/web/app/(admin)/chat/page.tsx
apps/web/app/(admin)/layout.tsx
apps/web/app/(chat)/page.tsx
apps/web/app/api/agent/route.ts
apps/web/app/api/approvals/[prId]/decide/route.ts
apps/web/app/api/auth/[...nextauth]/route.ts
apps/web/app/api/auth/login/route.ts
apps/web/app/api/auth/signup/route.ts
apps/web/app/auth/login/page.tsx
apps/web/components/auth-provider.tsx
apps/web/components/genui/ApprovalCard.tsx
apps/web/components/genui/CatalogGrid.tsx
apps/web/components/genui/PRList.tsx
apps/web/components/genui/PurchaseRequestDraft.tsx
apps/web/components/shell/Rail.tsx
apps/web/cypress.config.ts
apps/web/cypress/e2e/b2b-approval.cy.ts
apps/web/middleware.ts
apps/web/next.config.mjs
apps/web/package.json
prisma/schema.prisma

coderabbitai · 2026-05-21T14:39:58Z

+  // Admin/Merchant routes require MERCHANT or ADMIN role
+  // SHOPPER and SUPPORT can access via separate (chat) route, not this layout
+  if (user.role !== 'MERCHANT' && user.role !== 'ADMIN') {
+    // Allow SHOPPER and SUPPORT through - they have their own routes
+    // For now, just let them pass
  }


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Block unauthorized roles in this layout.

This guard currently falls through, so any authenticated non-MERCHANT/ADMIN user can still render the admin layout. Redirect or reject here instead of leaving the branch empty.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/app/`(admin)/layout.tsx around lines 31 - 36, The current guard checks user.role but leaves the branch empty, allowing unauthorized roles to render the admin layout; update the branch (the if checking user.role !== 'MERCHANT' && user.role !== 'ADMIN') to immediately block access by performing a redirect or returning an access-denied response (e.g., call the app's redirect helper to send users to a forbidden page or throw a 403/Next.js notFound/redirect). Ensure the change is applied inside the same conditional that inspects user.role so unauthorized users are not allowed to continue rendering the admin layout.

coderabbitai · 2026-05-21T14:39:59Z

+// Setup - use REAL LLM (no mocks)
 function setupMocks() {
-  // Mock catalog search
-  cy.intercept('GET', '**/assistants/*/threads/*/runs/*/stream', (req) => {
-    const url = req.url
-    if (url.includes('laptop') || url.includes('show me laptops') || url.includes('I need a')) {
-      req.reply({ statusCode: 200, body: mockCatalogResponse() })
-    } else if (url.includes('budget')) {
-      req.reply({ statusCode: 200, body: mockBudgetResponse() })
-    } else if (url.includes('approval')) {
-      req.reply({ statusCode: 200, body: mockApprovalsResponse() })
-    } else if (url.includes('MacBook Pro')) {
-      req.reply({ statusCode: 200, body: mockBudgetAlertResponse() })
-    } else {
-      req.reply({ statusCode: 200, body: mockCatalogResponse() })
-    }
-  }).as('mockStream')
-
-  // Also intercept POST to run endpoint
-  cy.intercept('POST', '**/assistants/*/threads/*/runs', {
-    statusCode: 200,
-    body: { run_id: 'mock-run', thread_id: 'mock-thread', assistant_id: 'customer' }
-  }).as('mockRun')
+  // Using real LLM - no interceptors
+  // The app will call actual API endpoints with deepseek-v3.2:cloud
 }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify whether this suite still has no deterministic test-mode wiring # and whether alternate provider strings are present. rg -n "function setupMocks|deepseek-v3\\.2|cy\\.intercept\\(|x-test-mode|x-user-id"

Repository: Aparnap2/smart_commerce_agent

Length of output: 2888

🏁 Script executed:

#!/bin/bash set -euo pipefail # 1) Inspect the exact setupMocks + surrounding beforeEach usage echo "---- b2b-approval.cy.ts (around setupMocks) ----" nl -ba apps/web/cypress/e2e/b2b-approval.cy.ts | sed -n '80,150p' # 2) Find all occurrences of deepseek-v3.2:cloud echo "---- Occurrences of deepseek-v3.2:cloud ----" rg -n --hidden --no-ignore -S "deepseek-v3\.2:cloud" . || true # 3) Broader deepseek references (to see if this is hardcoded elsewhere) echo "---- Occurrences of deepseek ----" rg -n --hidden --no-ignore -S "deepseek" . || true # 4) Confirm whether this cypress file sets x-test-mode/x-user-id anywhere else echo "---- x-test-mode/x-user-id in b2b-approval.cy.ts ----" rg -n "x-test-mode|x-user-id" apps/web/cypress/e2e/b2b-approval.cy.ts || true # 5) Inspect where app sets x-test-mode/x-user-id for chat requests echo "---- app/(chat)/page.tsx (around x-test-mode headers) ----" nl -ba "apps/web/app/(chat)/page.tsx" | sed -n '50,110p'

Repository: Aparnap2/smart_commerce_agent

Length of output: 183

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "---- b2b-approval.cy.ts (around setupMocks) ----" cat -n apps/web/cypress/e2e/b2b-approval.cy.ts | sed -n '80,150p' echo "---- Occurrences of deepseek-v3.2:cloud ----" rg -n -S "deepseek-v3\.2:cloud" . || true echo "---- Occurrences of deepseek ----" rg -n -S "deepseek" . || true echo "---- x-test-mode/x-user-id in b2b-approval.cy.ts ----" rg -n "x-test-mode|x-user-id" apps/web/cypress/e2e/b2b-approval.cy.ts || true echo "---- app/(chat)/page.tsx (around x-test-mode headers) ----" cat -n "apps/web/app/(chat)/page.tsx" | sed -n '50,110p'

Repository: Aparnap2/smart_commerce_agent

Length of output: 7245

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "---- app/(chat)/page.tsx (where isTestMode is computed) ----" rg -n "isTestMode" "apps/web/app/(chat)/page.tsx" cat -n "apps/web/app/(chat)/page.tsx" | sed -n '1,120p' echo "---- apps/web/app/api/agent/route.ts (x-test-mode handling + model selection) ----" cat -n apps/web/app/api/agent/route.ts | sed -n '1,140p' echo "---- apps/web/app/api/agent/route.ts (any llm/provider config) ----" rg -n -S "deepseek|foundry|Azure|OpenAI|llm|model" apps/web/app/api/agent/route.ts || true echo "---- occurrences of deepseek-v3.2 in repo ----" rg -n -S "deepseek-v3\\.2" . || true echo "---- occurrences of Azure AI Foundry / Foundry in repo ----" rg -n -S "Azure AI Foundry|Foundry" . || true echo "---- occurrences of x-test-mode usage ----" rg -n "x-test-mode" apps/web . || true

Repository: Aparnap2/smart_commerce_agent

Length of output: 50386

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "---- cy.intercept usages in b2b-approval.cy.ts ----" rg -n "cy\.intercept" apps/web/cypress/e2e/b2b-approval.cy.ts || true echo "---- calls to mock*Response helpers in b2b-approval.cy.ts ----" rg -n "mock[A-Za-z0-9_]*Response" apps/web/cypress/e2e/b2b-approval.cy.ts || true echo "---- where setupMocks() is used ----" rg -n "setupMocks" apps/web/cypress/e2e/b2b-approval.cy.ts || true

Repository: Aparnap2/smart_commerce_agent

Length of output: 542

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "---- b2b-approval.cy.ts (top to setupMocks) ----" cat -n apps/web/cypress/e2e/b2b-approval.cy.ts | sed -n '1,220p' echo "---- b2b-approval.cy.ts (rest of file) ----" cat -n apps/web/cypress/e2e/b2b-approval.cy.ts | sed -n '220,420p'

Repository: Aparnap2/smart_commerce_agent

Length of output: 12031

Fix nondeterministic E2E LLM behavior in b2b-approval.cy.ts (setupMocks)
apps/web/cypress/e2e/b2b-approval.cy.ts lines 113–117 set setupMocks() to a no-op and no longer install any cy.intercept, so the spec doesn’t use the existing mock SSE helpers (mockCatalogResponse, mockBudgetResponse, etc.) and instead relies on real /api/agent behavior. Although apps/web/app/(chat)/page.tsx sets x-test-mode: true/x-user-id: test-user-id when running under Cypress (window.Cypress), this spec still doesn’t guarantee deterministic agent-core responses.
Remove/replace the misleading deepseek-v3.2:cloud comment (it isn’t referenced anywhere else in code).

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/cypress/e2e/b2b-approval.cy.ts` around lines 113 - 117, The setupMocks() function is currently a no-op causing the spec to call real /api/agent and produce nondeterministic LLM behavior; update setupMocks() to install the existing Cypress SSE/mocking helpers (e.g., call mockCatalogResponse, mockBudgetResponse, mockApprovalResponse or other mock* helpers used elsewhere) so the test uses deterministic mocked SSE responses instead of real agent calls, and remove the misleading "deepseek-v3.2:cloud" comment; locate and modify the setupMocks definition and replace the no-op body with invocations of the mock functions used by other specs (and delete the unused deepseek comment) so the spec consistently sets up test responses while still allowing page.tsx's window.Cypress headers to remain untouched.

coderabbitai · 2026-05-21T14:39:59Z

+  typescript: {
+    ignoreBuildErrors: true,
+  },
+  eslint: {
+    ignoreDuringBuilds: true,
+  },


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail # Show exact contents around the referenced lines FILE="apps/web/next.config.mjs" echo "== File: $FILE ==" nl -ba "$FILE" | sed -n '1,120p' # Search for env usage or other overrides in this config echo "== Search for ignoreBuildErrors / ignoreDuringBuilds ==" rg -n "ignoreBuildErrors|ignoreDuringBuilds|ALLOW_UNSAFE_BUILD|process\.env" "$FILE" || true # Check if there are other next.config.* files that might override behavior echo "== Other next config files ==" fd "next\.config\.(js|mjs|ts)$" -a . || true

Repository: Aparnap2/smart_commerce_agent

Length of output: 171

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="apps/web/next.config.mjs" echo "== File: $FILE ==" if [ -f "$FILE" ]; then # line numbers without relying on nl awk '{print NR "\t" $0}' "$FILE" | sed -n '1,160p' else echo "MISSING: $FILE" exit 1 fi echo "== rg in next config ==" rg -n "ignoreBuildErrors|ignoreDuringBuilds|ALLOW_UNSAFE_BUILD|process\.env" "$FILE" || true echo "== Find other next.config.* files ==" find . -maxdepth 5 -type f $ -name "next.config.*" $ -print

Repository: Aparnap2/smart_commerce_agent

Length of output: 2463

🌐 Web query:

Next.js next.config.js typescript.ignoreBuildErrors ignoreDuringBuilds eslint meaning during production build

💡 Result:

In next.config.js: 1) typescript.ignoreBuildErrors Meaning: When set to true, Next.js allows the production build (next build) to complete even if there are TypeScript errors in your project; it “completely skips the TypeScript type checking step.” [1][2] 2) eslint.ignoreDuringBuilds Meaning: When ESLint is detected, Next.js normally fails the production build when there are ESLint errors; setting ignoreDuringBuilds: true disables the built-in linting step during the build so production builds can complete even with ESLint errors. [3] Production build behavior (next build) - With typescript.ignoreBuildErrors: true, Next.js will not run TypeScript type checking during next build. [1] - With eslint.ignoreDuringBuilds: true, Next.js will not run its ESLint check during next build (if ESLint is detected). [3] Example configuration module.exports = { typescript: { // Dangerously allow production builds to successfully complete even if // your project has type errors. ignoreBuildErrors: true, }, eslint: { // Allow production builds to successfully complete even if your project has ESLint errors. ignoreDuringBuilds: true, }, } Both options are documented as “dangerous”/not recommended because you must run equivalent checks elsewhere (for example, in CI using tsc --noEmit and eslint) to avoid deploying incorrect code. [1][2][3]

Citations:

1: https://nextjs.org/docs/app/api-reference/config/next-config-js/typescript

2: https://nextjs.org/docs/app/api-reference/config/typescript

3: https://en.nextjs.im/docs/13/pages/api-reference/next-config-js/eslint/

Avoid disabling Next.js type/lint checks in production builds

apps/web/next.config.mjs hardcodes typescript.ignoreBuildErrors: true and eslint.ignoreDuringBuilds: true, letting next build succeed even with TypeScript and ESLint errors.

Suggested guardrail

const nextConfig = { typescript: { - ignoreBuildErrors: true, + ignoreBuildErrors: process.env.ALLOW_UNSAFE_BUILD === 'true', }, eslint: { - ignoreDuringBuilds: true, + ignoreDuringBuilds: process.env.ALLOW_UNSAFE_BUILD === 'true', }, reactStrictMode: true, output: 'standalone',

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

typescript: {

ignoreBuildErrors: true,

},

eslint: {

ignoreDuringBuilds: true,

},

typescript: {

ignoreBuildErrors: process.env.ALLOW_UNSAFE_BUILD === 'true',

},

eslint: {

ignoreDuringBuilds: process.env.ALLOW_UNSAFE_BUILD === 'true',

},

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/next.config.mjs` around lines 5 - 10, Replace the hardcoded bypasses for type/lint checks in next.config.mjs: remove typescript.ignoreBuildErrors: true and eslint.ignoreDuringBuilds: true and instead read a specific environment flag (e.g., IGNORE_NEXT_ERRORS or NEXT_ALLOW_BUILD_SKIP) that defaults to false so production/CI builds fail on TypeScript/ESLint errors; update the entries using the symbols typescript.ignoreBuildErrors and eslint.ignoreDuringBuilds to use that env-derived boolean and document the env var for CI usage.

coderabbitai · 2026-05-21T14:39:59Z

    "@prisma/adapter-pg": "^6.7.0",
-    "@prisma/client": "6.7.0",
+    "@prisma/client": "7.8.0",
    "@smart-commerce/errors": "workspace:*",


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail # Show relevant sections with line numbers echo "== apps/web/package.json (around 30-110) ==" nl -ba apps/web/package.json | sed -n '30,110p' echo echo "== Extract Prisma-related deps ==" rg -n --no-heading '"(`@prisma/`[^"]+|prisma)"\s*:\s*"' apps/web/package.json || true

Repository: Aparnap2/smart_commerce_agent

Length of output: 178

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== apps/web/package.json (around 30-110) ==" awk 'NR>=30 && NR<=110 {printf "%d\t%s\n", NR, $0}' apps/web/package.json echo echo "== Extract Prisma-related deps ==" rg -n --no-heading '"(`@prisma/`[^"]+|prisma)"\s*:\s*"' apps/web/package.json

Repository: Aparnap2/smart_commerce_agent

Length of output: 2894

Align Prisma package versions to avoid generator/runtime skew

@prisma/client and prisma are on 7.8.0, while @prisma/adapter-pg and @prisma/internals remain 6.x—this version skew often leads to breakage.

apps/web/package.json: @prisma/client 7.8.0 (line 38) vs @prisma/adapter-pg ^6.7.0 (line 37)

apps/web/package.json: @prisma/internals 6.7.0 (line 83) vs prisma 7.8.0 (line 99)

Suggested alignment

- "`@prisma/adapter-pg`": "^6.7.0", + "`@prisma/adapter-pg`": "7.8.0", ... - "`@prisma/internals`": "6.7.0", + "`@prisma/internals`": "7.8.0",

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/package.json` around lines 37 - 39, The Prisma packages in package.json are version-skewed: `@prisma/client` and prisma are at 7.8.0 while `@prisma/adapter-pg` and `@prisma/internals` are at 6.x; update the dependency entries so all Prisma-related packages use the same major version (e.g., set "`@prisma/adapter-pg`" and "`@prisma/internals`" to 7.8.0 to match "`@prisma/client`" and "prisma", or alternatively pin all to 6.7.0 if you intend to stay on v6), by editing the dependency and devDependency entries for "`@prisma/adapter-pg`", "`@prisma/internals`", "`@prisma/client`" and "prisma" in package.json and then run install to regenerate lockfile.

…, RBAC - F8: Manager Dashboard (pending approvals, dept spend, recent activity) - F9: Finance Dashboard (budget utilisation, spend trend, flagged items) - F10: Vendor sourcing request flow (genui card + tool + prompt) - F11: Dispute flow end-to-end (5/5 tests passing) - F13: NotificationBell GenUI component - F14: Catalog pricing audit background job (SerpApi comparison, CLI) - F15: RBAC middleware + dynamic tool filtering (40 tests) Test coverage: 106 passed, 5 skipped, 19 xfailed, 0 new TS errors

coderabbitai

Actionable comments posted: 6

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)

apps/agent-core/tests/test_notifications.py (1)

75-85: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use the same patch target in the Redis error test.

test_publish_approval_event_redis_error still patches dependencies.get_redis, while the other tests correctly patch src.notifications.get_redis. This can miss the mock depending on how publish_approval_event resolves imports.

Proposed fix

 `@pytest.mark.asyncio`
 async def test_publish_approval_event_redis_error():
     """Test handling Redis connection errors gracefully."""
     from src.notifications import publish_approval_event
-    from src import dependencies
 
     mock_redis = AsyncMock()
     mock_redis.publish = AsyncMock(side_effect=Exception("Redis connection failed"))
 
-    with patch.object(dependencies, 'get_redis', return_value=mock_redis):
+    with patch('src.notifications.get_redis', return_value=mock_redis):
         result = await publish_approval_event(
             employee_id="emp-123",
             pr_id="pr-uuid-456",
             pr_number="PR-101",
             decision="APPROVED",

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/agent-core/tests/test_notifications.py` around lines 75 - 85, The test
test_publish_approval_event_redis_error patches the wrong target so the mock may
not be used; change the patch to target the same get_redis used by
publish_approval_event (patch src.notifications.get_redis) instead of
dependencies.get_redis. Locate the test function
test_publish_approval_event_redis_error and replace the with
patch.object(dependencies, 'get_redis', ...) (or patch.object(dependencies,
'get_redis', return_value=mock_redis)) with a patch on
'src.notifications.get_redis' (returning mock_redis) so publish_approval_event
uses the mocked Redis publish that raises the intended Exception.

apps/web/__tests__/components/genui/CatalogGrid.test.tsx (1)

24-245: ⚠️ Potential issue | 🟠 Major | ⚖️ Poor tradeoff

Required test bootstrap is missing for *.test.tsx.

This file does not include the required beforeEach setup with cleanupTestDB() and seedTestFixtures().

As per coding guidelines, **/*.test.{ts,tsx}: Test pattern: use beforeEach with cleanupTestDB() and seedTestFixtures(), write specific assertions (not snapshots), use real Docker postgres database.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/web/__tests__/components/genui/CatalogGrid.test.tsx` around lines 24 -
245, The test file CatalogGrid.test.tsx is missing the required test bootstrap;
add a beforeEach that calls cleanupTestDB() then seedTestFixtures() to ensure a
clean DB and seeded fixtures before each test; place this at the top of the
describe('CatalogGrid', ...) scope (or imported into the test module) and ensure
the helper functions cleanupTestDB and seedTestFixtures are imported or
available in the test file so all tests run against the real Docker Postgres
fixture per the project testing guidelines.

apps/web/components/shell/Rail.tsx (1)

10-13: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

JWT payload parsing is not base64url-safe and can fail for valid tokens.

atob(token.split('.')[1]) assumes standard base64, but JWT payloads are base64url-encoded. Normalize/pad before decoding to prevent intermittent auth-state parsing failures.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/web/components/shell/Rail.tsx` around lines 10 - 13, The JWT payload
decoding uses atob(token.split('.')[1]) which assumes standard base64; instead
normalize the base64url segment before decoding: take the payload part
(token.split('.')[1]), replace '-' with '+', '_' with '/', pad with '=' to
length %4==0, then pass the normalized string to atob and JSON.parse; update the
code around the token variable/JSON.parse(atob(...)) usage to perform this
base64url-safe normalization so valid JWTs decode reliably.

apps/web/app/manager/page.tsx (1)

191-199: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle unauthenticated/null session explicitly instead of infinite spinner.

When token parsing fails or no token exists, this page never exits the loading state.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/web/app/manager/page.tsx` around lines 191 - 199, The current check
returns a perpetual loading spinner when session is null; change the logic in
the component that references session (the conditional returning Shell with the
spinner) to explicitly handle unauthenticated state: if session is
null/unauthenticated, either redirect to the sign-in route (call your auth
redirect method) or render a clear unauthenticated UI (e.g., Shell with Rail and
a centered message + "Sign in" button that triggers the auth flow) instead of
the spinner; update the conditional around session in page.tsx so only an
undefined/loading state shows the spinner while a null/unauthenticated session
shows the explicit fallback and an actionable next step.

apps/agent-core/src/graph.py (2)

20-50: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid mutating message objects in strip_ui_from_messages.

This currently edits existing ToolMessage instances in place, which can leak modified content back into persisted graph state/history.

🔧 Suggested fix

+import copy
...
 def strip_ui_from_messages(messages: list[BaseMessage]) -> list[BaseMessage]:
 ...
     stripped = []
-    for msg in messages:
+    for msg in messages:
+        msg_copy = copy.deepcopy(msg)
         # Only process ToolMessage with JSON content
-        if isinstance(msg, ToolMessage) and msg.content:
+        if isinstance(msg_copy, ToolMessage) and msg_copy.content:
             try:
-                parsed = json.loads(msg.content)
+                parsed = json.loads(msg_copy.content)
 ...
-                msg.content = json.dumps(parsed)
+                msg_copy.content = json.dumps(parsed)
             except json.JSONDecodeError:
                 # Non-JSON content, leave as-is
                 pass
-        stripped.append(msg)
+        stripped.append(msg_copy)

Also applies to: 311-316

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/agent-core/src/graph.py` around lines 20 - 50, strip_ui_from_messages
mutates ToolMessage instances in place; instead, avoid modifying existing
messages by creating a new message object for each modified ToolMessage (e.g.,
deepcopy the original or construct a new ToolMessage with the same metadata and
the cleaned content) and append the new object to stripped while leaving the
original msg untouched; update the code paths in strip_ui_from_messages (and any
other similar block that assigns to msg.content) to produce and append a new
message rather than reassigning msg.content.

380-390: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Use AIMessage for the approval gate assistant message (don’t construct BaseMessage(..., role="assistant")).

approval_gate_node appends BaseMessage(..., role="assistant") into state["messages"], and those messages are later sent to llm.ainvoke. LangChain expects assistant outputs to be represented with AIMessage (or ChatMessage for arbitrary roles), so this construction is likely to break the approval path at runtime.

🔧 Suggested fix

-from langchain_core.messages import BaseMessage, SystemMessage, HumanMessage, ToolMessage
+from langchain_core.messages import BaseMessage, SystemMessage, HumanMessage, ToolMessage, AIMessage
...
-            "messages": state["messages"] + [BaseMessage(content=json.dumps({
+            "messages": state["messages"] + [AIMessage(content=json.dumps({
                 "approval_decision": "APPROVED",
                 "message": "The manager has APPROVED the PR."
-            }), role="assistant")]
+            }))]
...
-            "messages": state["messages"] + [BaseMessage(content=json.dumps({
+            "messages": state["messages"] + [AIMessage(content=json.dumps({
                 "approval_decision": "REJECTED",
                 "message": "The manager has REJECTED the PR."
-            }), role="assistant")]
+            }))]

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/agent-core/src/graph.py` around lines 380 - 390, The assistant reply
added in approval_gate_node currently constructs BaseMessage(...,
role="assistant") and appends it to state["messages"], which will break
downstream llm.ainvoke handling; instead create an AIMessage containing the JSON
string and append that to state["messages"] (replace the BaseMessage usages
inside approval_gate_node with AIMessage with the same content for both APPROVED
and REJECTED branches) so LangChain receives proper assistant messages; keep the
JSON payload and role semantics identical but use AIMessage rather than
BaseMessage with role="assistant".

🟠 Major comments (20)

apps/agent-core/tests/test_context_compression.py-29-48 (1)
29-48: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Tests call a local helper instead of the actual implementation.

Lines 35 and 46 call create_context_summary_from_pr_info, which is defined within this test file (line 51), rather than the imported create_context_summary from src.graph (line 8). This means the tests validate the test helper's behavior, not the real production code. To follow proper TDD (Red → Green → Refactor), these tests should call the actual src.graph.create_context_summary function and fail until the implementation is written.
✅ Proposed fix to test the real implementation
  def test_create_summary_contains_pr_info(self):
      """Summary should contain PR number and items."""
-     # Create a mock tool result with PR info
      pr_info = {"prId": "PR-2026-0042", "items": [{"name": "MacBook Pro"}], "success": True}
      
-     # The summary function extracts prId from tool results
-     summary = create_context_summary_from_pr_info(pr_info, "employee@company.com")
+     # Build a message history that includes the PR submission tool result
+     messages = [
+         HumanMessage(content="Create a PR for MacBook Pro"),
+         ToolMessage(content=json.dumps(pr_info), tool_call_id="call-123", name="submit_for_approval"),
+     ]
+     summary = create_context_summary(messages, "employee@company.com")
      
      assert "PR-2026-0042" in summary
      assert "item" in summary.lower()
Remove the local helper function (lines 51-59) once tests use the real implementation.
As per coding guidelines: "Write the test FIRST before implementation. Follow Red → Green → Refactor workflow" for Python files.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/agent-core/tests/test_context_compression.py` around lines 29 - 48,
Tests are calling the local helper create_context_summary_from_pr_info (defined
in this test file) instead of the real production function
create_context_summary from src.graph; update the tests to call
create_context_summary (the symbol imported at the top) in both
test_create_summary_contains_pr_info and test_summary_is_concise, and then
delete the local helper definition create_context_summary_from_pr_info so the
tests exercise the real implementation (ensure the existing import of
create_context_summary is used and no duplicate helper remains).
apps/agent-core/tests/test_context_compression.py-14-27 (1)
14-27: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Trivial assertions don't validate compression logic.

These tests simply check whether a string literal appears in a hardcoded array (lines 18, 25), which always passes and doesn't verify the actual should_compress_context behavior. TDD tests should exercise the real compression decision logic—for example, by constructing a message history with or without submit_for_approval tool calls and asserting the function's boolean return value.
♻️ Example of a meaningful test
  def test_compress_after_submit_for_approval(self):
      """Should compress context after submit_for_approval is called."""
-     # Directly test the logic - check if submit_for_approval was called
-     # by checking for tool call names in the message flow
-     tool_call_names = ["search_catalog", "manage_purchase_request", "submit_for_approval"]
-     
-     # Check if submit_for_approval is in the tool calls
-     assert "submit_for_approval" in tool_call_names
+     messages = [
+         HumanMessage(content="Create a PR"),
+         AIMessage(content="", tool_calls=[{"name": "submit_for_approval", "args": {}, "id": "call-1"}]),
+         ToolMessage(content='{"success": true}', tool_call_id="call-1", name="submit_for_approval"),
+     ]
+     assert should_compress_context(messages) is True
As per coding guidelines: "Write the test FIRST before implementation" for Python files.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/agent-core/tests/test_context_compression.py` around lines 14 - 27,
Replace the trivial string-presence assertions in
test_compress_after_submit_for_approval and test_no_compress_without_submission
with real calls to the compression decision function (e.g.,
should_compress_context) using constructed message histories that include or
omit a tool call with name "submit_for_approval"; build message lists that mimic
the real message objects or dicts your code expects, call
should_compress_context(history) and assert True when the history contains a
submit_for_approval tool call and False when it does not, updating the tests to
reference the actual function name should_compress_context and any helper
constructors used to create tool-call messages.
apps/web/app/api/finance/flagged-items/[id]/resolve/route.ts-14-20 (1)
14-20: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Validate the id parameter and verify the update succeeded.

The handler does not validate the id format before executing the SQL update, nor does it check whether any row was actually updated. An invalid or non-existent id will return { success: true } even when nothing changed.
✅ Proposed validation and result check
  try {
    const { id } = await params
+   
+   // Validate UUID format (adjust if using different ID scheme)
+   const uuidRegex = /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i
+   if (!uuidRegex.test(id)) {
+     return NextResponse.json({ error: 'Invalid ID format' }, { status: 400 })
+   }

-   await prisma.$executeRaw`
+   const result = await prisma.$executeRaw`
      UPDATE "CatalogItem"
      SET "pricingFlag" = false, "pricingFlaggedAt" = NULL
      WHERE id = ${id}
    `
+   
+   if (result === 0) {
+     return NextResponse.json({ error: 'Item not found' }, { status: 404 })
+   }

    return NextResponse.json({ success: true })
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/web/app/api/finance/flagged-items/`[id]/resolve/route.ts around lines 14
- 20, Validate the incoming id from params (e.g., ensure it's a non-empty string
and matches your expected format such as a UUID using a regex or schema
validator) before running the SQL, and after calling prisma.$executeRaw`...
WHERE id = ${id}` check the returned affected-rows count (the value returned by
prisma.$executeRaw) to confirm at least one row was updated; if validation fails
return a 400 and if affected rows is 0 return a 404 (or appropriate error)
instead of always returning { success: true }. Use the existing params
extraction and prisma.$executeRaw call sites to implement these checks.
apps/agent-core/tests/test_chat.py-4-5 (1)
4-5: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Align xfail with the actual test scope and remove hybrid mocking from endpoint “integration” tests

In apps/agent-core/tests/test_chat.py, the module-level xfail (lines 4-5) correctly targets the missing client fixture—but the test cases are also written as endpoint calls that fully mock routers.chat.create_orchestrator (in the SSE/thread/UI action tests). That makes them a hybrid “integration” test that won’t exercise the real agent flow once the client fixture is fixed (and conflicts with the “no mocks in integration tests” rule).

Either:

Make these true integration tests (add/enable the FastAPI client fixture and stop patching create_orchestrator), or

Treat them as unit tests of endpoint behavior (avoid hitting /agent/chat via the client; call the handler/service directly) and adjust the xfail/markers accordingly.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/agent-core/tests/test_chat.py` around lines 4 - 5, The module-level
xfail is OK for the missing FastAPI test client but the tests also patch
routers.chat.create_orchestrator which creates a hybrid integration/unit test;
pick one approach and update tests accordingly: either enable the FastAPI test
client fixture and remove the patches of routers.chat.create_orchestrator so the
tests call the real /agent/chat endpoint exercising the full orchestrator flow,
or convert the tests into pure unit tests by calling the handler/service
functions directly (e.g., the function under test in routers.chat) and keep the
patches, then replace the module xfail with appropriate unit-test markers for
that scope. Ensure references to routers.chat.create_orchestrator and the
endpoint tests (SSE/thread/UI action tests) are updated consistently.
apps/web/__tests__/components/genui/PRList.test.tsx-26-187 (1)
26-187: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add the required DB fixture setup hook in test setup.

This suite is missing the mandated beforeEach pattern with cleanupTestDB() and seedTestFixtures().

As per coding guidelines "**/*.test.{ts,tsx}: Test pattern: use beforeEach with cleanupTestDB() and seedTestFixtures(), write specific assertions (not snapshots), use real Docker postgres database".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/web/__tests__/components/genui/PRList.test.tsx` around lines 26 - 187,
The test suite is missing the required DB fixture setup/teardown; add a
beforeEach hook that calls cleanupTestDB() then seedTestFixtures() before each
test so the suite runs against the mandated real Postgres fixtures. Locate the
PRList test file and add a top-level beforeEach(async () => { await
cleanupTestDB(); await seedTestFixtures(); }) (or equivalent async setup) near
the describe('PRList', ...) so all tests use the seeded DB and teardown is
performed consistently.
apps/web/__tests__/components/SuggestedChips.test.tsx-19-21 (1)
19-21: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Update beforeEach to include required test DB fixture setup.

This hook currently clears mocks only; it also needs cleanupTestDB() and seedTestFixtures().

As per coding guidelines "**/*.test.{ts,tsx}: Test pattern: use beforeEach with cleanupTestDB() and seedTestFixtures(), write specific assertions (not snapshots), use real Docker postgres database".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/web/__tests__/components/SuggestedChips.test.tsx` around lines 19 - 21,
The beforeEach currently only calls vi.clearAllMocks(); update the setup to also
call cleanupTestDB() and seedTestFixtures() so the test DB is reset and seeded
before each test: locate the beforeEach block that calls vi.clearAllMocks() (in
SuggestedChips.test.tsx) and add calls to cleanupTestDB() and await
seedTestFixtures() (ensure async/await if needed) alongside vi.clearAllMocks()
to satisfy the test fixture requirements.
apps/web/__tests__/components/genui/PurchaseRequestDraft.test.tsx-24-238 (1)
24-238: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add the required DB fixture setup hook in test setup.

This suite is missing the mandated beforeEach pattern with cleanupTestDB() and seedTestFixtures().

As per coding guidelines "**/*.test.{ts,tsx}: Test pattern: use beforeEach with cleanupTestDB() and seedTestFixtures(), write specific assertions (not snapshots), use real Docker postgres database".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/web/__tests__/components/genui/PurchaseRequestDraft.test.tsx` around
lines 24 - 238, Add the mandated DB fixture setup by adding a beforeEach hook
that calls cleanupTestDB() then seedTestFixtures() before each test in this
suite; locate the describe('PurchaseRequestDraft', ...) block and insert the
beforeEach with cleanupTestDB() and seedTestFixtures() (and ensure those helper
functions are imported/available in the test file) so each test runs against a
clean seeded Docker Postgres instance.
apps/agent-core/routers/chat.py-95-97 (1)
95-97: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not expose raw exception messages to SSE/API clients.

Line 97 and Line 103 forward str(e) to callers. This can leak internals (stack details, dependency errors, SQL fragments).
Suggested fix
-                yield {"event": "error", "data": json.dumps({"message": str(e)})}
+                yield {"event": "error", "data": json.dumps({"message": "Internal server error"})}
...
-        raise HTTPException(status_code=500, detail=str(e))
+        raise HTTPException(status_code=500, detail="Internal server error")
Also applies to: 101-103
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/agent-core/routers/chat.py` around lines 95 - 97, The code currently
forwards raw exception text to clients via the yield payload; instead, log the
full exception server-side with logger.exception(logger) or logger.error(...,
exc_info=True) and return a generic, non-sensitive client message (e.g.,
"Internal server error") — optionally include a stable error_id (uuid via
uuid.uuid4()) in both the log and the SSE payload so clients can reference the
incident; update the error handling around the logger.error and yield {"event":
"error", "data": json.dumps({"message": str(e)})} sites to implement this
pattern and avoid leaking str(e) to callers.
apps/web/components/genui/NotificationBell.tsx-267-270 (1)
267-270: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate notification.link scheme before window.open.

Line 269 opens untrusted link values directly. A javascript:/data: URL can be abused. Restrict to http:/https: before opening.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/web/components/genui/NotificationBell.tsx` around lines 267 - 270, The
click handler in NotificationBell currently calls window.open(notification.link,
...) without validating the URL scheme; update the handler that contains the
hasLink && notification.link branch to validate the link before opening by
parsing notification.link (e.g., via the URL constructor or an <a> element) and
only allow protocols 'http:' or 'https:'; if the scheme is invalid or parsing
fails, do not call window.open (optionally log or ignore), ensuring no
javascript:, data:, or other unsafe schemes are opened.
apps/agent-core/routers/chat.py-90-91 (1)
90-91: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

thread_id event is populated from the wrong field.

Line 91 emits threadId using pending_pr_number instead of a thread identifier. This can break frontend SSE contract handling.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/agent-core/routers/chat.py` around lines 90 - 91, The SSE `thread_id`
emission is using the wrong field; in the yield inside the chat SSE generator
replace the value currently pulled from event.get("pending_pr_number", "") with
the actual thread identifier (e.g., event.get("thread_id") or fallback to
event.get("pending_pr_thread_id")). Update the yield in the generator that
checks "pending_pr_id" so it emits {"event":"thread_id", "data":
json.dumps({"threadId": <thread id from event.get("thread_id") or
event.get("pending_pr_thread_id")>})} to satisfy the frontend SSE contract.
apps/web/components/genui/PriceComparisonCard.tsx-409-413 (1)
409-413: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Sanitize retailer URLs before binding to href.

Line 410 uses untrusted retailerLink directly in <a href>. Enforce an http/https allowlist and drop invalid schemes.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/web/components/genui/PriceComparisonCard.tsx` around lines 409 - 413,
Sanitize the untrusted retailerLink before assigning it to href (and before
passing to onRetailerClick) in the PriceComparisonCard component: validate the
scheme using a strict allowlist (only http:// or https://) by parsing
retailerLink (e.g., via the URL constructor) and if invalid or missing a safe
scheme, replace with a safe fallback like '#' or omit the href; then bind the
sanitized value (e.g., sanitizedRetailerLink) to the <a href> and call
onRetailerClick with the sanitized URL instead of the raw retailerLink.
apps/agent-core/src/catalog_audit.py-205-207 (1)
205-207: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Pool ownership tracking is incorrect and may close caller-managed pools.

If a caller passes pool, cleanup still closes it when get_pool() is unavailable. Track whether this function created the pool (e.g., owns_pool) and only close in that case.

Also applies to: 305-313
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/agent-core/src/catalog_audit.py` around lines 205 - 207, The code
unconditionally closes the passed-in pool; change it to track ownership so only
pools this function created are closed: introduce a boolean (e.g., owns_pool)
set to False by default, set owns_pool = True when you call _fetch_item_pool()
(instead of an unconditional pool assignment), and in the cleanup/finally blocks
only close/await pool.close() if owns_pool is True; apply the same pattern to
the other similar block referenced (lines ~305-313) so caller-managed pools are
never closed by this function.
apps/agent-core/scripts/run_catalog_audit.py-62-65 (1)
62-65: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

finally can raise UnboundLocalError if dependencies was never imported.

dependencies is referenced in finally but only imported in the try block. If an error occurs earlier, cleanup can mask the original failure.

Also applies to: 88-91
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/agent-core/scripts/run_catalog_audit.py` around lines 62 - 65, The
finally block references the module variable dependencies but the import occurs
inside the try, so if the import (or anything before it) fails the finally can
raise UnboundLocalError and mask the original exception; to fix, initialize
dependencies = None before the try (or use a local flag), then in the finally
guard any cleanup that touches dependencies by checking it is not None (or use
getattr/dependencies attribute checks) before calling its cleanup functions;
apply the same pattern for the second occurrence around the block that spans
lines referencing dependencies (the blocks near 62-65 and 88-91) to avoid
masking the original error.
apps/agent-core/scripts/run_catalog_audit.py-53-55 (1)
53-55: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

REDIS_URL optionality is broken by forcing localhost default.

This makes Redis effectively mandatory and can crash the script on machines without local Redis, despite the script describing Redis as optional caching.

Also applies to: 69-73
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/agent-core/scripts/run_catalog_audit.py` around lines 53 - 55, The code
currently forces REDIS_URL to "redis://localhost:6379" via os.environ.setdefault
which makes Redis mandatory; remove the hard default and treat REDIS_URL as
optional: stop calling os.environ.setdefault("REDIS_URL",
"redis://localhost:6379") and instead read os.environ.get("REDIS_URL") where
present, only initializing Redis/cache if that value is non-empty. Apply this
change for both occurrences (the block referencing REDIS_URL around the
os.environ.setdefault call and the similar block at lines 69-73) so the script
won’t crash on machines without a local Redis.
apps/agent-core/src/catalog_audit.py-181-185 (1)
181-185: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Unflag path can leave stale pricingFlaggedAt despite clearing the flag.

When no market data is found, code intends to clear prior flag state, but pricingFlaggedAt is preserved when $2 is null. This leaves inconsistent audit metadata.

Also applies to: 252-258
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/agent-core/src/catalog_audit.py` around lines 181 - 185, The UPDATE
currently preserves pricingFlaggedAt when $2 (marketMedianPrice) is NULL,
leaving stale audit data; change the CASE expression in the SQL inside
catalog_audit.py so pricingFlaggedAt is set to NULL when $2 IS NULL (i.e. CASE
WHEN $2 IS NULL THEN NULL ELSE "pricingFlaggedAt" END) and apply the same fix to
the other occurrence mentioned (the block around lines 252-258) so both UPDATEs
clear pricingFlaggedAt on the unflag/no-market-data path.
apps/web/test-genui-mock.mjs-58-74 (1)
58-74: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Mock format doesn't match SSE text/event-stream.

The PR fixes SSE parsing to read event: lines, but this mock returns JSON { events: [...] } with contentType: 'application/json'. The frontend's SSE parser won't be exercised. Consider returning text/event-stream with proper SSE format:
event: metadata\ndata: {...}\n\n
event: message\ndata: {...}\n\n
Otherwise, this test validates JSON handling but not the SSE fix.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/web/test-genui-mock.mjs` around lines 58 - 74, The mock currently sends
a JSON payload (mockResponse) via route.fulfill with contentType
'application/json', so the frontend SSE parser isn't exercised; update the
page.route handler to return contentType 'text/event-stream' and build the body
as an SSE stream by iterating the mock events (the objects in
mockResponse.events) and serializing each as "event: <type>\ndata: <JSON
data>\n\n" (ensure data is JSON-stringified), then pass that string to
route.fulfill (keep status 200). Target the page.route closure, the
mockResponse/events array, and the route.fulfill call when making this change.
apps/agent-core/src/eval_suite.py-154-157 (1)
154-157: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

No-op checks let critical failure modes pass by default.

PR-collision and context-confusion branches currently do nothing, so the evaluator returns a false pass for those scenarios.

Also applies to: 177-179
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/agent-core/src/eval_suite.py` around lines 154 - 157, The if-blocks that
check pr_id and expected_outcome == "pr_created" (and the similar no-op at lines
177-179) are left as pass and therefore silently allow PR-collision and
context-confusion cases to pass; replace the no-op with concrete validation
logic inside the evaluator function (where pr_id and expected_outcome are used)
to detect collisions and mismatches: when expected_outcome == "pr_created"
verify a new PR was actually created (compare PR IDs/titles against a provided
prior_prs list or context snapshot), and when checking for context-confusion
ensure that the produced PR content matches the intended context (or flag a
failure). On mismatch, return or raise the appropriate failure/result object
(instead of pass) so the evaluator reports a failing test; update both the pr_id
block and the analogous branch referenced around lines 177-179 (same function or
helper) to use these checks and return explicit failure outcomes.
apps/agent-core/src/eval_suite.py-143-151 (1)
143-151: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Budget violation logic is tied to a hardcoded threshold, not request size.

This can misclassify outcomes because it ignores the actual amount being submitted.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/agent-core/src/eval_suite.py` around lines 143 - 151, The budget check
currently compares remaining to a hardcoded 100000 which misclassifies
submissions; change the logic in eval_suite.py (the block using budget_status,
tool_names and remaining) to compare remaining against the actual request amount
instead of a magic constant—obtain the real request size (e.g., total_request,
requested_tokens, payload_size or a passed-in request_amount) from the calling
context or function parameters and replace the threshold check (remaining <
100000) with remaining < total_request (or remaining < request_amount); ensure
the variable used exists in scope or is added to the function signature and keep
the same return structure (passed:false, reason/failure_mode) when the remaining
is insufficient.
apps/agent-core/tests/test_tools_tdd.py-51-52 (1)
51-52: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Parameterize dept_name in the SQL query.

Direct string interpolation here is a SQL injection vector and can also break if dept_name contains quotes.
🔧 Suggested fix
-        dept = await conn.fetchrow(f'SELECT id FROM "Department" WHERE name = \'{dept_name}\' LIMIT 1')
+        dept = await conn.fetchrow(
+            'SELECT id FROM "Department" WHERE name = $1 LIMIT 1',
+            dept_name,
+        )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/agent-core/tests/test_tools_tdd.py` around lines 51 - 52, The SQL uses
direct string interpolation of dept_name which is a SQL injection risk and
breaks on quotes; change the call to conn.fetchrow to use a parameterized query
(e.g., use a positional parameter placeholder like $1) and pass dept_name as a
separate argument so that the query becomes something like conn.fetchrow('SELECT
id FROM "Department" WHERE name = $1 LIMIT 1', dept_name) and keep the existing
return logic (dept["id"] if dept else None).
apps/web/app/manager/page.tsx-284-286 (1)
284-286: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Forward approval comments to handleDecision.

The comment entered in ApprovalCard is currently discarded.
🔧 Suggested fix
-                        onSubmitDecision={async (decision) => {
-                          await handleDecision(approval.id, decision)
+                        onSubmitDecision={async (decision, _prNumber, _total, comments) => {
+                          await handleDecision(approval.id, decision, comments)
                         }}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/web/app/manager/page.tsx` around lines 284 - 286, The ApprovalCard's
onSubmitDecision currently only forwards the decision and drops the entered
comment; update the callback to accept the comment argument from ApprovalCard
and pass it into handleDecision so the comment is preserved (e.g. change
onSubmitDecision={(decision) => ...} to accept the comment payload and call
handleDecision(approval.id, decision, comment)). Locate the onSubmitDecision
prop on the ApprovalCard and the handleDecision function and ensure
handleDecision's signature supports a comment parameter (add a third parameter
like comment if missing) so the comment is forwarded and saved.

🟡 Minor comments (6)

apps/agent-core/tests/test_prd_requirements.py-45-51 (1)

45-51: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Global DB singleton mutation is not restored, risking cross-test leakage.

Line 50 sets src.dependencies._db_pool globally but this helper doesn’t restore prior state. Failures or interleaving can bleed pool state into other tests.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/agent-core/tests/test_prd_requirements.py` around lines 45 - 51, The
helper setup_pool_for_tools mutates the global singleton
src.dependencies._db_pool without restoring it; update it to save the current
src.dependencies._db_pool before assignment and restore that saved value after
the test (either by returning a teardown callback or converting
setup_pool_for_tools into a context manager/fixture). Specifically, in
setup_pool_for_tools capture old_pool = src_deps._db_pool, assign
src_deps._db_pool = pool, and ensure old_pool is reassigned to src_deps._db_pool
in test teardown or when exiting the context; keep references to the existing
function name setup_pool_for_tools and to src.dependencies._db_pool and
dependencies.get_pool_singleton so reviewers can locate the change.

apps/agent-core/tests/test_dispute_flow.py-85-86 (1)

85-86: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Reset dependencies._db_pool after each test to avoid stale closed-pool state.

These tests inject a global pool and close it, but never restore the singleton. Later tests can pick up a closed pool and fail nondeterministically.

Proposed fix

     finally:
+        from src import dependencies
+        dependencies._db_pool = None
         await pool.close()

Also applies to: 148-150, 168-169, 182-183, 202-203, 219-220, 239-240, 270-271, 290-291, 329-330

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/agent-core/tests/test_dispute_flow.py` around lines 85 - 86, Tests
assign a global DB pool to dependencies._db_pool and close it, but never restore
the original singleton; capture the previous value before overwriting (e.g.,
save old_pool = dependencies._db_pool), ensure each test restores
dependencies._db_pool to old_pool in a finally/teardown block (or use pytest
fixture with yield to set and then reset), and apply this pattern for every test
that mutates dependencies._db_pool (locations around the assignments at/near
lines where dependencies._db_pool is set) so later tests won't observe a closed
pool.

apps/agent-core/tests/test_eval_suite.py-42-60 (1)

42-60: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

test_pr_creation_unique_id doesn’t validate uniqueness yet.

This only checks passed twice; it never asserts that two PR identifiers differ.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/agent-core/tests/test_eval_suite.py` around lines 42 - 60, The
test_pr_creation_unique_id currently only checks pr1_result["passed"] and
pr2_result["passed"] but doesn't assert the PR identifiers are distinct; update
the test to extract the created PR identifier from pr1_result and pr2_result
(e.g., pr1_result["pr_id"] or pr1_result["data"]["id"] /
pr1_result["output"]["id"] depending on evaluate_response return shape) and add
an assertion that they are not equal (assert pr1_id != pr2_id) so the test
actually verifies uniqueness; keep the existing passed assertions.

apps/agent-core/tests/test_tools_tdd.py-187-190 (1)

187-190: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Strengthen assertions to validate actual behavior.

These assertions currently pass even when filtering/validation behavior is wrong, so they can mask regressions.

Also applies to: 399-401

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/agent-core/tests/test_tools_tdd.py` around lines 187 - 190, The
assertion using data.get("found", 0) >= 0 is too weak—replace it with concrete
checks: assert that data contains an "items" list, assert data["found"] ==
len(data["items"]), and for each returned item assert the item's category
matches the expected filter (e.g., item.get("category") == expected_category or
contains the filter), and if the test expects results assert data["found"] > 0;
apply the same replacement to the other similar assertions in this file.

apps/web/app/finance/page.tsx-384-385 (1)

384-385: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use a null check for marketMedianPrice rendering.

A value of 0 is currently treated as missing and rendered as —.

🔧 Suggested fix

-                          {item.marketMedianPrice ? safePrice(item.marketMedianPrice) : '—'}
+                          {item.marketMedianPrice != null ? safePrice(item.marketMedianPrice) : '—'}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/web/app/finance/page.tsx` around lines 384 - 385, The rendering treats 0
as missing because it uses a truthy check; update the conditional that renders
marketMedianPrice (the expression referencing item.marketMedianPrice and
safePrice) to explicitly check for null/undefined (e.g., item.marketMedianPrice
!= null) so zero values are passed to safePrice and only null/undefined render
the placeholder '—'.

apps/agent-core/src/tools.py-912-915 (1)

912-915: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

datetime.utcnow() is deprecated in Python 3.12+.

Use datetime.now(timezone.utc) (already imported) for consistency with other parts of the codebase and future compatibility.

Proposed fix

-        count = await conn.fetchval('SELECT COUNT(*) FROM "Dispute"')
-        year = datetime.utcnow().year
+        count = await conn.fetchval('SELECT COUNT(*) FROM "Dispute"')
+        year = datetime.now(timezone.utc).year
         reference_number = f"DIS-{year}-{int(count) + 1:04d}"

-        submitted_at = datetime.utcnow()
+        submitted_at = datetime.now(timezone.utc)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/agent-core/src/tools.py` around lines 912 - 915, Replace deprecated
datetime.utcnow() calls with timezone-aware datetime.now(timezone.utc) for the
variables year, reference_number, and submitted_at in the code that builds the
DIS reference and timestamp; compute year = datetime.now(timezone.utc).year,
build reference_number using that year and the same count logic, and set
submitted_at = datetime.now(timezone.utc) to keep consistency with the rest of
the codebase and use the already imported timezone.

🧹 Nitpick comments (6)

apps/agent-core/requirements.txt (1)

16-18: 💤 Low value

Consider pinning versions for reproducible builds.

The newly added dependencies use minimum-version constraints (>=), while the existing requirements use exact pins (==). This inconsistency may lead to non-reproducible builds if a new minor/major version of these libraries introduces breaking changes. For production deployments, pinning to exact versions (or using a lock file mechanism) ensures consistent behavior across environments.
📌 Example of consistent pinning
-python-dotenv>=1.0.0
-loguru>=0.7.0
-sse-starlette>=2.0.0
+python-dotenv==1.0.1
+loguru==0.7.2
+sse-starlette==2.1.3
Alternatively, maintain a requirements.lock or use Poetry/Pipenv for automatic lock file management.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/agent-core/requirements.txt` around lines 16 - 18, The requirements file
currently uses minimum-version constraints for python-dotenv, loguru, and
sse-starlette which conflicts with the project's exact pinning policy; update
the three entries (python-dotenv, loguru, sse-starlette) to use exact versions
(replace ">=" with "==" and set the specific version numbers you verified in CI
or the current environment), or alternatively add a lock file (requirements.lock
or a Poetry/Pipfile lock) and reference it so builds are reproducible; ensure
the chosen versions match what tests currently run against to avoid regressions.

apps/web/app/api/finance/flagged-items/route.ts (1)

11-21: ⚡ Quick win

Type assertion bypasses runtime validation.

The type assertion on line 11 assumes the database returns exactly the specified shape, but Prisma's $queryRaw returns unknown by default. If the database schema changes or returns unexpected types (e.g., Decimal for numeric columns), the subsequent Number() conversions (lines 24-25) may silently fail or produce NaN.

🛡️ Add runtime validation or use Prisma's typed query methods

Option 1: Use Prisma Client's type-safe findMany instead of raw SQL:

- const rows = await prisma.$queryRaw<Array<{...}>>`
-   SELECT ...
-   FROM "CatalogItem"
-   WHERE "pricingFlag" = true
-   ORDER BY "pricingFlaggedAt" DESC NULLS LAST
- `
+ const rows = await prisma.catalogItem.findMany({
+   where: { pricingFlag: true },
+   orderBy: [{ pricingFlaggedAt: 'desc' }],
+   select: {
+     id: true,
+     name: true,
+     sku: true,
+     unitPrice: true,
+     category: true,
+     vendor: true,
+     vendorCode: true,
+     marketMedianPrice: true,
+     pricingFlaggedAt: true,
+   },
+ })

Option 2: If raw SQL is required, validate each row before processing:

const rows = await prisma.$queryRaw<Array<unknown>>`...`
const validatedRows = rows.map(r => {
  if (!r || typeof r !== 'object' || !('id' in r)) {
    throw new Error('Invalid row shape from database')
  }
  return r as { id: string; name: string; /* ... */ }
})

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/web/app/api/finance/flagged-items/route.ts` around lines 11 - 21, The
raw call to prisma.$queryRaw is asserted to a specific shape which bypasses
runtime checks and can break if DB types change (e.g., Decimal), so replace the
raw query with Prisma Client's type-safe method (e.g., use
prisma.catalogItem.findMany with an explicit select for
id,name,sku,unitPrice,category,vendor,vendorCode,marketMedianPrice,pricingFlaggedAt)
so returned types are inferred and Decimal values are handled; if you must keep
$queryRaw, change the call to return Array<unknown> and add a validation step
that checks each row's shape and converts Decimal-like fields to numbers before
the Number(...) conversions (refer to prisma.$queryRaw and the code that
currently does Number(...) on unitPrice/marketMedianPrice).

apps/web/test-genui-mock.mjs (1)

80-82: ⚡ Quick win

Replace waitForTimeout with explicit selector waits.

Arbitrary timeouts cause flaky tests. Prefer waitForSelector with the component's data-testid:

Suggested fix

-  // Wait for the components to render
-  await page.waitForTimeout(2000);
+  // Wait for GenUI components to render
+  await page.waitForSelector('[data-testid="catalog-grid"]', { timeout: 5000 });

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/web/test-genui-mock.mjs` around lines 80 - 82, Replace the flaky fixed
delay call page.waitForTimeout(2000) with explicit waits for the rendered
components by using page.waitForSelector targeting the component data-testid(s)
present in the test (e.g., wait for "[data-testid=\"...\" ]" for the main
component or child elements). Update the test to await the selector(s) that
indicate the UI is ready (use page.waitForSelector for each critical element or
a single parent container) instead of the arbitrary timeout so the test becomes
deterministic and less flaky.

apps/web/lib/auth/rbac.ts (1)

46-55: 💤 Low value

B2B_ROLE_HIERARCHY is defined but never used.

meetsRoleLevel still evaluates against the B2C ROLE_HIERARCHY. If this function is called with B2B roles like EMPLOYEE or MANAGER, indexOf returns -1, causing the check to always fail. Either wire meetsRoleLevel to use the correct hierarchy based on context, or remove B2B_ROLE_HIERARCHY if unused.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/web/lib/auth/rbac.ts` around lines 46 - 55, meetsRoleLevel currently
always uses ROLE_HIERARCHY while B2B_ROLE_HIERARCHY is unused, causing -1
lookups for B2B roles; update meetsRoleLevel to choose the correct hierarchy by
checking whether both userRole and minimumRole exist in B2B_ROLE_HIERARCHY (use
that) otherwise fall back to ROLE_HIERARCHY, handle missing roles by returning
false (or throwing) to avoid comparing -1 indices, and keep the symbols
B2B_ROLE_HIERARCHY, ROLE_HIERARCHY, and meetsRoleLevel to locate the change.

apps/web/components/genui/BudgetGauge.tsx (1)

126-131: 💤 Low value

maxBudget prop is unused in CategoryBar.

maxCategoryBudget is calculated at line 199 and passed to CategoryBar, but the component never uses it. The bar width is based on spent / budget, not relative to max. Remove the unused prop or clarify its intended purpose.

Remove unused prop

-const CategoryBar: FC<{ name: string; budget: number; spent: number; maxBudget: number }> = ({
+const CategoryBar: FC<{ name: string; budget: number; spent: number }> = ({
   name,
   budget,
   spent,
-  maxBudget,
 }) => {

-  const maxCategoryBudget = Math.max(...categories.map(c => c.budget), 1)
   ...
               <CategoryBar
                 key={cat.name || i}
                 name={cat.name}
                 budget={cat.budget}
                 spent={cat.spent}
-                maxBudget={maxCategoryBudget}
               />

Also applies to: 198-199, 258-265

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/web/components/genui/BudgetGauge.tsx` around lines 126 - 131,
CategoryBar currently accepts a maxBudget prop that is never used; either remove
maxBudget from the CategoryBar props and all call sites (where maxCategoryBudget
is passed) or update the component to compute widths relative to maxBudget
(e.g., use spent / maxBudget instead of spent / budget) depending on intended
behavior. Locate the CategoryBar declaration and usages (symbol: CategoryBar and
the prop name maxBudget) and either (a) delete maxBudget from the prop type and
from every invocation (where maxCategoryBudget is passed), or (b) replace the
width calculation inside CategoryBar to reference maxBudget (and add a
safe-guard for zero) and keep passing maxCategoryBudget from the caller.

apps/web/app/(chat)/page.tsx (1)

84-105: 💤 Low value

Consider using unknown[] instead of any[] for type assertions.

Lines 89, 98, and similar locations use as any[] type assertions. Per coding guidelines, prefer unknown over any to maintain type safety.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/web/app/`(chat)/page.tsx around lines 84 - 105, In renderUIComponent
update the loose any[] casts to unknown[] to preserve type safety: change casts
for items and products (currently using "as any[] | undefined") to "as unknown[]
| undefined" and propagate that type where CatalogGrid is called (items and
products props), keeping loading, error, and onAddToPR usage unchanged; ensure
the UIComponent prop typing reflects unknown[] so CatalogGrid receives unknown[]
| undefined instead of any[] and adjust any downstream handling or type guards
accordingly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fa9bc4d4-bf39-49a0-88b1-a6d3c36847bd

📥 Commits

Reviewing files that changed from the base of the PR and between 354f75b and a008fff.

📒 Files selected for processing (65)

apps/agent-core/migrations/005_add_pricing_flag_columns.sql
apps/agent-core/requirements.txt
apps/agent-core/routers/__init__.py
apps/agent-core/routers/chat.py
apps/agent-core/scripts/run_catalog_audit.py
apps/agent-core/src/catalog_audit.py
apps/agent-core/src/eval_suite.py
apps/agent-core/src/graph.py
apps/agent-core/src/tools.py
apps/agent-core/tests/conftest.py
apps/agent-core/tests/test_budget_timing.py
apps/agent-core/tests/test_catalog_audit.py
apps/agent-core/tests/test_chat.py
apps/agent-core/tests/test_context_compression.py
apps/agent-core/tests/test_context_stripping.py
apps/agent-core/tests/test_dispute_flow.py
apps/agent-core/tests/test_dynamic_tools.py
apps/agent-core/tests/test_error_formatting.py
apps/agent-core/tests/test_eval_suite.py
apps/agent-core/tests/test_graph.py
apps/agent-core/tests/test_health.py
apps/agent-core/tests/test_notifications.py
apps/agent-core/tests/test_prd_requirements.py
apps/agent-core/tests/test_sanitization.py
apps/agent-core/tests/test_semantic_search.py
apps/agent-core/tests/test_serpapi_market_price.py
apps/agent-core/tests/test_tools_tdd.py
apps/web/__tests__/components/SuggestedChips.test.tsx
apps/web/__tests__/components/genui/ApprovalCard.test.tsx
apps/web/__tests__/components/genui/BudgetGauge.test.tsx
apps/web/__tests__/components/genui/CatalogGrid.test.tsx
apps/web/__tests__/components/genui/PRList.test.tsx
apps/web/__tests__/components/genui/PurchaseRequestDraft.test.tsx
apps/web/app/(chat)/page.tsx
apps/web/app/api/agent/route.ts
apps/web/app/api/finance/budget-by-department/route.ts
apps/web/app/api/finance/flagged-items/[id]/resolve/route.ts
apps/web/app/api/finance/flagged-items/route.ts
apps/web/app/api/finance/me/route.ts
apps/web/app/api/finance/spend-trend/route.ts
apps/web/app/api/finance/top-categories/route.ts
apps/web/app/finance/page.tsx
apps/web/app/login/page.tsx
apps/web/app/manager/page.tsx
apps/web/components/SuggestedChips.tsx
apps/web/components/genui/ApprovalCard.tsx
apps/web/components/genui/BudgetAlert.tsx
apps/web/components/genui/BudgetGauge.tsx
apps/web/components/genui/CatalogGrid.tsx
apps/web/components/genui/DisputeCard.tsx
apps/web/components/genui/NotificationBell.tsx
apps/web/components/genui/PRList.tsx
apps/web/components/genui/PRSubmittedCard.tsx
apps/web/components/genui/PriceComparisonCard.tsx
apps/web/components/genui/PurchaseRequestDraft.tsx
apps/web/components/genui/SourcingRequestCard.tsx
apps/web/components/genui/index.ts
apps/web/components/shell/Rail.tsx
apps/web/lib/auth/jwt.ts
apps/web/lib/auth/rbac.ts
apps/web/lib/genui/__tests__/safe-render.test.ts
apps/web/lib/genui/safe-render.ts
apps/web/lib/ui-event-types.ts
apps/web/middleware.ts
apps/web/test-genui-mock.mjs

✅ Files skipped from review due to trivial changes (6)

apps/agent-core/migrations/005_add_pricing_flag_columns.sql
apps/agent-core/tests/test_health.py
apps/web/app/login/page.tsx
apps/agent-core/routers/init.py
apps/web/tests/components/genui/BudgetGauge.test.tsx
apps/agent-core/tests/test_budget_timing.py

coderabbitai · 2026-05-24T04:42:11Z

-        'Authorization': token,
+        'Content-Type':     'application/json',
+        'Authorization':    token,
+        'x-test-mode':      req.headers.get('x-test-mode') ?? '',


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check how agent-core handles x-test-mode header rg -nC10 'x-test-mode|test.mode|TEST_MODE' apps/agent-core/

Repository: Aparnap2/smart_commerce_agent

Length of output: 1704

Fix auth bypass via x-test-mode forwarding

apps/web/app/api/agent/route.ts forwards x-test-mode to agent-core; in apps/agent-core/routers/chat.py, auth is skipped when request.headers.get("x-test-mode") == "true" (or x-user-id == "test-user-id"), with no production/non-production gating shown. This lets any caller bypass JWT by sending these headers. Gate test-mode auth bypass strictly to non-production/internal use (e.g., MOCK_LLM=true/NODE_ENV checks and/or an additional server-side secret) and don’t forward x-test-mode from the public route in production.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/app/api/agent/route.ts` at line 87, The public API should not forward the 'x-test-mode' header unconditionally and the agent-core should not skip auth solely based on that header; in route.ts stop sending 'x-test-mode' to agent-core when running in production (e.g., only include 'x-test-mode' when NODE_ENV !== "production" or process.env.MOCK_LLM is true) or replace it with a server-only secret header, and in apps/agent-core/routers/chat.py tighten the bypass check that currently reads request.headers.get("x-test-mode") == "true" (and x-user-id == "test-user-id") so it also verifies a server-side gating condition (e.g., process.env.MOCK_LLM === "true" or a shared internal secret) before skipping JWT auth.

coderabbitai · 2026-05-24T04:42:11Z

+  const role = request.headers.get('x-role')
+  const userId = request.headers.get('x-user-id')
+
+  if (!role || !['FINANCE', 'ADMIN'].includes(role)) {
+    return NextResponse.json({ error: 'Forbidden' }, { status: 403 })
+  }


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Do not authorize from client-controlled headers.

x-role/x-user-id can be forged by callers, so this endpoint currently authorizes based on untrusted input. Derive role and user identity only from a verified server-side auth source (e.g., validated JWT/cookie context) and treat these headers as non-authoritative.

Also applies to: 17-24

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/app/api/finance/me/route.ts` around lines 5 - 10, The code is currently authorizing using untrusted headers (role/userId); replace reading request.headers.get('x-role') and 'x-user-id' in the route handler with server-validated identity from your auth layer (e.g., validate the request cookie/JWT or call your session helper) and extract role and user id from the verified token/session claims; update the authorization check (the logic that uses role and userId, including the similar block at the later check around lines 17-24) to use those verified values and remove reliance on the client-controlled headers.

… across 16 dimensions) - Add groq provider to create_llm() with ThinkTagStrippingLLM wrapper - Add openrouter provider to create_llm() + get_openai_client() - Fix SSE handler: parse __ui__ from message content JSON, add user_role field - Add x-test-mode auth bypass header for test env - Create test_agentic_ai_eval.py: 46 tests covering tool calling, output format, RAG, decision making, compaction, state persistence, GenUI payloads, short/long-term memory, context/harness engineering, edge cases, chaos/concurrency, checkpoints, system prompt integrity, performance - Register pytest integration marker in pyproject.toml

netlify · 2026-06-12T14:08:26Z

❌ Deploy Preview for ai-sdk failed. Why did it fail? →

Name	Link
🔨 Latest commit	`ca15ece`
🔍 Latest deploy log	https://app.netlify.com/projects/ai-sdk/deploys/6a2c13473128770008054230

…, docker - agent-core: SSE streaming, graph, tools, salesforce client, eval suite dependencies, 16 test files (unit + integration + e2e + genui + enforcement) - web: support cockpit (three-panel), 9 GenUI support components, UIContext, ContextPanel, RBAC, middleware, auth - mocks: salesforce mock server, slack mock server - infra: docker-compose.supabase.yml, opencode config

gemini-code-assist Bot reviewed May 21, 2026

View reviewed changes

coderabbitai Bot reviewed May 21, 2026

View reviewed changes

Aparnap2 changed the title ~~fix(genui): SSE parsing, mock LLM, and GenUI rendering~~ feat: F8-F15 — Manager/Finance dashboards, sourcing, disputes, pricing audit, RBAC May 24, 2026

coderabbitai Bot reviewed May 24, 2026

View reviewed changes

	console.log('[Middleware] Path:', path, '\| Has token cookie:', hasAuthCookie, '\| Cookie:', cookieHeader.slice(0, 100))
	// Debug logging (remove in production)
	console.log('[Middleware] Path:', path, '\| Has token cookie:', hasAuthCookie);

	msg.content = json.dumps(parsed)
	msg = msg.model_copy(update={"content": json.dumps(parsed)})

Conversation

Aparnap2 commented May 21, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ProcureAI — Features F8–F15

What this does

Architecture highlights

Test coverage

What changed

Demo walkthrough (5 min)

Out of scope (intentional)

Summary by CodeRabbit

Uh oh!

netlify Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

❌ Deploy Preview for ai-sdk failed. Why did it fail? →

Uh oh!

coderabbitai Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Aparnap2 commented May 21, 2026 •

edited by coderabbitai Bot

Loading

netlify Bot commented May 21, 2026 •

edited

Loading

coderabbitai Bot commented May 21, 2026 •

edited

Loading

netlify Bot commented Jun 12, 2026 •

edited

Loading