Skip to content

mborges-dev/docflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DocFlow

AI document extraction pipeline for logistics and distribution. Automates the manual data entry of paper-based delivery notes, invoices, and shipping documents — turning unstructured scanned documents into structured database records.

Designed for distribution operations processing thousands of delivery notes per week.

Status: Public showcase. The two demos here (app/index.html and dashboard/index.html) are anonymized recreations of the real system I built under a client engagement. The original production code is proprietary and stays with the client; what you see here is the same architecture, the same UX flow, with all client identifiers replaced.


Problem

Distribution companies receive hundreds of paper delivery notes daily from drivers, suppliers, and couriers. Each note contains:

  • Delivery address and contact
  • Line items with product codes, quantities, weights
  • Signatures, timestamps, special instructions
  • Handwritten annotations

Manual data entry into the ERP takes a team of clerks several hours per day. Errors lead to billing discrepancies and stock mismatches.


Solution

DocFlow scans or photographs the documents, runs them through a GPT-4o Vision pipeline, extracts structured data, validates it against known product codes and addresses, and writes it directly to the operational database — with a human review queue for low-confidence extractions.


Architecture

Physical document
       │
  📷 Photo / scan
       │
       ▼
  Cloudflare R2          ← upload via mobile app or web form
  (document storage)
       │
       ▼
  Cloudflare Worker      ← triggers on upload
  (orchestration layer)
       │
       ▼
  GPT-4o Vision          ← structured JSON extraction
  (extraction model)
       │
       ▼
  Validation layer       ← cross-checks against product catalogue + address book
  (Supabase queries)
       │
    ┌──┴──┐
    │     │
  ✓ Pass  ✗ Fail / low confidence
    │           │
    ▼           ▼
  Auto-write  Human review queue
  to ERP DB   (Next.js dashboard)

Extraction pipeline

Each document goes through a multi-step extraction:

Step 1 — Pre-processing The image is resized and contrast-adjusted for OCR readability. Documents with multiple pages are split and processed individually.

Step 2 — GPT-4o Vision call The model receives the image with a structured extraction prompt requesting a typed JSON response:

type DeliveryNoteExtraction = {
  document_type: "delivery_note" | "invoice" | "return_note" | "unknown";
  document_number: string | null;
  date: string | null; // ISO 8601
  supplier: { name: string; vat_number: string | null };
  delivery_address: { street: string; postal_code: string; city: string };
  line_items: Array<{
    product_code: string | null;
    description: string;
    quantity: number;
    unit: string;
    weight_kg: number | null;
  }>;
  signature_present: boolean;
  notes: string | null;
  confidence: number; // 0-1, model's self-assessed confidence
};

Step 3 — Validation Extracted product codes are matched against the product catalogue in Supabase. Addresses are fuzzy-matched against the known delivery address book. Mismatches or low-confidence fields are flagged for human review.

Step 4 — Write or queue Documents with confidence ≥ 0.85 and all fields validated are written directly to the ERP integration table. Others go to the review queue.

Human review dashboard

A Next.js dashboard shows the review queue with:

  • Side-by-side view: original document image + extracted JSON
  • Inline editing for any field
  • "Accept" / "Reject" / "Re-extract" actions
  • Audit trail of every edit

Reviewers process the queue in the morning for documents received overnight.

ERP integration

The validated records are written to a staging table in Supabase. A scheduled job (Supabase pg_cron) exports the staged records to the client's ERP via a REST adapter, running every 2 hours during business hours.


Stack

Layer Technology
Storage Cloudflare R2 (document images)
Edge Cloudflare Workers (upload handler, orchestration)
AI GPT-4o Vision (OpenAI)
Database Supabase (PostgreSQL)
Review dashboard Next.js 14 (App Router)
ERP connector ERP REST API (custom adapter)
Language TypeScript

Key decisions

GPT-4o Vision over traditional OCR — classic OCR (Tesseract, etc.) struggles with handwriting, poor scan quality, and mixed layouts. GPT-4o Vision handles all three and returns structured JSON directly, skipping the intermediate parse step.

Confidence-based routing — not all extractions are equal. A simple threshold (0.85) routes clean documents to auto-write and problematic ones to human review. This keeps the review queue manageable (≈5% of documents) while automating the bulk.

Cloudflare R2 + Workers — documents are uploaded directly to R2 from mobile devices in the field. The Worker triggers immediately on upload, keeping latency low without a polling loop or message queue for the typical volume.

Staging table → ERP — writing directly to the ERP API on every extraction would be brittle. The staging table decouples extraction from ERP sync, making retries, rollbacks, and debugging straightforward. The ERP adapter is the only component that knows about the target ERP system.

Supabase for everything except storage — product catalogue, address book, review queue, audit log, and ERP staging all live in Postgres. Using a single database keeps joins simple and avoids distributed transactions.


Results

  • ~95% of documents processed automatically without human review
  • Data entry time reduced from ~4 hours/day to ~20 minutes (review queue only)
  • Error rate in ERP records dropped from ~3% (manual) to ~0.3% (pipeline)

What's in this repo

docflow/
├── README.md              ← this file (architecture documentation)
├── app/
│   └── index.html         ← the field-app demo: capture a document,
│                            extract structured data, send to dashboard.
│                            Single-file standalone — open in a browser.
└── dashboard/
    └── index.html         ← the operator review dashboard: queue of
                             pending docs, approve / reject / send-to-ERP
                             actions, SAP integration log, analytics.

How to run the demos

Both demos are single-file HTML. No build step, no dependencies, no API keys needed.

# Field-app demo
open app/index.html

# Dashboard demo
open dashboard/index.html

The dashboard listens for new documents from the field-app via localStorage (they communicate when both open in the same browser). Open both, click "📱 Simular doc. da app" in the dashboard or use the field-app to send a new document, and watch the flow.

What's anonymized

All client-identifying data has been replaced with generic placeholders:

  • Supplier/brand names → "Acme Bebidas"
  • Real NIFs → fake NIFs in the 500 100 100 format
  • Geographic markers → generic "Lisboa, Portugal"
  • Real operator name → "João Silva"
  • Cost-center codes → generic "ACME-LOG-001"

The architecture, the UX flow, the SAP integration shape, the field-to-dashboard handoff, the OCR-vs-LLM fallback strategy — all real, all preserved.


Notice

This repository is published as a portfolio showcase of work I did under a real client engagement. The code is not licensed for reuse, redistribution, or modification. It is provided here for review purposes only. If you'd like to discuss similar work for your own organisation, get in touch.


Built by Miguel Borges · hello@miguelborges.dev

About

AI document extraction pipeline for logistics — anonymized demos (field-app + operator dashboard) of a system built under a client engagement.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages