F3.2 Invoice Processing — Smart Lean AI Pipeline
CRITICAL BUG FIX: Upload 400 Error
Root cause: server.js line 1342 uses let body = ''; to accumulate request body as a string. This corrupts binary multipart/form-data (PDF uploads). Django receives mangled data and says "No PDF file provided".
Fix: Change the proxy to use Buffer for binary-safe forwarding:
// Line 1342-1358 in server.js — replace string body with Buffer array
const chunks = [];
req.on('data', chunk => chunks.push(chunk));
req.on('end', () => {
const body = Buffer.concat(chunks);
const proxyReq = http.request({
hostname: '127.0.0.1', port: 3001,
path: proxyUrl, method: req.method,
headers: { ...req.headers, host: '127.0.0.1:3001' }
}, (proxyRes) => {
res.writeHead(proxyRes.statusCode, proxyRes.headers);
proxyRes.pipe(res);
});
proxyReq.on('error', (err) => {
console.error('Django proxy error:', err.message);
res.writeHead(502, { 'Content-Type': 'application/json' });
res.end(JSON.stringify({ error: 'Backend unavailable' }));
});
if (body.length > 0) proxyReq.write(body);
proxyReq.end();
});
File: /home/claude/projects/unitcycle-demo/server.js lines 1340-1361
Verify: curl -X POST -F "pdf_file=@backend/media/invoices/1abb4cad698a_test_invoice.pdf" https://demo.unitcycle.com/api/invoices/upload/
Context
UnitCycle's invoice feature (F3.2) has the frontend and basic backend built, but the pipeline dead-ends at approve/reject. Invoices go nowhere after approval — no GL coding, no payment tracking, no AI matching. Competitors like Yardi Breeze and Entrata offer 94% AI accuracy with full AP workflows. This plan transforms the existing skeleton into a Smart Lean pipeline where AI handles the tedious work (vendor/property/GL matching, anomaly detection, duplicate flagging) and the PM just reviews and approves.
Design approved by Rafael: Smart Lean workflow (C), 2-panel layout, inline AI badges per field, light mode.
Architecture: Smart Lean Pipeline
Upload PDF → LlamaParse OCR (Cost Effective, 3 credits) → AI Extraction (Gemini Flash via OpenRouter)
→ Auto-match vendor/property/GL → Flag anomalies/dupes/WO matches
→ PM Reviews (inline AI badges, accept/override) → Approved → Scheduled → Paid
6-Stage State Machine
UPLOADED → AI_PROCESSING → PENDING_REVIEW → APPROVED → SCHEDULED → PAID
↓
REJECTED / ON_HOLD
API Keys & Configuration
Environment Setup
Create /home/claude/projects/unitcycle-demo/backend/.env:
LLAMAPARSE_API_KEY=llx-TNRhGRWbPYPukOOnn2xo0nevpsGuLek4yC8Nmbt5f57TWadS
OPENROUTER_API_KEY=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
LlamaParse Config
- Mode: Cost Effective (
gpt4o_mode=true) — 3 credits/page, 3,333 free/month - Current code fix:
backend/invoices/llamaparse.pyline 63: change"premium_mode": "true"→"gpt4o_mode": "true" - API:
https://api.cloud.llamaindex.ai/api/v1/parsing(keep v1 for now, simpler)
OpenRouter Config
- Primary model:
google/gemini-2.0-flash-001— $0.10/$0.40 per M tokens - Fallback:
qwen/qwen3-32b— $0.08/$0.24 per M tokens - API:
https://openrouter.ai/api/v1/chat/completions(OpenAI-compatible) - Use for: Invoice field extraction enhancement, vendor/property/GL matching, anomaly detection, AI summaries, confidence scoring
- Cost: ~$0.0004 per invoice, ~$40 per 100K invoices/year
Django Settings Update
- Install
python-dotenv, load.envinbackend/config/settings.py - Add:
OPENROUTER_API_KEY,OPENROUTER_MODEL(default:google/gemini-2.0-flash-001) - Update:
LLAMAPARSE_API_KEYto read from env instead of hardcoded
Design Spec
Invoice Detail Page (Redesign)
Layout: 2-panel (PDF left, details right) — keep existing structure
New components in right panel (top to bottom):
Workflow Stepper — horizontal 6-stage bar at page top
- Uploaded → AI Extracted → Review → Approved → Scheduled → Paid
- Gold filled dot with pulse animation on active step
- Gold checkmark on completed steps, muted outline on future
Glass Summary Card — hero card with gold top border
- Total amount (24px Manrope bold), AI confidence bar + percentage
- Due date, payment terms
- No glassmorphism blur (light mode), just subtle shadow + gold accent bar
AI Alert Cards — only shown when relevant
- Price anomaly (red): "34% above avg for this vendor"
- Work order match (purple): links to matching WO
- Duplicate warning (amber): fingerprint match detected
- Each alert has icon, title, detail text, and action link
Vendor & Property Card — inline AI badges per field
- Each field: label | value | AI badge (check 95%, eye 78%, warning 45%) | "change" link on hover
- Confidence tiers: High (>=85% gold), Medium (60-84% amber), Low (<60% red)
- Low-confidence fields show alternatives ("Also considered: Oakmere Trace 61%")
- "change" link opens dropdown with search for manual override
Line Items Table — with GL code chips
- Columns: #, Description, GL Code, Qty, Amount
- GL code as clickable chip with inline confidence badge
- Uncertain GL codes have amber border
- Total row with Manrope bold, tabular-nums
Action Bar — pinned at bottom of line items card
- Approve (navy bg, white text, primary), Hold (neutral), Reject (red outline)
- Approve triggers: accept all AI suggestions → move to APPROVED status
Invoice List Page (Minor updates)
- Add "Scheduled" and "Paid" status filter tabs
- Add "Hold" status tab
- Stats cards: add total scheduled, total paid amounts
Color System (Light Mode, oklch())
- Surfaces: white cards on
oklch(0.97 0.005 260)background - Gold accent:
oklch(0.68 0.16 75)— all positive indicators - Navy:
oklch(0.25 0.06 260)— approve button, headings - Confidence high:
oklch(0.58 0.16 75)onoklch(0.96 0.04 80)bg - Confidence medium:
oklch(0.62 0.15 55)onoklch(0.95 0.04 60)bg - Danger:
oklch(0.55 0.22 25)— anomalies, reject only - Purple:
oklch(0.52 0.18 300)— WO match alerts
Implementation Plan
Phase 1: Foundation (Backend)
Step 1: Environment & config setup
- Create
backend/.envwith API keys - Install
python-dotenv - Update
backend/config/settings.pyto load .env, add OPENROUTER settings - Update
LLAMAPARSE_API_KEYto read from env - Files:
backend/.env,backend/config/settings.py,backend/requirements.txt
Step 2: Fix LlamaParse mode (1-line change)
backend/invoices/llamaparse.pyline 63:"premium_mode": "true"→"gpt4o_mode": "true"- Update the new API key
- Test: upload a real PDF, verify extraction still works
Step 3: Create OpenRouter service
- New file:
backend/invoices/openrouter_service.py - OpenAI-compatible client pointing at
openrouter.ai/api/v1 extract_invoice_fields(markdown_text)— structured JSON output with schemamatch_vendor(extracted_name, vendor_list)— fuzzy match + confidencematch_property(extracted_text, property_list)— context-based matchingsuggest_gl_codes(line_items, vendor_category)— per-line GL suggestiondetect_anomalies(invoice_data, vendor_history)— price anomaly detectiongenerate_summary(invoice_data)— human-readable AI summary- All methods return confidence scores
Step 4: Update Invoice model status choices
- Add to status choices:
'on_hold','scheduled' - Add DB columns (raw SQL since managed=False):
approved_by VARCHAR(100)approved_at TIMESTAMPscheduled_at TIMESTAMPscheduled_by VARCHAR(100)paid_at TIMESTAMPhold_reason TEXTgl_code_confirmed BOOLEAN DEFAULT FALSEvendor_confirmed BOOLEAN DEFAULT FALSEproperty_confirmed BOOLEAN DEFAULT FALSE
- Add
gl_codecolumn toinvoice_line_itemstable - Files:
backend/invoices/models.py, raw SQL migration script
Step 5: AI matching pipeline
- New file:
backend/invoices/ai_pipeline.py process_invoice(invoice_id)— orchestrates the full pipeline:- Get LlamaParse markdown → extract JSON via OpenRouter
- Match vendor against
vendorstable (fuzzy + AI) - Match property against
propertiestable (context clues) - Suggest GL codes per line item (based on vendor category + description)
- Check for duplicates (multi-field fingerprint: vendor + amount + date + invoice#)
- Detect price anomalies (compare to vendor's avg invoice amount)
- Find matching work orders (description similarity)
- Calculate per-field confidence scores
- Save all results to Invoice + InvoiceLineItem records
- Files:
backend/invoices/ai_pipeline.py
Step 6: New API endpoints
POST /api/invoices/<id>/hold/— set status to on_hold with reasonPOST /api/invoices/<id>/schedule/— move approved → scheduledPOST /api/invoices/<id>/mark-paid/— move scheduled → paidPOST /api/invoices/<id>/confirm-field/— confirm AI suggestion (vendor/property/GL)POST /api/invoices/<id>/override-field/— override AI suggestionGET /api/invoices/<id>/vendor-history/— vendor's past invoices for anomaly contextGET /api/invoices/aging/— AP aging report data- Update existing
InvoiceDetailSerializerto include all new fields - Files:
backend/invoices/views.py,backend/invoices/urls.py,backend/invoices/serializers.py
Phase 2: Frontend Redesign
Step 7: Update types & service
- Add new fields to
Invoiceinterface (hold_reason, scheduled_at, paid_at, etc.) - Add
InvoiceFieldConfidencetype (per-field confidence + alternatives) - Add new service methods: hold, schedule, markPaid, confirmField, overrideField, getVendorHistory, getAging
- Files:
src/app/features/invoices/invoice.types.ts,src/app/features/invoices/invoice.service.ts
Step 8: Workflow stepper component
- New:
src/app/features/invoices/components/workflow-stepper.component.ts - Input: current status → maps to active step
- 6 dots with connectors, gold pulse on active, animations
Step 9: Redesign invoice-detail.component.ts
- Replace existing right panel with new sections (per design spec):
- Glass summary card with gold top border
- AI alert cards (anomaly, WO match, duplicate) — conditional
- Vendor & Property card with inline AI badges + "change" dropdowns
- Line items table with GL code chips
- Action bar (Approve/Hold/Reject)
- Keep left panel (PDF preview) as-is
- Add workflow stepper above panels
- Files:
src/app/features/invoices/invoice-detail.component.ts
Step 10: Update invoice list
- Add Scheduled, Paid, On Hold filter tabs
- Update stats cards with new status counts
- Files:
src/app/features/invoices/invoices.component.ts
Phase 3: AI Integration & Polish
Step 11: Wire up upload → AI pipeline
- On PDF upload: LlamaParse OCR → OpenRouter extraction → AI matching pipeline
- Polling: frontend checks status while AI_PROCESSING
- When done: reload detail view with all AI suggestions populated
- Files:
backend/invoices/views.py(updateinvoice_upload)
Step 12: Field confirmation UX
- Click "Accept" on AI badge → POST confirm-field → badge turns solid gold check
- Click "change" → dropdown with search → POST override-field → update badge
- GL code chip click → dropdown of GL codes with AI-ranked suggestions
- All confirmations tracked for audit trail
Step 13: Duplicate detection
- On upload: compute fingerprint (vendor_name + amount + invoice_date + invoice_number hash)
- Query existing invoices for fingerprint match
- If match found: show amber alert card with link to existing invoice
- Field:
is_duplicatealready exists in model
Phase 4: Testing & Verification
Step 14: Playwright tests
- Invoice list: filters, search, pagination, stats
- Invoice detail: view, PDF preview, field display
- Upload flow: drag-drop, progress, scanning animation, result
- AI review: badge display, accept, override, alerts
- Approve/reject/hold workflows
- Status transitions through full pipeline
- Files:
tests/invoices/
Step 15: Manual verification
- Upload a real vendor invoice PDF
- Verify LlamaParse extracts correctly with Cost Effective mode
- Verify OpenRouter/Gemini Flash returns accurate structured JSON
- Verify vendor/property/GL matching produces reasonable suggestions
- Verify anomaly detection flags correctly
- Verify full workflow: Upload → AI Processing → Review → Approve → Schedule → Paid
- Check responsive layout on mobile
Key Files to Modify
| File | Changes |
|---|---|
backend/.env |
CREATE — API keys |
backend/config/settings.py |
Load .env, add OPENROUTER settings |
backend/invoices/llamaparse.py |
Fix mode to gpt4o, use new API key |
backend/invoices/openrouter_service.py |
CREATE — OpenRouter client |
backend/invoices/ai_pipeline.py |
CREATE — AI matching orchestrator |
backend/invoices/models.py |
Add new status choices, new fields |
backend/invoices/views.py |
New endpoints, update upload flow |
backend/invoices/serializers.py |
Expand detail serializer |
backend/invoices/urls.py |
New routes |
backend/requirements.txt |
Add python-dotenv, openai |
src/app/features/invoices/invoice.types.ts |
New fields, confidence types |
src/app/features/invoices/invoice.service.ts |
New methods |
src/app/features/invoices/invoice-detail.component.ts |
Full redesign per spec |
src/app/features/invoices/invoices.component.ts |
New filter tabs |
src/app/features/invoices/components/workflow-stepper.component.ts |
CREATE |
| Raw SQL script | Add columns to invoices + invoice_line_items |
Existing Code to Reuse
backend/invoices/llamaparse.py— keep upload/poll/extract flow, just change modebackend/invoices/models.py— Invoice model already has 107 columns including vendor_match_confidence, property_match_confidence, is_duplicate, etc.src/app/features/invoices/invoice-detail.component.ts— keep left PDF panel entirely, redesign right panel onlysrc/app/features/invoices/invoice.service.ts— extend, don't replace- Vendor matching fields already exist in Invoice model (probable_vendor_name, vendor_matched, vendor_match_confidence, vendor_match_strategy)
- Property matching fields already exist (probable_property_name, property_matched, property_match_confidence)
- All CSS custom properties in
src/styles.css— reuse for consistency
Verification
python manage.py check— Django validationng build— Angular compilation- Upload test PDF → verify LlamaParse extraction (Cost Effective mode)
- Verify OpenRouter API call returns valid JSON
- Verify AI matching populates vendor/property/GL suggestions
- Walk through full UI flow: list → upload → scanning → review → approve
- Playwright test suite passes
pm2 restart unitcycle && pm2 save— deploy- Verify at https://demo.unitcycle.com/invoices