Chapter 04 — Technology Stack
4.1 Vector Database & Embeddings
| Component | Technology | Key Specification |
|---|
| Vector Database | Pinecone Serverless | Dual-namespace: System docs + User uploads |
| Embedding Model | Jina v3 (MRL) | 1024d compressed to 256d |
| MRL Strategy | Matryoshka Representation Learning | Truncatable without quality loss |
| Storage Saving | MRL 256d vs 1024d | 75% reduction in storage cost |
| Accuracy Retained | Post-compression benchmark | ~95% retrieval accuracy maintained |
| Confidence Gating | Cosine similarity threshold | <45% confidence triggers HITL pause |
4.2 Databases & Storage
| Component | Technology | Role & Detail |
|---|
| Primary DB | MongoDB (Motor Async) | Sliding Window Chat History, User Feedback, HITL Chunk Review queue |
| Data Retention | MongoDB TTL Index | 30-day GDPR-compliant auto-deletion on all user data |
| File Registry | Supabase (PostgreSQL) | Global file tracking, metadata management, upload dedup records |
| Object Storage | Supabase Storage | Secure PDF file bucket for indexed documents |
| Deduplication | SHA-256 File Fingerprint | Identical re-uploads detected and skipped — no re-indexing cost |
4.3 Caching & Rate Limiting
| Feature | Technology | Configuration |
|---|
| Response Cache | Upstash Redis | Exact query match cache — 1 hour TTL |
| Chat Rate Limit | Upstash Redis | 10 queries per minute per user |
| Upload Rate Limit | Upstash Redis | 5 file uploads per day per user |
| Session Tracking | Upstash Redis | Active user session state — serverless persistence |
Agentic Financial Parser v2.0 — Technical DocumentationPage 7
4.4 Parsing Engine — LlamaParse + PyMuPDF
Document parsing is handled by a multi-tiered strategy. LlamaParse dynamically assigns parsing complexity (and cost) based on document type. PyMuPDF is reserved exclusively as a 100% free local fallback for user temporary uploads — protecting the LlamaParse API quota from casual or test uploads.
| Tier | Parser | Cost | Use Case |
|---|
| Tier 1 — Agentic Plus | LlamaParse | 45 cr/pg | Visual-heavy documents: diagrams, infographics, charts |
| Tier 2 — Agentic | LlamaParse | 10 cr/pg | Complex structured legal text, dense tables |
| Tier 3 — Cost Effective | LlamaParse | 1 cr/pg | Standard running text documents |
| Fallback (Free) | PyMuPDF | $0 — local | User temporary uploads only — API quota fully protected |
4.5 LLM & Observability
| Component | Technology | Detail |
|---|
| LLM (Generation) | Qwen 2.5 72B via OpenRouter | Primary model for Node 7 generation and Hallucination Guard |
| Web Fallback | Tavily API | Real-time internet search for Node 6 — OOS and HITL-authorized queries |
| Observability | Langfuse | LLM traces, token usage, latency, user feedback across all 9 nodes |
| Metrics Captured | Langfuse Cloud | Trace spans per node, bottleneck detection, cost per query, user satisfaction |
Agentic Financial Parser v2.0 — Technical DocumentationPage 8