Skip to main content

Chapter 04 — Technology Stack

4.1 Vector Database & Embeddings

ComponentTechnologyKey Specification
Vector DatabasePinecone ServerlessDual-namespace: System docs + User uploads
Embedding ModelJina v3 (MRL)1024d compressed to 256d
MRL StrategyMatryoshka Representation LearningTruncatable without quality loss
Storage SavingMRL 256d vs 1024d75% reduction in storage cost
Accuracy RetainedPost-compression benchmark~95% retrieval accuracy maintained
Confidence GatingCosine similarity threshold<45% confidence triggers HITL pause

4.2 Databases & Storage

ComponentTechnologyRole & Detail
Primary DBMongoDB (Motor Async)Sliding Window Chat History, User Feedback, HITL Chunk Review queue
Data RetentionMongoDB TTL Index30-day GDPR-compliant auto-deletion on all user data
File RegistrySupabase (PostgreSQL)Global file tracking, metadata management, upload dedup records
Object StorageSupabase StorageSecure PDF file bucket for indexed documents
DeduplicationSHA-256 File FingerprintIdentical re-uploads detected and skipped — no re-indexing cost

4.3 Caching & Rate Limiting

FeatureTechnologyConfiguration
Response CacheUpstash RedisExact query match cache — 1 hour TTL
Chat Rate LimitUpstash Redis10 queries per minute per user
Upload Rate LimitUpstash Redis5 file uploads per day per user
Session TrackingUpstash RedisActive user session state — serverless persistence

Agentic Financial Parser v2.0 — Technical DocumentationPage 7

4.4 Parsing Engine — LlamaParse + PyMuPDF

Document parsing is handled by a multi-tiered strategy. LlamaParse dynamically assigns parsing complexity (and cost) based on document type. PyMuPDF is reserved exclusively as a 100% free local fallback for user temporary uploads — protecting the LlamaParse API quota from casual or test uploads.

TierParserCostUse Case
Tier 1 — Agentic PlusLlamaParse45 cr/pgVisual-heavy documents: diagrams, infographics, charts
Tier 2 — AgenticLlamaParse10 cr/pgComplex structured legal text, dense tables
Tier 3 — Cost EffectiveLlamaParse1 cr/pgStandard running text documents
Fallback (Free)PyMuPDF$0 — localUser temporary uploads only — API quota fully protected

4.5 LLM & Observability

ComponentTechnologyDetail
LLM (Generation)Qwen 2.5 72B via OpenRouterPrimary model for Node 7 generation and Hallucination Guard
Web FallbackTavily APIReal-time internet search for Node 6 — OOS and HITL-authorized queries
ObservabilityLangfuseLLM traces, token usage, latency, user feedback across all 9 nodes
Metrics CapturedLangfuse CloudTrace spans per node, bottleneck detection, cost per query, user satisfaction

Agentic Financial Parser v2.0 — Technical DocumentationPage 8