Skip to main content

Chapter 13 β€” Known Edge Cases & Future Roadmap

Edge CaseDescriptionCurrent ImpactProposed Fix
Seventh Schedule OverlapSchedule entries use numbered lists ("19. Price control", "34. Betting"). The chunker's regex tags these as article_number: "19".Low β€” LLM correctly differentiates in its response but adds an unnecessary noteEnhance regex to verify chunk resides within Part I–XXII before tagging
General Conceptual QueriesQueries like "What are Fundamental Rights?" don't specify an article number, so metadata filter is not applied.Medium β€” relies on semantic search, which is good but not 100% preciseAdd part metadata (e.g., part: "III") to enable Part-level filtering
Cross-Article ReferencesSome articles reference others (e.g., Article 32 protects rights under Article 19).Low β€” system answers correctly for individual articles but doesn't auto-link related onesFuture: build a Knowledge Graph (Neo4j) to map inter-article relationships

Hallucination-Resistant RAG β€” Constitution of India Case StudyPage 20

Chapter 14 β€” Key Takeaways (Interview-Ready)

Interview Prep

If asked: "How did you handle RAG for a complex legal document?"

The Architecture Pattern​

LayerTechniqueIndustry Term
ParsingPyMuPDF + Custom Regex CleaningStructure-Aware Document Parsing
ChunkingArticle-boundary splits with parent β†’ child hierarchyHierarchical Chunking / Multi-Granularity Chunking
MetadataDynamic Article number extraction & injectionMetadata-Enriched Vector Indexing
RetrievalLLM Router β†’ Metadata Filter β†’ PineconeHybrid Retrieval with Metadata Filtering
Overall8-Node LangGraph Pipeline with classification, retrieval, generation, hallucination guardAgentic RAG Architecture

The One-Liner​

"I replaced expensive LLM parsing with a deterministic, structure-aware pipeline using Hierarchical Chunking and strict metadata filtering β€” achieving hallucination-resistant retrieval on a 400-page legal document at zero parsing cost."

β€” End of Case Study β€” Β© Ambuj Kumar Tripathi | All Rights Reserved