Skip to main content

Deployment Challenges & Fixes

These are real bugs from real deployments. Not hypothetical. Not from documentation. From actual crashes, error logs, and 2am debugging sessions.


Dependency Conflicts

ConflictErrorFix
langchain-core version mismatchImportError: cannot import name 'RunnablePassthrough'Pin langchain-core==0.1.52, langchain==0.1.20
itsdangerous + StarletteTypeError: argument of type 'NoneType' in session middlewarePin itsdangerous==2.1.2
langfuse + pydantic v2ValidationError on every tracePin langfuse==2.7.3 with pydantic<2.0

Lesson: Lock every version in requirements.txt. pip freeze > requirements.txt after every working state.


Missing Imports That Aren't in Docs

# These fail silently or with cryptic errors:
from langchain_community.vectorstores import Chroma # NOT langchain.vectorstores
from langchain_community.document_loaders import PyMuPDFLoader # NOT langchain.document_loaders

# Always use langchain_community for:
# - Vector stores (Chroma, Qdrant, Pinecone)
# - Document loaders (PyMuPDF, Unstructured, WebBase)
# - Embeddings (HuggingFace, Jina, Cohere)

Embedding Model Deprecation

text-embedding-004     → Deprecated Jan 14, 2026. 404 error.
embedding-001 → Deprecated. Returns HTTP 404.
gemini-embedding-001 → Current as of 2026 (but quota problems)

# If you get 404 on Google embedding:
# Always check: https://ai.google.dev/gemini-api/docs/models

Embedding Quota Exhaustion on Deploy

Problem: Google Gemini free tier = 1500 requests/month. Cold start re-embeds entire corpus = 300+ API calls = quota gone in one deploy.

Fix: Pre-build ChromaDB locally, commit to git.

# Build once locally:
python scripts/build_vectordb.py

# Commit the built DB:
git add chroma_db/
git commit -m "Pre-built ChromaDB — skip embedding at startup"

# Render pulls this on deploy — loads from disk
# Startup: 8s. Zero API calls for embeddings.

Memory Optimization Journey

StageRAM UsageWhat Changed
Initial~850MB (OOM)HuggingFace model + spaCy lg + full LangChain
After embedding fix~650MB (OOM)Switched to API embeddings
After spaCy fix~450MBReplaced spaCy with custom regex
After model pruning~250–400MBRemoved unused LangChain components
Final~250–400MB ✅Stable on 512MB Render

ISP DNS Block (JioFiber + Qdrant)

Problem: JioFiber blocked *.cloud.qdrant.io at DNS level during local development. ConnectionError: Name or service not known.

Fix: Monkey-patch Python's socket resolver for Qdrant hostnames only.

import socket
import dns.resolver # pip install dnspython

_original = socket.getaddrinfo

def _patched(host, port, *args, **kwargs):
if host and 'qdrant.io' in str(host):
resolved = dns.resolver.resolve(host, 'A',
nameservers=['8.8.8.8']) # Google DNS
host = str(resolved[0])
return _original(host, port, *args, **kwargs)

socket.getaddrinfo = _patched
# No-op in production — Render has normal DNS

ChromaDB Telemetry Deadlock

Problem: ChromaDB sends telemetry on startup. On Render's network restrictions, this DNS lookup hung indefinitely — blocking the entire startup.

Fix:

import chromadb
from chromadb.config import Settings

client = chromadb.Client(Settings(
anonymized_telemetry=False # Disable the DNS call entirely
))

Lesson: Any telemetry/analytics call in your dependencies can deadlock startup on restricted networks. Always check for telemetry settings.