Deploying AI-powered sales assistants in developing economy markets is constrained not by model capability but by three interdependent barriers: vector database infrastructure costs that price AI out of reach for micro-enterprises; model vendor lock-in that prevents cost optimization through provider switching; and hallucination in LLM outputs that undermines trust in commercial contexts. We present SalesCatalog, a production multi-tenant SaaS platform that addresses these barriers through four systems: (1) a pure SQLite vector database with a hybrid retrieval pipeline combining BM25 keyword search, Reciprocal Rank Fusion (k=60), Maximal Marginal Relevance diversity re-ranking (𝜆=0.7), and adaptive thresholding with similarity gap analysis, requiring zero native extensions and eliminating per-tenant infrastructure costs; (2) a 3-gate LLM orchestrator that routes prompts across four specialized AI agents: OBIT-1 (catalog queries, Llama 3.1 8B), OBIT-2 (complex analysis, GPT-OSS 20B), OBIT-Docs (documentation retrieval, hybrid search), and Alex (legacy). It uses dynamic confidence thresholds and sessionaware context change detection; (3) an 11-step anti-hallucination pipeline with chain-of-thought grounding verification that resolved all factual errors on our validation set (N=19); and (4) a three-tier context management system spanning per-turn rolling summaries, cross-conversation session memory, and domain-level grand summaries for context-aware routing. We evaluate each subsystem on accuracy, latency, and cost benchmarks, and discuss implications for sustainable
Publication Date: 2026-06-13