Abstract

Project Overview: ContextForgeSubtitle: An Information-Theoretic Approach to Adversarial Resilience and Predictive Failover in Agentic MemoryAbstractStateless Retrieval-Augmented Generation (RAG) systems exhibit three systematic failure modes: zero adversarial block rate, high cold-start failover latency (~480ms), and unconstrained context injection. We propose ContextForge, an agentic memory architecture that addresses these failures through information-theoretic mechanisms.The central contribution is the Entropy-Gated Soft-Gate Ledger: an append-only event store that filters writes using a dual-signal approach—Shannon entropy ($H$) and Lempel-Ziv Compression Density ($\rho$). This combination effectively intercepts both obfuscated (high-$H$) payloads and repetition-based (low-$\rho$) attacks. To maintain developer velocity, a Tiered Clearance Logic grants verified-origin traffic an entropy buffer, while a Soft-Gate Quarantine preserves system availability during secondary validation.Complementary mechanisms include a tri-core circuit-breaker router with entropy-triggered predictive failover and a Differential Context Injection (DCI) layer. Experimental results via high-fidelity synthetic benchmarking demonstrate a Weighted Composite Performance & Safety Index of $\Phi=80.7%$, a 68.9% reduction in failover latency, and an 87.4% reduction in token noise.Core Architectural PillarsDual-Signal Security Gate ($H + \rho$):Unlike standard pattern-matching, ContextForge treats security as a statistical signal. By measuring the "randomness" (Entropy) and "compressibility" (Density) of incoming data, the system can detect malicious injections—including base64-obfuscated commands and repetitive "mantra" attacks—without requiring a pre-defined library of threats.Predictive Failover Engine:The system monitors the health of LLM providers (Groq, Gemini, Ollama) via a circuit-breaker state machine. If an incoming prompt shows high entropy, the router anticipates a potential provider failure or rate-limit and pre-warms a backup connection in the background, slashing recovery time by over 330ms.Differential Context Injection (DCI):To solve "context clutter," a cosine-gated layer measures the semantic similarity of retrieved memory chunks. Chunks that fall below a strict relevance threshold are pruned before they reach the LLM, saving token costs and preventing model hallucinations.Tiered Clearance & Soft-Quarantine:Acknowledging the trade-off between security and usability, the system uses Verified Origin Headers (VOH) to trust known administrative traffic. Suspicious but non-definitive writes are moved to a quarantine state for asynchronous review, ensuring the primary agent loop never "locks up" due to a false positive.Key Performance Metrics$\Phi$ Index (Composite Safety): 80.7%Adversarial Block Rate: +85.0 pp (vs. 0% Baseline)Failover Latency Reduction: -68.9% (480ms $\to$ 149.5ms)Token Noise Reduction: 87.4%Validation: 100% Pass Rate across 375 independent OMEGA-75 tests.Technical StackProtocol: Model Context Protocol (MCP)Languages & Libraries: Python, NumPy, AsyncIO, Sentence-TransformersData Layer: SQLite (Append-only Ledger)Theories Applied: Information Theory (Shannon Entropy), Data Compression (LZ77), Vector Semantics (Cosine Similarity)Author & ReproducibilityAuthor: Trilochan Sharma (Parnish)Affiliation: Independent Researcher / AI EngineerReproducibility: All benchmarks and the OMEGA-75 suite are fully reproducible via the open-source engine at benchmark/engine.py