Machine Learning One

Defense in depth, error recovery, performance tuning, and where the project goes next.

Over the previous thirteen articles we built an agentic runtime from scratch: a WASM engine, a capability plugin system, a preprocessor, an LLM-driven ReAct loop, an HTTP server with SSE streaming, and a React frontend. This final article steps back to examine the security model, resilience patterns, performance characteristics, and where the project goes from here.

Defense in Depth

The system's security model is not a single wall -- it is seven concentric layers. Each layer assumes the layers outside it might fail. This matters because the core threat model is adversarial: an LLM generates executable code based on user input, and both the user and the LLM can produce surprising things.

Layer 1: WASM Sandbox (Structural Isolation)

WebAssembly provides memory isolation by construction. A WASM module can only access its own linear memory. It cannot read the host process's heap, call arbitrary system functions, or access the file system. This is not a policy -- it is a structural property of the WASM execution model enforced by the Wasmtime runtime at the virtual machine level.

What it prevents: Memory corruption, arbitrary code execution, access to host process state.

Layer 2: Capability-Based Imports (Explicit Privilege)

A WASM module has zero host capabilities by default. Every function it can call -- HTTP, filesystem, AI, KV storage -- must be explicitly registered on the linker by the host. The guest program can only use what the host chooses to provide.

This is the principle of least privilege applied at the API boundary. If you don't register FsCapability, no WASM program can touch the filesystem, regardless of what the LLM generates.

What it prevents: Unauthorized access to host resources, privilege escalation.

Layer 3: Fuel Metering (Instruction Budget)

Wasmtime's fuel system assigns a finite instruction budget to each execution. Every WASM instruction consumes fuel. When fuel runs out, execution halts with a trap -- not a graceful error, but a hard stop.

This prevents infinite loops and excessive computation. Without fuel metering, a malicious or buggy program could spin forever and lock the async runtime.

What it prevents: Denial of service via infinite loops, runaway computation.

Layer 4: Memory Limits (Allocation Budget)

The engine configuration caps linear memory growth:

pub struct EngineConfig {
    pub max_memory_bytes: u64,  // Default: 16MB
    pub initial_pages: u32,     // Default: 1 (64KB)
    // ...
}

If a WASM program tries to grow memory beyond the limit, the memory.grow instruction returns -1 (failure). Combined with the bump allocator, this means a single execution cannot consume unbounded memory.

What it prevents: Memory exhaustion, host OOM.

Layer 5: Domain Allowlists (Network Boundary)

The HTTP capability accepts an optional list of allowed domains. If configured, any request to a domain not on the list is rejected before the network call is made.

What it prevents: Server-side request forgery (SSRF), data exfiltration to arbitrary endpoints.

Layer 6: Path Jailing (Filesystem Boundary)

The filesystem capability takes a root path at construction time. All file operations are resolved relative to this root, and path traversal attempts (e.g., ../../etc/passwd) are caught and rejected.

What it prevents: Directory traversal, unauthorized file access outside the jail.

Layer 7: Content Guard (Semantic Boundary)

The outermost layer operates on meaning rather than mechanism. It uses an LLM to classify user inputs and agent outputs for safety violations before they enter or leave the system.

Content Guard Details

The content guard is an AI-based pre/post-processing filter. It runs separately from the agent's main LLM and classifies text into safety categories.

GuardVerdict

The guard returns one of two verdicts:

pub enum GuardVerdict {
    Safe,
    Blocked {
        category: String,
        severity: String,
    },
}

The categories are: SAFE, HARASSMENT, HATE_SPEECH, SEXUALLY_EXPLICIT, DANGEROUS_CONTENT, and JAILBREAK. Each has an associated severity: none, low, medium, or high. Content classified as SAFE, or with low severity regardless of category, passes through. Anything medium or high is blocked.

Classification Flow

The guard sends the text to the LLM with a classification prompt and expects a JSON response:

const CLASSIFICATION_PROMPT: &str = r#"You are a content safety classifier.
Analyze the following text and classify it into exactly one category.
...
Respond with ONLY a JSON object, no other text:
{"category": "<CATEGORY>", "severity": "<none|low|medium|high>"}"#;

The response parser (extract_json) handles the common case where the LLM wraps its JSON in extra text -- it finds the first { and matches braces to extract the JSON object.

Graceful Degradation

If the LLM returns malformed output that cannot be parsed as JSON at all, the guard defaults to Safe:

let parsed: serde_json::Value = serde_json::from_str(json_str)
    .unwrap_or_else(|_| serde_json::json!({"category": "SAFE", "severity": "none"}));

This is a deliberate choice. The guard is a best-effort filter, not a gatekeeper. A transient LLM failure should not block legitimate requests. The other six security layers still apply regardless of whether the guard is functioning.

Pre-Guard and Post-Guard

The guard runs twice per request:

Pre-guard: Before the ReAct loop begins, the user's message is classified. If blocked, a category-specific safety message is returned immediately and no WASM execution occurs.
Post-guard: After the agent produces its final response, the output text is classified. If blocked, the response is replaced with a safety message. This catches cases where the agent's generated content is problematic even though the user's input was not.

Each safety category has a tailored message:

pub fn safety_message(category: &str) -> &'static str {
    match category {
        "HARASSMENT" => "Your message was flagged for potentially harmful content...",
        "JAILBREAK" => "Your message appears to be attempting to circumvent...",
        // ...
    }
}

Resilience Patterns

Compilation Retry

The LLM generates WAT code, and sometimes that code has syntax errors. Rather than failing immediately, the agent asks the LLM to fix its own mistake:

if retries < self.config.max_retries
    && (error_msg.contains("compilation failed")
        || error_msg.contains("Compilation"))
{
    retries += 1;
    messages.push(Message::system(&format!(
        "Your WAT failed to compile: {}. Fix it and respond with \
         the corrected ToolCall::Wat(```wat ... ```) block.",
        error_msg
    )));

    if let Ok(retry_response) = self.client.complete(messages).await {
        if let Parsed::Wat { body: new_body, .. } =
            parse_response(&retry_response)
        {
            current_body = new_body;
            continue;
        }
    }
}

The compilation error message is fed back as context. The LLM sees what went wrong and produces a corrected version. This self-healing loop runs up to max_retries times (default: 2). It only triggers on compilation errors -- runtime traps are not retried because the WAT was syntactically valid.

Session TTL

Sessions accumulate conversation history and KV state in memory. Without cleanup, abandoned sessions would leak memory indefinitely. The SessionStore runs a passive cleanup on every access:

pub fn cleanup_expired(&mut self) {
    let now = Instant::now();
    self.sessions
        .retain(|_, s| now.duration_since(s.last_active) < self.ttl);
}

This is called inside get_or_create, so expired sessions are reaped whenever any session is accessed. The default TTL is one hour. This approach avoids the complexity of a background reaper thread while still bounding memory growth.

Graceful Shutdown

The server installs a Ctrl-C handler via tokio::signal::ctrl_c() and passes it to Axum's graceful shutdown:

axum::serve(listener, app)
    .with_graceful_shutdown(shutdown_signal())
    .await?;

When the signal arrives, Axum stops accepting new connections and waits for in-flight requests (including active SSE streams) to complete before shutting down. No requests are dropped mid-response.

Structured Error Codes

Error codes propagate through the entire stack, from WASM exit codes to SSE events:

Code	Name	Meaning
0	SUCCESS	Execution completed normally
1	ETRFM	Serialization/deserialization error
2	ENOMEM	Memory allocation failed
3	EACCESS	Permission denied (capability restriction)
4	ENOENT	Resource not found
5	EBOUND	Bounds check failed
6	EREMOTE	Remote call failed (HTTP, AI)
7	EPARSE	Parse error

A WASM program returns one of these as its i32 exit code. The agent formats it into an observation message. The SSE stream delivers it to the frontend as a structured tool_result event with success: false. At no point does the error turn into an unstructured string that loses information.

Performance Patterns

Module Caching by Content Hash

The engine maintains a compile cache keyed by the content hash of WAT/WASM source:

pub fn compile_wat(&self, wat: &str) -> Result<Module> {
    let hash = hash_bytes(wat.as_bytes());
    {
        let cache = self.cache.read().unwrap();
        if let Some(cached) = cache.get(&hash) {
            return Ok(Module::new(cached.clone(), hash));
        }
    }
    let module = wasmtime::Module::new(&self.inner, wat)
        .map_err(|e| Error::Compilation(e.to_string()))?;
    self.cache.write().unwrap().insert(hash, module.clone());
    Ok(Module::new(module, hash))
}

Compilation is the most expensive operation in the hot path (tens of milliseconds for non-trivial programs). Caching by content hash means identical programs compile once and are reused. The agent layer adds a second cache for pre-linked modules (compilation + linking done together), and both caches use Arc<RwLock> for concurrent access.

Pre-Warming Catalog Programs

At startup, the agent compiles and pre-links every program in the WAT catalog:

let pre_cache: HashMap<u64, PreCacheEntry> = {
    let template = Self::make_template(&config);
    let mut cache = HashMap::new();
    for program in catalog.programs() {
        match template.assemble(&program.body) {
            Ok(ar) => {
                let hash = content_hash(ar.wat.as_bytes());
                match engine.compile_wat(&ar.wat) {
                    Ok(module) => match linker.pre_link(&module) {
                        Ok(linked) => {
                            cache.insert(hash, (linked, ar.data_sections, ar.initial_top));
                        }
                        // ... error handling
                    }
                    // ... error handling
                }
            }
            // ... error handling
        }
    }
    cache
};

This eliminates first-request latency for library programs. When the agent dispatches @use http_get(url), the compiled module is already in the cache. The execution path goes directly to instantiation, skipping template assembly, WAT compilation, and linking.

Arc<RwLock> for Shared KV State

Each session's KV state is wrapped in Arc<RwLock<KvState>>. The Arc is cloned (O(1)) when passing state to WASM executions. The RwLock allows concurrent reads from multiple ReAct steps while serializing writes. Since KV operations within a single execution are sequential (WASM is single-threaded), lock contention is minimal in practice.

ArcSwap for Lock-Free Router Hot-Swapping

The dynamic integration router uses arc_swap::ArcSwap<Router>:

pub struct ServerState {
    pub agent: Arc<Agent>,
    pub integration_router: ArcSwap<Router>,
    // ...
}

When a new integration is registered, the entire router is rebuilt and swapped atomically. Readers (incoming HTTP requests) never block -- they load the current router pointer with a single atomic read. Writers (integration registration) build the new router, then swap it in. There is no mutex on the read path.

Architecture Retrospective

What Worked

The Capability trait as a plugin system. Adding new host functions requires implementing a single trait with three methods (name, imports, setup). The template system automatically includes the WAT import declarations. The linker automatically registers the functions. No modification to the engine or agent core is needed. This has been the cleanest abstraction in the project.

The preprocessor. Converting argv!, resv!, and check! macros into raw WAT instructions made the LLM's job dramatically easier. The LLM generates code using these concise macros, and the preprocessor expands them into the verbose memory operations that WASM actually requires. This reduced generation errors significantly.

Pre-caching catalog programs. Moving compilation cost to startup time rather than first-request time makes the system feel responsive from the first interaction. The tradeoff (slightly longer startup) is invisible in a long-running server.

Trade-Offs

Tree-sitter dependency for preprocessing. The preprocessor uses tree-sitter with a custom WAT grammar to parse and transform WAT source. This adds a native dependency and build-time complexity. A simpler regex-based approach would have been less accurate but much lighter. For a production system, the accuracy is worth it; for a prototype, it might not be.

Single-threaded bump allocator. The WasmMemory bump allocator is simple and fast but never frees memory. Within a single execution this is fine -- executions are short-lived and the entire memory is reclaimed when the instance is dropped. But it means a long-running WASM program that allocates and discards data will eventually hit the memory limit even if its live data is small.

No garbage collection. Related to the above: WASM programs must manage their memory manually. The LLM-generated programs work around this by being short -- they allocate, compute, store results, and exit. This is a deliberate design constraint, not an oversight.

The LLM as Programmer

WAT turns out to be a surprisingly good compilation target for LLMs. It is a simple, regular language with a small instruction set. There are no libraries to import, no build systems to configure, no type inference to get wrong. The LLM writes a function body, the preprocessor handles the boilerplate, and the engine executes it.

The ReAct loop's self-correction mechanism (compilation retry) is important here. LLMs make syntax errors, but they are generally good at fixing them when shown the error message. Two retries is enough to recover from most generation mistakes.

Future Directions

Several features are planned but not yet implemented:

Context Compaction

As conversations grow, the message history sent to the LLM becomes long and expensive. Context compaction would summarize older messages, keeping recent exchanges verbatim but compressing earlier ones into a condensed summary. This reduces token usage without losing essential context.

Local Model Fallback

Not every request needs a large cloud model. Simple tasks (KV lookups, basic HTTP fetches) could be handled by a smaller local model, reserving the cloud model for complex reasoning. This reduces latency and cost for straightforward operations.

Outgoing Webhooks

Currently the agent only responds to incoming requests. Outgoing webhooks would allow the agent to proactively notify external systems -- pushing results to Slack, triggering CI pipelines, or updating dashboards when certain conditions are met.

WASM-Loaded Integrations

The current integration system dispatches requests to the agent's ReAct loop. A future version could load integration logic as pre-compiled WASM modules, bypassing the LLM entirely for well-defined endpoints. This would combine the sandboxing guarantees of WASM with the performance of pre-compiled code.

Frontend Completion

The chat interface needs message rendering components, conversation history management, and session persistence. The SSE client pattern and embedding mechanism are in place; the remaining work is standard React development on top of a well-defined event stream.

Series Conclusion

We started in Article 1 by asking why AI agents need a virtual machine. The answer was sandboxing: if an LLM is going to write and execute code, that code needs to run in a controlled environment with explicit resource limits and capability boundaries.

From there we built the system layer by layer:

Articles 2-3: Chose WASM as the substrate and built the execution engine with Wasmtime.
Article 4: Designed a binary wire protocol for crossing the WASM boundary.
Articles 5-6: Created the Capability trait, the linker, and the template system.
Articles 7-8: Confronted the pain of raw WAT and built a preprocessor to tame it.
Articles 9-10: Assembled the LLM integration and ReAct orchestration loop.
Article 11: Added the WAT program catalog for pre-built tools.
Article 12: Put it online with an HTTP server and SSE streaming.
Article 13: Built the frontend shell and the SSE client pattern.
Article 14: Reviewed the security model, resilience patterns, and future roadmap.

The result is a system where a user sends a message, an LLM reasons about it, generates a sandboxed WebAssembly program, executes it with controlled access to the outside world, observes the result, and iterates until it has an answer. Every step is metered, jailed, and auditable.

The code is not perfect. There are trade-offs, missing features, and rough edges. But the architecture is sound: capabilities as plugins, the preprocessor as a complexity absorber, content-hash caching for performance, and seven layers of security for defense in depth. These patterns extend naturally as the system grows.

Production Hardening and Future Directions