Machine Learning One

The Pain of Raw WAT (and Why We Need a Preprocessor)

Discovering the limits of raw WAT through building HTTP and KV capabilities.

Our runtime works. We can compile WAT, register capabilities, assemble modules, and run programs. Time to build something real.

In this article, we'll write a program that fetches a URL and stores the result in a key-value store. Along the way, we'll build two new capabilities — HTTP and KV — and discover that raw WAT is painful to write. This pain is deliberate: you need to feel the problem before the preprocessor (next article) will make sense.

The HTTP Capability

The HTTP capability gives WASM programs the ability to fetch URLs. It's our first async capability — HTTP requests take time, so the host function must be async.

Structure

pub struct HttpCapability {
    client: reqwest::Client,
    allowed_domains: Option<Vec<String>>,
    timeout: Duration,
}

Three fields:

  • client: A reusable reqwest HTTP client
  • allowed_domains: Optional allowlist — if set, only these domains can be fetched
  • timeout: Per-request timeout (default: 30 seconds)

WAT Imports

fn wat_imports(&self) -> &str {
    r#"(import "http" "get" (func $http.get (param i32) (result i32 i32)))
(import "http" "post" (func $http.post (param i32 i32 i32) (result i32 i32)))"#
}

Both functions follow our convention: pointers in, (pointer, error_code) out.

Domain Allowlisting

Before making any request, we check if the domain is allowed:

fn check_domain(url: &str, allowed: &Option<Vec<String>>) -> Result<(), u32> {
    if let Some(ref domains) = allowed {
        let host = url
            .split("://")
            .nth(1)
            .and_then(|rest| rest.split('/').next())
            .and_then(|host_port| host_port.split(':').next())
            .unwrap_or("");
        if !domains.iter().any(|d| d == host) {
            return Err(rt::EACCESS);
        }
    }
    Ok(())
}

If allowed_domains is None, all domains are permitted. If it's Some, the host portion of the URL must appear in the list. This is capability-based security at the application level — the orchestrator decides which domains the LLM-generated program can access.

The Async Host Function

Here's http.get — our first func_wrap_async:

linker.func_wrap_async(
    "http",
    "get",
    move |mut caller: Caller<'_, State>, (url_ptr,): (u32,)| {
        let client = client.clone();
        let allowed = allowed.clone();
        Box::new(async move {
            // 1. Read URL from WASM memory
            let url_bytes = match caller_read_bytes(&mut caller, url_ptr) {
                Ok(b) => b,
                Err(_) => return Ok((0u32, rt::ETRFM)),
            };
            let url: String = match protocol::from_bytes(&url_bytes) {
                Ok(u) => u,
                Err(_) => return Ok((0u32, rt::ETRFM)),
            };

            // 2. Check domain allowlist
            if let Err(code) = check_domain(&url, &allowed) {
                return Ok((0u32, code));
            }

            // 3. Make the HTTP request
            let resp = match client.get(&url).timeout(timeout).send().await {
                Ok(r) => r,
                Err(_) => return Ok((0u32, rt::EREMOTE)),
            };
            let body = match resp.bytes().await {
                Ok(b) => b,
                Err(_) => return Ok((0u32, rt::EREMOTE)),
            };

            // 4. Serialize and write response body to WASM memory
            let serialized = match protocol::to_bytes(&body.to_vec()) {
                Ok(s) => s,
                Err(_) => return Ok((0u32, rt::ETRFM)),
            };
            match caller_write_raw(&mut caller, &serialized) {
                Ok(ptr) => Ok((ptr, rt::SUCCESS)),
                Err(_) => Ok((0u32, rt::ENOMEM)),
            }
        })
    },
)?;

The pattern for every async host function:

  1. Read input pointer(s) from WASM memory
  2. Deserialize via the protocol crate
  3. Do the actual work (HTTP request)
  4. Serialize the result
  5. Write it back to WASM memory
  6. Return (pointer, error_code)

The http.post implementation follows the same pattern with three input pointers (URL, body, content-type) instead of one. That's a reader exercise — the structure is identical.

The KV Capability

The key-value store introduces a new pattern: shared state across executions. In a ReAct loop, the agent might execute 5 programs in sequence. If program 1 stores a value, program 3 should be able to retrieve it. This requires state that outlives a single WASM instance.

The Arc Pattern

#[derive(Default)]
pub struct KvState {
    pub store: HashMap<Vec<u8>, Vec<u8>>,
}

pub struct KvCapability {
    state: Option<Arc<RwLock<KvState>>>,
}

impl KvCapability {
    pub fn new() -> Self {
        Self { state: None }
    }

    pub fn with_arc_state(state: Arc<RwLock<KvState>>) -> Self {
        Self { state: Some(state) }
    }
}

When state is None, init_state creates a fresh Arc. When state is Some, it uses the provided one. The orchestrator creates one Arc<RwLock<KvState>> per session and passes it to every execution:

fn init_state(&self, ext: &mut ExtensionMap) -> rt::Result<()> {
    let arc = self.state.clone()
        .unwrap_or_else(|| Arc::new(RwLock::new(KvState::default())));
    ext.insert(arc);
    Ok(())
}

Inside host functions, the Arc is retrieved from the ExtensionMap:

// kv.set
|mut caller: Caller<'_, State>, key_ptr: u32, value_ptr: u32| -> u32 {
    let key = match caller_read_bytes(&mut caller, key_ptr) {
        Ok(k) => k,
        Err(_) => return rt::ETRFM,
    };
    let value = match caller_read_bytes(&mut caller, value_ptr) {
        Ok(v) => v,
        Err(_) => return rt::ETRFM,
    };
    match caller.data().extensions.get::<Arc<RwLock<KvState>>>().cloned() {
        Some(arc) => {
            arc.write().unwrap().store.insert(key, value);
            rt::SUCCESS
        }
        None => rt::ETRFM,
    }
}

Note: we store and retrieve raw serialized bytes. The KV store doesn't care about the type of the value — it's opaque bytes. This means you can store a string, a struct, or a sequence, and the guest retrieves whatever was stored.

Now Write a Real Program

Let's write a program that:

  1. Takes a key and a value as arguments
  2. Stores the value under the key via kv.set
  3. Retrieves it back via kv.get
  4. Returns the retrieved value

Here's the raw WAT:

;; Arguments: argv[0] = key, argv[1] = value
;; Stores value under key, then retrieves it and returns as result.

(local $key_ptr i32)
(local $key_err i32)
(local $val_ptr i32)
(local $val_err i32)
(local $set_err i32)
(local $get_ptr i32)
(local $get_err i32)

;; Read key from argv[0]
(call $sys.argv (i32.const 0))
(local.set $key_err)
(local.set $key_ptr)
(if (i32.ne (local.get $key_err) (i32.const 0))
    (then (return (local.get $key_err)))
)

;; Read value from argv[1]
(call $sys.argv (i32.const 1))
(local.set $val_err)
(local.set $val_ptr)
(if (i32.ne (local.get $val_err) (i32.const 0))
    (then (return (local.get $val_err)))
)

;; Set key=value
(call $kv.set (local.get $key_ptr) (local.get $val_ptr))
(local.set $set_err)
(if (i32.ne (local.get $set_err) (i32.const 0))
    (then (return (local.get $set_err)))
)

;; Get key back
(call $kv.get (local.get $key_ptr))
(local.set $get_err)
(local.set $get_ptr)
(if (i32.ne (local.get $get_err) (i32.const 0))
    (then (return (local.get $get_err)))
)

;; Store retrieved value as result
(call $sys.resv (local.get $get_ptr))
(i32.const 0)

Count the lines: 35 lines for a program that sets and gets a single key-value pair. That's absurd.

The Boilerplate Inventory

Let's count what's redundant:

Loading each argument: 5 lines

(call $sys.argv (i32.const 0))    ;; call
(local.set $key_err)              ;; pop error
(local.set $key_ptr)              ;; pop pointer
(if (i32.ne (local.get $key_err) (i32.const 0))
    (then (return (local.get $key_err)))
)

We do this twice (two arguments) = 10 lines of boilerplate.

Each error check: 3 lines

(if (i32.ne (local.get $set_err) (i32.const 0))
    (then (return (local.get $set_err)))
)

We do this for every fallible call. In a real program with 5 host function calls, that's 15 lines of error checking.

Local declarations: must be at the top

WAT requires all (local ...) declarations before any instructions. In our 35-line program, 7 of those lines are just declaring locals. If you forget one and put a (local ...) after an instruction, compilation fails.

No string literals

Want to log a message? You can't write (call $dbg.log "hello"). You need to: serialize the string, allocate memory, write it, and pass the pointer. That's 4+ lines just to pass a constant string.

A More Realistic Example: Fetch and Summarize

Now imagine a program the agent would actually generate — fetch a URL, send the response to an AI for summarization, and store the summary:

;; argv[0] = url to fetch

;; === LOCAL DECLARATIONS (must be at top) ===
(local $url_ptr i32)
(local $url_err i32)
(local $body_ptr i32)
(local $body_err i32)
(local $summary_ptr i32)
(local $summary_err i32)
(local $store_err i32)

;; === Load URL argument ===
(call $sys.argv (i32.const 0))
(local.set $url_err)
(local.set $url_ptr)
(if (i32.ne (local.get $url_err) (i32.const 0))
    (then (return (local.get $url_err)))
)

;; === Fetch URL ===
(call $http.get (local.get $url_ptr))
(local.set $body_err)
(local.set $body_ptr)
(if (i32.ne (local.get $body_err) (i32.const 0))
    (then (return (local.get $body_err)))
)

;; === Summarize with AI ===
;; PROBLEM: We need to pass a system prompt string.
;; But we can't write string literals in WAT!
;; We would need to serialize "Summarize the following text in one paragraph."
;; into memory manually. For now, pretend we have a pointer to it somehow.
;;
;; (call $ai.assist (i32.const ???) (local.get $body_ptr) (call $sys.nil))
;; (local.set $summary_err)
;; (local.set $summary_ptr)
;; (if (i32.ne (local.get $summary_err) (i32.const 0))
;;     (then (return (local.get $summary_err)))
;; )

;; === Store result ===
(call $sys.resv (local.get $body_ptr))
(i32.const 0)

We had to give up on the AI step entirely because there's no way to pass a string constant. The program would need the system prompt as an additional argument, or we'd need to pre-allocate the string in memory and hardcode the offset. Both are terrible ergonomics for an LLM that wants to write (call $ai.assist "Summarize this" (local.get $body_ptr) (call $sys.nil)).

The Three Problems, Summarized

  1. Argument loading boilerplate: Every sys.argv call requires declaring two locals, calling the function, popping results in reverse order, and checking the error. 5 lines that should be 1.

  2. No inline string literals: Constant strings cannot appear in WAT. They must be serialized into memory separately and referenced by pointer offset.

  3. Locals at the top: All (local ...) declarations must come before any instructions. This forces the LLM to plan every variable upfront, which conflicts with its natural tendency to declare variables as needed.

These aren't theoretical concerns. In production, the LLM generates WAT in a single pass. It can't go back and add a local declaration it forgot. It can't pre-compute string offsets. And it frequently makes mistakes with the 5-line argv pattern — forgetting the reverse-order pop, omitting the error check, or declaring the locals with the wrong names.

What We Want

Instead of the 35-line raw WAT program, we want to write:

(argv 0 $key_ptr)
(argv 1 $val_ptr)
(call $kv.set (local.get $key_ptr) (local.get $val_ptr))
(local $set_err i32)
(local.set $set_err)
(check $set_err)
(call $kv.get (local.get $key_ptr))
(local $get_err i32) (local $get_ptr i32)
(local.set $get_err)
(local.set $get_ptr)
(check $get_err)
(resv $get_ptr)
(i32.const 0)

And the fetch-and-summarize program as:

(argv 0 $url_ptr)
(call $http.get (local.get $url_ptr))
(local $body_err i32) (local $body_ptr i32)
(local.set $body_err)
(local.set $body_ptr)
(check $body_err)
(call $ai.assist "Summarize in one paragraph." (local.get $body_ptr) (call $sys.nil))
(local $sum_err i32) (local $sum_ptr i32)
(local.set $sum_err)
(local.set $sum_ptr)
(check $sum_err)
(resv $sum_ptr)
(i32.const 0)

Notice: (argv 0 $url_ptr) replaces 5 lines. (check $body_err) replaces 3 lines. "Summarize in one paragraph." is an inline string literal. (local ...) declarations appear next to the code that uses them. (resv $sum_ptr) is a shorthand for (call $sys.resv (local.get $sum_ptr)).

In the next article, we build the preprocessor that makes this transformation.

On this page