Expanding the Toolkit: Strings, Arrays, Debug, Filesystem, and More
Giving the agent a practical toolkit by building capabilities for strings, arrays, AI, filesystem, and more.
Article 5 built the Capability trait and the foundational CoreCapability. Article 7 showed the pain of raw WAT. Now that we have macros and string inlining (Article 8), building new capabilities is mostly mechanical: declare imports, register host functions, serialize through the wire protocol. This article adds six capabilities in rapid succession, giving the agent a practical toolkit.
All new capabilities live in the capability crate. By this point, capability/Cargo.toml has accumulated the dependencies we need for HTTP, async I/O, and JSON handling:
[package]
name = "capability"
version = "0.1.0"
edition = "2021"
[dependencies]
rt = { path = "../rt" }
protocol = { workspace = true }
wasmtime = { workspace = true }
serde = { workspace = true }
serde_json = { workspace = true }
tokio = { workspace = true }
reqwest = { workspace = true }
anyhow = { workspace = true }
[dev-dependencies]
tempfile = "3"The reqwest dependency powers HttpCapability and AiCapability. The tempfile dev-dependency is used in FsCapability tests for creating sandboxed jail directories.
We will cover three capabilities in detail — arrays (because they introduce a new pattern: host-side handles), string chunking (because the algorithm is interesting), and filesystem (because path jailing is security-critical). The rest follow the same pattern and are described with enough detail to implement as exercises.
ArrayCapability: Host-Side Handles
Every capability so far has exchanged data through WASM linear memory: serialize on one side, deserialize on the other. Arrays break this pattern. A string fits neatly in a byte buffer, but a dynamically-growing list of heterogeneous elements does not. We could serialize an entire array into memory, but then appending an element would require re-serializing the whole thing.
The solution: arrays live entirely on the Rust heap. WASM programs hold opaque integer handles — indices into a host-side Vec. No array data ever crosses the WASM boundary. Only handles (i32 values) do.
ArrayState
// capability/src/array.rs
pub struct ArrayState {
pub arrays: Vec<Vec<Vec<u8>>>, // array_id -> elements (serialized bytes each)
}
impl ArrayState {
pub fn new() -> Self {
Self { arrays: Vec::new() }
}
pub fn create(&mut self) -> u32 {
let id = self.arrays.len() as u32;
self.arrays.push(Vec::new());
id
}
pub fn push(&mut self, id: u32, data: Vec<u8>) -> Result<(), u32> {
self.arrays.get_mut(id as usize).ok_or(rt::EBOUND)?.push(data);
Ok(())
}
pub fn get(&self, id: u32, index: u32) -> Result<&[u8], u32> {
let arr = self.arrays.get(id as usize).ok_or(rt::EBOUND)?;
arr.get(index as usize).map(|v| v.as_slice()).ok_or(rt::EBOUND)
}
pub fn len(&self, id: u32) -> u32 {
self.arrays.get(id as usize).map(|a| a.len() as u32).unwrap_or(0)
}
}Each array is a Vec<Vec<u8>> — a list of serialized elements. The outer Vec maps handle IDs to arrays. create() pushes an empty Vec and returns its index. push() appends raw serialized bytes. get() returns a slice to the serialized element, which the host function then writes into WASM memory.
This is stored in the ExtensionMap and shared with StringCapability (which creates arrays when splitting and chunking text).
The Imports
fn wat_imports(&self) -> &str {
r#"(import "arr" "new" (func $arr.new (result i32)))
(import "arr" "push" (func $arr.push (param i32 i32) (result i32)))
(import "arr" "get" (func $arr.get (param i32 i32) (result i32 i32)))
(import "arr" "len" (func $arr.len (param i32) (result i32)))
(import "arr" "join" (func $arr.join (param i32 i32) (result i32 i32)))"#
}Notice that arr.new takes no parameters and returns a plain i32 — the handle. arr.push takes a handle and a pointer (serialized data in WASM memory), reads the bytes, and stores them host-side. arr.get takes a handle and an index, retrieves serialized bytes from the host, writes them into WASM memory, and returns (ptr, err). The data only crosses the boundary at push (WASM to host) and get (host to WASM).
arr.join is the most useful function in practice. It takes a handle and a separator pointer, deserializes all elements as strings, joins them with the separator, serializes the result, and writes it back into WASM memory:
// arr.join(arr_id, sep_ptr) -> (ptr, err)
linker
.func_wrap(
"arr",
"join",
|mut caller: Caller<'_, State>, arr_id: u32, sep_ptr: u32| -> (u32, u32) {
let separator = if sep_ptr != 0 {
match caller_read_bytes(&mut caller, sep_ptr) {
Ok(bytes) => protocol::from_bytes::<String>(&bytes).unwrap_or_default(),
Err(_) => String::new(),
}
} else {
String::new()
};
let strings = {
let state = caller.data();
let arr_state = match state.extensions.get::<ArrayState>() {
Some(s) => s,
None => return (0u32, rt::ETRFM),
};
let arr = match arr_state.arrays.get(arr_id as usize) {
Some(a) => a,
None => return (0u32, rt::EBOUND),
};
let mut strings = Vec::new();
for elem in arr {
match protocol::from_bytes::<String>(elem) {
Ok(s) => strings.push(s),
Err(_) => return (0u32, rt::EPARSE),
}
}
strings
};
let result = strings.join(&separator);
let serialized = match protocol::to_bytes(&result) {
Ok(s) => s,
Err(_) => return (0u32, rt::ETRFM),
};
match caller_write_raw(&mut caller, &serialized) {
Ok(ptr) => (ptr, rt::SUCCESS),
Err(_) => (0u32, rt::ENOMEM),
}
},
)
.map_err(rt::Error::Other)?;The borrow gymnastics here are typical: we borrow caller.data() immutably to read the array elements, collect into an owned Vec<String> to release the borrow, then call caller_write_raw() which needs &mut caller. This two-phase pattern — read into owned data, release borrow, write — appears in nearly every host function.
The Handle Pattern
This handle-based design is worth naming because it recurs throughout the system. DebugCapability uses handles for tracked tasks. If we added a socket capability, it would use handles for connections. The pattern is:
- Host-side state holds a
Vec<T> create()returns the next index- WASM programs pass the index to subsequent operations
- No
Tdata crosses the WASM boundary — only the integer handle
This is a capability-based security pattern within the capability system itself: WASM code can only operate on arrays it created (or received handles for). It cannot forge a handle to access another execution's arrays because ArrayState is per-instance.
StringCapability: Smart Chunking
Most string operations are straightforward wrappers — str.len deserializes a string and returns its byte length, str.cat concatenates two strings, str.slice extracts a substring. The interesting function is str.chunk.
Why Chunking Matters
LLMs have context limits. When the agent fetches a large document and wants to process it, it needs to split it into pieces. Naive splitting at a fixed byte offset breaks words, sentences, and paragraphs. The chunking algorithm prefers natural boundaries.
The Algorithm
// capability/src/string.rs
pub fn chunk_text(text: &str, max_bytes: usize) -> Vec<String> {
if text.len() <= max_bytes {
return vec![text.to_string()];
}
let mut chunks = Vec::new();
let mut current_pos = 0;
while current_pos < text.len() {
let remaining = &text[current_pos..];
if remaining.len() <= max_bytes {
chunks.push(remaining.to_string());
break;
}
// Make sure search_end is on a UTF-8 char boundary
let mut search_end = (current_pos + max_bytes).min(text.len());
while search_end > current_pos && !text.is_char_boundary(search_end) {
search_end -= 1;
}
let search_text = &text[current_pos..search_end];
// 1. Try paragraph boundary (\n\n)
if let Some(pos) = search_text.rfind("\n\n") {
let end = current_pos + pos + 2;
chunks.push(text[current_pos..end].to_string());
current_pos = end;
continue;
}
// 2. Try sentence boundary (. ! ? followed by space or newline)
let mut best_sentence_end = None;
for pattern in &[". ", ".\n", "! ", "!\n", "? ", "?\n"] {
if let Some(pos) = search_text.rfind(pattern) {
let candidate = current_pos + pos + pattern.len();
if best_sentence_end.is_none() || candidate > best_sentence_end.unwrap() {
best_sentence_end = Some(candidate);
}
}
}
if let Some(end) = best_sentence_end {
chunks.push(text[current_pos..end].to_string());
current_pos = end;
continue;
}
// 3. Try word boundary (space)
if let Some(pos) = search_text.rfind(' ') {
let end = current_pos + pos + 1;
chunks.push(text[current_pos..end].to_string());
current_pos = end;
continue;
}
// 4. Last resort: hard split at UTF-8 char boundary
let mut end = current_pos + max_bytes;
while end > current_pos && !text.is_char_boundary(end) {
end -= 1;
}
if end == current_pos {
end = current_pos + 1;
while end < text.len() && !text.is_char_boundary(end) {
end += 1;
}
}
chunks.push(text[current_pos..end].to_string());
current_pos = end;
}
chunks
}The priority cascade:
-
Paragraph boundary (
\n\n): The strongest natural boundary. If the text has paragraph breaks within the chunk window, split there. This keeps paragraphs intact. -
Sentence boundary (
.,!,?followed by space or newline): If no paragraph break fits, find the last sentence ending. The algorithm searches for all six patterns and picks the one furthest into the chunk — maximizing chunk size while respecting sentence structure. -
Word boundary (space): If no sentence boundary fits (e.g., a single very long sentence), split at the last space. This avoids breaking words.
-
Hard UTF-8 boundary: If there are no spaces at all (e.g., a long URL or CJK text without spaces), split at the byte limit, but adjust backward to the nearest UTF-8 character boundary. The
is_char_boundary()check prevents splitting in the middle of a multi-byte character.
Every search uses rfind — searching backward from the chunk limit. This maximizes chunk size at each step.
How str.chunk Uses Arrays
str.chunk returns an array handle, not a string. The host function chunks the text, creates a new array via ArrayState, pushes each chunk as a serialized string element, and returns the handle:
// str.chunk(ptr, max_bytes) -> (arr_id, err)
linker
.func_wrap(
"str",
"chunk",
|mut caller: Caller<'_, State>, ptr: u32, max_bytes: i32| -> (u32, u32) {
let bytes = match caller_read_bytes(&mut caller, ptr) {
Ok(b) => b,
Err(_) => return (0u32, rt::ETRFM),
};
let s: String = match protocol::from_bytes(&bytes) {
Ok(s) => s,
Err(_) => return (0u32, rt::ETRFM),
};
let chunks = chunk_text(&s, max_bytes as usize);
let state = caller.data_mut();
let arr_state = match state.extensions.get_mut::<ArrayState>() {
Some(s) => s,
None => return (0u32, rt::ETRFM),
};
let arr_id = arr_state.create();
for chunk in chunks {
let serialized = match protocol::to_bytes(&chunk) {
Ok(s) => s,
Err(_) => return (0u32, rt::ETRFM),
};
if arr_state.push(arr_id, serialized).is_err() {
return (0u32, rt::ENOMEM);
}
}
(arr_id, rt::SUCCESS)
},
)
.map_err(rt::Error::Other)?;This is the bridge between the two capabilities: StringCapability depends on ArrayState, so its init_state ensures ArrayState exists:
fn init_state(&self, ext: &mut ExtensionMap) -> rt::Result<()> {
if ext.get::<ArrayState>().is_none() {
ext.insert(ArrayState::new());
}
Ok(())
}The if guard prevents double-initialization when both ArrayCapability and StringCapability are registered.
The Remaining String Functions
str.split works the same way as str.chunk — splits on a delimiter, creates an array, returns a handle. str.parse_int scans for the first digit sequence in a string and returns the raw i32 value (useful for extracting numbers from LLM responses). str.from_int does the reverse, converting an i32 to its decimal string representation. These are straightforward and follow the same serialize-deserialize-return pattern.
FsCapability: Path Jailing
The filesystem capability gives WASM programs read/write access to files. This is the most dangerous capability in the system — an unconstrained filesystem write could overwrite system files, leak credentials, or corrupt the runtime itself. The defense is path jailing: every requested path is resolved relative to a root directory, and any attempt to escape that root is rejected.
The Jailing Functions
There are two path resolution functions — one for reads (file must exist) and one for writes (parent directory must exist, file may not):
// capability/src/fs.rs
/// Resolve a path for reading (file must exist).
fn resolve_read(root: &Path, requested: &str) -> Result<PathBuf, u32> {
let path = root.join(requested);
let canonical = path.canonicalize().map_err(|_| rt::ENOENT)?;
let canonical_root = root.canonicalize().map_err(|_| rt::EACCESS)?;
if !canonical.starts_with(&canonical_root) {
return Err(rt::EACCESS);
}
Ok(canonical)
}
/// Resolve a path for writing (parent must exist, file may not).
fn resolve_write(root: &Path, requested: &str) -> Result<PathBuf, u32> {
let path = root.join(requested);
let parent = path.parent().ok_or(rt::EACCESS)?;
let canonical_parent = parent.canonicalize().map_err(|_| rt::ENOENT)?;
let canonical_root = root.canonicalize().map_err(|_| rt::EACCESS)?;
if !canonical_parent.starts_with(&canonical_root) {
return Err(rt::EACCESS);
}
let file_name = path.file_name().ok_or(rt::EACCESS)?;
Ok(canonical_parent.join(file_name))
}The critical line in both functions is canonical.starts_with(&canonical_root). The canonicalize() call resolves symlinks and normalizes .. segments. If a program requests ../../etc/passwd, canonicalize() resolves it to an absolute path that does not start with the jail root, and the function returns EACCESS.
resolve_write has an extra subtlety: the file being written might not exist yet (you're creating it), so you can't canonicalize the full path. Instead, it canonicalizes the parent directory and appends the filename. This ensures the parent is inside the jail while allowing new file creation.
Directory Traversal Blocked
The test for this is straightforward:
#[tokio::test]
async fn fs_path_traversal_blocked() {
let tmp = tempfile::tempdir().unwrap();
let outside = tmp.path().parent().unwrap().join("secret.txt");
fs::write(&outside, b"secret").unwrap();
let fs_cap = FsCapability::new(tmp.path().to_path_buf());
// ... setup engine, template, linker ...
// WAT: try to read ../secret.txt
let args = vec![rt::serialize_arg(&"../secret.txt").unwrap()];
let mut instance = linked
.instantiate(&engine, ExtensionMap::new(), args, vec![], 0)
.await
.unwrap();
let code = instance.run().await.unwrap();
assert_eq!(code, rt::EACCESS);
}fs.tree: Recursive Directory Listing
fs.tree provides the agent with a compact view of project structure — essential for understanding what files are available before reading them. It has two safety limits:
const SKIP_DIRS: &[&str] = &[
".git", "target", "node_modules", ".next", "dist",
"__pycache__", ".venv", "venv", "vendor", "build", ".cache",
];
const MAX_TREE_DEPTH: usize = 4;
const MAX_TREE_BYTES: usize = 32 * 1024; // 32 KBThe depth limit prevents runaway recursion. The byte budget prevents a massive directory tree from consuming all of WASM memory. The blocked directories skip known junk that would clutter the listing without providing useful information.
The output format is minimal — indentation-based, no box-drawing characters, file sizes in compact notation (4.2K, 1.3M):
src/
main.rs 4.2K
lib.rs 1.1K
components/
ui/
button.tsx 2.3K
App.tsx 856BDirectories are listed before files at each level, both sorted alphabetically. This format was chosen specifically for LLM consumption: easy to parse, minimal token usage, sufficient information to decide which files to read.
DebugCapability: Observability
The debug capability provides three simple output functions and a task tracking system. The output functions (dbg.inspect, dbg.trace, dbg.log) write to a configurable Write sink — stdout in production, a shared buffer in tests:
pub struct DebugCapability {
writer: Arc<Mutex<Box<dyn Write + Send>>>,
}
impl DebugCapability {
pub fn stdout() -> Self {
Self { writer: Arc::new(Mutex::new(Box::new(std::io::stdout()))) }
}
pub fn buffer() -> (Self, Arc<Mutex<Vec<u8>>>) {
let buf = Arc::new(Mutex::new(Vec::new()));
let writer = BufferWriter(buf.clone());
(Self { writer: Arc::new(Mutex::new(Box::new(writer))) }, buf)
}
}The buffer() constructor returns both the capability and a shared reference to the output buffer. Tests create the capability with buffer(), run a WASM program, and assert against the buffer contents.
Task Tracking
For multi-step programs, the agent can track progress via dbg.task, dbg.step, and dbg.done:
struct TrackedTask {
name: String,
current: u32,
total: Option<u32>,
}
pub struct DebugTaskState {
tasks: Vec<TrackedTask>,
}dbg.task(name_ptr, total)creates a tracked task. Iftotal > 0, the task is bounded (prints"Processing: 3/10"). Iftotal <= 0, it's unbounded (prints"Processing: step 3"). Returns a handle.dbg.step(handle)increments the counter and prints progress.dbg.done(handle, summary_ptr)prints a completion message with an optional summary.
This is the handle pattern again — tasks live host-side as Vec<TrackedTask>, WASM holds integer handles. A WASM program that processes 10 chunks of text might look like:
(argv 0 $text_ptr)
(call $str.chunk (local.get $text_ptr) (i32.const 4096))
(local $arr_err i32) (local $arr_id i32)
(local.set $arr_err) (local.set $arr_id)
(check $arr_err)
;; Start tracking
(local $task i32)
(local.set $task (call $dbg.task "Processing chunks" (call $arr.len (local.get $arr_id))))
;; ... loop over chunks, calling dbg.step each iteration ...
(call $dbg.done (local.get $task) "All chunks processed")RandCapability: Deterministic PRNG
The random number capability uses xorshift32 — a minimal PRNG that fits in a single u32:
pub struct RandState {
state: u32,
}
impl RandState {
pub fn new() -> Self {
RandState { state: 12345 }
}
fn next(&mut self) -> u32 {
let mut x = self.state;
x ^= x << 13;
x ^= x >> 17;
x ^= x << 5;
self.state = x;
x
}
}Two functions: rand.seed(val) resets the state (zero is treated as the default seed 12345), and rand.sample() returns a non-negative i32 (the high bit is masked off with & 0x7FFF_FFFF).
Determinism is the key property. Same seed, same sequence, every time. This matters for testing — a WASM program that uses randomness can be made reproducible by seeding before execution.
AiCapability: LLM Access from WASM
The AI capability lets WASM programs query the same LLM that drives the agent. Two functions:
ai.query(prompt_ptr, params_ptr)— single user messageai.assist(prompt_ptr, system_ptr, params_ptr)— system message + user message
Both make async HTTP requests to an OpenAI-compatible API (Cerebras by default). The params_ptr accepts an optional JSON object with temperature, max_tokens, and model overrides — or sys.nil() / i32.const 0 for defaults.
The key implementation detail is func_wrap_async:
linker
.func_wrap_async(
"ai",
"query",
move |mut caller: Caller<'_, State>, (prompt_ptr, params_ptr): (u32, u32)| {
let client = client.clone();
let api_key = api_key.clone();
// ...
Box::new(async move {
// Read prompt from WASM memory
// Make HTTP request
// Write response back to WASM memory
// ...
})
},
)This is the only capability that uses func_wrap_async instead of func_wrap. Wasmtime suspends the WASM execution while the future is pending, then resumes it with the result. The WASM program sees a synchronous function call — it has no idea an HTTP request happened in between.
The request body is minimal:
let body = serde_json::json!({
"model": actual_model,
"messages": messages,
"max_tokens": actual_max_tokens,
"temperature": temperature,
});OpenAI-compatible, so it works with Cerebras, OpenAI, Ollama, or any other provider that speaks the same protocol.
Summary of All Imports
After this article, the runtime provides the following host functions:
| Namespace | Functions | Pattern |
|---|---|---|
sys | alloc, argv, resv, nil | Core ABI (Article 5) |
http | get, post | Async, returns (ptr, err) |
kv | get, set | Shared via Arc across ReAct steps |
ai | query, assist | Async, OpenAI-compatible |
str | len, cat, slice, split, chunk, parse_int, from_int | Returns array handles for split/chunk |
arr | new, push, get, len, join | Host-side handle pattern |
dbg | inspect, trace, log, task, step, done | Handle pattern for tasks |
fs | read, write, append, rm, tree | Path-jailed, sync |
rand | seed, sample | Deterministic xorshift32 |
Every capability follows the same structure: implement Capability, declare WAT imports, register host functions on the linker, optionally initialize per-instance state. The trait keeps each capability self-contained — adding a new one never requires modifying existing code.
The agent now has enough tools to do real work: fetch URLs, process text, query AI models, read and write files, track progress. What it lacks is a brain — something to decide which tools to use, when, and how. That's the ReAct agent, coming next.