Blog

Interchange Formats: How Your Tools Should Talk to Each Other

Most AI-generated code fails at integration boundaries. Contracts, event logs, and manifest-driven dispatch fix this structurally.

APRIL 2026 · 6 MIN READ

In Part 2, I covered the structural patterns — single-responsibility files, config-driven behavior, dispatchers, contracts. Those patterns keep individual tools safe.

This part is about what happens between tools. Integration boundaries. The seams.

And the seams are where AI-modified codebases actually die.

Why integration is where it breaks

A single tool with clean boundaries can be modified by an AI session without much risk. The session reads the file, understands the scope, makes the change. Fine.

But the moment a change involves two tools talking to each other — tool A writes data that tool B reads, or tool A triggers tool B — you're in dangerous territory. The AI session sees tool A's code and tool B's code. It doesn't see the contract between them. It doesn't know that tool B expects dates in ISO 8601, or that tool A appends to a log that tool B reads with tail -1, or that tool B runs on a different machine entirely and gets its data via sync.

Without declared interchange formats, every integration point is an implicit contract. Implicit contracts are broken by people who don't know they exist. LLMs never know they exist.

Principle 1: Append-only event logs

The simplest interchange format is a line of JSON appended to a file. One line, one event, one timestamp.

{"ts":"2026-04-03T14:22:00Z","type":"task.created","id":"T-041","owner":"stuart","title":"Review cascade metrics","due":"2026-04-05"}
{"ts":"2026-04-03T14:25:00Z","type":"task.completed","id":"T-039","owner":"stuart","completed_at":"2026-04-03T14:25:00Z"}
{"ts":"2026-04-03T15:01:00Z","type":"task.created","id":"T-042","owner":"jason","title":"Submit weekly scorecard","due":"2026-04-07"}

This is NDJSON — newline-delimited JSON. Each line is a self-contained event. The file is append-only. Nothing is ever modified or deleted.

This format solves three problems at once:

It's safe under concurrent access. Two processes appending to the same file don't corrupt each other. Appending is atomic at the OS level for lines under the pipe buffer size. No locks, no coordination, no race conditions.

It's a complete audit trail. You don't just have the current state — you have every state transition. "When did this task get completed?" Grep the log. "Who changed this metric?" Grep the log. "What happened at 3am when nobody was watching?" Grep the log.

It's LLM-friendly. An AI session that needs to record an event appends one line. It doesn't need to read the existing file. It doesn't need to understand the schema beyond "here's what a line looks like." The blast radius of getting it wrong is one malformed line — not a corrupted database.

Compare this to a mutable JSON file: tasks.json that contains an array of task objects. Every modification requires reading the whole file, parsing it, finding the right object, modifying it, and writing the whole file back. An LLM session that does this while another process is also doing it corrupts the file. An LLM session that misunderstands the nesting structure corrupts the file. An LLM session that writes a trailing comma corrupts the file.

Append-only logs make the wrong thing hard to do. You can't accidentally corrupt history by appending to it.

Materialized views for current state

"But I need to know the current list of tasks, not the full history."

Build a materialized view. A script that reads the event log and produces the current state as a JSON file. Run it when you need it, or run it on a schedule.

# Materialize current open tasks from event log
jq -s '
  group_by(.id) |
  map(last) |
  map(select(.type != "task.completed"))
' task_log.ndjson > tasks_current.json

The event log is the source of truth. The materialized view is a convenience. If the view ever gets corrupted, you regenerate it from the log. No data loss. No ambiguity about what happened.

Principle 2: Declared contracts

I covered contracts registries in Part 2 as a design pattern. Here I want to go deeper on what the contract actually declares at the interchange level.

A contract between two tools needs to answer four questions:

What format is the data in? (NDJSON lines? JSON object? CSV? Plain text?)
What fields are required? (Every event must have ts, type, and id)
Who writes and who reads? (task_ctl writes task_log.ndjson. The dashboard reads it. Nobody else writes to it.)
What's the delivery mechanism? (Shared file? HTTP POST? Shell command? stdout pipe?)

When these four things are declared in a registry file, an AI session that needs to integrate with an existing tool has everything it needs. It doesn't guess the format. It doesn't reach into internals. It reads the contract and builds to spec.

Without declared contracts, the AI session reads the source code and infers the contract. Inference is where integration bugs live. The session sees that tool A writes to /data/crm/contacts.json and reasonably concludes it can also write there. It can't — because tool A uses an atomic write pattern (write to temp file, then rename) and direct writes break the sync. That invariant isn't in the code. It's in the history of why the code is shaped that way.

Contracts make the invisible visible.

Principle 3: Manifest-driven dispatch

This is where the patterns from Parts 1 and 2 converge into something powerful.

Imagine you have a system that processes meeting transcripts. Each transcript produces structured output: action items, decisions, metrics, pipeline updates. That output needs to go to different tools — action items to the task tracker, metrics to the dashboard, pipeline updates to the CRM.

The naive implementation is a router function:

def route(digest):
    if digest.has_actions:
        task_ctl.add(digest.actions)
    if digest.has_metrics:
        dashboard.update(digest.metrics)
    if digest.has_pipeline:
        crm.log_touch(digest.pipeline)

Three destinations, three if-blocks. Add a fourth and you modify the router. Add a tenth and you have a function nobody wants to touch because breaking it breaks everything.

Manifest-driven dispatch inverts this. Each subscriber declares what it wants:

// subscriptions/task_ctl.json
{
  "event": "digest.ready",
  "filter": "has_field:action_items",
  "extract": ["action_items"],
  "deliver": "shell:task_ctl.sh add-batch"
}

// subscriptions/crm.json
{
  "event": "digest.ready",
  "filter": "has_field:pipeline",
  "extract": ["pipeline", "contact_name"],
  "deliver": "shell:crm_sync.sh add-touch"
}

The dispatcher reads all subscription files, matches them against the event, and delivers. Adding a new subscriber means dropping a new JSON file in the subscriptions/ directory. Zero code changes to the dispatcher or any existing subscriber.

This is the architectural equivalent of an event bus, but simpler. No message broker. No pub/sub framework. Just config files in a directory.

Why this matters for LLM-modified systems

An AI session that needs to add a new integration creates one subscription file. It doesn't open the dispatcher. It doesn't modify existing subscriptions. It can't break existing routing because it never touches it.

And here's the thing that makes this better than the hardcoded router: you can see all integrations by listing the directory. ls subscriptions/ tells you everything. A hardcoded router requires reading and understanding 300 lines of code to know what it does.

The directory listing is self-documenting in a way that code never is. An AI session, a human developer, a new team member — they all get the same instant understanding. No institutional knowledge required.

The compound effect

These three principles — append-only event logs, declared contracts, manifest-driven dispatch — create a system where integration is the easy part instead of the hard part.

Each tool writes events to its own log. Contracts declare the format. Subscribers register via config. The dispatcher routes events to subscribers. Adding a new tool means: create the tool, declare its contract, create subscriptions for any events it cares about.

No modification of existing tools. No implicit contracts to discover. No routing logic to untangle.

An AI session working on this system can build and integrate a new tool without understanding the rest of the system — because the interchange formats are explicit and the subscription model is self-documenting.

In Part 4, I'll cover what happens between sessions — how to keep documentation alive, detect architectural drift, and make sure today's clean patterns don't rot into tomorrow's legacy code.