In Automation, Reliability Is the Product
Browser automation looks like a demo problem and is actually a reliability problem. What building Snappit taught me about why self-healing beats clever.
A browser automation demo is easy. You record a flow, it plays back, the room claps. Then a button moves three pixels, a modal loads half a second late, a session expires — and the whole thing falls over. The demo was never the product. Reliability is the product.
This is the core insight behind Snappit, and it changed how I think about AI agents entirely.
The web is not an API
APIs have contracts. The web has vibes. Pages render asynchronously, lie about being ready, change layout between visits, and expire your session when you look away. Any automation that assumes the page is what it was a moment ago is already broken — it just doesn't know it yet.
So the design principle becomes: assume brittleness is the default.
Perceive, act, verify, recover
Instead of a script that executes blindly, Snappit's agent runs a loop:
async function runStep(step, page) {
for (let attempt = 0; attempt < MAX_RETRIES; attempt++) {
const state = await perceive(page); // re-ground in real DOM
const action = await agent.decide(step, state); // adapt to reality
try {
await execute(action, page);
if (await verify(step.goal, page)) return ok(action);
} catch (err) {
await recover(page, err); // heal, don't fail silently
}
}
return escalate(step); // loud, explicit failure
}
Every critical action re-grounds in the actual DOM before it runs. The agent adapts to what's there, not what it expected. And when it genuinely can't proceed, it fails loudly — because in automation, a silent failure is far worse than a loud one.
Constrain the model to make it more useful
It's tempting to let an LLM click around freely. Don't. Snappit compiles intent into a verifiable action plan first. Constraining the model makes runs observable, reproducible, and safe — and counter-intuitively, more capable, because it can reason about a plan instead of improvising forever.
Where AI actually earns its place
AI is most valuable where it removes drudgery, not where it looks impressive. The flashy part of Snappit is the natural-language input. The valuable part is the recovery layer nobody sees. That layer is what lets a non-engineer trust the system with real work — and trust, in automation, is everything.
The same recovery layer now powers QA Copilot. Reliable foundations compound: build them once, reuse them everywhere.
Related reading
What an Accounting Degree Taught Me About Building Software Companies
I trained as an accountant and became an engineer and founder. The detour wasn’t wasted — it’s my edge. On outcomes, distribution, and finding what’s true.
How I Cut a Cloud Bill by 90% Without Touching Latency
Cost is a feature. Here is the playbook I used to cut a workload’s cloud spend by ~90% while improving p99 — and why most teams leave this money on the table.