Snappit
AI-powered browser automation platform.
The problem
Browser automation is notoriously brittle. Selectors break on the smallest UI change, sessions drift and expire, and running real browsers at scale is expensive and flaky. Most tools require code and constant babysitting — which means automation never reaches the people who need it most.
Context
Snappit was born from a simple observation: an enormous amount of real work still happens by a human clicking through web apps. RPA tools exist, but they’re rigid and engineering-heavy. The arrival of capable LLMs made a different approach possible — agents that understand intent and adapt, rather than scripts that shatter.
Architecture
Snappit compiles plain-language intent into a sequence of robust, observable browser actions, executed by an AI agent loop that can perceive the page and recover when reality diverges from the plan. A control plane orchestrates a fleet of headless browsers that scale on demand, with durable session state so long-running workflows survive restarts.
- Intent layer: natural language → structured action plan.
- Agent loop: perceive → act → verify → recover, with the DOM as ground truth.
- Session orchestration: durable, resumable sessions backed by Redis.
- Browser fleet: headless browsers autoscaled on Google Cloud, pooled for cost.
- Observability: every action traced so failures are debuggable, not mysterious.
async function runStep(step: ActionPlanStep, page: Page) {
for (let attempt = 0; attempt < MAX_RETRIES; attempt++) {
const state = await perceive(page); // re-ground in real DOM
const action = await agent.decide(step, state); // adapt to what's actually there
try {
await execute(action, page);
if (await verify(step.goal, page)) return ok(action);
} catch (err) {
await recover(page, err); // heal, don't fail silently
}
}
return escalate(step); // loud, explicit failure
}Technical challenges
Resilience against a moving DOM
The web is not an API. Pages change, load asynchronously, and lie about being ready. I built a recovery layer that treats brittleness as the default and re-grounds the agent in the actual page state before every critical action.
Scaling real browsers cost-effectively
Real browsers are heavy. Pooling, warm-start strategies, and aggressive autoscaling keep the fleet responsive without paying for idle capacity — a direct application of the cost discipline I’d built earlier in my career.
Agents that recover, not fail silently
A silent failure in automation is worse than a loud one. The agent loop is designed to detect divergence, retry with new strategies, and escalate clearly when it truly can’t proceed.
Engineering decisions
Playwright over raw CDP
Playwright’s auto-waiting and cross-browser model gave a more reliable foundation than hand-rolled Chrome DevTools Protocol, letting me spend my complexity budget on the agent layer instead of plumbing.
Intent compiled to actions, not free-form clicking
Rather than letting the model click around unconstrained, intent compiles to a verifiable action plan. This makes runs observable, reproducible, and far safer.
Redis as the session backbone
Durable, fast session state means long workflows can pause, resume, and survive infrastructure churn without losing progress.
Technologies
Results
A platform where non-engineers automate real web workflows reliably, and engineers extend them without fighting the browser. Automation that recovers from the messiness of the real web instead of breaking on it.
- Plain-language workflows accessible to non-engineers.
- Self-healing execution that survives DOM and session churn.
- On-demand browser scaling that keeps cost in line with usage.
Lessons learned
- In automation, reliability is the product — not a feature.
- AI earns its place where it removes drudgery, not where it looks impressive.
- Constraining the model (intent → plan) makes it more useful, not less.
What I’d improve today
- A richer simulation harness to test agents against recorded site variations.
- A visual workflow editor that compiles to the same intent layer.
- Per-step confidence scoring surfaced to the user before execution.