ShippedAI · Automation· 2024

Snappit

AI-powered browser automation platform.

AI-native

plain-language workflows

Self-healing

resilient automation

On-demand

browser scaling

The problem

Browser automation is notoriously brittle. Selectors break on the smallest UI change, sessions drift and expire, and running real browsers at scale is expensive and flaky. Most tools require code and constant babysitting — which means automation never reaches the people who need it most.

Context

Snappit was born from a simple observation: an enormous amount of real work still happens by a human clicking through web apps. RPA tools exist, but they’re rigid and engineering-heavy. The arrival of capable LLMs made a different approach possible — agents that understand intent and adapt, rather than scripts that shatter.

Architecture

Snappit compiles plain-language intent into a sequence of robust, observable browser actions, executed by an AI agent loop that can perceive the page and recover when reality diverges from the plan. A control plane orchestrates a fleet of headless browsers that scale on demand, with durable session state so long-running workflows survive restarts.

Intent layer: natural language → structured action plan.
Agent loop: perceive → act → verify → recover, with the DOM as ground truth.
Session orchestration: durable, resumable sessions backed by Redis.
Browser fleet: headless browsers autoscaled on Google Cloud, pooled for cost.
Observability: every action traced so failures are debuggable, not mysterious.

The agent recovery loop (illustrative)

async function runStep(step: ActionPlanStep, page: Page) {
  for (let attempt = 0; attempt < MAX_RETRIES; attempt++) {
    const state = await perceive(page);            // re-ground in real DOM
    const action = await agent.decide(step, state); // adapt to what's actually there
    try {
      await execute(action, page);
      if (await verify(step.goal, page)) return ok(action);
    } catch (err) {
      await recover(page, err);                     // heal, don't fail silently
    }
  }
  return escalate(step);                            // loud, explicit failure
}

Technical challenges

Resilience against a moving DOM

The web is not an API. Pages change, load asynchronously, and lie about being ready. I built a recovery layer that treats brittleness as the default and re-grounds the agent in the actual page state before every critical action.

Scaling real browsers cost-effectively

Real browsers are heavy. Pooling, warm-start strategies, and aggressive autoscaling keep the fleet responsive without paying for idle capacity — a direct application of the cost discipline I’d built earlier in my career.

Agents that recover, not fail silently

A silent failure in automation is worse than a loud one. The agent loop is designed to detect divergence, retry with new strategies, and escalate clearly when it truly can’t proceed.

Engineering decisions

Playwright over raw CDP

Playwright’s auto-waiting and cross-browser model gave a more reliable foundation than hand-rolled Chrome DevTools Protocol, letting me spend my complexity budget on the agent layer instead of plumbing.

Intent compiled to actions, not free-form clicking

Rather than letting the model click around unconstrained, intent compiles to a verifiable action plan. This makes runs observable, reproducible, and far safer.

Redis as the session backbone

Durable, fast session state means long workflows can pause, resume, and survive infrastructure churn without losing progress.

Technologies

PlaywrightTypeScriptLLM AgentsNode.jsRedisGoogle Cloud

Results

A platform where non-engineers automate real web workflows reliably, and engineers extend them without fighting the browser. Automation that recovers from the messiness of the real web instead of breaking on it.

Plain-language workflows accessible to non-engineers.
Self-healing execution that survives DOM and session churn.
On-demand browser scaling that keeps cost in line with usage.

Lessons learned

In automation, reliability is the product — not a feature.
AI earns its place where it removes drudgery, not where it looks impressive.
Constraining the model (intent → plan) makes it more useful, not less.

What I’d improve today

A richer simulation harness to test agents against recorded site variations.
A visual workflow editor that compiles to the same intent layer.
Per-step confidence scoring surfaced to the user before execution.

Next project

Cochapp

Software onboarding platform.