Skip to main content

Otto Architecture

Otto is a remote browser automation system with a clear role boundary: controllers issue commands, relay brokers trust and routing, and extension nodes execute browser work. This page explains how those pieces cooperate, what guarantees the platform provides, and where implementation authority lives.

System roles

ComponentPrimary responsibilityWhy it exists
ControllerCommand creation and user workflowsKeeps automation intent outside browser runtime
RelayAuth, routing, locking, redaction, terminalizationCentral policy enforcement point
Node (Extension)Site-aware execution and listener captureExecutes browser actions close to the target tab

The split is intentional. Control-plane concerns (auth, routing, audit) stay in relay. Execution-plane concerns (tab access, DOM, network) stay in the extension node.

Command lifecycle

For site commands, the node runtime runs this sequence before invoking command logic:

  1. Resolve site bundle and command metadata.
  2. Validate active tab URL against the declared site scope.
  3. Validate and sanitize declared input metadata (inputFields, optional inputAtLeastOneOf).
  4. If requiresAuth, run checkLogin and optional gotoLogin (no credential automation).
  5. Ensure preloadHost via auto-navigation when needed.
  6. Invoke command execute and return structured output.

This sequence exists to fail early with explicit error codes and prevent command handlers from running in ambiguous page state.

Runtime model (MV3)

The extension uses a Chrome MV3 split runtime so WebSocket continuity does not depend on service worker uptime.

ComponentFileResponsibility
Background scriptbackground.tsCommand orchestration and browser API access
Offscreen clientoffscreen-client.tsPersistent relay WebSocket and heartbeat

Stream handling is also split by responsibility:

  • Listener transport — generic, site-agnostic. Captures raw network events.
  • Site command adapters — parse raw payloads into shared domain objects.
  • Transport deduplication — suppresses equivalent hybrid cross-source response duplicates.
  • Adapter deduplication — suppresses semantic replay duplicates from site payloads.

Platform guarantees

GuaranteeEffect
targetNodeId requiredCommands route intentionally, never by implicit default
Terminal outcomes preservedEvery command ends as completed, failed, timed_out, or cancelled
Per-tab serial / cross-tab parallelPrevents conflicting tab mutations without sacrificing throughput
Pre-ingress redactionSensitive values masked before persistence and stream fan-out
Site-scoped executionCommand logic cannot run against the wrong domain
No credential automationrequiresAuth commands use manual_login_required handoff

Setup and ownership boundaries

otto setup configures the controller side. Controller preferences and tokens are stored in ~/.otto/config.json.

Extension relay URL, pairing code state, and node credentials are stored in chrome.storage.* and are extension-owned. These stores may point at the same relay host, but they remain role-scoped (controller vs node).

Setup is release-driven for end users: extension artifacts come from release assets with checksum verification. Non-interactive mode emits machine-readable JSON; interactive TTY mode provides human-oriented onboarding output.

Source of truth

ConcernPath
Protocol contractspackages/shared-protocol/src/index.ts
Relay routing and lockspackages/relay/src/index.ts
CLI UX and envelopespackages/cli/src/index.ts
Extension background orchestrationextension/entrypoints/background.ts
Offscreen transport lifecycleextension/src/runtime/offscreen-client.ts

Next steps