Skip to main content

Protocol Reference

This page covers wire-level behavior and command-routing guarantees. If you are implementing product workflows, start with Architecture and the Controller Implementation Guide first, then use this page as the strict contract reference.

Current protocol types are defined in packages/shared-protocol/src/index.ts.

Source-of-truth code paths

ConcernSource
Shared envelope and payload contractspackages/shared-protocol/src/index.ts
Relay protocol handling and routingpackages/relay/src/index.ts
Controller envelope emissionpackages/cli/src/index.ts

Message flow

Envelope contract

Every WebSocket frame uses one envelope shape for stable cross-component correlation.

FieldDescription
protocolVersionCurrently 1.0
messageTypeFrame family: hello, auth, command, result, event, etc.
requestIdPrimary correlation key across controller, relay, and node
timestampISO-8601; enforced against replay skew windows
senderRolecontroller, relay, or node
payloadMessage-specific object

Message families

FamilyPurpose
hello / hello_ackRole and capability negotiation
auth / auth_ackAccess-token authentication
refresh / refresh_ackAccess-token renewal
commandController-issued actions
result / errorTerminal outcomes
eventProgress and listener updates
ping / pongSession liveness
tab_lock / tab_unlockLock lifecycle signals
command_cancelExplicit stream or command cancellation

Listener lifecycle

listener.subscribe and listener.unsubscribe each return normal terminal outcomes (result or error). Streaming data is delivered later as event frames tied to the original subscribe requestId.

ActionRequired payloadTerminal behavior
listener.subscribelistener, optional optionsImmediate result or error
listener.unsubscribetargetRequestIdImmediate result or error

After successful unsubscribe, further updates for that subscribe requestId are rejected with listener_not_found.

network.http_intercept options

OptionRequiredDescription
tabSessionIdYesMust resolve to an active managed session
siteYesNormalized to lowercase; validated against tab URL
urlPatternsNoGlob filters
requestHostAllowlistNoExplicit cross-host allowlist
modeNonetwork, fetch, or hybrid (default network)
includeBodyNoDefault true
includeHeadersNoDefault false; sensitive headers are redacted
maxBodyBytesNoDefault 256000; positive numeric only
mimeTypesNoMIME prefix allowlist
streamAdapterNoCommand-owned adapter hint
selfUserIdNoCommand-owned context value

Listener update shape

Listener updates are messageType=event frames with payload.type=listener_update and requestId equal to the original subscribe request. payload.data carries either raw transport payloads or command-owned shared-domain objects.

Shared object discriminators: chat.message, chat.typing, chat.participant, chat.message_deleted, content.article, content.post, content.post_comment.

Command contract

Every command payload must identify a target node and include replay protection fields.

FieldRequiredDescription
targetNodeIdYesRequired by protocol invariant
tabSessionIdConditionalRequired for tab-scoped actions
actionYesAction ID including command and listener actions
payloadYesAction-specific data
timeoutMsNoRelay timeout budget
waitPolicyNofail_fast or wait_with_timeout
idempotencyKeyNoReplay dedupe key at runtime
replayNonceYesRequired for acceptance

Command actions

ActionDescription
command.listAdvertise site command metadata
command.runExecute command logic
command.testExecute test hook; falls back to execute when none declared
command.reddit_feedLegacy alias for command.run with site=reddit.com, command=getFeed
primitive.tab.openOpen a managed tab
primitive.tab.closeClose a managed tab
primitive.tab.navigateNavigate a managed tab
primitive.tab.queryQuery managed tab state
primitive.dom.extract_textExtract visible text from tab
primitive.dom.extract_htmlExtract page HTML
primitive.dom.extract_clean_htmlExtract DOM with semantic attributes, scripts/styles removed
primitive.dom.extract_distilled_htmlExtract distilled HTML (readability)
primitive.dom.extract_markdownExtract Markdown representation
primitive.page.screenshotScreenshot the tab or a URL

command.list descriptor metadata

command.list returns command descriptors that include command metadata used by controllers for validation, preload behavior, and timeout planning.

FieldRequiredDescription
siteYesSite scope the command is valid for
idYesCommand id within the site scope
displayNameYesHuman-readable command name
descriptionYesCommand summary
requiresAuthNoIndicates manual-login handoff may be required
preloadHostNoPreferred URL host for auto-open flows
inputFieldsNoInput schema used for command input validation
timeoutPolicyNoCommand timeout hints for controllers

When provided, timeoutPolicy supports both fixed and input-scaled timeout guidance:

timeoutPolicy fieldRequiredDescription
defaultMsNoSuggested default timeout for this command
scaling.inputFieldNoInput field name used for scaling
scaling.baseMsNoBase timeout budget before scaling
scaling.perUnitMsNoAdditional timeout budget per input unit
scaling.minMsNoLower clamp for resolved timeout
scaling.maxMsNoUpper clamp for resolved timeout

Controllers may use this metadata to compute effective timeout budgets while preserving explicit user-provided timeout overrides.

Content extraction is also exposed through first-class controller interfaces:

  • CLI: otto extract-content [url] --format markdown|distilled_html|clean_html|raw_html|text (defaults to markdown)
  • MCP: otto_extract_content tool with format selector and shared targeting (url or tabSession)

These interfaces map to the primitive DOM actions above and provide one consolidated surface for agents and CLI users.

command.test may return a stream manifest in result.payload.data. Controllers should keep follow-up subscribe traffic on the same authenticated WebSocket, maintain heartbeat (ping/pong) for long sessions, and use command_cancel against the original test requestId when shutting down active stream tests.

Routing, queueing, and reliability

  • Relay routes each command to exactly one node session and tracks requestId ownership so results always return to the originating controller.
  • Same-tab work (targetNodeId:tabSessionId) is FIFO; cross-tab work is parallel.
  • Queue depth limits are enforced per tab and per controller.
  • Commands waiting under wait_with_timeout can terminate as queue_wait_timed_out.
  • Timeout windows produce terminal timeout outcomes; node disconnects produce node_disconnected; lock contention produces lock_conflict.
  • Replay nonces and timestamp skew windows are enforced on ingress.
  • Controller disconnect cleanup purges owned queued work and triggers owner-scoped tab cleanup.

Tab ownership and cleanup

  • Relay injects internal tab ownership metadata when forwarding controller-created tab-open commands.
  • Node stores ownership by tabSessionId.
  • On controller disconnect or heartbeat timeout, relay dispatches primitive.tab.close_owned to close only tabs owned by that controller identity.
  • Lock keys are targetNodeId:tabSessionId; only one controller can hold a lock at a time.
  • Lease expiration auto-releases locks. Lock events include lease metadata (lockOwnerControllerId, lockLeaseMs, lockExpiresAt) for observability.

Versioning

Current version: 1.0. Additive changes are preferred; breaking changes require a new major version.

command.reddit_feed alias is maintained during migration to command.run.

Next steps