Architecture - OpenAnalyst CLI Docs

Runtime Shape

OpenAnalyst is built around one shared agent runtime. The default local TUI, backend-only CLI mode, and hosted session API all drive that same core conversation loop so prompts, tool calls, and session state stay consistent.

High-level flow

User input
  -> Session runtime
  -> Model response / tool planning
  -> Tool execution
  -> Streaming events
  -> Persistent session state

Sessions

Each conversation runs inside a persistent session. A session carries the conversation history, working context, selected model, permission state, and runtime metadata needed to continue later without losing continuity.

Field	Purpose
`user_id`	Identifies the owning user or tenant
`session_id`	Stable handle for the running agent session
`conversation_id`	Tracks the conversation thread across turns and resume events
`timestamps`	Creation, update, and resume timing for persistence and auditing

Streaming Events

The hosted API streams the same kind of runtime information the TUI uses locally: assistant text deltas, tool calls, tool results, usage updates, and state transitions.

POST /sessions
POST /sessions/{id}/message
GET  /sessions/{id}/events

Hosted Sessions

The canonical remote contract is session-based, not stateless chat-completions only. That keeps tool calls, resume behavior, and conversation persistence aligned with the local CLI experience.

Tool Execution

OpenAnalyst can expose different tool surfaces depending on where it runs. Local interactive sessions can use the full workstation-aware toolset. Hosted backend sessions can be restricted to advisory or hosted-safe tools so remote callers do not gain unintended host-machine access.

Persistence

Conversation state is stored so users can resume work later, switch between local and hosted usage, and keep isolated histories per user and session. In hosted deployments, the database is the source of truth for session metadata and conversation continuity.

Deployment Modes

Mode	Behavior
`openanalyst`	Starts the default interactive TUI
`openanalyst --notui`	Runs backend-only CLI behavior without the interactive UI
`openanalyst --serve`	Exposes hosted session endpoints for remote users

Automatic Provider Failover

When a provider fails due to a rate limit, timeout, or unexpected error, the engine automatically attempts the next configured provider. This happens transparently — the caller never sees a provider-level failure unless every configured provider is exhausted.

Each provider is only attempted if its API key is configured
Retries use intelligent backoff to avoid overwhelming providers
The chain resets on the next request — a previously-failed provider is tried again
Result: automatic high availability as long as at least one provider is healthy

All 8 supported providers participate in the failover chain: OpenAnalyst, Anthropic, OpenAI, xAI, OpenRouter, Bedrock, Gemini, and Ollama (local). The engine determines the optimal order based on your configuration and availability.

Tip

Configure multiple provider keys for maximum resilience. The failover chain costs nothing when the primary provider is healthy — backup providers are only contacted on failure.

Session Isolation

Every session is fully isolated per user. The persistent store enforces strict boundaries — cross-tenant data access is impossible by design.

Per-user scoping — all data is partitioned by user identity, ensuring no tenant can see another tenant's sessions or conversations
Concurrent access — the engine handles multiple simultaneous sessions safely without data corruption
Clean deletion — removing a session automatically cleans up all related data (conversations, messages, metadata)
Duplicate protection — the engine prevents race conditions when multiple requests target the same session simultaneously

Each session carries the following information:

session_id — stable handle for the running agent session
conversation_id — tracks the conversation thread across turns and resume events
user_id — identifies the owning user or tenant
timestamps — creation, update, and resume timing
configuration — model, provider, permission mode, and runtime settings

Tenant Architecture

The server supports two tenant modes. Choose single-tenant for personal use and development, or multi-tenant when deploying for teams and enterprises.

Single-Tenant (Default)

No authentication required — all requests default to the "owner" user
One user, one server — the simplest deployment model
Tools operate in the server's working directory
Ideal for personal use, local development, and CI/CD pipelines

Multi-Tenant (Enterprise)

Pass X-Tenant-ID header on every request to identify the tenant
Each tenant gets isolated sessions, conversations, and workspace
Tenant workspaces are created at ~/.openanalyst/server/workspaces/{tenant_id}/
No authentication baked in — wrap with your own auth layer (IAM, OAuth, API gateway)
Designed so enterprises can add their own rate limiting, billing, and access control

Tenant routing

Single-Tenant:
  Client → Server (owner) → AI Engine → Tools (server CWD)

Multi-Tenant:
  Client A [X-Tenant-ID: alpha] → Server → AI Engine → Tools (workspaces/alpha/)
  Client B [X-Tenant-ID: beta]  → Server → AI Engine → Tools (workspaces/beta/)

Enterprise Deployment

Multi-tenant mode does not include authentication. Place a reverse proxy or API gateway in front of the server to handle auth, then pass the resolved tenant identity via the X-Tenant-ID header.

Sandbox Model

Tool execution is sandboxed to prevent unintended access to the host system. Every tool call runs within the boundaries of the tenant's workspace directory.

Workspace-scoped writes — tools can read and write within the tenant's workspace, but nothing outside it
Protected paths — server files, system files, and other tenants' directories are blocked at the engine level
Directory jail — each tenant's tool execution is confined to their workspace directory
Code execution sandbox — the ExecuteCode tool runs in a sandboxed subprocess with configurable timeout and path protection
Self-contained — complete workspace isolation built into the engine, no third-party sandbox dependency required

Security Note

The sandbox prevents accidental cross-tenant file access, but it is not a hardened security boundary. For high-security deployments, combine the sandbox with OS-level isolation (containers, VMs) and network segmentation.

SSE Streaming

The server uses Server-Sent Events (SSE) for real-time communication. The same event types are consumed by both the TUI and remote API clients, ensuring consistent behavior across interfaces.

Event Type	Description
`snapshot`	Full session state on initial connection
`message`	Complete message (user or assistant)
`assistant_delta`	Incremental text chunk from the model
`tool_call`	Tool invocation with name and arguments
`tool_result`	Tool execution output
`usage`	Token counts and cost tracking
`state`	Session state transitions (thinking, tool_use, idle)
`error`	Error events with message and code

Broadcast channel per session — multiple clients can subscribe to the same session's event stream
Immediate flush — events are sent as they occur, with no buffering
Disconnect detection — the server detects dropped connections via stream timeout

Subscribing to events

GET /sessions/{id}/events
Accept: text/event-stream

data: {"type":"assistant_delta","content":"Hello"}
data: {"type":"tool_call","name":"ReadFile","args":{"path":"src/main.rs"}}
data: {"type":"tool_result","output":"fn main() { ... }"}
data: {"type":"usage","input_tokens":1024,"output_tokens":256}

Auto-Configuration

On startup, the server scans the environment for API keys and automatically configures every provider that has a valid key. No manual provider setup is required.

Environment Variable	Provider
`OPENANALYST_API_KEY`	OpenAnalyst
`ANTHROPIC_API_KEY`	Anthropic
`OPENAI_API_KEY`	OpenAI
`XAI_API_KEY`	xAI
`OPENROUTER_API_KEY`	OpenRouter
`AWS_ACCESS_KEY_ID`	Bedrock
`GOOGLE_API_KEY` / `GEMINI_API_KEY`	Gemini

If environment variables are not set, the server also checks these fallback locations:

~/.openanalyst/.env — dotenv-style key-value file
~/.openanalyst/credentials.json — structured credentials file

CORS

The server enables CORS for browser-based API clients by default — all origins and all methods are permitted. For production deployments, restrict allowed origins via a reverse proxy.

Runtime Architecture

Runtime Shape

Sessions

Streaming Events

Tool Execution

Persistence

Deployment Modes

Automatic Provider Failover

Session Isolation

Tenant Architecture

Single-Tenant (Default)

Multi-Tenant (Enterprise)

Sandbox Model

SSE Streaming

Auto-Configuration