Runtime Architecture
A high-level view of how sessions, tools, streaming, persistence, and hosted execution fit together.
Runtime Shape
OpenAnalyst is built around one shared agent runtime. The default local TUI, backend-only CLI mode, and hosted session API all drive that same core conversation loop so prompts, tool calls, and session state stay consistent.
User input
-> Session runtime
-> Model response / tool planning
-> Tool execution
-> Streaming events
-> Persistent session state
Sessions
Each conversation runs inside a persistent session. A session carries the conversation history, working context, selected model, permission state, and runtime metadata needed to continue later without losing continuity.
| Field | Purpose |
|---|---|
user_id | Identifies the owning user or tenant |
session_id | Stable handle for the running agent session |
conversation_id | Tracks the conversation thread across turns and resume events |
timestamps | Creation, update, and resume timing for persistence and auditing |
Streaming Events
The hosted API streams the same kind of runtime information the TUI uses locally: assistant text deltas, tool calls, tool results, usage updates, and state transitions.
POST /sessions
POST /sessions/{id}/message
GET /sessions/{id}/events
Tool Execution
OpenAnalyst can expose different tool surfaces depending on where it runs. Local interactive sessions can use the full workstation-aware toolset. Hosted backend sessions can be restricted to advisory or hosted-safe tools so remote callers do not gain unintended host-machine access.
Persistence
Conversation state is stored so users can resume work later, switch between local and hosted usage, and keep isolated histories per user and session. In hosted deployments, the database is the source of truth for session metadata and conversation continuity.
Deployment Modes
| Mode | Behavior |
|---|---|
openanalyst | Starts the default interactive TUI |
openanalyst --notui | Runs backend-only CLI behavior without the interactive UI |
openanalyst --serve | Exposes hosted session endpoints for remote users |
Automatic Provider Failover
When a provider fails due to a rate limit, timeout, or unexpected error, the engine automatically attempts the next configured provider. This happens transparently — the caller never sees a provider-level failure unless every configured provider is exhausted.
- Each provider is only attempted if its API key is configured
- Retries use intelligent backoff to avoid overwhelming providers
- The chain resets on the next request — a previously-failed provider is tried again
- Result: automatic high availability as long as at least one provider is healthy
All 8 supported providers participate in the failover chain: OpenAnalyst, Anthropic, OpenAI, xAI, OpenRouter, Bedrock, Gemini, and Ollama (local). The engine determines the optimal order based on your configuration and availability.
Session Isolation
Every session is fully isolated per user. The persistent store enforces strict boundaries — cross-tenant data access is impossible by design.
- Per-user scoping — all data is partitioned by user identity, ensuring no tenant can see another tenant's sessions or conversations
- Concurrent access — the engine handles multiple simultaneous sessions safely without data corruption
- Clean deletion — removing a session automatically cleans up all related data (conversations, messages, metadata)
- Duplicate protection — the engine prevents race conditions when multiple requests target the same session simultaneously
Each session carries the following information:
session_id— stable handle for the running agent sessionconversation_id— tracks the conversation thread across turns and resume eventsuser_id— identifies the owning user or tenanttimestamps— creation, update, and resume timingconfiguration— model, provider, permission mode, and runtime settings
Tenant Architecture
The server supports two tenant modes. Choose single-tenant for personal use and development, or multi-tenant when deploying for teams and enterprises.
Single-Tenant (Default)
- No authentication required — all requests default to the "owner" user
- One user, one server — the simplest deployment model
- Tools operate in the server's working directory
- Ideal for personal use, local development, and CI/CD pipelines
Multi-Tenant (Enterprise)
- Pass
X-Tenant-IDheader on every request to identify the tenant - Each tenant gets isolated sessions, conversations, and workspace
- Tenant workspaces are created at
~/.openanalyst/server/workspaces/{tenant_id}/ - No authentication baked in — wrap with your own auth layer (IAM, OAuth, API gateway)
- Designed so enterprises can add their own rate limiting, billing, and access control
Single-Tenant:
Client → Server (owner) → AI Engine → Tools (server CWD)
Multi-Tenant:
Client A [X-Tenant-ID: alpha] → Server → AI Engine → Tools (workspaces/alpha/)
Client B [X-Tenant-ID: beta] → Server → AI Engine → Tools (workspaces/beta/)
X-Tenant-ID header.
Sandbox Model
Tool execution is sandboxed to prevent unintended access to the host system. Every tool call runs within the boundaries of the tenant's workspace directory.
- Workspace-scoped writes — tools can read and write within the tenant's workspace, but nothing outside it
- Protected paths — server files, system files, and other tenants' directories are blocked at the engine level
- Directory jail — each tenant's tool execution is confined to their workspace directory
- Code execution sandbox — the ExecuteCode tool runs in a sandboxed subprocess with configurable timeout and path protection
- Self-contained — complete workspace isolation built into the engine, no third-party sandbox dependency required
SSE Streaming
The server uses Server-Sent Events (SSE) for real-time communication. The same event types are consumed by both the TUI and remote API clients, ensuring consistent behavior across interfaces.
| Event Type | Description |
|---|---|
snapshot | Full session state on initial connection |
message | Complete message (user or assistant) |
assistant_delta | Incremental text chunk from the model |
tool_call | Tool invocation with name and arguments |
tool_result | Tool execution output |
usage | Token counts and cost tracking |
state | Session state transitions (thinking, tool_use, idle) |
error | Error events with message and code |
- Broadcast channel per session — multiple clients can subscribe to the same session's event stream
- Immediate flush — events are sent as they occur, with no buffering
- Disconnect detection — the server detects dropped connections via stream timeout
GET /sessions/{id}/events
Accept: text/event-stream
data: {"type":"assistant_delta","content":"Hello"}
data: {"type":"tool_call","name":"ReadFile","args":{"path":"src/main.rs"}}
data: {"type":"tool_result","output":"fn main() { ... }"}
data: {"type":"usage","input_tokens":1024,"output_tokens":256}
Auto-Configuration
On startup, the server scans the environment for API keys and automatically configures every provider that has a valid key. No manual provider setup is required.
| Environment Variable | Provider |
|---|---|
OPENANALYST_API_KEY | OpenAnalyst |
ANTHROPIC_API_KEY | Anthropic |
OPENAI_API_KEY | OpenAI |
XAI_API_KEY | xAI |
OPENROUTER_API_KEY | OpenRouter |
AWS_ACCESS_KEY_ID | Bedrock |
GOOGLE_API_KEY / GEMINI_API_KEY | Gemini |
If environment variables are not set, the server also checks these fallback locations:
~/.openanalyst/.env— dotenv-style key-value file~/.openanalyst/credentials.json— structured credentials file