Server API - OpenAnalyst CLI Docs

Getting Started

Start OpenAnalyst in server mode with the --serve flag. The server listens on port 3080 by default and exposes a full REST + SSE API for programmatic access to all AI agent capabilities.

Start the server

# Default port (3080)
openanalyst --serve

# Custom port
openanalyst --serve 8080

On startup, the server:

Creates an owner user automatically (single-tenant mode)
Auto-loads provider API keys from environment variables (ANTHROPIC_API_KEY, OPENAI_API_KEY, etc.)
Enables CORS for browser clients
Requires no authentication by default — zero-friction setup

Quick test

Once the server is running, verify with: curl http://localhost:3080/health — you should see {"status":"ok"}.

Authentication Model

The server supports two operating modes depending on your deployment needs:

Single-Tenant (Default)

No authentication required. All requests default to the owner user. One user, one server, zero friction. Ideal for local development, CI/CD pipelines, and personal automation.

Single-tenant request

# No headers needed — you're the owner
curl http://localhost:3080/health
curl http://localhost:3080/me
curl http://localhost:3080/sessions

Multi-Tenant

Pass the X-Tenant-ID header to isolate sessions, conversations, and workspace directories per tenant. Each tenant gets a sandboxed environment with no access to other tenants' data.

Multi-tenant request

# Each tenant is fully isolated
curl -H "X-Tenant-ID: team-alpha" http://localhost:3080/sessions
curl -H "X-Tenant-ID: team-beta" http://localhost:3080/sessions

Enterprise pattern

The server is designed as a single-tenant core that you wrap with your own auth middleware. Add your own JWT verification, OAuth, or API key validation in front of the server — then pass X-Tenant-ID from your middleware to the server.

API Reference

Complete list of all endpoints exposed by the server.

Health & User

Method	Path	Description
`GET`	`/health`	Health check — returns `{"status":"ok"}`
`GET`	`/me`	Get current user info
`GET`	`/api/ai/account`	Get account details (email, plan, credits)

Provider Management

Method	Path	Description
`GET`	`/me/providers`	List configured provider credentials
`PUT`	`/me/providers/{name}`	Add or update a provider API key

Quick Chat

Method	Path	Description
`POST`	`/v1/chat`	Simple chat — send a message, get SSE response
`POST`	`/v1/query`	Alias for `/v1/chat`

Skills

Method	Path	Description
`GET`	`/v1/skills`	List all available skills
`POST`	`/v1/skills/match`	Find skills matching a query
`POST`	`/v1/skills/execute`	Execute matched skills
`POST`	`/v1/skills/{name}/run`	Run a specific skill by name

Sessions

Method	Path	Description
`POST`	`/sessions`	Create a new persistent session
`GET`	`/sessions`	List all sessions for the user
`GET`	`/sessions/{id}`	Get session details + conversation history
`DELETE`	`/sessions/{id}`	Delete a session and its data
`POST`	`/sessions/{id}/message`	Send a message to a session (SSE response)
`GET`	`/sessions/{id}/events`	Subscribe to real-time session events (SSE)

OpenAnalyst Account

Email + 6-digit OTP login flow. On successful verification the server stores an OAuth credential locally and uses it automatically for inference — clients never see the token.

Method	Path	Description
`POST`	`/me/auth/otp/start`	Send a 6-digit code to an email address
`POST`	`/me/auth/otp/verify`	Exchange the code for a stored OAuth session
`GET`	`/me/auth/status`	Current sign-in state, plan, and credit balance
`POST`	`/me/auth/logout`	Wipe the stored OAuth credential (`204`)
`PUT`	`/me/providers/openanalyst/preferred-source`	Choose `oauth` vs `manual` key (`204`)

Quick Chat

The simplest way to use the server. Send a message and receive a streaming SSE response as the AI types. No session management required — fire and forget.

curl

curl -X POST http://localhost:3080/v1/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Explain Rust ownership in 3 sentences"}'

The response is an SSE stream with assistant_delta events as the AI generates text in real time.

Optional Fields

message string Required The prompt or question to send to the AI

model string Optional Override the model (e.g., "gpt-4o", "claude-sonnet-4-6")

system_prompt string Optional Custom system prompt for this request

effort string Optional Thinking effort level: "low", "medium", "high", "max"

stream boolean Optional Default true. Set to false for a non-streaming JSON response

With options

curl -X POST http://localhost:3080/v1/chat \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Write a Python web scraper for Hacker News",
    "model": "claude-sonnet-4-6",
    "effort": "high",
    "system_prompt": "You are a senior Python developer. Write clean, typed code."
  }'

OpenAnalyst Account

The server can sign the user into their OpenAnalyst account directly — no manual API key needed. The flow is email + 6-digit OTP. The resulting access token is stored in the local provider_credentials store with auth_method="oauth", and all subsequent inference calls pick it up automatically. Clients never handle the token themselves; they only ever hit these local endpoints.

Start the OTP flow

curl

curl -X POST http://localhost:3080/me/auth/otp/start \
  -H "Content-Type: application/json" \
  -d '{"email": "alice@example.com"}'

Returns an opaque challenge_id to round-trip into /verify. A 6-digit code is emailed to the user.

Response

{
  "challenge_id": "chl_...",
  "code_delivery": { "destination": "a***@example.com", "delivery_medium": "EMAIL" },
  "expires_at": 1731000000000,
  "rate_limited": false,
  "retry_after_seconds": null
}

If rate_limited is true, wait retry_after_seconds before retrying.

Verify the code

curl

curl -X POST http://localhost:3080/me/auth/otp/verify \
  -H "Content-Type: application/json" \
  -d '{
    "email": "alice@example.com",
    "code": "123456",
    "challenge_id": "chl_..."
  }'

On success, the server stores the OAuth credential and returns the same shape as GET /me/auth/status (see below). Inference calls now work without any further client setup.

Check sign-in status

curl

curl http://localhost:3080/me/auth/status

Reports current sign-in state, plan, and credit balance. If the access token is within 60 seconds of expiry, the server refreshes it transparently before responding.

Response

{
  "signed_in": true,
  "email": "alice@example.com",
  "expires_at": "2026-05-18T14:00:00Z",
  "balance": {
    "plan_name": "Pro",
    "plan_limit": 1000,
    "plan_used": 142,
    "plan_remaining": 858,
    "purchases_total": 0,
    "purchases_remaining": 0,
    "total_available": 858
  },
  "manual_key_present": false,
  "preferred_source": "oauth"
}

When signed out, signed_in is false and email, expires_at, and balance are null. manual_key_present reports whether a manual sk-oa-v1-* key is also configured.

Sign out

curl

curl -X POST http://localhost:3080/me/auth/logout

Wipes the stored OAuth credential. Returns 204 No Content. A manual key (if configured) is left intact.

Choose preferred source

When both an OAuth login and a manual sk-oa-v1-* key are present, this picks which one inference uses. The choice is persisted in provider_credentials.

curl

curl -X PUT http://localhost:3080/me/providers/openanalyst/preferred-source \
  -H "Content-Type: application/json" \
  -d '{"source": "oauth"}'     # or "manual"

Returns 204 No Content.

Session-Based Conversations

For multi-turn conversations with persistent history, use the session endpoints. Sessions preserve the full conversation context, tool results, and agent state across multiple interactions.

Create a session

curl

curl -X POST http://localhost:3080/sessions \
  -H "Content-Type: application/json"

# Response:
# {"session_id": "sess_abc123", "created_at": "2026-04-10T..."}

Send messages

curl

curl -X POST http://localhost:3080/sessions/sess_abc123/message \
  -H "Content-Type: application/json" \
  -d '{"message": "Read the main.rs file and explain the architecture"}'

# Returns SSE stream with assistant_delta, tool_call, tool_result events

Subscribe to real-time events

curl

curl -N http://localhost:3080/sessions/sess_abc123/events

# Long-lived SSE connection — receives all session events in real time

Resume later

curl

# Get full conversation history
curl http://localhost:3080/sessions/sess_abc123

# Continue the conversation
curl -X POST http://localhost:3080/sessions/sess_abc123/message \
  -H "Content-Type: application/json" \
  -d '{"message": "Now refactor that module to use async/await"}'

SSE Event Types

All streaming responses use Server-Sent Events (SSE). Each event has a type field and a JSON data payload. Connect with any SSE client — EventSource in browsers, curl -N for testing, or any HTTP client with streaming support.

Event	Description
`snapshot`	Initial session state on connect
`message`	Complete message added to conversation
`assistant_delta`	Streaming text chunk from the AI (most frequent event)
`tool_call`	AI invoked a tool (name, arguments)
`tool_result`	Tool execution result (output, duration, errors)
`usage`	Token usage update (input tokens, output tokens, cost)
`state`	Session state change (`thinking`, `tool_use`, `idle`)
`error`	Error occurred during processing

Event Format

Each SSE event follows the standard format with an event: line and a data: line containing JSON:

SSE stream output

event: assistant_delta
data: {"text": "Rust ownership ensures "}

event: assistant_delta
data: {"text": "memory safety without garbage collection."}

event: tool_call
data: {"tool_name": "read_file", "arguments": {"path": "src/main.rs"}}

event: tool_result
data: {"tool_name": "read_file", "output": "fn main() {...}", "duration_ms": 12}

event: usage
data: {"input_tokens": 1250, "output_tokens": 340, "cost_usd": 0.0087}

event: state
data: {"state": "idle"}

Browser client

Use the native EventSource API: const es = new EventSource('/sessions/sess_abc123/events') — events arrive automatically as they happen.

Provider Configuration

Configure AI provider API keys at runtime via the API, or set them as environment variables before starting the server. The server auto-detects all configured providers on startup.

Runtime Configuration

curl

# Add or update a provider key at runtime
curl -X PUT http://localhost:3080/me/providers/anthropic \
  -H "Content-Type: application/json" \
  -d '{"api_key": "sk-ant-api03-..."}'

# List all configured providers
curl http://localhost:3080/me/providers

Environment Variables

Set these before starting the server — they are auto-detected on startup:

Shell

# Set provider keys, then start
ANTHROPIC_API_KEY=sk-ant-... openanalyst --serve

Environment Variable	Provider
`ANTHROPIC_API_KEY`	Anthropic (Claude)
`OPENAI_API_KEY`	OpenAI (GPT-4o, o1, o3)
`GOOGLE_API_KEY` or `GEMINI_API_KEY`	Google Gemini
`XAI_API_KEY`	xAI (Grok)
`OPENROUTER_API_KEY`	OpenRouter
`AWS_ACCESS_KEY_ID`	AWS Bedrock
`OPENANALYST_API_KEY`	OpenAnalyst hosted

Sandbox & Security

The server enforces a strict sandbox model to prevent unintended access to the host system.

🔒

Owner Mode (Default)

Tools operate in the server's working directory. Full access to the local filesystem within the workspace.

🗃

Tenant Mode

Each tenant gets an isolated workspace at ~/.openanalyst/server/workspaces/{tenant_id}/. No cross-tenant access.

🛡

Sandboxed Permissions

Tools can read and write within the workspace but cannot modify server files, system files, or other tenants' directories.

Available Tools

The server exposes 22 tools to the AI agent, all operating within the sandbox boundary:

File operations — read_file, write_file, edit_file
Search — grep_search, glob_search
Execution — bash, powershell, repl
Web — web_search, web_fetch
Code execution — Python, JavaScript, and shell code in a jailed subprocess
Agent — agent (sub-agent spawning), todo_write, notebook_edit
And more — skill, tool_search, config, structured_output, sleep, send_user_message

ExecuteCode sandbox

Python, JavaScript, and shell code execution runs in a jailed subprocess with restricted filesystem access. The subprocess cannot escape the workspace boundary.

Building on Top

Developer Guide

The server is designed as a single-tenant core that enterprises can wrap with their own authentication, authorization, and multi-tenancy. Add your own auth middleware, rate limiting, and user management — the server handles all AI agent capabilities.

Here is a minimal Python wrapper to get started:

Python client

import requests

class OpenAnalystClient:
    def __init__(self, base_url="http://localhost:3080"):
        self.base = base_url

    def chat(self, message, model=None, effort=None):
        payload = {"message": message}
        if model: payload["model"] = model
        if effort: payload["effort"] = effort
        return requests.post(f"{self.base}/v1/chat", json=payload, stream=True)

    def create_session(self):
        return requests.post(f"{self.base}/sessions").json()

    def send_message(self, session_id, message):
        return requests.post(
            f"{self.base}/sessions/{session_id}/message",
            json={"message": message},
            stream=True
        )

    def get_history(self, session_id):
        return requests.get(f"{self.base}/sessions/{session_id}").json()


# Usage
client = OpenAnalystClient()
session = client.create_session()
response = client.send_message(session["session_id"], "Analyze this codebase")

JavaScript client

class OpenAnalystClient {
  constructor(baseUrl = 'http://localhost:3080') {
    this.base = baseUrl;
  }

  async chat(message, options = {}) {
    return fetch(`${this.base}/v1/chat`, {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ message, ...options })
    });
  }

  subscribeToSession(sessionId, onEvent) {
    const es = new EventSource(`${this.base}/sessions/${sessionId}/events`);
    es.addEventListener('assistant_delta', e => onEvent('delta', JSON.parse(e.data)));
    es.addEventListener('tool_call', e => onEvent('tool', JSON.parse(e.data)));
    es.addEventListener('error', e => onEvent('error', e));
    return es;
  }
}

Automatic Provider Failover

OpenAnalyst automatically fails over between configured providers. If a provider is rate-limited, returns an error, or times out, the next configured provider is tried transparently. Zero downtime for your API consumers.

All 8 supported providers participate in the failover: OpenAnalyst, Anthropic, OpenAI, xAI, OpenRouter, Bedrock, Gemini, and Ollama (local). The engine determines the optimal failover order based on your configuration and availability. Your client code does not need to handle provider errors — the server retries automatically and returns the response as if nothing happened.

Best practice

Configure at least 2–3 providers for production deployments. If your primary provider (e.g., Anthropic) hits rate limits during peak traffic, the server seamlessly falls over to your secondary (e.g., OpenAI) with no client-side changes.