Symposium

Collaborative AI built collaboratively

About ⁄ Get started ⁄ Contribute

About Symposium

Symposium is a set of components that make AI agents work better. These components are compatible with any ACP-based editing system, including Zed, VSCode (with our plugin), and IntelliJ and NeoVim (coming soon).

We focus on three areas:

Rust Expertise

Built-in capabilities for understanding Rust language, errors, and idioms.

rust-crate-sources tool
IDE operations
Error explanations

Ecosystem-Powered Knowledge

Crate authors provide specialized AI tooling through Cargo.toml metadata, bringing domain knowledge directly to the agent.

Crate-provided skills, context, and capabilities

Rich Collaboration

Interactive patterns for human-AI partnership beyond simple text exchanges.

Sparkle collaborative patterns
Walkthroughs
Taskspaces

Open Source Community

Symposium is an open-source project and we are actively soliciting contributors. We welcome users as well, but given the exploratory nature of Symposium, expect frequent changes. Currently, the only way to install Symposium is from source.

We maintain a code of conduct and operate as an independent community focused on exploring what AI has to offer for software development.

Rust Crate Sources Tool

The rust-crate-sources MCP tool allows AI agents to access the source code of published Rust crates from crates.io.

Just like humans, AI agents work best when they have a few examples of what to do rather than having to sort through reams of documentation. Often the best way to provide this is to expose them to the crate's source code, since well-maintained crates come equipped with examples and usage patterns.

When an agent needs to understand how a particular crate works, it can fetch and examine the actual implementation, including any examples or tests the crate provides.

Rich Collaboration

Symposium integrates with Sparkle, a framework for AI collaboration that creates partnership dynamics through intuitive signals, partnership behaviors, and meta-collaboration tools.

For complete information about Sparkle patterns and capabilities, see the Sparkle documentation.

Implementation Overview

Symposium appears to external clients as a single ACP proxy, but internally uses a conductor to orchestrate a dynamic chain of component proxies. This architecture allows Symposium to adapt to different client capabilities and provide consistent functionality regardless of what the editor or agent natively supports.

Architecture

External View

From the outside, Symposium is a standard ACP proxy that sits between an editor and an agent:

flowchart LR
    Editor --> Symposium --> Agent

Internal Structure

Internally, Symposium runs a conductor in proxy mode that orchestrates multiple component proxies:

flowchart LR
    Editor --> S[Symposium Conductor]
    S --> C1[Component 1]
    C1 --> A1[Adapter 1]
    A1 --> C2[Component 2]
    C2 --> Agent

The conductor dynamically builds this chain based on what capabilities the editor and agent provide.

Component Pattern

Some Symposium features are implemented as component/adapter pairs:

Components

Components provide functionality to agents through MCP tools and other mechanisms. They:

Expose high-level capabilities (e.g., Dialect-based IDE operations)
May rely on primitive capabilities from upstream (the editor)
Are always included in the chain when their functionality is relevant

Adapters

Adapters "shim" for missing primitive capabilities by providing fallback implementations. They:

Check whether required primitive capabilities exist upstream
Provide the capability if it's missing (e.g., spawn rust-analyzer to provide IDE operations)
Pass through transparently if the capability already exists
Are conditionally included only when needed

Capability-Driven Assembly

During initialization, Symposium:

Receives capabilities from the editor - examines what the upstream client provides
Queries the agent - discovers what capabilities the downstream agent supports
Builds the proxy chain - spawns components and adapters based on detected gaps and opportunities
Advertises enriched capabilities - tells the editor what the complete chain provides

This approach allows Symposium to work with minimal ACP clients (by providing fallback implementations) while taking advantage of native capabilities when available (by passing through directly).

For detailed information about the initialization sequence and capability negotiation, see Initialization Sequence.

Components

Symposium's functionality is delivered through component proxies that are orchestrated by the internal conductor. Some features use a component/adapter pattern while others are standalone components.

Component Types

Standalone Components

Some components provide functionality that doesn't depend on upstream capabilities. These components work with any editor and add features purely through the proxy layer.

Example: A component that provides git history analysis through MCP tools doesn't need special editor support - it can work with the filesystem directly.

Component/Adapter Pairs

Other components rely on primitive capabilities from the upstream editor. For these, Symposium uses a two-layer approach:

Adapter Layer

The adapter sits upstream in the proxy chain and provides primitive capabilities that the component needs.

Responsibilities:

Check for required capabilities during initialization
Pass requests through if the editor provides the capability
Provide fallback implementation if the capability is missing
Abstract away editor differences from the component

Example: The IDE Operations adapter checks if the editor supports ide_operations. If not, it can spawn a language server (like rust-analyzer) to provide that capability.

Component Layer

The component sits downstream from its adapter and enriches primitive capabilities into higher-level MCP tools.

Responsibilities:

Expose MCP tools to the agent
Process tool invocations
Send requests upstream through the adapter
Return results to the agent

Example: The IDE Operations component exposes an ide_operation MCP tool that accepts Dialect programs and translates them into IDE operation requests sent upstream.

Component Lifecycle

For component/adapter pairs:

Initialization - Adapter receives initialize request from upstream (editor)
Capability Check - Adapter examines editor capabilities
Conditional Spawning - Adapter spawns fallback if capability is missing
Chain Assembly - Conductor wires adapter → component → downstream
Request Flow - Agent calls MCP tool → component → adapter → editor
Response Flow - Results flow back: editor → adapter → component → agent

Proxy Chain Direction

The proxy chain flows from editor to agent:

Editor → [Adapter] → [Component] → Agent

Upstream = toward the editor
Downstream = toward the agent

Adapters sit closer to the editor, components sit closer to the agent.

Current Components

Rust Crate Sources

Provides access to published Rust crate source code through an MCP server.

Type: Standalone component
Implementation: Injects an MCP server that exposes the rust-crate-sources tool
Function: Allows agents to fetch and examine source code from crates.io

Sparkle

Provides AI collaboration framework through prompt injection and MCP tooling.

Type: Standalone component
Implementation: Injects Sparkle MCP server with collaboration tools
Function: Enables partnership dynamics, pattern anchors, and meta-collaboration capabilities
Documentation: Sparkle docs

Future Components

Additional components can be added following these patterns:

IDE Operations - Code navigation and search (likely component/adapter pair)
Walkthroughs - Interactive code explanations
Git Operations - Repository analysis
Build Integration - Compilation and testing workflows

Rust Crate Sources Component

The Rust Crate Sources component provides agents with the ability to research published Rust crate source code through a sub-agent architecture.

Architecture Overview

The component uses a sub-agent research pattern: when an agent needs information about a Rust crate, the component spawns a dedicated research session with its own agent to investigate the crate sources and return findings.

Message Flow

sequenceDiagram
    participant Client
    participant Proxy as Crate Sources Proxy
    participant Agent

    Note over Client,Proxy: Initial Session Setup
    Client->>Proxy: NewSessionRequest
    Note right of Proxy: Adds user-facing MCP server<br/>(rust_crate_query tool)
    Proxy->>Agent: NewSessionRequest (with user-facing MCP)
    Agent-->>Proxy: NewSessionResponse(session_id)
    Proxy-->>Client: NewSessionResponse(session_id)

    Note over Agent,Proxy: Research Request
    Agent->>Proxy: ToolRequest(rust_crate_query, crate, prompt)
    Note right of Proxy: Create research session
    Proxy->>Agent: NewSessionRequest (with sub-agent MCP)
    Note right of Proxy: Sub-agent MCP has:<br/>- get_rust_crate_source<br/>- return_response_to_user
    Agent-->>Proxy: NewSessionResponse(research_session_id)
    Proxy->>Agent: PromptRequest(research_session_id, prompt)
    
    Note over Agent: Sub-agent researches crate<br/>Uses get_rust_crate_source<br/>Reads files (auto-approved)
    
    Agent->>Proxy: RequestPermissionRequest(Read)
    Proxy-->>Agent: RequestPermissionResponse(approved)
    
    Agent->>Proxy: ToolRequest(return_response_to_user, findings)
    Proxy-->>Agent: ToolResponse(success)
    Note right of Proxy: Response sent via internal channel
    Proxy-->>Agent: ToolResponse(rust_crate_query result)

Two MCP Servers

The component provides two distinct MCP servers:

User-facing MCP Server - Exposed to the main agent session
- Tool: rust_crate_query - Initiates crate research
Sub-agent MCP Server - Provided only to research sessions
- Tool: get_rust_crate_source - Locates crate sources and returns path
- Tool: return_response_to_user - Returns research findings and ends the session

User-Facing Tool: `rust_crate_query`

Parameters

{
  crate_name: string,      // Name of the Rust crate
  crate_version?: string,  // Optional semver range (defaults to latest)
  prompt: string           // What to research about the crate
}

Examples

{
  "crate_name": "serde",
  "prompt": "How do I use the derive macro for custom field names?"
}

{
  "crate_name": "tokio",
  "crate_version": "1.0",
  "prompt": "What are the signatures of all methods on tokio::runtime::Runtime?"
}

Behavior

Creates a new research session via NewSessionRequest
Attaches the sub-agent MCP server to that session
Sends the user's prompt via PromptRequest
Waits for the sub-agent to call return_response_to_user
Returns the sub-agent's findings as the tool result

Sub-Agent Tools

`get_rust_crate_source`

Locates and extracts the source code for a Rust crate from crates.io.

Parameters:

{
  crate_name: string,
  version?: string  // Semver range
}

Returns:

{
  "crate_name": "serde",
  "version": "1.0.210",
  "checkout_path": "/Users/user/.cargo/registry/src/.../serde-1.0.210",
  "message": "Crate 'serde' version 1.0.210 extracted to ..."
}

The sub-agent can then use Read tool calls (which are auto-approved) to examine the source code.

`return_response_to_user`

Signals completion of the research and returns findings to the waiting rust_crate_query call.

Parameters:

{
  response: string  // The research findings to return
}

Behavior:

Sends the response through an internal channel to the waiting tool handler
The original rust_crate_query call completes with this response
The research session can then be terminated

Permission Auto-Approval

The component implements a message handler that intercepts RequestPermissionRequest messages from research sessions and automatically approves all permission requests.

Permission Rules

Research sessions → All permissions automatically approved
Other sessions → Passed through unchanged

Rationale

Research sessions are sandboxed and disposable - they investigate crate sources and return findings. Auto-approving all permissions eliminates the need for dozens of permission prompts while maintaining safety:

Research sessions operate on read-only crate sources in the cargo registry cache
Sessions are short-lived and focused on a single research task
Any side effects are contained within the research session's scope

Implementation

The handler checks if a permission request comes from a registered research session and automatically selects the first available option (typically "allow"):

#![allow(unused)]
fn main() {
if self.state.is_research_session(&req.session_id) {
    // Select first option (typically "allow")
    let response = RequestPermissionResponse {
        outcome: RequestPermissionOutcome::Selected {
            option_id: req.options.first().unwrap().id.clone(),
        },
        meta: None,
    };
    request_cx.respond(response)?;
    return Ok(Handled::Yes);
}
return Ok(Handled::No(message));  // Not our session, propagate unchanged
}

Session Lifecycle

Agent calls rust_crate_query
- Handler creates oneshot::channel() for response
- Registers session in active sessions map
Handler sends NewSessionRequest
- Includes sub-agent MCP server configuration
- Receives session_id in response
Handler sends PromptRequest
- Sends user's research prompt to the session
- Awaits response on the oneshot channel
Sub-agent performs research
- Calls get_rust_crate_source to locate crate
- Reads source files (auto-approved by permission handler)
- Analyzes code to answer the prompt
Sub-agent calls return_response_to_user
- Sends findings through internal channel
- Original rust_crate_query call receives response
Session cleanup
- Remove session from active sessions map
- Session termination (if ACP supports explicit session end)

Shared State

The component uses shared state to coordinate between:

The rust_crate_query tool handler (creates sessions, waits for responses)
The return_response_to_user tool handler (sends responses)
The permission request handler (auto-approves Read operations)

State Structure

#![allow(unused)]
fn main() {
struct ResearchSession {
    session_id: SessionId,
    response_tx: oneshot::Sender<String>,
}

// Shared across all handlers
Arc<Mutex<HashMap<SessionId, ResearchSession>>>
}

Design Decisions

Why Sub-Agents Instead of Direct Pattern Search?

Previous approach: The component exposed get_rust_crate_source with a pattern parameter that performed regex searches across crate sources.

Problems:

Agents had to construct exact regex patterns
Limited to simple pattern matching
No semantic understanding of code structure
Single-shot queries couldn't follow up on findings

Sub-agent approach:

Agent describes what information they need in natural language
Sub-agent can perform multiple reads, follow references, understand context
Can navigate code structure intelligently
Returns synthesized answers, not raw pattern matches

Why Auto-Approve All Permissions?

Research sessions need extensive file access to examine crate sources. Requiring user approval for every operation would create dozens of permission prompts, making the feature unusable.

Safety considerations:

Research sessions are sandboxed and disposable
Scope is limited to investigating crate sources in cargo registry cache
Sessions are short-lived with a focused task
Any side effects are contained within the research session

Why Oneshot Channels for Response Coordination?

Each rust_crate_query call creates exactly one research session and expects exactly one response. A oneshot::channel models this perfectly:

Type-safe guarantee of single response
Clear ownership transfer
Automatic cleanup on drop
No need to poll or maintain complex state

Integration with Symposium

The component is registered with the conductor in symposium-acp/src/lib.rs:

#![allow(unused)]
fn main() {
components.push(sacp::DynComponent::new(
    symposium_crate_sources_proxy::CrateSourcesProxy {},
));
}

The component implements Component::serve() to:

Register the user-facing MCP server via McpServiceRegistry
Implement message handling for permission requests
Forward all other messages to the successor component

Future Enhancements

Session timeouts - Terminate research sessions that take too long
Concurrent research - Support multiple research sessions simultaneously
Caching - Cache common queries to avoid redundant research
Progressive responses - Stream findings as they're discovered rather than waiting for completion
Research history - Allow agents to reference previous research results

VSCode Extension Architecture

The Symposium VSCode extension provides a chat interface for interacting with AI agents. The architecture divides responsibilities across three layers to handle VSCode's webview constraints while maintaining clean separation of concerns.

Components Overview

mynah-ui: AWS's open-source chat interface library (github.com/aws/mynah-ui). Provides the chat UI rendering, tab management, and message display. The webview layer uses mynah-ui for all visual presentation.

Agent: Currently a mock implementation (HomerActor) that responds with Homer Simpson quotes. Future implementation will spawn an ACP-compatible agent process (see ACP Integration chapter when available).

Extension activation: VSCode activates the extension when the user first opens the Symposium sidebar or runs a Symposium command. The extension spawns the agent process during activation (or lazily on first use) and keeps it alive for the entire VSCode session.

Three-Layer Model

┌─────────────────────────────────────────────────┐
│  Webview (Browser Context)                      │
│  - mynah-ui rendering                           │
│  - User interaction capture                     │
│  - Tab management                               │
└─────────────────┬───────────────────────────────┘
                  │ VSCode postMessage API
┌─────────────────▼───────────────────────────────┐
│  Extension (Node.js Context)                    │
│  - Message routing                              │
│  - Agent lifecycle                              │
│  - Webview lifecycle                            │
└─────────────────┬───────────────────────────────┘
                  │ Process spawning / stdio
┌─────────────────▼───────────────────────────────┐
│  Agent (Separate Process)                       │
│  - Session management                           │
│  - AI interaction                               │
│  - Streaming responses                          │
└─────────────────────────────────────────────────┘

Why Three Layers?

Webview Isolation

VSCode webviews run in isolated browser contexts without Node.js APIs. This security boundary prevents direct file system access, process spawning, or network operations. The webview can only communicate with the extension through VSCode's postMessage API.

Design consequence: UI code must be pure browser JavaScript. All privileged operations (spawning agents, workspace access, persistence) happen in the extension layer.

Extension as Coordinator

The extension runs in Node.js with full VSCode API access. It bridges between the isolated webview and external agent processes.

Key responsibilities:

Message routing - Translates between webview UI events and agent protocol messages
Agent lifecycle - Spawns and manages the agent process
Webview lifecycle - Handles visibility changes and ensures messages reach the UI

The extension deliberately avoids understanding message semantics. It routes based on IDs (tab ID, message ID) without interpreting content.

Agent Independence

The agent runs as a separate process communicating via stdio. This isolation provides:

Flexibility - Agent can be any executable (Rust, Python, TypeScript)
Stability - Agent crashes don't kill the extension
Multiple sessions - Single agent process handles all tabs/conversations

The agent owns all session state and conversation logic. The extension only tracks which tab corresponds to which session.

Communication Boundaries

Webview ↔ Extension

Transport: postMessage API (asynchronous, JSON-serializable messages only)

Direction:

Webview → Extension: User actions (new tab, send prompt, close tab)
Extension → Webview: Agent responses (response chunks, completion signals)

Why not synchronous? VSCode's webview API is inherently asynchronous. This forces the UI to be resilient to message delays and webview lifecycle events.

Extension ↔ Agent

Transport: ACP (Agent Client Protocol) over stdio

Direction:

Extension → Agent: Session commands (new session, process prompt)
Agent → Extension: Streaming responses, session state updates

Why ACP over stdio? ACP provides a standardized protocol for agent communication. Stdio is simple, universal, and works with any language. No need for network sockets or IPC complexity.

The extension uses AgentConfiguration to determine when agent processes can be shared across tabs. An AgentConfiguration consists of:

Agent name (e.g., "ElizACP", "Claude")
Enabled components (e.g., "symposium-acp")
Workspace folder (the VSCode workspace the agent operates in)

Sharing strategy: Tabs with identical configurations share the same agent actor (process), but each tab gets its own session within that process.

Workspace folder selection:

Single workspace: Automatically uses that workspace
Multiple workspaces: Prompts user to select which workspace folder to use
Each session is created with the workspace folder as its working directory

Rationale:

Resource efficiency - Shared actor means one process for multiple tabs with the same config
Workspace isolation - Different workspace folders get different actors to maintain proper working directory context
Session isolation - Each tab gets its own session ID for conversation independence

Trade-off: Agent must implement multiplexing. Messages include session/tab IDs for routing. Extension maps UI tab IDs to agent session IDs.

Design Principles

Opaque state: Each layer owns its state format. Extension stores but doesn't parse webview UI state or agent session state.

Graceful degradation: Webview can be hidden/shown at any time. Extension buffers messages when webview is inactive.

UUID-based identity: Tab IDs and message IDs use UUIDs to avoid collisions. Generated at source (webview generates tab IDs, extension generates message IDs) to eliminate coordination overhead.

Minimal coupling: Layers communicate through well-defined message protocols. Webview doesn't know about agents. Agent doesn't know about webviews. Extension coordinates without understanding semantics.

End-to-End Flow

Here's how a complete user interaction flows through the system:

sequenceDiagram
    participant User
    participant VSCode
    participant Extension
    participant Webview
    participant Agent
    
    User->>VSCode: Opens Symposium sidebar
    VSCode->>Extension: activate()
    Extension->>Extension: Generate session ID
    Extension->>Agent: Spawn process
    
    Extension->>Webview: Create webview (inject session ID)
    Webview->>Webview: Load, check session ID vs saved state
    Webview->>Webview: Restore or clear tabs, initialize mynah-ui
    Webview->>Extension: webview-ready (last-seen-index)
    
    User->>Webview: Creates new tab
    Webview->>Webview: Generate tab UUID
    Webview->>Extension: new-tab (tabId)
    Extension->>Agent: new-session
    Agent->>Agent: Initialize session
    Agent->>Extension: session-created (sessionId)
    Extension->>Extension: Store tabId ↔ sessionId mapping
    
    User->>Webview: Sends prompt
    Webview->>Webview: Generate message UUID
    Webview->>Extension: prompt (tabId, messageId, text)
    Extension->>Extension: Lookup sessionId for tabId
    Extension->>Agent: process-prompt (sessionId, text)
    
    loop Streaming response
        Agent->>Extension: response-chunk (sessionId, chunk)
        Extension->>Extension: Lookup tabId for sessionId
        Extension->>Webview: response-chunk (tabId, messageId, chunk)
        Webview->>Webview: Render chunk in mynah-ui
    end
    
    Agent->>Extension: response-complete (sessionId)
    Extension->>Webview: response-complete (tabId, messageId)
    Webview->>Webview: End message stream
    Webview->>Webview: setState() - persist session ID and tabs

The extension maintains tab↔session mappings and handles webview visibility, while the agent maintains session state and generates responses.

Message Protocol

The extension coordinates message flow between the webview UI and agent process. Messages are identified by UUIDs and routed based on tab/session mappings.

Message Identity

The system uses two separate identification mechanisms:

Message IDs (UUIDs): Identify specific prompt/response conversations. When a user sends a prompt, the webview generates a UUID message ID. All response chunks for that prompt include the same message ID, allowing the UI to associate chunks with the correct prompt and render them in the right place. Message IDs enable multiple concurrent prompts (user sends prompt in tab A while tab B is still streaming a response).

Message indices (numbers): Monotonically increasing integers assigned by the extension per tab, used exclusively for deduplication. When the webview is hidden and shown, the extension may replay messages to ensure nothing was missed. The webview tracks the last index it saw per tab (via lastSeenIndex map) and ignores messages with index <= lastSeenIndex[tabId]. This prevents duplicate response chunks from appearing in the UI.

Why both? Message IDs provide semantic identity ("which conversation is this?"). Message indices provide delivery tracking ("have I seen this before?"). The extension assigns indices sequentially as messages flow through; the webview uses UUIDs for UI routing and indices for deduplication.

Message Flow Patterns

Opening a New Tab

sequenceDiagram
    participant User
    participant Webview
    participant Extension
    participant Agent
    
    User->>Webview: Opens new tab
    Webview->>Webview: Generate tab ID (UUID)
    Webview->>Extension: new-tab (tabId)
    Extension->>Agent: new-session
    Agent->>Agent: Initialize session
    Agent->>Extension: session-created (sessionId)
    Extension->>Extension: Store tabId → sessionId mapping

Why UUID generation in webview? The webview owns tab lifecycle. Generating IDs at the source avoids round-trip coordination with the extension.

Why separate session IDs? The agent owns session identity. Tab IDs are UI concepts; session IDs are agent concepts. The extension maps between them without understanding either.

Sending a Prompt

sequenceDiagram
    participant User
    participant Webview
    participant Extension
    participant Agent
    
    User->>Webview: Types message
    Webview->>Extension: prompt (tabId, messageId, text)
    Extension->>Extension: Lookup sessionId for tabId
    Extension->>Agent: process-prompt (sessionId, text)
    
    loop Streaming response
        Agent->>Extension: response-chunk (sessionId, chunk)
        Extension->>Extension: Lookup tabId for sessionId
        alt Webview visible
            Extension->>Webview: response-chunk (tabId, messageId, chunk)
            Webview->>Webview: Append to message stream
        else Webview hidden
            Extension->>Extension: Buffer message
        end
    end
    
    Agent->>Extension: response-complete (sessionId)
    Extension->>Webview: response-complete (tabId, messageId)
    Webview->>Webview: End message stream

Why streaming? AI responses can take seconds to complete. Streaming provides immediate feedback and allows users to start reading while generation continues.

Why message IDs? Multiple prompts can be in flight simultaneously (user sends prompt in tab A while tab B is still receiving a response). Message IDs ensure response chunks are associated with the correct prompt.

Why buffer when hidden? VSCode can hide webviews at any time (user switches away, collapses sidebar). Buffering ensures the UI sees all messages when it becomes visible again.

Closing a Tab

sequenceDiagram
    participant User
    participant Webview
    participant Extension
    participant Agent
    
    User->>Webview: Closes tab
    Webview->>Extension: close-tab (tabId)
    Extension->>Extension: Lookup sessionId for tabId
    Extension->>Agent: close-session (sessionId)
    Agent->>Agent: Cleanup session state
    Extension->>Extension: Remove tabId → sessionId mapping

Why explicit close messages? Allows agent to clean up resources (free memory, close file handles) rather than leaking session state indefinitely.

Message Identification Strategy

Tab IDs

Generated by: Webview (when user creates new tab)
Format: UUID v4
Scope: UI-only concept
Lifetime: From tab creation to tab close

Session IDs

Generated by: Agent (in response to new-session)
Format: Agent-defined (typically UUID)
Scope: Agent-only concept
Lifetime: From session creation to session close

Message IDs

Generated by: Webview (when user sends prompt)
Format: UUID v4
Scope: Used by both webview and extension for response routing
Lifetime: From prompt send to response complete

Why three separate ID spaces? Each layer owns its identity domain. This avoids coupling and eliminates coordination overhead.

Bidirectional Mapping

The extension maintains two maps:

tabId → sessionId    (for extension → agent messages)
sessionId → tabId    (for agent → extension messages)

Synchronization: Maps are updated atomically when session creation completes. Both directions always stay consistent.

Cleanup: Both mappings are removed when either tab closes or session ends.

Message Ordering Guarantees

Within a session: Agent processes prompts sequentially. A second prompt won't start processing until the first response completes.

Across sessions: No ordering guarantees. Tabs are independent. Multiple sessions can stream responses simultaneously.

Webview messages: Delivered in order sent, but delivery timing depends on webview visibility. Buffered messages are replayed in order when webview becomes visible.

Error Handling

Agent crashes: Extension detects process exit, notifies all active tabs. Tabs display error state. User can trigger agent restart.

Webview disposal: Extension maintains agent sessions. If webview is recreated (VSCode restart), extension can restore tab → session mappings and continue existing sessions.

Message delivery failure: If webview is disposed while messages are buffered, messages are discarded. Agent sessions may continue running. Next webview instantiation can restore session state.

Design Rationale

Why not request/response? Streaming responses require continuous message flow, not single request/reply pairs. The protocol is inherently asynchronous.

Why not share IDs across layers? Each layer has different lifecycle concerns. Decoupling identity spaces allows independent evolution. Extension acts as impedance matcher between UI tab identity and agent session identity.

Why buffer in extension instead of agent? Agent shouldn't need to know about webview lifecycle. Extension handles VSCode-specific concerns (visibility, disposal) to keep agent implementation portable.

Tool Use Authorization

When agents request permission to execute tools (file operations, terminal commands, etc.), the extension provides a user approval mechanism. This chapter describes how authorization requests flow through the system and how per-agent policies are enforced.

Architecture

The authorization flow bridges three layers:

Agent (ACP requestPermission) → Extension (Promise-based routing) → Webview (MynahUI approval card)

The extension acts as the coordination point:

Receives synchronous requestPermission callbacks from the ACP agent
Checks per-agent bypass settings
Routes approval requests to the webview when user input is needed
Blocks the agent using promises until the user responds

Authorization Flow

With Bypass Disabled

sequenceDiagram
    participant Agent
    participant Extension
    participant Settings
    participant Webview
    participant User
    
    Agent->>Extension: requestPermission(toolCall, options)
    Extension->>Settings: Check agents[agentName].bypassPermissions
    Settings-->>Extension: false
    Extension->>Extension: Generate approval ID, create pending promise
    Extension->>Webview: approval-request message
    Webview->>User: Display approval card (MynahUI)
    User->>Webview: Click approve/deny/bypass
    Webview->>Extension: approval-response message
    
    alt User selected "Bypass Permissions"
        Extension->>Settings: Set agents[agentName].bypassPermissions = true
    end
    
    Extension->>Extension: Resolve promise with user's choice
    Extension-->>Agent: return RequestPermissionResponse

With Bypass Enabled

sequenceDiagram
    participant Agent
    participant Extension
    participant Settings
    
    Agent->>Extension: requestPermission(toolCall, options)
    Extension->>Settings: Check agents[agentName].bypassPermissions
    Settings-->>Extension: true
    Extension-->>Agent: return allow_once (auto-approved)

Promise-Based Blocking

The ACP SDK's requestPermission callback is synchronous - it must return a Promise<RequestPermissionResponse>. The extension creates a promise that resolves when the user responds:

async requestPermission(params) {
  // Check bypass setting first
  if (agentConfig.bypassPermissions) {
    return { outcome: { outcome: "selected", optionId: allowOptionId } };
  }
  
  // Create promise that will resolve when user responds
  const promise = new Promise((resolve, reject) => {
    pendingApprovals.set(approvalId, { resolve, reject, agentName });
  });
  
  // Send request to webview
  sendToWebview({ type: "approval-request", approvalId, ... });
  
  // Return promise (blocks agent until resolved)
  return promise;
}

When the webview sends approval-response, the extension resolves the promise:

case "approval-response":
  const pending = pendingApprovals.get(message.approvalId);
  pending.resolve(message.response);  // Unblocks agent

This allows the agent to block on permission requests without blocking the extension's event loop.

Per-Agent Settings

Authorization policies are scoped per-agent in symposium.agents configuration:

{
  "symposium.agents": {
    "Claude Code": {
      "command": "npx",
      "args": ["@zed-industries/claude-code-acp"],
      "bypassPermissions": true
    },
    "ElizACP": {
      "command": "elizacp",
      "bypassPermissions": false
    }
  }
}

Why per-agent? Different agents have different trust levels. A user might trust Claude Code with unrestricted file access but want to review every tool call from an experimental agent.

Scope: Settings are stored globally (VSCode user settings), so bypass policies persist across workspaces and sessions.

User Approval Options

When bypass is disabled, the webview displays three options:

Approve - Allow this single tool call, continue prompting for future tools
Deny - Reject this single tool call, continue prompting for future tools
Bypass Permissions - Approve this call AND set bypassPermissions = true for this agent permanently

The "Bypass Permissions" option provides a quick path to trusted status without requiring manual settings edits.

Webview UI Implementation

The webview uses MynahUI primitives to display approval requests:

Chat item - Approval request appears as a chat message in the conversation
Buttons - Three buttons (Approve, Deny, Bypass) using MynahUI's button status colors
Tool details - Tool name, parameters (formatted as JSON), and any available metadata
Card dismissal - Cards auto-dismiss after the user clicks a button (keepCardAfterClick: false)

The specific MynahUI API usage is documented in the MynahUI GUI reference.

Approval Request Message

Extension → Webview:

{
  type: "approval-request",
  tabId: string,
  approvalId: string,        // UUID for matching response
  agentName: string,          // Which agent is requesting permission
  toolCall: {
    toolCallId: string,       // ACP tool call identifier
    title?: string,           // Human-readable tool name (may be null)
    kind?: ToolKind,          // "read", "edit", "execute", etc.
    rawInput?: object         // Tool parameters
  },
  options: PermissionOption[] // Available approval options from ACP
}

Approval Response Message

Webview → Extension:

{
  type: "approval-response",
  approvalId: string,         // Matches approval-request
  response: {
    outcome: {
      outcome: "selected",
      optionId: string        // Which option was chosen
    }
  },
  bypassAll: boolean          // True if "Bypass Permissions" clicked
}

Design Decisions

Why block the agent? Tool execution should wait for user consent. Continuing execution while waiting for approval would allow the agent to make progress on non-tool operations, potentially creating race conditions where the user approves a tool call that's no longer relevant.

Why promise-based? JavaScript promises provide natural blocking semantics. The extension can return immediately (non-blocking event loop) while the agent perceives the call as synchronous (blocking until approval).

Why store in settings? Bypass permissions should persist across sessions. VSCode settings provide durable storage with UI for manual editing if needed.

Why auto-dismiss cards? Once the user responds, the approval card is no longer actionable. Dismissing it keeps the conversation history clean and focused on the actual work.

Future Enhancements

Potential extensions to the authorization system:

Per-tool policies - Trust specific tools (e.g., "always allow Read") while prompting for others
Resource-based rules - Auto-approve file reads within certain directories
Temporary sessions - "Bypass for this session" option that doesn't persist
Approval history - Log of past approvals for security auditing
Batch approvals - Approve multiple pending tool calls at once

Webview State Persistence

The webview must preserve chat history and UI state across hide/show cycles, but clear state when VSCode restarts. This requires distinguishing between temporary hiding and permanent disposal.

The Problem

VSCode webviews face two distinct lifecycle events that look identical from the webview's perspective:

User collapses sidebar - Webview is hidden but should restore exactly when reopened
VSCode restarts - Webview is disposed and recreated, should start fresh

Both events destroy and recreate the webview DOM. The webview cannot distinguish between them without additional context.

User expectation: Chat history persists within a VSCode session but doesn't carry over to the next session. Draft text should survive sidebar collapse but not VSCode restart.

Session ID Solution

The extension generates a session ID (UUID) once per VSCode session at activation. This ID is embedded in the webview HTML as a global JavaScript variable (window.SYMPOSIUM_SESSION_ID) in a script tag. The webview reads this variable synchronously on load and compares it against the session ID stored in saved state.

sequenceDiagram
    participant VSCode
    participant Extension
    participant Webview
    
    Note over VSCode: Extension activation
    Extension->>Extension: Generate session ID
    
    Note over VSCode: User opens sidebar
    Extension->>Webview: Create webview with session ID
    Webview->>Webview: Load saved state
    
    alt Session IDs match
        Webview->>Webview: Restore chat history
    else Session IDs don't match (or no saved ID)
        Webview->>Webview: Clear state, start fresh
    end

Why this works:

Within a session: Same session ID embedded every time, state restores
After restart: New session ID generated, mismatch detected, state cleared

State Structure

The webview maintains three pieces of state:

Session ID - Embedded from extension, used for freshness detection
Last seen index - Message deduplication tracking (see Webview Lifecycle chapter)
Mynah UI tabs - Opaque blob from mynahUI.getAllTabs() containing tab metadata, chat history, and UI configuration for all open tabs

Ownership: Webview owns this state entirely. Extension provides session ID but doesn't read or interpret webview state. The mynah-ui tabs structure is treated as opaque—the webview saves whatever getAllTabs() returns and restores it via mynah-ui's initialization config.

Storage: VSCode's getState()/setState() API. Persists across hide/show cycles and VSCode restarts.

State Lifecycle

Initial Load

Webview reads embedded session ID from window.SYMPOSIUM_SESSION_ID
Webview calls vscode.getState() to load saved state
If savedState.sessionId === window.SYMPOSIUM_SESSION_ID, restore tabs
Otherwise, call vscode.setState(undefined) to clear stale state

During Use

State is saved after any UI change:

User sends a message
User opens or closes a tab
Agent response is received and rendered

Performance: VSCode's setState() is optimized for frequent calls. No need to debounce or throttle state saves.

On Restart

Extension activation generates new session ID
Webview loads with new session ID embedded
Session ID mismatch detected (old state has previous session's ID)
State cleared, webview starts fresh

Message Deduplication

When the webview is hidden and shown, the extension may resend messages to ensure nothing was missed. The webview tracks the last message index seen per tab to avoid duplicates.

Last seen index map: { [tabId: string]: number }

Logic: If incoming message has index <= lastSeenIndex[tabId], ignore it. Otherwise, process and update lastSeenIndex[tabId].

Why needed? Extension buffers messages when webview is hidden (see Webview Lifecycle chapter). Replay strategy is "send everything since last known state" rather than tracking exactly which messages were delivered. Webview deduplicates to avoid showing duplicate response chunks.

Design Trade-offs

Why not `retainContextWhenHidden`?

VSCode offers retainContextWhenHidden: true to keep webview alive when hidden. This would eliminate the need for state persistence entirely.

Trade-off: Microsoft documentation warns of "much higher performance overhead." The webview remains in memory consuming resources even when not visible.

Decision: Use state persistence for lightweight chat interfaces. Reserve retainContextWhenHidden for complex UIs (e.g., embedded IDEs) that cannot be easily serialized.

Why not global state in extension?

Extension could store chat history in globalState instead of webview managing its own state.

Trade-off: Violates state ownership principle. Webview understands mynah-ui structure; extension shouldn't need to parse or manipulate UI state.

Decision: Webview owns UI state, extension provides coordination (session ID injection). Keeps extension simple and allows mynah-ui to evolve independently.

Why clear on restart instead of persisting?

Chat history could persist across VSCode restarts using globalState or workspace storage.

Trade-off: Users expect fresh sessions on restart. Long-lived history creates stale context and memory accumulation. Workspace-specific persistence could be added later if needed.

Decision: Session-scoped state matches user expectations and reduces complexity. Each VSCode session starts clean.

Migration and Compatibility

Old state without session ID: Treated as stale, cleared on first load. Ensures smooth upgrade path when session ID feature is added.

Future state format changes: Session ID check happens before parsing state structure. Mismatched session ID clears everything, eliminating need for explicit version migration.

Webview Lifecycle Management

VSCode can hide and show webviews at any time based on user actions. The extension must handle visibility changes gracefully to ensure no messages are lost and the UI appears responsive when shown.

Visibility States

A webview has three lifecycle states from the extension's perspective:

Visible - User can see the webview, messages can be delivered immediately
Hidden - Webview exists but is not visible (sidebar collapsed, tab not focused)
Disposed - Webview destroyed, no communication possible

Key constraint: Hidden webviews cannot receive messages. Attempting to send via postMessage succeeds (no error) but messages are silently dropped.

The Hidden Webview Problem

sequenceDiagram
    participant User
    participant Extension
    participant Webview
    participant Agent
    
    User->>Webview: Sends prompt
    Webview->>Extension: prompt message
    Extension->>Agent: Forward prompt
    Agent->>Extension: Start streaming response
    
    Note over User: User collapses sidebar
    Extension->>Extension: Webview hidden (visible = false)
    
    loop Agent still streaming
        Agent->>Extension: response-chunk
        Extension->>Webview: postMessage (silently dropped!)
        Note over Webview: Message lost
    end
    
    Note over User: User reopens sidebar
    Extension->>Extension: Webview visible again
    Note over Webview: Missing chunks, partial response

Without buffering: Messages sent while webview is hidden are lost. When user reopens the sidebar, they see incomplete responses or missing messages entirely.

Message Buffering Strategy

The extension tracks webview visibility and buffers messages when hidden:

sequenceDiagram
    participant Extension
    participant Webview
    participant Agent
    
    Agent->>Extension: response-chunk
    
    alt Webview visible
        Extension->>Webview: Send immediately
    else Webview hidden
        Extension->>Extension: Add to buffer
    end
    
    Note over Extension: Webview becomes visible
    Extension->>Webview: webview-ready request
    Webview->>Extension: last-seen-index
    
    loop For each buffered message
        Extension->>Webview: Send buffered message
        Webview->>Webview: Deduplicate if already seen
    end
    
    Extension->>Extension: Clear buffer

Buffer contents: Any message destined for the webview (response chunks, completion signals, error notifications).

Buffer lifetime: From webview hidden to webview shown. Cleared after replay.

Replay strategy: Send all buffered messages in order. Webview uses last-seen-index tracking (see State Persistence chapter) to ignore duplicates.

Visibility Detection

The extension monitors visibility using VSCode's onDidChangeViewState event:

stateDiagram-v2
    [*] --> Created: resolveWebviewView
    Created --> Visible: visible = true
    Visible --> Hidden: visible = false
    Hidden --> Visible: visible = true
    Visible --> Disposed: onDidDispose
    Hidden --> Disposed: onDidDispose
    Disposed --> [*]

Event timing:

onDidChangeViewState fires when visible property changes
onDidDispose fires after webview is destroyed (too late for cleanup)

Race condition: Messages can arrive between "webview created" and "webview visible." Extension treats created-but-not-visible as hidden state and buffers messages.

Webview-Ready Handshake

When the webview becomes visible (including initial creation), it announces readiness:

Webview finishes initialization - DOM loads, webview script executes, session ID is checked, state is restored or cleared, mynah-ui is constructed with restored tabs (if any)
Webview sends webview-ready - After mynah-ui initialization completes, webview sends message to extension including current last-seen-index map
Extension replays buffered messages - Extension sends any messages that accumulated while webview was hidden
Extension resumes normal message delivery - New messages are sent immediately as they arrive

Why handshake? Webview needs time to initialize mynah-ui and restore state. Sending messages immediately after visibility change could arrive before UI is ready to process them. The webview signals when it's actually ready to receive messages rather than the extension guessing based on visibility events.

Why include last-seen-index? Allows extension to avoid resending messages the webview already processed before hiding. Reduces redundant replay.

What triggers webview-ready? The webview sends this message during its initialization script, after the mynah-ui constructor completes and before setting up event handlers. On subsequent hide/show cycles, if mynah-ui remains initialized, the webview can send webview-ready immediately after becoming visible.

Agent Independence

The agent continues running regardless of webview visibility:

Prompts sent while webview is hidden are still processed
Responses generated while webview is hidden are buffered
Sessions remain active across webview hide/show cycles

Why? Agent should not need to know about VSCode-specific concerns. Extension insulates agent from webview lifecycle complexity.

Trade-off: Long-running agent operations may complete while webview is hidden, buffering large amounts of data. If webview remains hidden for extended periods, memory usage grows. Current implementation has no buffer size limit.

Disposal Handling

When the webview is disposed (user closes sidebar permanently, workspace switch), buffered messages are discarded:

Buffer is cleared
Agent sessions continue running
Next webview creation can restore tab → session mappings

Why not save buffered messages? Messages are ephemeral rendering updates. State persistence (see State Persistence chapter) handles durable state. Buffering is purely a delivery mechanism for real-time updates.

Design Rationale

Why buffer in extension instead of agent? Webview lifecycle is VSCode-specific. Agent shouldn't need VSCode-specific logic. Extension handles UI framework concerns.

Why replay all messages instead of tracking delivered? Simpler implementation. Webview deduplication is cheap (index comparison). Tracking exactly which messages were delivered requires more complex state management.

Why not queue in webview? Webview is destroyed/recreated when hidden in some cases. Can't rely on webview maintaining queue across lifecycle events. Extension has stable lifecycle tied to VSCode session.

Why immediate send when visible? Minimize latency. Users expect real-time streaming responses. Buffering only when necessary provides best UX.

VSCode Extension Integration Testing Guide

Overview

VSCode extension testing involves multiple layers, with integration tests being crucial for verifying that your extension works correctly with the VSCode API in a real VSCode environment.

Why Integration Tests Matter:

Unit tests can't verify VSCode API interactions
Extensions can break due to VSCode API changes
Manual testing doesn't scale as extensions grow
Integration tests catch issues that unit tests miss

Key Principle: Follow the test pyramid - most tests should be fast unit tests, with a smaller number of integration tests for critical workflows.

Testing Types

Unit Tests

Test pure logic in isolation
No VSCode API required
Fast and can run in any environment
Use standard frameworks (Mocha, Jest, etc.)
Good for: utility functions, data transformations, business logic

Integration Tests

Run inside a real VSCode instance (Extension Development Host)
Have access to full VSCode API
Test extension behavior with actual VSCode
Slower but more realistic
Good for: command execution, UI interactions, API integrations

End-to-End Tests

Automate the full VSCode UI using tools like WebdriverIO or Playwright
Most complex to set up
Test complete user workflows
Good for: complex UIs, webviews, full user journeys

Setting Up Integration Tests

Option 1: Using @vscode/test-cli (Recommended)

The modern approach using the official VSCode test CLI.

Installation:

npm install --save-dev @vscode/test-cli @vscode/test-electron

package.json configuration:

{
  "scripts": {
    "test": "vscode-test"
  }
}

Create .vscode-test.js or .vscode-test.mjs:

import { defineConfig } from '@vscode/test-cli';

export default defineConfig({
  files: 'out/test/**/*.test.js',
  version: 'stable', // or 'insiders' or specific version like '1.85.0'
  workspaceFolder: './test-workspace',
  mocha: {
    ui: 'tdd',
    timeout: 20000
  }
});

Run tests:

npm test

Option 2: Using @vscode/test-electron Directly

For more control over the test runner.

Installation:

npm install --save-dev @vscode/test-electron mocha

Create src/test/runTest.ts:

import * as path from 'path';
import { runTests } from '@vscode/test-electron';

async function main() {
  try {
    // The folder containing the Extension Manifest package.json
    const extensionDevelopmentPath = path.resolve(__dirname, '../../');
    
    // The path to test runner
    const extensionTestsPath = path.resolve(__dirname, './suite/index');
    
    // Optional: specific workspace to open
    const testWorkspace = path.resolve(__dirname, '../../test-fixtures');
    
    // Download VS Code, unzip it and run the integration test
    await runTests({
      extensionDevelopmentPath,
      extensionTestsPath,
      launchArgs: [
        testWorkspace,
        '--disable-extensions' // Disable other extensions during testing
      ]
    });
  } catch (err) {
    console.error('Failed to run tests');
    process.exit(1);
  }
}

main();

Create src/test/suite/index.ts (test runner):

import * as path from 'path';
import * as Mocha from 'mocha';
import { glob } from 'glob';

export function run(): Promise<void> {
  const mocha = new Mocha({
    ui: 'tdd',
    color: true,
    timeout: 20000
  });

  const testsRoot = path.resolve(__dirname, '.');

  return new Promise((resolve, reject) => {
    glob('**/**.test.js', { cwd: testsRoot }).then((files) => {
      // Add files to the test suite
      files.forEach(f => mocha.addFile(path.resolve(testsRoot, f)));

      try {
        // Run the mocha test
        mocha.run(failures => {
          if (failures > 0) {
            reject(new Error(`${failures} tests failed.`));
          } else {
            resolve();
          }
        });
      } catch (err) {
        reject(err);
      }
    }).catch((err) => {
      reject(err);
    });
  });
}

Project Structure

your-extension/
├── src/
│   ├── extension.ts
│   └── test/
│       ├── runTest.ts
│       └── suite/
│           ├── index.ts
│           ├── extension.test.ts
│           └── other.test.ts
├── test-fixtures/          # Optional test workspace
│   └── sample-file.txt
├── .vscode/
│   └── launch.json         # Debug configuration
└── package.json

Writing Integration Tests

Basic Test Structure

import * as assert from 'assert';
import * as vscode from 'vscode';

suite('Extension Test Suite', () => {
  vscode.window.showInformationMessage('Start all tests.');

  test('Sample test', () => {
    assert.strictEqual(-1, [1, 2, 3].indexOf(5));
    assert.strictEqual(-1, [1, 2, 3].indexOf(0));
  });

  test('Extension should be present', () => {
    assert.ok(vscode.extensions.getExtension('your-publisher.your-extension'));
  });

  test('Should register commands', async () => {
    const commands = await vscode.commands.getCommands(true);
    assert.ok(commands.includes('your-extension.yourCommand'));
  });
});

Testing Commands

test('Execute command should work', async () => {
  const result = await vscode.commands.executeCommand('your-extension.yourCommand');
  assert.ok(result);
  assert.strictEqual(result.status, 'success');
});

Testing with Documents and Editors

test('Should modify document', async () => {
  // Create a new document
  const doc = await vscode.workspace.openTextDocument({
    content: 'Hello World',
    language: 'plaintext'
  });

  // Open it in an editor
  const editor = await vscode.window.showTextDocument(doc);

  // Execute your command that modifies the document
  await vscode.commands.executeCommand('your-extension.formatDocument');

  // Assert the document was modified
  assert.strictEqual(doc.getText(), 'HELLO WORLD');

  // Clean up
  await vscode.commands.executeCommand('workbench.action.closeActiveEditor');
});

Asynchronous Operations and Waiting

function waitForCondition(
  condition: () => boolean,
  timeout: number = 5000,
  message?: string
): Promise<void> {
  return new Promise((resolve, reject) => {
    const startTime = Date.now();
    const interval = setInterval(() => {
      if (condition()) {
        clearInterval(interval);
        resolve();
      } else if (Date.now() - startTime > timeout) {
        clearInterval(interval);
        reject(new Error(message || 'Timeout waiting for condition'));
      }
    }, 50);
  });
}

test('Wait for extension activation', async () => {
  const extension = vscode.extensions.getExtension('your-publisher.your-extension');
  
  if (!extension!.isActive) {
    await extension!.activate();
  }

  await waitForCondition(
    () => extension!.isActive,
    5000,
    'Extension did not activate'
  );

  assert.ok(extension!.isActive);
});

Testing Events

test('Should trigger onDidChangeTextDocument', async () => {
  const doc = await vscode.workspace.openTextDocument({
    content: 'Test',
    language: 'plaintext'
  });

  let eventFired = false;
  const disposable = vscode.workspace.onDidChangeTextDocument(e => {
    if (e.document === doc) {
      eventFired = true;
    }
  });

  const editor = await vscode.window.showTextDocument(doc);
  await editor.edit(edit => {
    edit.insert(new vscode.Position(0, 0), 'Hello ');
  });

  await waitForCondition(() => eventFired, 2000);
  assert.ok(eventFired, 'Event should have fired');

  disposable.dispose();
});

Testing Webviews

Testing webviews is challenging because they run in an isolated context. There are several approaches:

Approach 1: Message-Based Testing (Recommended for Integration Tests)

Extension Side - Add Test Hooks:

class ChatPanel {
  private panel: vscode.WebviewPanel;
  private messageHandlers: Map<string, (message: any) => void> = new Map();

  constructor(extensionUri: vscode.Uri) {
    this.panel = vscode.window.createWebviewPanel(
      'chat',
      'Chat',
      vscode.ViewColumn.One,
      {
        enableScripts: true,
        retainContextWhenHidden: true
      }
    );

    this.panel.webview.onDidReceiveMessage(message => {
      // Handle normal messages
      if (message.type === 'userMessage') {
        this.handleUserMessage(message.text);
      }
      
      // Handle test messages (only in test environment)
      if (process.env.VSCODE_TEST_MODE === 'true') {
        if (message.type === 'test:state') {
          const handler = this.messageHandlers.get('state');
          handler?.(message);
        }
      }
    });
  }

  // Public method for tests to get state
  public requestState(): Promise<any> {
    return new Promise((resolve) => {
      this.messageHandlers.set('state', (message) => {
        resolve(message.data);
        this.messageHandlers.delete('state');
      });
      this.panel.webview.postMessage({ type: 'test:getState' });
    });
  }

  // Method to send messages to webview
  public sendMessage(text: string) {
    this.handleUserMessage(text);
  }

  private handleUserMessage(text: string) {
    // Your normal message handling logic
    // ...
    
    // Send to webview
    this.panel.webview.postMessage({
      type: 'agentResponse',
      text: 'Response to: ' + text
    });
  }
}

Webview Side - Add Test Handlers:

// In your webview HTML/JS
const vscode = acquireVsCodeApi();

let messages = [];

// Handle messages from extension
window.addEventListener('message', event => {
  const message = event.data;
  
  if (message.type === 'agentResponse') {
    messages.push(message);
    updateUI();
  }
  
  // Test-specific handlers
  if (message.type === 'test:getState') {
    vscode.postMessage({
      type: 'test:state',
      data: {
        messages: messages,
        // other state...
      }
    });
  }
});

// Handle user input
function sendMessage(text) {
  vscode.postMessage({
    type: 'userMessage',
    text: text
  });
}

Integration Test:

suite('Chat Webview Tests', () => {
  let chatPanel: ChatPanel;

  setup(async () => {
    // Set test mode
    process.env.VSCODE_TEST_MODE = 'true';
    
    // Create chat panel
    chatPanel = new ChatPanel(extensionUri);
  });

  teardown(async () => {
    // Clean up
    await vscode.commands.executeCommand('workbench.action.closeAllEditors');
    process.env.VSCODE_TEST_MODE = 'false';
  });

  test('Chat state persistence', async () => {
    // Send a message
    chatPanel.sendMessage('Hello');
    
    // Wait for response
    await new Promise(resolve => setTimeout(resolve, 500));
    
    // Get state before closing
    const stateBefore = await chatPanel.requestState();
    assert.strictEqual(stateBefore.messages.length, 1);
    
    // Close and reopen
    await vscode.commands.executeCommand('workbench.action.closePanel');
    await new Promise(resolve => setTimeout(resolve, 100));
    
    // Reopen chat
    chatPanel = new ChatPanel(extensionUri);
    await new Promise(resolve => setTimeout(resolve, 500));
    
    // Verify state persisted
    const stateAfter = await chatPanel.requestState();
    assert.strictEqual(stateAfter.messages.length, 1);
    assert.strictEqual(stateAfter.messages[0].text, 'Response to: Hello');
  });
});

Approach 2: Direct Extension-Side Testing

If your webview logic mostly lives on the extension side, test the handlers directly:

test('Handle user message', async () => {
  const chatPanel = new ChatPanel(extensionUri);
  
  // Simulate message from webview by calling the handler directly
  await chatPanel.handleWebviewMessage({
    type: 'userMessage',
    text: 'Test message'
  });
  
  // Verify the extension's state changed
  const messages = chatPanel.getMessages();
  assert.strictEqual(messages.length, 1);
  assert.strictEqual(messages[0].user, 'Test message');
});

Approach 3: Using WebdriverIO for True E2E Webview Testing

For complex webview UIs where you need to test the actual DOM:

Installation:

npm install --save-dev @wdio/cli @wdio/mocha-framework wdio-vscode-service

wdio.conf.ts:

import path from 'path';

export const config = {
  specs: ['./test/e2e/**/*.test.ts'],
  capabilities: [{
    browserName: 'vscode',
    browserVersion: 'stable',
    'wdio:vscodeOptions': {
      extensionPath: path.join(__dirname, '.'),
      userSettings: {
        'window.dialogStyle': 'custom'
      }
    }
  }],
  services: ['vscode'],
  framework: 'mocha',
  mochaOpts: {
    ui: 'bdd',
    timeout: 60000
  }
};

E2E Test:

describe('Chat Webview E2E', () => {
  it('should allow typing and sending messages', async () => {
    const workbench = await browser.getWorkbench();
    
    // Open your chat panel
    await browser.executeWorkbench((vscode) => {
      vscode.commands.executeCommand('your-extension.openChat');
    });
    
    // Wait for webview to appear
    await browser.pause(1000);
    
    // Switch to webview frame
    const webview = await $('iframe.webview');
    await browser.switchToFrame(webview);
    
    // Interact with webview DOM
    const input = await $('input[type="text"]');
    await input.setValue('Hello from E2E test');
    
    const sendButton = await $('button[type="submit"]');
    await sendButton.click();
    
    // Verify response appears
    const messages = await $$('.message');
    expect(messages).toHaveLength(2); // User message + bot response
  });
});

Advanced Testing Scenarios

Testing with Mock Dependencies

// Create a mock agent for deterministic testing
class MockAgent {
  async sendMessage(text: string): Promise<string> {
    // Return deterministic responses for testing
    if (text.includes('hello')) {
      return 'Hi there!';
    }
    return 'I received: ' + text;
  }
}

// Inject mock in tests
test('Chat with mock agent', async () => {
  const mockAgent = new MockAgent();
  const chatPanel = new ChatPanel(extensionUri, mockAgent);
  
  chatPanel.sendMessage('hello');
  await waitForCondition(() => chatPanel.getMessages().length > 0);
  
  const messages = chatPanel.getMessages();
  assert.strictEqual(messages[0].response, 'Hi there!');
});

Testing State Serialization

test('Serialize and restore webview state', async () => {
  const chatPanel = new ChatPanel(extensionUri);
  
  // Add some state
  chatPanel.sendMessage('First message');
  await new Promise(resolve => setTimeout(resolve, 200));
  
  chatPanel.sendMessage('Second message');
  await new Promise(resolve => setTimeout(resolve, 200));
  
  // Get serialized state
  const state = chatPanel.getSerializedState();
  assert.ok(state);
  assert.ok(state.messages);
  
  // Close panel
  chatPanel.dispose();
  
  // Create new panel with saved state
  const newChatPanel = ChatPanel.restore(extensionUri, state);
  
  // Verify state was restored
  const messages = newChatPanel.getMessages();
  assert.strictEqual(messages.length, 2);
  assert.strictEqual(messages[0].text, 'First message');
});

Testing with File System

import * as fs from 'fs/promises';
import * as path from 'path';
import * as os from 'os';

suite('File Operations', () => {
  let tempDir: string;

  setup(async () => {
    // Create temp directory for test files
    tempDir = await fs.mkdtemp(path.join(os.tmpdir(), 'vscode-test-'));
  });

  teardown(async () => {
    // Clean up temp files
    await fs.rm(tempDir, { recursive: true, force: true });
  });

  test('Should read and process files', async () => {
    // Create test file
    const testFile = path.join(tempDir, 'test.txt');
    await fs.writeFile(testFile, 'test content');
    
    // Open file in VSCode
    const doc = await vscode.workspace.openTextDocument(testFile);
    await vscode.window.showTextDocument(doc);
    
    // Execute your command
    await vscode.commands.executeCommand('your-extension.processFile');
    
    // Verify results
    const content = await fs.readFile(testFile, 'utf-8');
    assert.strictEqual(content, 'PROCESSED: test content');
  });
});

Testing Extension Configuration

test('Should respect configuration changes', async () => {
  const config = vscode.workspace.getConfiguration('your-extension');
  
  // Set test configuration
  await config.update('someSetting', 'testValue', 
    vscode.ConfigurationTarget.Global);
  
  // Execute command that uses config
  const result = await vscode.commands.executeCommand('your-extension.useConfig');
  
  assert.strictEqual(result.settingValue, 'testValue');
  
  // Clean up
  await config.update('someSetting', undefined, 
    vscode.ConfigurationTarget.Global);
});

Testing Best Practices

1. Isolation

Each test should be independent
Clean up resources in teardown()
Don't rely on test execution order
Close editors and panels after tests

2. Determinism

Use mock agents or services for predictable behavior
Avoid timing dependencies where possible
Use proper wait conditions instead of arbitrary sleeps
Control randomness (use seeds for random data)

3. Speed

Keep integration tests focused
Don't test every edge case in integration tests
Use unit tests for detailed logic testing
Disable unnecessary extensions with --disable-extensions

4. Clarity

Use descriptive test names
Comment complex setup/teardown logic
Group related tests in suites
Keep tests readable and maintainable

5. Reliability

Handle asynchronous operations properly
Use appropriate timeouts
Add retry logic for flaky operations
Log failures for debugging

Test Helpers

Create reusable test utilities:

// test/helpers.ts
export async function createTestDocument(
  content: string, 
  language: string = 'plaintext'
): Promise<vscode.TextDocument> {
  const doc = await vscode.workspace.openTextDocument({
    content,
    language
  });
  return doc;
}

export async function closeAllEditors(): Promise<void> {
  await vscode.commands.executeCommand('workbench.action.closeAllEditors');
}

export function waitForExtensionActivation(
  extensionId: string
): Promise<void> {
  return new Promise((resolve, reject) => {
    const extension = vscode.extensions.getExtension(extensionId);
    if (!extension) {
      reject(new Error(`Extension ${extensionId} not found`));
      return;
    }
    
    if (extension.isActive) {
      resolve();
      return;
    }
    
    extension.activate()
      .then(() => resolve())
      .catch(reject);
  });
}

export class Deferred<T> {
  promise: Promise<T>;
  resolve!: (value: T) => void;
  reject!: (error: Error) => void;

  constructor() {
    this.promise = new Promise((resolve, reject) => {
      this.resolve = resolve;
      this.reject = reject;
    });
  }
}

Debugging Tests

VSCode Launch Configuration

Add to .vscode/launch.json:

{
  "version": "0.2.0",
  "configurations": [
    {
      "name": "Extension Tests",
      "type": "extensionHost",
      "request": "launch",
      "runtimeExecutable": "${execPath}",
      "args": [
        "--extensionDevelopmentPath=${workspaceFolder}",
        "--extensionTestsPath=${workspaceFolder}/out/test/suite/index",
        "--disable-extensions"
      ],
      "outFiles": [
        "${workspaceFolder}/out/test/**/*.js"
      ],
      "preLaunchTask": "npm: compile"
    }
  ]
}

Debugging Tips

Set breakpoints in your test files
Use Debug Console to inspect variables

Run single tests by using .only():

test.only('This test will run alone', () => {
  // ...
});

Use console.log for quick debugging
Check Extension Development Host output for extension logs

Running Specific Tests

# Run all tests
npm test

# Run tests matching pattern
npm test -- --grep "specific test name"

# Run with more verbose output
npm test -- --reporter spec

Common Patterns

Pattern: Testing Command Registration

test('Commands should be registered', async () => {
  const commands = await vscode.commands.getCommands(true);
  const expectedCommands = [
    'your-extension.command1',
    'your-extension.command2',
    'your-extension.command3'
  ];
  
  for (const cmd of expectedCommands) {
    assert.ok(
      commands.includes(cmd),
      `Command ${cmd} should be registered`
    );
  }
});

Pattern: Testing Status Bar Items

test('Should show status bar item', async () => {
  // Trigger action that creates status bar item
  await vscode.commands.executeCommand('your-extension.showStatus');
  
  // Status bar items aren't directly testable via API,
  // so test the underlying state
  const extension = vscode.extensions.getExtension('your-publisher.your-extension');
  const statusItem = (extension?.exports as any).statusBarItem;
  
  assert.ok(statusItem);
  assert.strictEqual(statusItem.text, '$(check) Ready');
});

Pattern: Testing Tree Views

test('Tree view should show items', async () => {
  // Get your tree data provider
  const extension = vscode.extensions.getExtension('your-publisher.your-extension');
  const treeProvider = (extension?.exports as any).treeDataProvider;
  
  // Get root items
  const items = await treeProvider.getChildren();
  
  assert.ok(items.length > 0);
  assert.strictEqual(items[0].label, 'Expected Item');
});

Pattern: Testing Quick Picks

test('Quick pick should show options', async () => {
  // This is tricky - quick picks block execution
  // One approach is to test the logic that generates options
  
  const extension = vscode.extensions.getExtension('your-publisher.your-extension');
  const getQuickPickItems = (extension?.exports as any).getQuickPickItems;
  
  const items = await getQuickPickItems();
  
  assert.strictEqual(items.length, 3);
  assert.strictEqual(items[0].label, 'Option 1');
});

Tools and Libraries

Core Testing Tools

@vscode/test-cli: Official CLI for running tests (recommended)
@vscode/test-electron: Lower-level test runner for Desktop VSCode
@vscode/test-web: Test runner for web extensions
Mocha: Test framework used by VSCode (TDD or BDD style)

Additional Testing Tools

WebdriverIO + wdio-vscode-service: E2E testing with webview support
vscode-extension-tester: Alternative E2E testing tool by Red Hat
Sinon: Mocking and stubbing library
Chai: Assertion library (alternative to Node's assert)

Useful Utilities

// Helper to wait for promises with timeout
export function withTimeout<T>(
  promise: Promise<T>, 
  timeoutMs: number
): Promise<T> {
  return Promise.race([
    promise,
    new Promise<T>((_, reject) => 
      setTimeout(() => reject(new Error('Timeout')), timeoutMs)
    )
  ]);
}

// Helper to retry flaky operations
export async function retry<T>(
  fn: () => Promise<T>,
  attempts: number = 3,
  delay: number = 100
): Promise<T> {
  for (let i = 0; i < attempts; i++) {
    try {
      return await fn();
    } catch (error) {
      if (i === attempts - 1) throw error;
      await new Promise(resolve => setTimeout(resolve, delay));
    }
  }
  throw new Error('Retry failed');
}

Example: Complete Test Suite

Here's a complete example putting it all together:

import * as assert from 'assert';
import * as vscode from 'vscode';
import { ChatPanel } from '../../chatPanel';

suite('Chat Extension Test Suite', () => {
  let extensionUri: vscode.Uri;
  let chatPanel: ChatPanel | undefined;

  suiteSetup(async () => {
    // Run once before all tests
    const extension = vscode.extensions.getExtension('your-publisher.your-extension');
    assert.ok(extension);
    
    if (!extension.isActive) {
      await extension.activate();
    }
    
    extensionUri = extension.extensionUri;
  });

  setup(() => {
    // Run before each test
    process.env.VSCODE_TEST_MODE = 'true';
  });

  teardown(async () => {
    // Run after each test
    if (chatPanel) {
      chatPanel.dispose();
      chatPanel = undefined;
    }
    await vscode.commands.executeCommand('workbench.action.closeAllEditors');
    process.env.VSCODE_TEST_MODE = 'false';
  });

  test('Extension should be present', () => {
    assert.ok(vscode.extensions.getExtension('your-publisher.your-extension'));
  });

  test('Chat command should be registered', async () => {
    const commands = await vscode.commands.getCommands(true);
    assert.ok(commands.includes('your-extension.openChat'));
  });

  test('Should create chat panel', async () => {
    chatPanel = new ChatPanel(extensionUri);
    assert.ok(chatPanel);
  });

  test('Should send and receive messages', async function() {
    this.timeout(5000);
    
    chatPanel = new ChatPanel(extensionUri);
    
    // Send message
    chatPanel.sendMessage('Hello');
    
    // Wait for response
    await new Promise(resolve => setTimeout(resolve, 1000));
    
    const state = await chatPanel.requestState();
    assert.ok(state.messages.length > 0);
  });

  test('Should persist state across panel close/reopen', async function() {
    this.timeout(10000);
    
    // Create panel and send message
    chatPanel = new ChatPanel(extensionUri);
    chatPanel.sendMessage('Test message');
    await new Promise(resolve => setTimeout(resolve, 500));
    
    // Get state
    const stateBefore = await chatPanel.requestState();
    const messageCount = stateBefore.messages.length;
    
    // Serialize and dispose
    const serialized = chatPanel.getSerializedState();
    chatPanel.dispose();
    chatPanel = undefined;
    
    // Wait a bit
    await new Promise(resolve => setTimeout(resolve, 200));
    
    // Restore
    chatPanel = ChatPanel.restore(extensionUri, serialized);
    await new Promise(resolve => setTimeout(resolve, 500));
    
    // Verify
    const stateAfter = await chatPanel.requestState();
    assert.strictEqual(stateAfter.messages.length, messageCount);
  });
});

Summary

Integration testing for VSCode extensions requires:

Proper setup using @vscode/test-cli or @vscode/test-electron
Strategic testing - focus on critical workflows, use unit tests for details
Webview testing via message-passing or E2E tools like WebdriverIO
Good practices - isolation, determinism, proper cleanup
Debugging support with launch configurations

Testing webviews specifically requires creative approaches since they run in isolated contexts. The message-passing pattern works well for integration tests, while WebdriverIO is better for true E2E testing of complex UIs.

Remember: integration tests are slower than unit tests, so use them strategically for testing VSCode API interactions and critical user workflows.

Testing Implementation

This chapter documents the testing framework architecture for the VSCode extension, explaining how tests are structured and how to extend the testing system with new capabilities.

Architecture

Test Infrastructure

The test suite uses @vscode/test-cli which downloads and runs a VSCode instance, loads the extension in development mode, and executes Mocha tests in the extension host context.

Configuration in .vscode-test.mjs:

{
  files: "out/test/**/*.test.js",
  version: "stable",
  workspaceFolder: "./test-workspace",
  mocha: { ui: "tdd", timeout: 20000 }
}

Tests run with:

npm test

Testing API Design

Rather than coupling tests to implementation details, the extension exposes a command-based testing API. Tests invoke VSCode commands which delegate to public testing methods on ChatViewProvider.

Pattern:

// In extension.ts - register test command
context.subscriptions.push(
  vscode.commands.registerCommand("symposium.test.commandName", 
    async (arg1, arg2) => {
      return await chatProvider.testingMethod(arg1, arg2);
    }
  )
);

// In test - invoke via command
const result = await vscode.commands.executeCommand(
  "symposium.test.commandName", 
  arg1, 
  arg2
);

Current Testing Commands:

symposium.test.simulateNewTab(tabId) - Create a tab
symposium.test.getTabs() - Get list of tab IDs
symposium.test.sendPrompt(tabId, prompt) - Send prompt to tab
symposium.test.startCapturingResponses(tabId) - Begin capturing agent responses
symposium.test.getResponse(tabId) - Get accumulated response text
symposium.test.stopCapturingResponses(tabId) - Stop capturing

Adding New Test Commands

To test new behavior:

Add public method to ChatViewProvider (or relevant class):

export class ChatViewProvider {
  // Existing test methods...
  
  public async newTestingMethod(param: string): Promise<ResultType> {
    // Implementation that exposes needed behavior
    return result;
  }
}

Register command in extension.ts:

context.subscriptions.push(
  vscode.commands.registerCommand(
    "symposium.test.newCommand",
    async (param: string) => {
      return await chatProvider.newTestingMethod(param);
    }
  )
);

Use in tests:

test("Should test new behavior", async () => {
  const result = await vscode.commands.executeCommand(
    "symposium.test.newCommand",
    "test-param"
  );
  assert.strictEqual(result.expected, true);
});

Structured Logging for Assertions

Tests verify behavior through structured log events rather than console scraping.

Logger Architecture:

export class Logger {
  private outputChannel: vscode.OutputChannel;
  private eventEmitter = new vscode.EventEmitter<LogEvent>();
  
  public get onLog(): vscode.Event<LogEvent> {
    return this.eventEmitter.event;
  }
  
  public info(category: string, message: string, data?: any): void {
    const event: LogEvent = { 
      timestamp: new Date(), 
      level: "info", 
      category, 
      message, 
      data 
    };
    this.eventEmitter.fire(event);
    this.outputChannel.appendLine(/* formatted output */);
  }
}

Dual Purpose:

Testing - Event emitter allows tests to capture and assert on events
Live Debugging - Output channel shows logs in VSCode Output panel

Usage in Tests:

const logEvents: LogEvent[] = [];
const disposable = logger.onLog((event) => logEvents.push(event));

// ... perform test actions ...

const relevantEvents = logEvents.filter(
  e => e.category === "agent" && e.message === "Session created"
);
assert.strictEqual(relevantEvents.length, 2);

Adding New Log Points

To make behavior testable:

Add log statement in implementation:

logger.info("category", "Descriptive message", {
  relevantData: value,
  moreContext: other
});

Filter and assert in tests:

const events = logEvents.filter(
  e => e.category === "category" && e.message === "Descriptive message"
);
assert.ok(events.length > 0);
assert.strictEqual(events[0].data.relevantData, expectedValue);

Log Categories:

webview - Webview lifecycle events
agent - Agent spawning, sessions, communication
Add new categories as needed for different subsystems

Design Decisions

Command-Based Testing API

Alternative: Direct access to ChatViewProvider internals from tests

Chosen: Command-based testing API

Rationale:

Decouples tests from implementation details
Tests the same code paths as real usage
Allows refactoring without breaking tests
Commands document the testing interface

Real Agents vs Mocks

Alternative: Mock agent responses with canned data

Chosen: Real ElizACP over ACP protocol

Rationale:

Tests the full protocol stack (JSON-RPC, stdio, conductor)
Verifies conductor integration
Catches protocol-level bugs
Provides realistic timing and behavior

ElizACP is lightweight, deterministic, and fast enough for testing.

Event-Based Logging

Alternative: Console output scraping with regex

Chosen: Event emitter with structured data

Rationale:

Enables precise assertions on event counts and data
Provides rich context for debugging
Output panel visibility for live debugging
No brittle string matching
Same infrastructure serves testing and development

Test Isolation

Challenge: Tests share VSCode instance, agent processes persist across tests

Strategy: Make tests order-independent:

Assert "spawned OR reused" rather than exact counts
Focus on test-specific events (e.g., prompts sent, responses received)
Capture logs from test start, not globally
Don't assume clean state between tests

This allows the test suite to pass regardless of execution order.

Writing Tests

Tests live in src/test/*.test.ts and use Mocha's TDD interface:

suite("Feature Tests", () => {
  test("Should do something", async function() {
    this.timeout(20000); // Extend timeout for async operations
    
    // Setup log capture
    const logEvents: LogEvent[] = [];
    const disposable = logger.onLog((event) => logEvents.push(event));
    
    // Perform test actions via commands
    await vscode.commands.executeCommand("symposium.test.doSomething");
    
    // Wait for async completion
    await new Promise(resolve => setTimeout(resolve, 1000));
    
    // Assert on results
    const events = logEvents.filter(/* ... */);
    assert.ok(events.length > 0);
    
    disposable.dispose();
  });
});

Key Patterns:

Use async function() (not arrow functions) to access this.timeout()
Extend timeout for operations involving agent spawning
Always dispose log listeners
Add delays for async operations (agent responses, UI updates)

Message Protocol - Extension ↔ webview communication
State Persistence - How state survives webview lifecycle

Implementation Status

This chapter tracks what's been implemented, what's in progress, and what's planned for the VSCode extension.

Core Architecture

Three-layer architecture (webview/extension/agent)
Message routing with UUID-based identification
HomerActor mock agent with session support
Webview state persistence with session ID checking
Message buffering when webview is hidden
Message deduplication via last-seen-index tracking

Error Handling

Agent crash detection (partially implemented - detection works, UI error display incomplete)
Complete error recovery UX (restart agent button, error notifications)
Agent health monitoring and automatic restart

Agent Lifecycle

Agent spawn on extension activation (partially implemented - spawn/restart works, graceful shutdown incomplete)
Graceful agent shutdown on extension deactivation
Agent process supervision and restart on crash

ACP Protocol Support

Connection & Lifecycle

Client-side connection (ClientSideConnection)
Protocol initialization and capability negotiation
Session creation (newSession)
Prompt sending (prompt)
Streaming response handling (sessionUpdate)
Session cancellation (session/cancel)
Session mode switching (session/set_mode)
Model selection (session/set_model)
Authentication flow

Tool Permissions

Permission request callback (requestPermission)
MynahUI approval cards with approve/deny/bypass options
Per-agent bypass permissions in settings
Settings UI for managing bypass permissions
Automatic approval when bypass enabled

Session Updates

The client receives sessionUpdate notifications from the agent. Current support:

agent_message_chunk - Display streaming text in chat UI
tool_call - Logged to console (not displayed in UI)
tool_call_update - Logged to console (not displayed in UI)
Execution plans - Not implemented
Thinking steps - Not implemented
Custom update types - Not implemented

Gap: Tool calls are logged but not visually displayed. Users don't see which tools are being executed or their progress.

File System Capabilities

readTextFile - Stub implemented (throws "not yet implemented")
writeTextFile - Stub implemented (throws "not yet implemented")

Current state: We advertise fs.readTextFile: false and fs.writeTextFile: false in capabilities, so agents know we don't support file operations.

Why not implemented: Requires VSCode workspace API integration and security considerations (which files can be accessed, path validation, etc.).

Terminal Capabilities

createTerminal - Not implemented
Terminal output streaming - Not implemented
Terminal lifecycle (kill, release) - Not implemented

Why not implemented: Requires integrating with VSCode's terminal API and managing terminal lifecycle. Also involves security considerations around command execution.

Extension Points

Extension methods (extMethod) - Not implemented
Extension notifications (extNotification) - Not implemented

These allow protocol extensions beyond the ACP specification. Not currently needed but could be useful for custom features.

State Management

Webview state persistence within session
Chat history persistence across hide/show cycles
Draft text persistence (FIXME: partially typed prompts are lost on hide/show)
Session restoration after VSCode restart
Workspace-specific state persistence
Tab history and conversation export

MynahUI GUI Capabilities Guide

Overview

MynahUI is a data and event-driven chat interface library for browsers and webviews. This guide focuses on the interactive GUI capabilities relevant for building tool permission and approval workflows.

Core Concepts

Chat Items

Chat items are the fundamental building blocks of the conversation UI. Each chat item is a "card" that can contain various interactive elements.

Basic Structure:

interface ChatItem {
  type: ChatItemType;           // Determines positioning and styling
  messageId?: string;            // Unique identifier for updates
  body?: string;                 // Markdown content
  buttons?: ChatItemButton[];    // Action buttons
  formItems?: ChatItemFormItem[]; // Form inputs
  fileList?: FileList;           // File tree display
  followUp?: FollowUpOptions;    // Quick action pills
  // ... many more options
}

Chat Item Types:

ANSWER / ANSWER_STREAM / CODE_RESULT → Left-aligned (AI responses)
PROMPT / SYSTEM_PROMPT → Right-aligned (user messages)
DIRECTIVE → Transparent, no background

Interactive Components

1. Buttons (`ChatItemButton`)

Buttons are the primary action mechanism for user approval/denial workflows.

Interface:

interface ChatItemButton {
  id: string;                    // Unique identifier for the button
  text?: string;                 // Button label
  icon?: MynahIcons;             // Optional icon
  status?: 'main' | 'primary' | 'clear' | 'dimmed-clear' | 'info' | 'success' | 'warning' | 'error';
  keepCardAfterClick?: boolean;  // If false, removes card after click
  waitMandatoryFormItems?: boolean; // Disables until mandatory form items are filled
  disabled?: boolean;
  description?: string;          // Tooltip text
}

Status Colors:

main - Primary brand color
primary - Accent color
success - Green (for approval actions)
error - Red (for denial/rejection actions)
warning - Yellow/orange
info - Blue
clear - Transparent background

Event Handler:

onInBodyButtonClicked: (tabId: string, messageId: string, action: {
  id: string;
  text?: string;
  // ... other button properties
}) => void

Example - Approval Buttons:

{
  type: ChatItemType.ANSWER,
  messageId: 'tool-approval-123',
  body: 'Tool execution request...',
  buttons: [
    {
      id: 'approve-once',
      text: 'Approve',
      status: 'primary',
      icon: MynahIcons.OK
    },
    {
      id: 'approve-session',
      text: 'Approve for Session',
      status: 'success',
      icon: MynahIcons.OK_CIRCLED
    },
    {
      id: 'deny',
      text: 'Deny',
      status: 'error',
      icon: MynahIcons.CANCEL,
      keepCardAfterClick: false  // Card disappears on denial
    }
  ]
}

2. Form Items (`ChatItemFormItem`)

Form items allow collecting structured user input alongside button actions.

Available Form Types:

textinput / textarea / numericinput / email
select (dropdown)
radiogroup / toggle
checkbox / switch
stars (rating)
list (dynamic list of items)
pillbox (tag/pill input)

Common Properties:

interface BaseFormItem {
  id: string;                // Unique identifier
  type: string;              // Form type
  mandatory?: boolean;       // Required field
  title?: string;            // Label
  description?: string;      // Help text
  tooltip?: string;          // Tooltip
  value?: string;            // Initial/current value
  disabled?: boolean;
}

Example - Checkbox for "Remember Choice":

formItems: [
  {
    type: 'checkbox',
    id: 'remember-approval',
    label: 'Remember this choice for similar requests',
    value: 'false',
    tooltip: 'If checked, future requests for this tool will be automatically approved'
  }
]

Example - Toggle for Options:

formItems: [
  {
    type: 'toggle',
    id: 'approval-scope',
    title: 'Approval Scope',
    value: 'once',
    options: [
      { value: 'once', label: 'Once', icon: MynahIcons.CHECK },
      { value: 'session', label: 'Session', icon: MynahIcons.STACK },
      { value: 'always', label: 'Always', icon: MynahIcons.OK_CIRCLED }
    ]
  }
]

Event Handlers:

onFormChange: (tabId: string, messageId: string, item: ChatItemFormItem, value: any) => void

3. Content Display Options

Markdown Body

The body field supports full markdown including:

Headings (#, ##, ###)
Code blocks with syntax highlighting
Inline code
Links
Lists (ordered/unordered)
Blockquotes
Tables

Example - Displaying Tool Parameters:

body: `### Tool Execution Request

**Tool:** \`read_file\`

**Parameters:**
\`\`\`json
{
  "file_path": "/Users/niko/src/config.ts",
  "offset": 0,
  "limit": 100
}
\`\`\`

Do you want to allow this tool to execute?`

Custom Renderer

For complex layouts beyond markdown, use customRenderer with HTML markup:

customRenderer: `
<div>
  <h4>Tool: <code>read_file</code></h4>
  <table>
    <tr>
      <th>Parameter</th>
      <th>Value</th>
    </tr>
    <tr>
      <td>file_path</td>
      <td><code>/Users/niko/src/config.ts</code></td>
    </tr>
    <tr>
      <td>offset</td>
      <td><code>0</code></td>
    </tr>
  </table>
</div>
`

Information Cards

For hierarchical content with status indicators:

informationCard: {
  title: 'Security Notice',
  status: {
    status: 'warning',
    icon: MynahIcons.WARNING,
    body: 'This tool will access filesystem resources'
  },
  description: 'Review the parameters carefully',
  content: {
    body: '... detailed information ...'
  }
}

4. File Lists

Display file paths with actions and metadata:

fileList: {
  fileTreeTitle: 'Files to be accessed',
  filePaths: ['/src/config.ts', '/src/main.ts'],
  details: {
    '/src/config.ts': {
      icon: MynahIcons.FILE,
      description: 'Configuration file',
      clickable: true
    }
  },
  actions: {
    '/src/config.ts': [
      {
        name: 'view-details',
        icon: MynahIcons.EYE,
        description: 'View file details'
      }
    ]
  }
}

Event Handler:

onFileActionClick: (tabId: string, messageId: string, filePath: string, actionName: string) => void

5. Follow-Up Pills

Quick action buttons displayed as pills:

followUp: {
  text: 'Quick actions',
  options: [
    {
      pillText: 'Approve All',
      icon: MynahIcons.OK,
      status: 'success',
      prompt: 'approve-all'  // Can trigger automatic actions
    },
    {
      pillText: 'Deny All',
      icon: MynahIcons.CANCEL,
      status: 'error',
      prompt: 'deny-all'
    }
  ]
}

Event Handler:

onFollowUpClicked: (tabId: string, messageId: string, followUp: ChatItemAction) => void

Card Behavior Options

Visual States

{
  status?: 'info' | 'success' | 'warning' | 'error';  // Colors the card border/icon
  shimmer?: boolean;         // Loading animation
  canBeVoted?: boolean;      // Show thumbs up/down
  canBeDismissed?: boolean;  // Show dismiss button
  snapToTop?: boolean;       // Pin to top of chat
  border?: boolean;          // Show border
  hoverEffect?: boolean;     // Highlight on hover
}

Layout Options

{
  fullWidth?: boolean;               // Stretch to container width
  padding?: boolean;                 // Internal padding
  contentHorizontalAlignment?: 'default' | 'center';
}

Card Lifecycle

{
  keepCardAfterClick?: boolean;      // On buttons - remove card after click
  autoCollapse?: boolean;            // Auto-collapse long content
}

Updating Chat Items

Chat items can be updated after creation:

// Add new chat item
mynahUI.addChatItem(tabId, chatItem);

// Update by message ID
mynahUI.updateChatAnswerWithMessageId(tabId, messageId, updatedChatItem);

// Update last streaming answer
mynahUI.updateLastChatAnswer(tabId, partialChatItem);

Complete Example: Tool Approval Workflow

// 1. Show tool approval request
mynahUI.addChatItem('main-tab', {
  type: ChatItemType.ANSWER,
  messageId: 'tool-approval-read-file-001',
  status: 'warning',
  icon: MynahIcons.LOCK,
  body: `### Tool Execution Request

**Tool:** \`read_file\`

**Description:** Read file contents from the filesystem

**Parameters:**
\`\`\`json
{
  "file_path": "/Users/nikomat/dev/mynah-ui/src/config.ts",
  "offset": 0,
  "limit": 2000
}
\`\`\`

**Security:** This tool will access local filesystem resources.`,
  
  formItems: [
    {
      type: 'checkbox',
      id: 'remember-read-file',
      label: 'Trust this tool for the remainder of the session',
      value: 'false'
    }
  ],
  
  buttons: [
    {
      id: 'approve',
      text: 'Approve',
      status: 'success',
      icon: MynahIcons.OK,
      keepCardAfterClick: false
    },
    {
      id: 'deny',
      text: 'Deny',
      status: 'error',
      icon: MynahIcons.CANCEL,
      keepCardAfterClick: false
    },
    {
      id: 'details',
      text: 'More Details',
      status: 'clear',
      icon: MynahIcons.INFO
    }
  ]
});

// 2. Handle button clicks
mynahUI.onInBodyButtonClicked = (tabId, messageId, action) => {
  if (messageId === 'tool-approval-read-file-001') {
    const formState = mynahUI.getFormState(tabId, messageId);
    const rememberChoice = formState['remember-read-file'] === 'true';
    
    switch (action.id) {
      case 'approve':
        // Execute tool
        // If rememberChoice, add to session whitelist
        break;
      case 'deny':
        // Cancel tool execution
        break;
      case 'details':
        // Show additional information
        mynahUI.updateChatAnswerWithMessageId(tabId, messageId, {
          informationCard: {
            title: 'Tool Details',
            content: {
              body: 'Detailed tool documentation...'
            }
          }
        });
        break;
    }
  }
};

Progressive Updates

For multi-step approval flows, you can progressively update the same card:

// Initial request
mynahUI.addChatItem(tabId, {
  messageId: 'approval-001',
  type: ChatItemType.ANSWER,
  body: 'Waiting for approval...',
  shimmer: true
});

// User approves
mynahUI.updateChatAnswerWithMessageId(tabId, 'approval-001', {
  body: 'Approved! Executing tool...',
  shimmer: true,
  buttons: []  // Remove buttons
});

// Execution complete
mynahUI.updateChatAnswerWithMessageId(tabId, 'approval-001', {
  body: 'Tool execution complete!',
  shimmer: false,
  status: 'success',
  icon: MynahIcons.OK_CIRCLED
});

Sticky Cards

For persistent approval requests that stay above the prompt:

mynahUI.updateStore(tabId, {
  promptInputStickyCard: {
    messageId: 'persistent-approval',
    body: 'Multiple tools are waiting for approval',
    status: 'warning',
    icon: MynahIcons.WARNING,
    buttons: [
      {
        id: 'review-pending',
        text: 'Review Pending',
        status: 'info'
      }
    ]
  }
});

// Clear sticky card
mynahUI.updateStore(tabId, {
  promptInputStickyCard: null
});

Best Practices for Tool Approval UI

Clear Tool Identity: Always show tool name prominently
Parameter Visibility: Display all parameters the tool will receive
Security Context: Indicate security implications (file access, network, etc.)
Action Clarity: Use clear "Approve" vs "Deny" with appropriate status colors
Scope Options: Provide "once", "session", "always" choices when appropriate
Non-blocking: Use keepCardAfterClick: false to auto-dismiss after approval
Progressive Disclosure: Start simple, show details on demand
Feedback: Update card state to show execution progress after approval

Key Event Handlers

interface MynahUIProps {
  onInBodyButtonClicked?: (tabId: string, messageId: string, action: ChatItemButton) => void;
  onFollowUpClicked?: (tabId: string, messageId: string, followUp: ChatItemAction) => void;
  onFormChange?: (tabId: string, messageId: string, item: ChatItemFormItem, value: any) => void;
  onFileActionClick?: (tabId: string, messageId: string, filePath: string, actionName: string) => void;
  // ... many more
}

Reference

Full documentation: mynah-ui/docs/DATAMODEL.md
Type definitions: mynah-ui/src/static.ts
Examples: mynah-ui/example/src/samples/

VSCode Webview State Preservation: Complete Guide for Chat Interfaces

Your mynah-ui chat extension can preserve draft text automatically using VSCode's built-in APIs. The key insight: there's no "last chance" event before destruction, so you must save continuously. The official VSCode documentation shows setState() being called every 100ms without performance concerns, and popular extensions use debounced saves at 300-500ms intervals.

VSCode webview lifecycle: No beforeunload safety net

VSCode webviews do not expose a beforeunload or similar "last chance" event through the extension API. This is the most critical finding for your implementation. You have exactly two lifecycle events to work with:

onDidChangeViewState fires when the webview's visibility changes or moves to a different editor column. It provides access to webviewPanel.visible and webviewPanel.viewColumn properties. Critically, this event does NOT fire when the webview is disposed—only when it becomes hidden or changes position. The browser's beforeunload event exists within the webview iframe itself but cannot communicate asynchronously back to your extension, making it effectively useless for state preservation.

onDidDispose fires after the webview is already destroyed—too late for state saving. Use it only for cleanup operations like canceling timers or removing subscriptions. By the time this event fires, your webview context is gone and any unsaved state is lost.

The recommended pattern is to save state continuously rather than trying to intercept disposal. VSCode's official documentation explicitly shows this approach, with their example calling setState() every 100ms in a setInterval without any warnings about performance impact.

setState performance: Call it freely with light debouncing

The performance cost of vscode.setState() is remarkably low. Microsoft's official documentation states that "getState and setState are the preferred way to persist state, as they have much lower performance overhead than retainContextWhenHidden." The API appears to be synchronous, accepts JSON-serializable objects, and has no documented size limits or throttling mechanisms.

The official VSCode webview sample demonstrates calling setState() 10 times per second (every 100ms) without any performance warnings or caveats. This suggests the operation is highly optimized and suitable for frequent updates. Real-world extension analysis shows a community consensus around 300-500ms debounce intervals for text input, which balances responsiveness with minimal overhead.

Is it acceptable to call on every keystroke? Technically yes, but practically you should debounce. Here's why: while setState itself is lightweight, debouncing serves UX purposes more than performance. A 300-500ms debounce provides a better user experience by avoiding excessive state churn while ensuring draft preservation happens quickly enough that users rarely lose more than half a second of typing if they close the sidebar mid-sentence.

Popular extension patterns: The REST Client extension saves request history to globalState immediately on submission. The GistPad extension uses a 1500ms debounce for search input updates. The Continue AI extension relies on message passing between webview and extension for complex state management rather than setState alone. Most extensions combine approaches—using setState for immediate UI state and globalState for data that must survive webview disposal.

mynah-ui API: Event-driven architecture with limited draft access

mynah-ui does not expose a direct API to retrieve current draft text from input fields in its public documentation. The library follows a strictly event-driven pattern where user input is captured through the onChatPrompt callback, which fires when users submit messages—not during typing.

The getAllTabs() method is not explicitly documented as including unsent draft messages. Based on the library's architecture, tabs contain conversation history and submitted messages, not draft state. You'll need to implement your own draft tracking by monitoring the underlying DOM input elements or maintaining draft state in your extension code.

Events you can hook into:

onChatPrompt: Fires when users submit a message (your primary input capture point)
onTabChange: Fires when switching between tabs (good opportunity to save current draft)
onTabAdd/onTabRemove: Tab lifecycle events

mynah-ui uses a centralized reactive data store where updates automatically trigger re-renders of subscribed components. The library prioritizes declarative state management over imperative queries, which is why draft access methods aren't prominent. For your use case, you'll likely need to access the input DOM elements directly or maintain a parallel draft state structure outside mynah-ui.

User expectations: Auto-save is non-negotiable

Users expect automatic draft preservation based on industry-standard chat applications. Research into Slack, Teams, Discord, and even recent iOS updates reveals consistent patterns:

Automatic per-conversation drafts are table stakes. Slack saves drafts automatically per channel, Teams maintains drafts per conversation, and Discord preserves drafts across app restarts. All provide visual indicators (bold channel names, "[Draft]" labels, or draft count badges) showing where unsent messages exist.

VSCode users are already frustrated by draft loss in existing extensions. GitHub issues show significant pain points: users lose hours of work when chat history disappears during workspace switches, and Claude Code extension users report losing conversation context due to inadequate state preservation. One user complaint: "Lost chats today and am here to express how insane it is that this is even possible."

Expected behavior for your sidebar: When users close the sidebar while typing, they expect that text to reappear when they reopen it—period. This expectation comes from every major communication platform they use daily. Losing draft text is not acceptable. Your implementation must preserve this state automatically, invisibly, and reliably.

VSCode's built-in GitHub Copilot Chat demonstrates the acceptable standard: chat sessions persist within a workspace, history is accessible via "Show Chats...", and sessions can be exported. However, even Copilot Chat has limitations—history loss when switching workspaces causes major user frustration, proving that inadequate persistence is a critical UX failure.

Recommended implementation: Hybrid approach with debounced auto-save

The optimal pattern combines immediate setState() for UI state with debounced saves for draft content, backed by globalState for persistence beyond webview lifecycle. Here's the complete implementation strategy:

Pattern 1: Continuous state preservation in webview

// Inside your webview script
const vscode = acquireVsCodeApi();

// Restore previous state immediately
const previousState = vscode.getState() || { 
  drafts: {},  // keyed by tab/conversation ID
  activeTab: null 
};

// Debounced save function (500ms is the sweet spot)
let saveTimeout;
function saveDraftDebounced(tabId, draftText) {
  clearTimeout(saveTimeout);
  saveTimeout = setTimeout(() => {
    const currentState = vscode.getState() || { drafts: {} };
    currentState.drafts[tabId] = {
      text: draftText,
      timestamp: Date.now()
    };
    vscode.setState(currentState);
    
    // Also notify extension for globalState backup
    vscode.postMessage({
      command: 'saveDraft',
      tabId: tabId,
      text: draftText
    });
  }, 500);
}

// Hook into mynah-ui or direct DOM events
// Since mynah-ui doesn't expose input change events, access the DOM
const chatInput = document.querySelector('[data-mynah-chat-input]'); // adjust selector
if (chatInput) {
  chatInput.addEventListener('input', (e) => {
    const currentTab = getCurrentTabId(); // your function to get active tab
    saveDraftDebounced(currentTab, e.target.value);
  });
}

// Immediate save on tab switch (use mynah-ui's onTabChange)
mynahUI = new MynahUI({
  onTabChange: (tabId) => {
    // Save current draft immediately before switching
    const currentDraft = getCurrentDraftText();
    if (currentDraft) {
      const state = vscode.getState() || { drafts: {} };
      state.drafts[getCurrentTabId()] = {
        text: currentDraft,
        timestamp: Date.now()
      };
      vscode.setState(state);
    }
    
    // Restore draft for new tab
    const newState = vscode.getState();
    if (newState?.drafts?.[tabId]) {
      restoreDraftToInput(newState.drafts[tabId].text);
    }
  },
  
  onChatPrompt: (tabId, prompt) => {
    // Clear draft after successful send
    const state = vscode.getState() || { drafts: {} };
    delete state.drafts[tabId];
    vscode.setState(state);
    
    vscode.postMessage({
      command: 'clearDraft',
      tabId: tabId
    });
  }
});

// Restore drafts on load
window.addEventListener('load', () => {
  const state = vscode.getState();
  const activeTab = getCurrentTabId();
  if (state?.drafts?.[activeTab]?.text) {
    restoreDraftToInput(state.drafts[activeTab].text);
  }
});

Pattern 2: Extension-side backup with globalState

// In your extension code (extension.ts)
export function activate(context: vscode.ExtensionContext) {
  
  // Handle messages from webview
  webviewPanel.webview.onDidReceiveMessage(
    message => {
      switch (message.command) {
        case 'saveDraft':
          // Save to globalState as backup
          const drafts = context.globalState.get('chatDrafts', {});
          drafts[message.tabId] = {
            text: message.text,
            timestamp: Date.now(),
            workspace: vscode.workspace.name || 'default'
          };
          context.globalState.update('chatDrafts', drafts);
          break;
          
        case 'clearDraft':
          const currentDrafts = context.globalState.get('chatDrafts', {});
          delete currentDrafts[message.tabId];
          context.globalState.update('chatDrafts', currentDrafts);
          break;
          
        case 'getDrafts':
          // Send stored drafts back to webview for restoration
          const storedDrafts = context.globalState.get('chatDrafts', {});
          webviewPanel.webview.postMessage({
            command: 'restoreDrafts',
            drafts: storedDrafts
          });
          break;
      }
    },
    undefined,
    context.subscriptions
  );
  
  // Implement WebviewPanelSerializer for cross-restart persistence
  vscode.window.registerWebviewPanelSerializer('yourViewType', {
    async deserializeWebviewPanel(webviewPanel: vscode.WebviewPanel, state: any) {
      // Restore webview with saved state
      webviewPanel.webview.html = getWebviewContent();
      
      // Send drafts from globalState
      const drafts = context.globalState.get('chatDrafts', {});
      webviewPanel.webview.postMessage({
        command: 'restoreDrafts',
        drafts: drafts
      });
    }
  });
}

Pattern 3: Flush on critical visibility changes

// Listen to visibility changes
webviewPanel.onDidChangeViewState(
  e => {
    if (!e.webviewPanel.visible) {
      // Webview is becoming hidden - request final state save
      webviewPanel.webview.postMessage({
        command: 'flushState'
      });
    }
  },
  null,
  context.subscriptions
);

// In webview: handle flush command
window.addEventListener('message', event => {
  const message = event.data;
  if (message.command === 'flushState') {
    // Immediately save current state without debouncing
    const currentDraft = getCurrentDraftText();
    if (currentDraft) {
      vscode.setState({ 
        drafts: { 
          [getCurrentTabId()]: { 
            text: currentDraft, 
            timestamp: Date.now() 
          } 
        } 
      });
      
      vscode.postMessage({
        command: 'saveDraft',
        tabId: getCurrentTabId(),
        text: currentDraft
      });
    }
  }
});

Trade-offs and performance considerations

Debounce intervals tested in the wild:

100ms (VSCode official example): No debounce, continuous updates, perfect for demos but potentially excessive
300-500ms (community standard): Optimal balance between responsiveness and efficiency—recommended for most chat interfaces
1500ms (GistPad search): Too long for draft preservation, risks losing 1.5 seconds of typing
Immediate (on send/tab switch): Essential for critical actions where data loss is unacceptable

The undo/redo conflict: Custom text editors that debounce updates face a specific problem—hitting undo before the debounce fires causes undo to jump back to a previous state instead of the last edit. For chat interfaces this is less critical since most chat inputs don't implement complex undo stacks, but be aware if you're building rich text editing features.

Memory and storage considerations: setState() stores data in memory until the webview is disposed. globalState persists to disk and survives VSCode restarts but should be used judiciously for data that truly needs long-term persistence. For your chat extension, draft text is lightweight (typically under 10KB per draft) and appropriate for globalState backup.

retainContextWhenHidden alternative: You could set retainContextWhenHidden: true in your webview options to keep the entire webview context alive when hidden. This would eliminate the need for state persistence entirely, but Microsoft explicitly warns about "much higher performance overhead." Only use this for complex UIs that cannot be quickly serialized and restored. For a chat interface with text drafts, setState/getState is definitively the right choice.

Specific recommendations for your mynah-ui extension

Your implementation checklist:

Implement debounced auto-save at 500ms intervals for draft text as users type
Save immediately on tab switches using mynah-ui's onTabChange event
Clear drafts after successful message submission in the onChatPrompt handler
Back up drafts to globalState via message passing to your extension for persistence beyond webview lifecycle
Restore drafts on webview load by checking both vscode.getState() and requesting globalState from your extension
Use onDidChangeViewState to trigger immediate flush when the webview becomes hidden
Implement WebviewPanelSerializer if you want drafts to survive VSCode restarts (optional but recommended)

Accessing mynah-ui input fields: Since mynah-ui doesn't expose a direct draft text API, you'll need to either:

Query the DOM directly for the input element (look for textarea or input fields within mynah-ui's rendered structure)
Maintain a parallel state object that tracks input as users type by monitoring DOM events
Wrap mynah-ui's initialization and hook into its input element references after construction

Visual indicators to add: Following industry standards, consider adding:

"[Draft]" label next to tabs with unsaved text
Badge count showing number of tabs with drafts
Timestamp showing when draft was last saved
Warning dialog if user attempts to close VSCode with unsaved drafts (though VSCode doesn't provide a beforeunload hook, you could show a modal when dispose is called)

Testing your implementation:

Type draft text and close the sidebar—text should reappear on reopen
Type draft in one tab, switch tabs, return—draft should persist
Reload the webview (Developer: Reload Webview command)—draft should restore
Restart VSCode—draft should restore if using WebviewPanelSerializer
Type draft, wait only 200ms, close sidebar—draft should still save (test your debounce timing)

Code you can ship today

Here's a minimal, production-ready implementation you can add to your existing code:

// Add to your webview script
class DraftManager {
  constructor(vscode, mynahUI) {
    this.vscode = vscode;
    this.mynahUI = mynahUI;
    this.saveTimeout = null;
    this.DEBOUNCE_MS = 500;
    
    this.init();
  }
  
  init() {
    // Restore drafts on load
    this.restoreAllDrafts();
    
    // Hook into input changes
    this.monitorInput();
    
    // Save immediately on visibility change
    window.addEventListener('beforeunload', () => this.flushAll());
  }
  
  monitorInput() {
    // Find mynah-ui input element (adjust selector as needed)
    const inputObserver = new MutationObserver(() => {
      const input = document.querySelector('textarea[data-mynah-input]');
      if (input && !input.dataset.draftHandlerAttached) {
        input.dataset.draftHandlerAttached = 'true';
        input.addEventListener('input', (e) => {
          this.saveDraft(this.getCurrentTabId(), e.target.value);
        });
      }
    });
    
    inputObserver.observe(document.body, { 
      childList: true, 
      subtree: true 
    });
  }
  
  saveDraft(tabId, text) {
    clearTimeout(this.saveTimeout);
    this.saveTimeout = setTimeout(() => {
      const state = this.vscode.getState() || { drafts: {} };
      state.drafts[tabId] = { text, timestamp: Date.now() };
      this.vscode.setState(state);
      
      // Backup to extension
      this.vscode.postMessage({
        command: 'saveDraft',
        tabId,
        text
      });
    }, this.DEBOUNCE_MS);
  }
  
  flushAll() {
    clearTimeout(this.saveTimeout);
    const tabId = this.getCurrentTabId();
    const text = this.getCurrentDraftText();
    if (text) {
      const state = this.vscode.getState() || { drafts: {} };
      state.drafts[tabId] = { text, timestamp: Date.now() };
      this.vscode.setState(state);
    }
  }
  
  restoreAllDrafts() {
    const state = this.vscode.getState();
    if (state?.drafts) {
      const currentTab = this.getCurrentTabId();
      const draft = state.drafts[currentTab];
      if (draft?.text) {
        this.setInputText(draft.text);
      }
    }
  }
  
  getCurrentTabId() {
    // Your logic to get active tab ID
    return this.mynahUI.getSelectedTabId?.() || 'default';
  }
  
  getCurrentDraftText() {
    const input = document.querySelector('textarea[data-mynah-input]');
    return input?.value || '';
  }
  
  setInputText(text) {
    const input = document.querySelector('textarea[data-mynah-input]');
    if (input) {
      input.value = text;
      input.dispatchEvent(new Event('input', { bubbles: true }));
    }
  }
}

// Initialize
const vscode = acquireVsCodeApi();
const draftManager = new DraftManager(vscode, mynahUI);

// Integrate with mynah-ui events
mynahUI.onTabChange = (tabId) => {
  draftManager.flushAll(); // Save current before switching
  draftManager.restoreAllDrafts(); // Restore for new tab
};

mynahUI.onChatPrompt = (tabId, prompt) => {
  // Clear draft after send
  const state = vscode.getState() || { drafts: {} };
  delete state.drafts[tabId];
  vscode.setState(state);
};

This implementation provides automatic draft preservation with minimal overhead, follows VSCode best practices, and aligns with industry-standard user expectations. Your users will never lose draft text when closing the sidebar, and the 500ms debounce ensures efficient performance even during rapid typing.

Key documentation references

VSCode Official:

Webview API Guide: https://code.visualstudio.com/api/extension-guides/webview
Webview UX Guidelines: https://code.visualstudio.com/api/ux-guidelines/webviews
Extension Samples (webview-sample): https://github.com/microsoft/vscode-extension-samples

mynah-ui:

GitHub Repository: https://github.com/aws/mynah-ui
Documentation files: STARTUP.md, CONFIG.md, DATAMODEL.md, USAGE.md

Open Source Extension Examples:

Continue (AI chat): https://github.com/continuedev/continue
REST Client: https://github.com/Huachao/vscode-restclient
Jupyter: https://github.com/microsoft/vscode-jupyter

Performance and UX Research:

VSCode GitHub Issues #66939, #109521, #127006 (lifecycle events)
Community Discussion #68362 (draft loss frustration)
Issue #251340 (chat history preservation requests)

Language Server Protocol (LSP) - Comprehensive Overview

Executive Summary

The Language Server Protocol (LSP) defines the protocol used between an editor or IDE and a language server that provides language features like auto complete, go to definition, find all references etc. The goal of the Language Server Index Format (LSIF, pronounced like "else if") is to support rich code navigation in development tools or a Web UI without needing a local copy of the source code.

The idea behind the Language Server Protocol (LSP) is to standardize the protocol for how tools and servers communicate, so a single Language Server can be re-used in multiple development tools, and tools can support languages with minimal effort.

Key Benefits:

Reduces M×N complexity to M+N (one server per language instead of one implementation per editor per language)
Enables language providers to focus on a single high-quality implementation
Allows editors to support multiple languages with minimal effort
Standardized JSON-RPC based communication

Architecture & Core Concepts

Problem Statement

Prior to the design and implementation of the Language Server Protocol for the development of Visual Studio Code, most language services were generally tied to a given IDE or other editor. In the absence of the Language Server Protocol, language services are typically implemented by using a tool-specific extension API.

This created a classic M×N complexity problem where:

M = Number of editors/IDEs
N = Number of programming languages
Total implementations needed = M × N

LSP Solution

The idea behind a Language Server is to provide the language-specific smarts inside a server that can communicate with development tooling over a protocol that enables inter-process communication.

Architecture Components:

Language Client: The editor/IDE that requests language services
Language Server: A separate process providing language intelligence
LSP: The standardized communication protocol between them

Communication Model:

JSON-RPC 2.0 based messaging
A language server runs as a separate process and development tools communicate with the server using the language protocol over JSON-RPC.
Bi-directional communication (client ↔ server)
Support for synchronous requests and asynchronous notifications

Supported Languages & Environments

LSP is not restricted to programming languages. It can be used for any kind of text-based language, like specifications or domain-specific languages (DSL).

Transport Options:

stdio (standard input/output)
Named pipes (Windows) / Unix domain sockets
TCP sockets
Node.js IPC

This comprehensive overview provides the foundation for understanding and implementing Language Server Protocol solutions. Each section can be expanded into detailed implementation guides as needed.

Base Protocol

Message Structure

The base protocol consists of a header and a content part (comparable to HTTP). The header and content part are separated by a '\r\n'.

Header Format

Content-Length: <number>\r\n
Content-Type: application/vscode-jsonrpc; charset=utf-8\r\n
\r\n

Required Headers:

Content-Length: Length of content in bytes (mandatory)
Content-Type: MIME type (optional, defaults to application/vscode-jsonrpc; charset=utf-8)

Content Format

Contains the actual content of the message. The content part of a message uses JSON-RPC to describe requests, responses and notifications.

Example Message:

Content-Length: 126\r\n
\r\n
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "textDocument/completion",
  "params": {
    "textDocument": { "uri": "file:///path/to/file.js" },
    "position": { "line": 5, "character": 10 }
  }
}

JSON-RPC Structure

Base Message

interface Message {
  jsonrpc: string; // Always "2.0"
}

Request Message

interface RequestMessage extends Message {
  id: integer | string;
  method: string;
  params?: array | object;
}

Response Message

interface ResponseMessage extends Message {
  id: integer | string | null;
  result?: any;
  error?: ResponseError;
}

Notification Message

interface NotificationMessage extends Message {
  method: string;
  params?: array | object;
}

Error Handling

Standard Error Codes:

-32700: Parse error
-32600: Invalid Request
-32601: Method not found
-32602: Invalid params
-32603: Internal error

LSP-Specific Error Codes:

-32803: RequestFailed
-32802: ServerCancelled
-32801: ContentModified
-32800: RequestCancelled

Language Features

Language Features provide the actual smarts in the language server protocol. They are usually executed on a [text document, position] tuple. The main language feature categories are: code comprehension features like Hover or Goto Definition. coding features like diagnostics, code complete or code actions.

Go to Definition

textDocument/definition: TextDocumentPositionParams → Location | Location[] | LocationLink[] | null

Go to Declaration

textDocument/declaration: TextDocumentPositionParams → Location | Location[] | LocationLink[] | null

Go to Type Definition

textDocument/typeDefinition: TextDocumentPositionParams → Location | Location[] | LocationLink[] | null

Go to Implementation

textDocument/implementation: TextDocumentPositionParams → Location | Location[] | LocationLink[] | null

Find References

textDocument/references: ReferenceParams → Location[] | null

interface ReferenceParams extends TextDocumentPositionParams {
  context: { includeDeclaration: boolean; }
}

Information Features

Hover

textDocument/hover: TextDocumentPositionParams → Hover | null

interface Hover {
  contents: MarkedString | MarkedString[] | MarkupContent;
  range?: Range;
}

Signature Help

textDocument/signatureHelp: SignatureHelpParams → SignatureHelp | null

interface SignatureHelp {
  signatures: SignatureInformation[];
  activeSignature?: uinteger;
  activeParameter?: uinteger;
}

Document Symbols

textDocument/documentSymbol: DocumentSymbolParams → DocumentSymbol[] | SymbolInformation[] | null

Workspace Symbols

workspace/symbol: WorkspaceSymbolParams → SymbolInformation[] | WorkspaceSymbol[] | null

Code Intelligence Features

Code Completion

textDocument/completion: CompletionParams → CompletionItem[] | CompletionList | null

interface CompletionList {
  isIncomplete: boolean;
  items: CompletionItem[];
}

interface CompletionItem {
  label: string;
  kind?: CompletionItemKind;
  detail?: string;
  documentation?: string | MarkupContent;
  sortText?: string;
  filterText?: string;
  insertText?: string;
  textEdit?: TextEdit;
  additionalTextEdits?: TextEdit[];
}

Completion Triggers:

User invoked (Ctrl+Space)
Trigger characters (., ->, etc.)
Incomplete completion re-trigger

Code Actions

textDocument/codeAction: CodeActionParams → (Command | CodeAction)[] | null

interface CodeAction {
  title: string;
  kind?: CodeActionKind;
  diagnostics?: Diagnostic[];
  isPreferred?: boolean;
  disabled?: { reason: string; };
  edit?: WorkspaceEdit;
  command?: Command;
}

Code Action Kinds:

quickfix - Fix problems
refactor - Refactoring operations
source - Source code actions (organize imports, etc.)

Code Lens

textDocument/codeLens: CodeLensParams → CodeLens[] | null

interface CodeLens {
  range: Range;
  command?: Command;
  data?: any; // For resolve support
}

Formatting Features

Document Formatting

textDocument/formatting: DocumentFormattingParams → TextEdit[] | null

Range Formatting

textDocument/rangeFormatting: DocumentRangeFormattingParams → TextEdit[] | null

On-Type Formatting

textDocument/onTypeFormatting: DocumentOnTypeFormattingParams → TextEdit[] | null

Semantic Features

Semantic Tokens

Since version 3.16.0. The request is sent from the client to the server to resolve semantic tokens for a given file. Semantic tokens are used to add additional color information to a file that depends on language specific symbol information.

textDocument/semanticTokens/full: SemanticTokensParams → SemanticTokens | null
textDocument/semanticTokens/range: SemanticTokensRangeParams → SemanticTokens | null
textDocument/semanticTokens/full/delta: SemanticTokensDeltaParams → SemanticTokens | SemanticTokensDelta | null

Token Encoding:

5 integers per token: [deltaLine, deltaStart, length, tokenType, tokenModifiers]
Relative positioning for efficiency
Bit flags for modifiers

Inlay Hints

textDocument/inlayHint: InlayHintParams → InlayHint[] | null

interface InlayHint {
  position: Position;
  label: string | InlayHintLabelPart[];
  kind?: InlayHintKind; // Type | Parameter
  tooltip?: string | MarkupContent;
  paddingLeft?: boolean;
  paddingRight?: boolean;
}

Diagnostics

Push Model (Traditional)

textDocument/publishDiagnostics: PublishDiagnosticsParams

interface PublishDiagnosticsParams {
  uri: DocumentUri;
  version?: integer;
  diagnostics: Diagnostic[];
}

Pull Model (Since 3.17)

textDocument/diagnostic: DocumentDiagnosticParams → DocumentDiagnosticReport
workspace/diagnostic: WorkspaceDiagnosticParams → WorkspaceDiagnosticReport

Diagnostic Structure:

interface Diagnostic {
  range: Range;
  severity?: DiagnosticSeverity; // Error | Warning | Information | Hint
  code?: integer | string;
  source?: string; // e.g., "typescript"
  message: string;
  tags?: DiagnosticTag[]; // Unnecessary | Deprecated
  relatedInformation?: DiagnosticRelatedInformation[];
}

Implementation Guide

Performance Guidelines

Message Ordering: Responses to requests should be sent in roughly the same order as the requests appear on the server or client side.

State Management:

Servers should handle partial/incomplete requests gracefully
Use ContentModified error for outdated results
Implement proper cancellation support

Resource Management:

Language servers run in separate processes
Avoid memory leaks in long-running servers
Implement proper cleanup on shutdown

Error Handling

Client Responsibilities:

Restart crashed servers (with exponential backoff)
Handle ContentModified errors gracefully
Validate server responses

Server Responsibilities:

Return appropriate error codes
Handle malformed/outdated requests
Monitor client process health

Transport Considerations

Command Line Arguments:

language-server --stdio                    # Use stdio
language-server --pipe=<n>             # Use named pipe/socket
language-server --socket --port=<port>    # Use TCP socket  
language-server --node-ipc                # Use Node.js IPC
language-server --clientProcessId=<pid>   # Monitor client process

Testing Strategies

Unit Testing:

Mock LSP message exchange
Test individual feature implementations
Validate message serialization/deserialization

Integration Testing:

End-to-end editor integration
Multi-document scenarios
Error condition handling

Performance Testing:

Large file handling
Memory usage patterns
Response time benchmarks

Advanced Topics

Custom Extensions

Experimental Capabilities:

interface ClientCapabilities {
  experimental?: {
    customFeature?: boolean;
    vendorSpecificExtension?: any;
  };
}

Custom Methods:

Use vendor prefixes: mycompany/customFeature
Document custom protocol extensions
Ensure graceful degradation

Security Considerations

Process Isolation:

Language servers run in separate processes
Limit file system access appropriately
Validate all input from untrusted sources

Content Validation:

Sanitize file paths and URIs
Validate document versions
Implement proper input validation

Multi-Language Support

Language Identification:

interface TextDocumentItem {
  uri: DocumentUri;
  languageId: string; // "typescript", "python", etc.
  version: integer;
  text: string;
}

Document Selectors:

type DocumentSelector = DocumentFilter[];

interface DocumentFilter {
  language?: string;    // "typescript"
  scheme?: string;      // "file", "untitled"  
  pattern?: string;     // "**/*.{ts,js}"
}

Message Reference

Message Types

Request/Response Pattern

Client-to-Server Requests:

initialize - Server initialization
textDocument/hover - Get hover information
textDocument/completion - Get code completions
textDocument/definition - Go to definition

Server-to-Client Requests:

client/registerCapability - Register new capabilities
workspace/configuration - Get configuration settings
window/showMessageRequest - Show message with actions

Notification Pattern

Client-to-Server Notifications:

initialized - Initialization complete
textDocument/didOpen - Document opened
textDocument/didChange - Document changed
textDocument/didSave - Document saved
textDocument/didClose - Document closed

Server-to-Client Notifications:

textDocument/publishDiagnostics - Send diagnostics
window/showMessage - Display message
telemetry/event - Send telemetry data

Special Messages

Dollar Prefixed Messages: Notifications and requests whose methods start with '$/' are messages which are protocol implementation dependent and might not be implementable in all clients or servers.

Examples:

$/cancelRequest - Cancel ongoing request
$/progress - Progress reporting
$/setTrace - Set trace level

Capabilities System

Not every language server can support all features defined by the protocol. LSP therefore provides 'capabilities'. A capability groups a set of language features.

Capability Exchange

During Initialization:

Client announces capabilities in initialize request
Server announces capabilities in initialize response
Both sides adapt behavior based on announced capabilities

Client Capabilities Structure

interface ClientCapabilities {
  workspace?: WorkspaceClientCapabilities;
  textDocument?: TextDocumentClientCapabilities;
  window?: WindowClientCapabilities;
  general?: GeneralClientCapabilities;
  experimental?: any;
}

Key Client Capabilities:

textDocument.hover.dynamicRegistration - Support dynamic hover registration
textDocument.completion.contextSupport - Support completion context
workspace.workspaceFolders - Multi-root workspace support
window.workDoneProgress - Progress reporting support

Server Capabilities Structure

interface ServerCapabilities {
  textDocumentSync?: TextDocumentSyncKind | TextDocumentSyncOptions;
  completionProvider?: CompletionOptions;
  hoverProvider?: boolean | HoverOptions;
  definitionProvider?: boolean | DefinitionOptions;
  referencesProvider?: boolean | ReferenceOptions;
  documentSymbolProvider?: boolean | DocumentSymbolOptions;
  workspaceSymbolProvider?: boolean | WorkspaceSymbolOptions;
  codeActionProvider?: boolean | CodeActionOptions;
  // ... many more
}

Dynamic Registration

Servers can register/unregister capabilities after initialization:

// Register new capability
client/registerCapability: {
  registrations: [{
    id: "uuid",
    method: "textDocument/willSaveWaitUntil",
    registerOptions: { documentSelector: [{ language: "javascript" }] }
  }]
}

// Unregister capability
client/unregisterCapability: {
  unregisterations: [{ id: "uuid", method: "textDocument/willSaveWaitUntil" }]
}

Lifecycle Management

Initialization Sequence

Client → Server: initialize request

interface InitializeParams {
  processId: integer | null;
  clientInfo?: { name: string; version?: string; };
  rootUri: DocumentUri | null;
  initializationOptions?: any;
  capabilities: ClientCapabilities;
  workspaceFolders?: WorkspaceFolder[] | null;
}

Server → Client: initialize response

interface InitializeResult {
  capabilities: ServerCapabilities;
  serverInfo?: { name: string; version?: string; };
}

Client → Server: initialized notification
- Signals completion of initialization
- Server can now send requests to client

Shutdown Sequence

Client → Server: shutdown request
- Server must not accept new requests (except exit)
- Server should finish processing ongoing requests
Client → Server: exit notification
- Server should exit immediately
- Exit code: 0 if shutdown was called, 1 otherwise

Process Monitoring

Client Process Monitoring:

Server can monitor client process via processId from initialize
Server should exit if client process dies

Server Crash Handling:

Client should restart crashed servers
Implement exponential backoff to prevent restart loops

Document Synchronization

Client support for textDocument/didOpen, textDocument/didChange and textDocument/didClose notifications is mandatory in the protocol and clients can not opt out supporting them.

Text Document Sync Modes

enum TextDocumentSyncKind {
  None = 0,        // No synchronization
  Full = 1,        // Full document sync on every change
  Incremental = 2  // Incremental sync (deltas only)
}

Document Lifecycle

Document Open

textDocument/didOpen: {
  textDocument: {
    uri: "file:///path/to/file.js",
    languageId: "javascript", 
    version: 1,
    text: "console.log('hello');"
  }
}

Document Change

textDocument/didChange: {
  textDocument: { uri: "file:///path/to/file.js", version: 2 },
  contentChanges: [{
    range: { start: { line: 0, character: 12 }, end: { line: 0, character: 17 } },
    text: "world"
  }]
}

Change Event Types:

Full text: Replace entire document
Incremental: Specify range and replacement text

Document Save

// Optional: Before save
textDocument/willSave: {
  textDocument: { uri: "file:///path/to/file.js" },
  reason: TextDocumentSaveReason.Manual
}

// Optional: Before save with text edits
textDocument/willSaveWaitUntil → TextEdit[]

// After save
textDocument/didSave: {
  textDocument: { uri: "file:///path/to/file.js" },
  text?: "optional full text"
}

Document Close

textDocument/didClose: {
  textDocument: { uri: "file:///path/to/file.js" }
}

Position Encoding

Prior to 3.17 the offsets were always based on a UTF-16 string representation. Since 3.17 clients and servers can agree on a different string encoding representation (e.g. UTF-8).

Supported Encodings:

utf-16 (default, mandatory)
utf-8
utf-32

Position Structure:

interface Position {
  line: uinteger;     // Zero-based line number
  character: uinteger; // Zero-based character offset
}

interface Range {
  start: Position;
  end: Position;
}

Workspace Features

Multi-Root Workspaces

workspace/workspaceFolders → WorkspaceFolder[] | null

interface WorkspaceFolder {
  uri: URI;
  name: string;
}

// Notification when folders change
workspace/didChangeWorkspaceFolders: DidChangeWorkspaceFoldersParams

Configuration Management

// Server requests configuration from client
workspace/configuration: ConfigurationParams → any[]

interface ConfigurationItem {
  scopeUri?: URI;     // Scope (file/folder) for the setting
  section?: string;   // Setting name (e.g., "typescript.preferences")
}

// Client notifies server of configuration changes
workspace/didChangeConfiguration: DidChangeConfigurationParams

File Operations

File Watching

workspace/didChangeWatchedFiles: DidChangeWatchedFilesParams

interface FileEvent {
  uri: DocumentUri;
  type: FileChangeType; // Created | Changed | Deleted
}

File System Operations

// Before operations (can return WorkspaceEdit)
workspace/willCreateFiles: CreateFilesParams → WorkspaceEdit | null
workspace/willRenameFiles: RenameFilesParams → WorkspaceEdit | null  
workspace/willDeleteFiles: DeleteFilesParams → WorkspaceEdit | null

// After operations (notifications)
workspace/didCreateFiles: CreateFilesParams
workspace/didRenameFiles: RenameFilesParams
workspace/didDeleteFiles: DeleteFilesParams

Command Execution

workspace/executeCommand: ExecuteCommandParams → any

interface ExecuteCommandParams {
  command: string;           // Command identifier
  arguments?: any[];         // Command arguments
}

// Server applies edits to workspace
workspace/applyEdit: ApplyWorkspaceEditParams → ApplyWorkspaceEditResult

window/showMessage: ShowMessageParams

interface ShowMessageParams {
  type: MessageType; // Error | Warning | Info | Log | Debug
  message: string;
}

Show Message Request

window/showMessageRequest: ShowMessageRequestParams → MessageActionItem | null

interface ShowMessageRequestParams {
  type: MessageType;
  message: string;
  actions?: MessageActionItem[]; // Buttons to show
}

Show Document

window/showDocument: ShowDocumentParams → ShowDocumentResult

interface ShowDocumentParams {
  uri: URI;
  external?: boolean;    // Open in external program
  takeFocus?: boolean;   // Focus the document
  selection?: Range;     // Select range in document
}

Progress Reporting

Work Done Progress

// Server creates progress token
window/workDoneProgress/create: WorkDoneProgressCreateParams → void

// Report progress using $/progress
$/progress: ProgressParams<WorkDoneProgressBegin | WorkDoneProgressReport | WorkDoneProgressEnd>

// Client can cancel progress
window/workDoneProgress/cancel: WorkDoneProgressCancelParams

Progress Reporting Pattern

// Begin
{ kind: "begin", title: "Indexing", cancellable: true, percentage: 0 }

// Report
{ kind: "report", message: "Processing file.ts", percentage: 25 }

// End  
{ kind: "end", message: "Indexing complete" }

Logging & Telemetry

window/logMessage: LogMessageParams     // Development logs
telemetry/event: any                   // Usage analytics

Version History

LSP 3.17 (Current)

Major new feature are: type hierarchy, inline values, inlay hints, notebook document support and a meta model that describes the 3.17 LSP version.

Key Features:

Type hierarchy support
Inline value provider
Inlay hints
Notebook document synchronization
Diagnostic pull model
Position encoding negotiation

LSP 3.16

Key Features:

Semantic tokens
Call hierarchy
Moniker support
File operation events
Linked editing ranges
Code action resolve

LSP 3.15

Key Features:

Progress reporting
Selection ranges
Signature help context

LSP 3.0

Breaking Changes:

Client capabilities system
Dynamic registration
Workspace folders
Document link support