About Symposium
Symposium is a set of components that make AI agents work better. These components are compatible with any ACP-based editing system, including Zed, VSCode (with our plugin), and IntelliJ and NeoVim (coming soon).
We focus on three areas:
Rust Expertise
Built-in capabilities for understanding Rust language, errors, and idioms.
- rust-crate-sources tool
- IDE operations
- Error explanations
Ecosystem-Powered Knowledge
Crate authors provide specialized AI tooling through Cargo.toml metadata, bringing domain knowledge directly to the agent.
- Crate-provided skills, context, and capabilities
Rich Collaboration
Interactive patterns for human-AI partnership beyond simple text exchanges.
- Sparkle collaborative patterns
- Walkthroughs
- Taskspaces
Open Source Community
Symposium is an open-source project and we are actively soliciting contributors. We welcome users as well, but given the exploratory nature of Symposium, expect frequent changes. Currently, the only way to install Symposium is from source.
We maintain a code of conduct and operate as an independent community focused on exploring what AI has to offer for software development.
Rust Crate Sources Tool
The rust-crate-sources MCP tool allows AI agents to access the source code of published Rust crates from crates.io.
Just like humans, AI agents work best when they have a few examples of what to do rather than having to sort through reams of documentation. Often the best way to provide this is to expose them to the crate's source code, since well-maintained crates come equipped with examples and usage patterns.
When an agent needs to understand how a particular crate works, it can fetch and examine the actual implementation, including any examples or tests the crate provides.
Rich Collaboration
Symposium integrates with Sparkle, a framework for AI collaboration that creates partnership dynamics through intuitive signals, partnership behaviors, and meta-collaboration tools.
For complete information about Sparkle patterns and capabilities, see the Sparkle documentation.
Implementation Overview
Symposium appears to external clients as a single ACP proxy, but internally uses a conductor to orchestrate a dynamic chain of component proxies. This architecture allows Symposium to adapt to different client capabilities and provide consistent functionality regardless of what the editor or agent natively supports.
Architecture
External View
From the outside, Symposium is a standard ACP proxy that sits between an editor and an agent:
flowchart LR
Editor --> Symposium --> Agent
Internal Structure
Internally, Symposium runs a conductor in proxy mode that orchestrates multiple component proxies:
flowchart LR
Editor --> S[Symposium Conductor]
S --> C1[Component 1]
C1 --> A1[Adapter 1]
A1 --> C2[Component 2]
C2 --> Agent
The conductor dynamically builds this chain based on what capabilities the editor and agent provide.
Component Pattern
Some Symposium features are implemented as component/adapter pairs:
Components
Components provide functionality to agents through MCP tools and other mechanisms. They:
- Expose high-level capabilities (e.g., Dialect-based IDE operations)
- May rely on primitive capabilities from upstream (the editor)
- Are always included in the chain when their functionality is relevant
Adapters
Adapters "shim" for missing primitive capabilities by providing fallback implementations. They:
- Check whether required primitive capabilities exist upstream
- Provide the capability if it's missing (e.g., spawn rust-analyzer to provide IDE operations)
- Pass through transparently if the capability already exists
- Are conditionally included only when needed
Capability-Driven Assembly
During initialization, Symposium:
- Receives capabilities from the editor - examines what the upstream client provides
- Queries the agent - discovers what capabilities the downstream agent supports
- Builds the proxy chain - spawns components and adapters based on detected gaps and opportunities
- Advertises enriched capabilities - tells the editor what the complete chain provides
This approach allows Symposium to work with minimal ACP clients (by providing fallback implementations) while taking advantage of native capabilities when available (by passing through directly).
For detailed information about the initialization sequence and capability negotiation, see Initialization Sequence.
Components
Symposium's functionality is delivered through component proxies that are orchestrated by the internal conductor. Some features use a component/adapter pattern while others are standalone components.
Component Types
Standalone Components
Some components provide functionality that doesn't depend on upstream capabilities. These components work with any editor and add features purely through the proxy layer.
Example: A component that provides git history analysis through MCP tools doesn't need special editor support - it can work with the filesystem directly.
Component/Adapter Pairs
Other components rely on primitive capabilities from the upstream editor. For these, Symposium uses a two-layer approach:
Adapter Layer
The adapter sits upstream in the proxy chain and provides primitive capabilities that the component needs.
Responsibilities:
- Check for required capabilities during initialization
- Pass requests through if the editor provides the capability
- Provide fallback implementation if the capability is missing
- Abstract away editor differences from the component
Example: The IDE Operations adapter checks if the editor supports ide_operations. If not, it can spawn a language server (like rust-analyzer) to provide that capability.
Component Layer
The component sits downstream from its adapter and enriches primitive capabilities into higher-level MCP tools.
Responsibilities:
- Expose MCP tools to the agent
- Process tool invocations
- Send requests upstream through the adapter
- Return results to the agent
Example: The IDE Operations component exposes an ide_operation MCP tool that accepts Dialect programs and translates them into IDE operation requests sent upstream.
Component Lifecycle
For component/adapter pairs:
- Initialization - Adapter receives initialize request from upstream (editor)
- Capability Check - Adapter examines editor capabilities
- Conditional Spawning - Adapter spawns fallback if capability is missing
- Chain Assembly - Conductor wires adapter → component → downstream
- Request Flow - Agent calls MCP tool → component → adapter → editor
- Response Flow - Results flow back: editor → adapter → component → agent
Proxy Chain Direction
The proxy chain flows from editor to agent:
Editor → [Adapter] → [Component] → Agent
- Upstream = toward the editor
- Downstream = toward the agent
Adapters sit closer to the editor, components sit closer to the agent.
Current Components
Rust Crate Sources
Provides access to published Rust crate source code through an MCP server.
- Type: Standalone component
- Implementation: Injects an MCP server that exposes the
rust-crate-sourcestool - Function: Allows agents to fetch and examine source code from crates.io
Sparkle
Provides AI collaboration framework through prompt injection and MCP tooling.
- Type: Standalone component
- Implementation: Injects Sparkle MCP server with collaboration tools
- Function: Enables partnership dynamics, pattern anchors, and meta-collaboration capabilities
- Documentation: Sparkle docs
Future Components
Additional components can be added following these patterns:
- IDE Operations - Code navigation and search (likely component/adapter pair)
- Walkthroughs - Interactive code explanations
- Git Operations - Repository analysis
- Build Integration - Compilation and testing workflows
Rust Crate Sources Component
The Rust Crate Sources component provides agents with the ability to research published Rust crate source code through a sub-agent architecture.
Architecture Overview
The component uses a sub-agent research pattern: when an agent needs information about a Rust crate, the component spawns a dedicated research session with its own agent to investigate the crate sources and return findings.
Message Flow
sequenceDiagram
participant Client
participant Proxy as Crate Sources Proxy
participant Agent
Note over Client,Proxy: Initial Session Setup
Client->>Proxy: NewSessionRequest
Note right of Proxy: Adds user-facing MCP server<br/>(rust_crate_query tool)
Proxy->>Agent: NewSessionRequest (with user-facing MCP)
Agent-->>Proxy: NewSessionResponse(session_id)
Proxy-->>Client: NewSessionResponse(session_id)
Note over Agent,Proxy: Research Request
Agent->>Proxy: ToolRequest(rust_crate_query, crate, prompt)
Note right of Proxy: Create research session
Proxy->>Agent: NewSessionRequest (with sub-agent MCP)
Note right of Proxy: Sub-agent MCP has:<br/>- get_rust_crate_source<br/>- return_response_to_user
Agent-->>Proxy: NewSessionResponse(research_session_id)
Proxy->>Agent: PromptRequest(research_session_id, prompt)
Note over Agent: Sub-agent researches crate<br/>Uses get_rust_crate_source<br/>Reads files (auto-approved)
Agent->>Proxy: RequestPermissionRequest(Read)
Proxy-->>Agent: RequestPermissionResponse(approved)
Agent->>Proxy: ToolRequest(return_response_to_user, findings)
Proxy-->>Agent: ToolResponse(success)
Note right of Proxy: Response sent via internal channel
Proxy-->>Agent: ToolResponse(rust_crate_query result)
Two MCP Servers
The component provides two distinct MCP servers:
-
User-facing MCP Server - Exposed to the main agent session
- Tool:
rust_crate_query- Initiates crate research
- Tool:
-
Sub-agent MCP Server - Provided only to research sessions
- Tool:
get_rust_crate_source- Locates crate sources and returns path - Tool:
return_response_to_user- Returns research findings and ends the session
- Tool:
User-Facing Tool: rust_crate_query
Parameters
{
crate_name: string, // Name of the Rust crate
crate_version?: string, // Optional semver range (defaults to latest)
prompt: string // What to research about the crate
}
Examples
{
"crate_name": "serde",
"prompt": "How do I use the derive macro for custom field names?"
}
{
"crate_name": "tokio",
"crate_version": "1.0",
"prompt": "What are the signatures of all methods on tokio::runtime::Runtime?"
}
Behavior
- Creates a new research session via
NewSessionRequest - Attaches the sub-agent MCP server to that session
- Sends the user's prompt via
PromptRequest - Waits for the sub-agent to call
return_response_to_user - Returns the sub-agent's findings as the tool result
Sub-Agent Tools
get_rust_crate_source
Locates and extracts the source code for a Rust crate from crates.io.
Parameters:
{
crate_name: string,
version?: string // Semver range
}
Returns:
{
"crate_name": "serde",
"version": "1.0.210",
"checkout_path": "/Users/user/.cargo/registry/src/.../serde-1.0.210",
"message": "Crate 'serde' version 1.0.210 extracted to ..."
}
The sub-agent can then use Read tool calls (which are auto-approved) to examine the source code.
return_response_to_user
Signals completion of the research and returns findings to the waiting rust_crate_query call.
Parameters:
{
response: string // The research findings to return
}
Behavior:
- Sends the response through an internal channel to the waiting tool handler
- The original
rust_crate_querycall completes with this response - The research session can then be terminated
Permission Auto-Approval
The component implements a message handler that intercepts RequestPermissionRequest messages from research sessions and automatically approves all permission requests.
Permission Rules
- Research sessions → All permissions automatically approved
- Other sessions → Passed through unchanged
Rationale
Research sessions are sandboxed and disposable - they investigate crate sources and return findings. Auto-approving all permissions eliminates the need for dozens of permission prompts while maintaining safety:
- Research sessions operate on read-only crate sources in the cargo registry cache
- Sessions are short-lived and focused on a single research task
- Any side effects are contained within the research session's scope
Implementation
The handler checks if a permission request comes from a registered research session and automatically selects the first available option (typically "allow"):
#![allow(unused)] fn main() { if self.state.is_research_session(&req.session_id) { // Select first option (typically "allow") let response = RequestPermissionResponse { outcome: RequestPermissionOutcome::Selected { option_id: req.options.first().unwrap().id.clone(), }, meta: None, }; request_cx.respond(response)?; return Ok(Handled::Yes); } return Ok(Handled::No(message)); // Not our session, propagate unchanged }
Session Lifecycle
-
Agent calls
rust_crate_query- Handler creates
oneshot::channel()for response - Registers session in active sessions map
- Handler creates
-
Handler sends
NewSessionRequest- Includes sub-agent MCP server configuration
- Receives
session_idin response
-
Handler sends
PromptRequest- Sends user's research prompt to the session
- Awaits response on the oneshot channel
-
Sub-agent performs research
- Calls
get_rust_crate_sourceto locate crate - Reads source files (auto-approved by permission handler)
- Analyzes code to answer the prompt
- Calls
-
Sub-agent calls
return_response_to_user- Sends findings through internal channel
- Original
rust_crate_querycall receives response
-
Session cleanup
- Remove session from active sessions map
- Session termination (if ACP supports explicit session end)
Shared State
The component uses shared state to coordinate between:
- The
rust_crate_querytool handler (creates sessions, waits for responses) - The
return_response_to_usertool handler (sends responses) - The permission request handler (auto-approves Read operations)
State Structure
#![allow(unused)] fn main() { struct ResearchSession { session_id: SessionId, response_tx: oneshot::Sender<String>, } // Shared across all handlers Arc<Mutex<HashMap<SessionId, ResearchSession>>> }
Design Decisions
Why Sub-Agents Instead of Direct Pattern Search?
Previous approach: The component exposed get_rust_crate_source with a pattern parameter that performed regex searches across crate sources.
Problems:
- Agents had to construct exact regex patterns
- Limited to simple pattern matching
- No semantic understanding of code structure
- Single-shot queries couldn't follow up on findings
Sub-agent approach:
- Agent describes what information they need in natural language
- Sub-agent can perform multiple reads, follow references, understand context
- Can navigate code structure intelligently
- Returns synthesized answers, not raw pattern matches
Why Auto-Approve All Permissions?
Research sessions need extensive file access to examine crate sources. Requiring user approval for every operation would create dozens of permission prompts, making the feature unusable.
Safety considerations:
- Research sessions are sandboxed and disposable
- Scope is limited to investigating crate sources in cargo registry cache
- Sessions are short-lived with a focused task
- Any side effects are contained within the research session
Why Oneshot Channels for Response Coordination?
Each rust_crate_query call creates exactly one research session and expects exactly one response. A oneshot::channel models this perfectly:
- Type-safe guarantee of single response
- Clear ownership transfer
- Automatic cleanup on drop
- No need to poll or maintain complex state
Integration with Symposium
The component is registered with the conductor in symposium-acp/src/lib.rs:
#![allow(unused)] fn main() { components.push(sacp::DynComponent::new( symposium_crate_sources_proxy::CrateSourcesProxy {}, )); }
The component implements Component::serve() to:
- Register the user-facing MCP server via
McpServiceRegistry - Implement message handling for permission requests
- Forward all other messages to the successor component
Future Enhancements
- Session timeouts - Terminate research sessions that take too long
- Concurrent research - Support multiple research sessions simultaneously
- Caching - Cache common queries to avoid redundant research
- Progressive responses - Stream findings as they're discovered rather than waiting for completion
- Research history - Allow agents to reference previous research results
VSCode Extension Architecture
The Symposium VSCode extension provides a chat interface for interacting with AI agents. The architecture divides responsibilities across three layers to handle VSCode's webview constraints while maintaining clean separation of concerns.
Components Overview
mynah-ui: AWS's open-source chat interface library (github.com/aws/mynah-ui). Provides the chat UI rendering, tab management, and message display. The webview layer uses mynah-ui for all visual presentation.
Agent: Currently a mock implementation (HomerActor) that responds with Homer Simpson quotes. Future implementation will spawn an ACP-compatible agent process (see ACP Integration chapter when available).
Extension activation: VSCode activates the extension when the user first opens the Symposium sidebar or runs a Symposium command. The extension spawns the agent process during activation (or lazily on first use) and keeps it alive for the entire VSCode session.
Three-Layer Model
┌─────────────────────────────────────────────────┐
│ Webview (Browser Context) │
│ - mynah-ui rendering │
│ - User interaction capture │
│ - Tab management │
└─────────────────┬───────────────────────────────┘
│ VSCode postMessage API
┌─────────────────▼───────────────────────────────┐
│ Extension (Node.js Context) │
│ - Message routing │
│ - Agent lifecycle │
│ - Webview lifecycle │
└─────────────────┬───────────────────────────────┘
│ Process spawning / stdio
┌─────────────────▼───────────────────────────────┐
│ Agent (Separate Process) │
│ - Session management │
│ - AI interaction │
│ - Streaming responses │
└─────────────────────────────────────────────────┘
Why Three Layers?
Webview Isolation
VSCode webviews run in isolated browser contexts without Node.js APIs. This security boundary prevents direct file system access, process spawning, or network operations. The webview can only communicate with the extension through VSCode's postMessage API.
Design consequence: UI code must be pure browser JavaScript. All privileged operations (spawning agents, workspace access, persistence) happen in the extension layer.
Extension as Coordinator
The extension runs in Node.js with full VSCode API access. It bridges between the isolated webview and external agent processes.
Key responsibilities:
- Message routing - Translates between webview UI events and agent protocol messages
- Agent lifecycle - Spawns and manages the agent process
- Webview lifecycle - Handles visibility changes and ensures messages reach the UI
The extension deliberately avoids understanding message semantics. It routes based on IDs (tab ID, message ID) without interpreting content.
Agent Independence
The agent runs as a separate process communicating via stdio. This isolation provides:
- Flexibility - Agent can be any executable (Rust, Python, TypeScript)
- Stability - Agent crashes don't kill the extension
- Multiple sessions - Single agent process handles all tabs/conversations
The agent owns all session state and conversation logic. The extension only tracks which tab corresponds to which session.
Communication Boundaries
Webview ↔ Extension
Transport: postMessage API (asynchronous, JSON-serializable messages only)
Direction:
- Webview → Extension: User actions (new tab, send prompt, close tab)
- Extension → Webview: Agent responses (response chunks, completion signals)
Why not synchronous? VSCode's webview API is inherently asynchronous. This forces the UI to be resilient to message delays and webview lifecycle events.
Extension ↔ Agent
Transport: ACP (Agent Client Protocol) over stdio
Direction:
- Extension → Agent: Session commands (new session, process prompt)
- Agent → Extension: Streaming responses, session state updates
Why ACP over stdio? ACP provides a standardized protocol for agent communication. Stdio is simple, universal, and works with any language. No need for network sockets or IPC complexity.
Agent Configuration and Sharing
The extension uses AgentConfiguration to determine when agent processes can be shared across tabs. An AgentConfiguration consists of:
- Agent name (e.g., "ElizACP", "Claude")
- Enabled components (e.g., "symposium-acp")
- Workspace folder (the VSCode workspace the agent operates in)
Sharing strategy: Tabs with identical configurations share the same agent actor (process), but each tab gets its own session within that process.
Workspace folder selection:
- Single workspace: Automatically uses that workspace
- Multiple workspaces: Prompts user to select which workspace folder to use
- Each session is created with the workspace folder as its working directory
Rationale:
- Resource efficiency - Shared actor means one process for multiple tabs with the same config
- Workspace isolation - Different workspace folders get different actors to maintain proper working directory context
- Session isolation - Each tab gets its own session ID for conversation independence
Trade-off: Agent must implement multiplexing. Messages include session/tab IDs for routing. Extension maps UI tab IDs to agent session IDs.
Design Principles
Opaque state: Each layer owns its state format. Extension stores but doesn't parse webview UI state or agent session state.
Graceful degradation: Webview can be hidden/shown at any time. Extension buffers messages when webview is inactive.
UUID-based identity: Tab IDs and message IDs use UUIDs to avoid collisions. Generated at source (webview generates tab IDs, extension generates message IDs) to eliminate coordination overhead.
Minimal coupling: Layers communicate through well-defined message protocols. Webview doesn't know about agents. Agent doesn't know about webviews. Extension coordinates without understanding semantics.
End-to-End Flow
Here's how a complete user interaction flows through the system:
sequenceDiagram
participant User
participant VSCode
participant Extension
participant Webview
participant Agent
User->>VSCode: Opens Symposium sidebar
VSCode->>Extension: activate()
Extension->>Extension: Generate session ID
Extension->>Agent: Spawn process
Extension->>Webview: Create webview (inject session ID)
Webview->>Webview: Load, check session ID vs saved state
Webview->>Webview: Restore or clear tabs, initialize mynah-ui
Webview->>Extension: webview-ready (last-seen-index)
User->>Webview: Creates new tab
Webview->>Webview: Generate tab UUID
Webview->>Extension: new-tab (tabId)
Extension->>Agent: new-session
Agent->>Agent: Initialize session
Agent->>Extension: session-created (sessionId)
Extension->>Extension: Store tabId ↔ sessionId mapping
User->>Webview: Sends prompt
Webview->>Webview: Generate message UUID
Webview->>Extension: prompt (tabId, messageId, text)
Extension->>Extension: Lookup sessionId for tabId
Extension->>Agent: process-prompt (sessionId, text)
loop Streaming response
Agent->>Extension: response-chunk (sessionId, chunk)
Extension->>Extension: Lookup tabId for sessionId
Extension->>Webview: response-chunk (tabId, messageId, chunk)
Webview->>Webview: Render chunk in mynah-ui
end
Agent->>Extension: response-complete (sessionId)
Extension->>Webview: response-complete (tabId, messageId)
Webview->>Webview: End message stream
Webview->>Webview: setState() - persist session ID and tabs
The extension maintains tab↔session mappings and handles webview visibility, while the agent maintains session state and generates responses.
Message Protocol
The extension coordinates message flow between the webview UI and agent process. Messages are identified by UUIDs and routed based on tab/session mappings.
Message Identity
The system uses two separate identification mechanisms:
Message IDs (UUIDs): Identify specific prompt/response conversations. When a user sends a prompt, the webview generates a UUID message ID. All response chunks for that prompt include the same message ID, allowing the UI to associate chunks with the correct prompt and render them in the right place. Message IDs enable multiple concurrent prompts (user sends prompt in tab A while tab B is still streaming a response).
Message indices (numbers): Monotonically increasing integers assigned by the extension per tab, used exclusively for deduplication. When the webview is hidden and shown, the extension may replay messages to ensure nothing was missed. The webview tracks the last index it saw per tab (via lastSeenIndex map) and ignores messages with index <= lastSeenIndex[tabId]. This prevents duplicate response chunks from appearing in the UI.
Why both? Message IDs provide semantic identity ("which conversation is this?"). Message indices provide delivery tracking ("have I seen this before?"). The extension assigns indices sequentially as messages flow through; the webview uses UUIDs for UI routing and indices for deduplication.
Message Flow Patterns
Opening a New Tab
sequenceDiagram
participant User
participant Webview
participant Extension
participant Agent
User->>Webview: Opens new tab
Webview->>Webview: Generate tab ID (UUID)
Webview->>Extension: new-tab (tabId)
Extension->>Agent: new-session
Agent->>Agent: Initialize session
Agent->>Extension: session-created (sessionId)
Extension->>Extension: Store tabId → sessionId mapping
Why UUID generation in webview? The webview owns tab lifecycle. Generating IDs at the source avoids round-trip coordination with the extension.
Why separate session IDs? The agent owns session identity. Tab IDs are UI concepts; session IDs are agent concepts. The extension maps between them without understanding either.
Sending a Prompt
sequenceDiagram
participant User
participant Webview
participant Extension
participant Agent
User->>Webview: Types message
Webview->>Extension: prompt (tabId, messageId, text)
Extension->>Extension: Lookup sessionId for tabId
Extension->>Agent: process-prompt (sessionId, text)
loop Streaming response
Agent->>Extension: response-chunk (sessionId, chunk)
Extension->>Extension: Lookup tabId for sessionId
alt Webview visible
Extension->>Webview: response-chunk (tabId, messageId, chunk)
Webview->>Webview: Append to message stream
else Webview hidden
Extension->>Extension: Buffer message
end
end
Agent->>Extension: response-complete (sessionId)
Extension->>Webview: response-complete (tabId, messageId)
Webview->>Webview: End message stream
Why streaming? AI responses can take seconds to complete. Streaming provides immediate feedback and allows users to start reading while generation continues.
Why message IDs? Multiple prompts can be in flight simultaneously (user sends prompt in tab A while tab B is still receiving a response). Message IDs ensure response chunks are associated with the correct prompt.
Why buffer when hidden? VSCode can hide webviews at any time (user switches away, collapses sidebar). Buffering ensures the UI sees all messages when it becomes visible again.
Closing a Tab
sequenceDiagram
participant User
participant Webview
participant Extension
participant Agent
User->>Webview: Closes tab
Webview->>Extension: close-tab (tabId)
Extension->>Extension: Lookup sessionId for tabId
Extension->>Agent: close-session (sessionId)
Agent->>Agent: Cleanup session state
Extension->>Extension: Remove tabId → sessionId mapping
Why explicit close messages? Allows agent to clean up resources (free memory, close file handles) rather than leaking session state indefinitely.
Message Identification Strategy
Tab IDs
- Generated by: Webview (when user creates new tab)
- Format: UUID v4
- Scope: UI-only concept
- Lifetime: From tab creation to tab close
Session IDs
- Generated by: Agent (in response to new-session)
- Format: Agent-defined (typically UUID)
- Scope: Agent-only concept
- Lifetime: From session creation to session close
Message IDs
- Generated by: Webview (when user sends prompt)
- Format: UUID v4
- Scope: Used by both webview and extension for response routing
- Lifetime: From prompt send to response complete
Why three separate ID spaces? Each layer owns its identity domain. This avoids coupling and eliminates coordination overhead.
Bidirectional Mapping
The extension maintains two maps:
tabId → sessionId (for extension → agent messages)
sessionId → tabId (for agent → extension messages)
Synchronization: Maps are updated atomically when session creation completes. Both directions always stay consistent.
Cleanup: Both mappings are removed when either tab closes or session ends.
Message Ordering Guarantees
Within a session: Agent processes prompts sequentially. A second prompt won't start processing until the first response completes.
Across sessions: No ordering guarantees. Tabs are independent. Multiple sessions can stream responses simultaneously.
Webview messages: Delivered in order sent, but delivery timing depends on webview visibility. Buffered messages are replayed in order when webview becomes visible.
Error Handling
Agent crashes: Extension detects process exit, notifies all active tabs. Tabs display error state. User can trigger agent restart.
Webview disposal: Extension maintains agent sessions. If webview is recreated (VSCode restart), extension can restore tab → session mappings and continue existing sessions.
Message delivery failure: If webview is disposed while messages are buffered, messages are discarded. Agent sessions may continue running. Next webview instantiation can restore session state.
Design Rationale
Why not request/response? Streaming responses require continuous message flow, not single request/reply pairs. The protocol is inherently asynchronous.
Why not share IDs across layers? Each layer has different lifecycle concerns. Decoupling identity spaces allows independent evolution. Extension acts as impedance matcher between UI tab identity and agent session identity.
Why buffer in extension instead of agent? Agent shouldn't need to know about webview lifecycle. Extension handles VSCode-specific concerns (visibility, disposal) to keep agent implementation portable.
Tool Use Authorization
When agents request permission to execute tools (file operations, terminal commands, etc.), the extension provides a user approval mechanism. This chapter describes how authorization requests flow through the system and how per-agent policies are enforced.
Architecture
The authorization flow bridges three layers:
Agent (ACP requestPermission) → Extension (Promise-based routing) → Webview (MynahUI approval card)
The extension acts as the coordination point:
- Receives synchronous
requestPermissioncallbacks from the ACP agent - Checks per-agent bypass settings
- Routes approval requests to the webview when user input is needed
- Blocks the agent using promises until the user responds
Authorization Flow
With Bypass Disabled
sequenceDiagram
participant Agent
participant Extension
participant Settings
participant Webview
participant User
Agent->>Extension: requestPermission(toolCall, options)
Extension->>Settings: Check agents[agentName].bypassPermissions
Settings-->>Extension: false
Extension->>Extension: Generate approval ID, create pending promise
Extension->>Webview: approval-request message
Webview->>User: Display approval card (MynahUI)
User->>Webview: Click approve/deny/bypass
Webview->>Extension: approval-response message
alt User selected "Bypass Permissions"
Extension->>Settings: Set agents[agentName].bypassPermissions = true
end
Extension->>Extension: Resolve promise with user's choice
Extension-->>Agent: return RequestPermissionResponse
With Bypass Enabled
sequenceDiagram
participant Agent
participant Extension
participant Settings
Agent->>Extension: requestPermission(toolCall, options)
Extension->>Settings: Check agents[agentName].bypassPermissions
Settings-->>Extension: true
Extension-->>Agent: return allow_once (auto-approved)
Promise-Based Blocking
The ACP SDK's requestPermission callback is synchronous - it must return a Promise<RequestPermissionResponse>. The extension creates a promise that resolves when the user responds:
async requestPermission(params) {
// Check bypass setting first
if (agentConfig.bypassPermissions) {
return { outcome: { outcome: "selected", optionId: allowOptionId } };
}
// Create promise that will resolve when user responds
const promise = new Promise((resolve, reject) => {
pendingApprovals.set(approvalId, { resolve, reject, agentName });
});
// Send request to webview
sendToWebview({ type: "approval-request", approvalId, ... });
// Return promise (blocks agent until resolved)
return promise;
}
When the webview sends approval-response, the extension resolves the promise:
case "approval-response":
const pending = pendingApprovals.get(message.approvalId);
pending.resolve(message.response); // Unblocks agent
This allows the agent to block on permission requests without blocking the extension's event loop.
Per-Agent Settings
Authorization policies are scoped per-agent in symposium.agents configuration:
{
"symposium.agents": {
"Claude Code": {
"command": "npx",
"args": ["@zed-industries/claude-code-acp"],
"bypassPermissions": true
},
"ElizACP": {
"command": "elizacp",
"bypassPermissions": false
}
}
}
Why per-agent? Different agents have different trust levels. A user might trust Claude Code with unrestricted file access but want to review every tool call from an experimental agent.
Scope: Settings are stored globally (VSCode user settings), so bypass policies persist across workspaces and sessions.
User Approval Options
When bypass is disabled, the webview displays three options:
- Approve - Allow this single tool call, continue prompting for future tools
- Deny - Reject this single tool call, continue prompting for future tools
- Bypass Permissions - Approve this call AND set
bypassPermissions = truefor this agent permanently
The "Bypass Permissions" option provides a quick path to trusted status without requiring manual settings edits.
Webview UI Implementation
The webview uses MynahUI primitives to display approval requests:
- Chat item - Approval request appears as a chat message in the conversation
- Buttons - Three buttons (Approve, Deny, Bypass) using MynahUI's button status colors
- Tool details - Tool name, parameters (formatted as JSON), and any available metadata
- Card dismissal - Cards auto-dismiss after the user clicks a button (
keepCardAfterClick: false)
The specific MynahUI API usage is documented in the MynahUI GUI reference.
Approval Request Message
Extension → Webview:
{
type: "approval-request",
tabId: string,
approvalId: string, // UUID for matching response
agentName: string, // Which agent is requesting permission
toolCall: {
toolCallId: string, // ACP tool call identifier
title?: string, // Human-readable tool name (may be null)
kind?: ToolKind, // "read", "edit", "execute", etc.
rawInput?: object // Tool parameters
},
options: PermissionOption[] // Available approval options from ACP
}
Approval Response Message
Webview → Extension:
{
type: "approval-response",
approvalId: string, // Matches approval-request
response: {
outcome: {
outcome: "selected",
optionId: string // Which option was chosen
}
},
bypassAll: boolean // True if "Bypass Permissions" clicked
}
Design Decisions
Why block the agent? Tool execution should wait for user consent. Continuing execution while waiting for approval would allow the agent to make progress on non-tool operations, potentially creating race conditions where the user approves a tool call that's no longer relevant.
Why promise-based? JavaScript promises provide natural blocking semantics. The extension can return immediately (non-blocking event loop) while the agent perceives the call as synchronous (blocking until approval).
Why store in settings? Bypass permissions should persist across sessions. VSCode settings provide durable storage with UI for manual editing if needed.
Why auto-dismiss cards? Once the user responds, the approval card is no longer actionable. Dismissing it keeps the conversation history clean and focused on the actual work.
Future Enhancements
Potential extensions to the authorization system:
- Per-tool policies - Trust specific tools (e.g., "always allow Read") while prompting for others
- Resource-based rules - Auto-approve file reads within certain directories
- Temporary sessions - "Bypass for this session" option that doesn't persist
- Approval history - Log of past approvals for security auditing
- Batch approvals - Approve multiple pending tool calls at once
Webview State Persistence
The webview must preserve chat history and UI state across hide/show cycles, but clear state when VSCode restarts. This requires distinguishing between temporary hiding and permanent disposal.
The Problem
VSCode webviews face two distinct lifecycle events that look identical from the webview's perspective:
- User collapses sidebar - Webview is hidden but should restore exactly when reopened
- VSCode restarts - Webview is disposed and recreated, should start fresh
Both events destroy and recreate the webview DOM. The webview cannot distinguish between them without additional context.
User expectation: Chat history persists within a VSCode session but doesn't carry over to the next session. Draft text should survive sidebar collapse but not VSCode restart.
Session ID Solution
The extension generates a session ID (UUID) once per VSCode session at activation. This ID is embedded in the webview HTML as a global JavaScript variable (window.SYMPOSIUM_SESSION_ID) in a script tag. The webview reads this variable synchronously on load and compares it against the session ID stored in saved state.
sequenceDiagram
participant VSCode
participant Extension
participant Webview
Note over VSCode: Extension activation
Extension->>Extension: Generate session ID
Note over VSCode: User opens sidebar
Extension->>Webview: Create webview with session ID
Webview->>Webview: Load saved state
alt Session IDs match
Webview->>Webview: Restore chat history
else Session IDs don't match (or no saved ID)
Webview->>Webview: Clear state, start fresh
end
Why this works:
- Within a session: Same session ID embedded every time, state restores
- After restart: New session ID generated, mismatch detected, state cleared
State Structure
The webview maintains three pieces of state:
- Session ID - Embedded from extension, used for freshness detection
- Last seen index - Message deduplication tracking (see Webview Lifecycle chapter)
- Mynah UI tabs - Opaque blob from
mynahUI.getAllTabs()containing tab metadata, chat history, and UI configuration for all open tabs
Ownership: Webview owns this state entirely. Extension provides session ID but doesn't read or interpret webview state. The mynah-ui tabs structure is treated as opaque—the webview saves whatever getAllTabs() returns and restores it via mynah-ui's initialization config.
Storage: VSCode's getState()/setState() API. Persists across hide/show cycles and VSCode restarts.
State Lifecycle
Initial Load
- Webview reads embedded session ID from
window.SYMPOSIUM_SESSION_ID - Webview calls
vscode.getState()to load saved state - If
savedState.sessionId === window.SYMPOSIUM_SESSION_ID, restore tabs - Otherwise, call
vscode.setState(undefined)to clear stale state
During Use
State is saved after any UI change:
- User sends a message
- User opens or closes a tab
- Agent response is received and rendered
Performance: VSCode's setState() is optimized for frequent calls. No need to debounce or throttle state saves.
On Restart
- Extension activation generates new session ID
- Webview loads with new session ID embedded
- Session ID mismatch detected (old state has previous session's ID)
- State cleared, webview starts fresh
Message Deduplication
When the webview is hidden and shown, the extension may resend messages to ensure nothing was missed. The webview tracks the last message index seen per tab to avoid duplicates.
Last seen index map: { [tabId: string]: number }
Logic: If incoming message has index <= lastSeenIndex[tabId], ignore it. Otherwise, process and update lastSeenIndex[tabId].
Why needed? Extension buffers messages when webview is hidden (see Webview Lifecycle chapter). Replay strategy is "send everything since last known state" rather than tracking exactly which messages were delivered. Webview deduplicates to avoid showing duplicate response chunks.
Design Trade-offs
Why not retainContextWhenHidden?
VSCode offers retainContextWhenHidden: true to keep webview alive when hidden. This would eliminate the need for state persistence entirely.
Trade-off: Microsoft documentation warns of "much higher performance overhead." The webview remains in memory consuming resources even when not visible.
Decision: Use state persistence for lightweight chat interfaces. Reserve retainContextWhenHidden for complex UIs (e.g., embedded IDEs) that cannot be easily serialized.
Why not global state in extension?
Extension could store chat history in globalState instead of webview managing its own state.
Trade-off: Violates state ownership principle. Webview understands mynah-ui structure; extension shouldn't need to parse or manipulate UI state.
Decision: Webview owns UI state, extension provides coordination (session ID injection). Keeps extension simple and allows mynah-ui to evolve independently.
Why clear on restart instead of persisting?
Chat history could persist across VSCode restarts using globalState or workspace storage.
Trade-off: Users expect fresh sessions on restart. Long-lived history creates stale context and memory accumulation. Workspace-specific persistence could be added later if needed.
Decision: Session-scoped state matches user expectations and reduces complexity. Each VSCode session starts clean.
Migration and Compatibility
Old state without session ID: Treated as stale, cleared on first load. Ensures smooth upgrade path when session ID feature is added.
Future state format changes: Session ID check happens before parsing state structure. Mismatched session ID clears everything, eliminating need for explicit version migration.
Webview Lifecycle Management
VSCode can hide and show webviews at any time based on user actions. The extension must handle visibility changes gracefully to ensure no messages are lost and the UI appears responsive when shown.
Visibility States
A webview has three lifecycle states from the extension's perspective:
- Visible - User can see the webview, messages can be delivered immediately
- Hidden - Webview exists but is not visible (sidebar collapsed, tab not focused)
- Disposed - Webview destroyed, no communication possible
Key constraint: Hidden webviews cannot receive messages. Attempting to send via postMessage succeeds (no error) but messages are silently dropped.
The Hidden Webview Problem
sequenceDiagram
participant User
participant Extension
participant Webview
participant Agent
User->>Webview: Sends prompt
Webview->>Extension: prompt message
Extension->>Agent: Forward prompt
Agent->>Extension: Start streaming response
Note over User: User collapses sidebar
Extension->>Extension: Webview hidden (visible = false)
loop Agent still streaming
Agent->>Extension: response-chunk
Extension->>Webview: postMessage (silently dropped!)
Note over Webview: Message lost
end
Note over User: User reopens sidebar
Extension->>Extension: Webview visible again
Note over Webview: Missing chunks, partial response
Without buffering: Messages sent while webview is hidden are lost. When user reopens the sidebar, they see incomplete responses or missing messages entirely.
Message Buffering Strategy
The extension tracks webview visibility and buffers messages when hidden:
sequenceDiagram
participant Extension
participant Webview
participant Agent
Agent->>Extension: response-chunk
alt Webview visible
Extension->>Webview: Send immediately
else Webview hidden
Extension->>Extension: Add to buffer
end
Note over Extension: Webview becomes visible
Extension->>Webview: webview-ready request
Webview->>Extension: last-seen-index
loop For each buffered message
Extension->>Webview: Send buffered message
Webview->>Webview: Deduplicate if already seen
end
Extension->>Extension: Clear buffer
Buffer contents: Any message destined for the webview (response chunks, completion signals, error notifications).
Buffer lifetime: From webview hidden to webview shown. Cleared after replay.
Replay strategy: Send all buffered messages in order. Webview uses last-seen-index tracking (see State Persistence chapter) to ignore duplicates.
Visibility Detection
The extension monitors visibility using VSCode's onDidChangeViewState event:
stateDiagram-v2
[*] --> Created: resolveWebviewView
Created --> Visible: visible = true
Visible --> Hidden: visible = false
Hidden --> Visible: visible = true
Visible --> Disposed: onDidDispose
Hidden --> Disposed: onDidDispose
Disposed --> [*]
Event timing:
onDidChangeViewStatefires whenvisibleproperty changesonDidDisposefires after webview is destroyed (too late for cleanup)
Race condition: Messages can arrive between "webview created" and "webview visible." Extension treats created-but-not-visible as hidden state and buffers messages.
Webview-Ready Handshake
When the webview becomes visible (including initial creation), it announces readiness:
- Webview finishes initialization - DOM loads, webview script executes, session ID is checked, state is restored or cleared, mynah-ui is constructed with restored tabs (if any)
- Webview sends
webview-ready- After mynah-ui initialization completes, webview sends message to extension including current last-seen-index map - Extension replays buffered messages - Extension sends any messages that accumulated while webview was hidden
- Extension resumes normal message delivery - New messages are sent immediately as they arrive
Why handshake? Webview needs time to initialize mynah-ui and restore state. Sending messages immediately after visibility change could arrive before UI is ready to process them. The webview signals when it's actually ready to receive messages rather than the extension guessing based on visibility events.
Why include last-seen-index? Allows extension to avoid resending messages the webview already processed before hiding. Reduces redundant replay.
What triggers webview-ready? The webview sends this message during its initialization script, after the mynah-ui constructor completes and before setting up event handlers. On subsequent hide/show cycles, if mynah-ui remains initialized, the webview can send webview-ready immediately after becoming visible.
Agent Independence
The agent continues running regardless of webview visibility:
- Prompts sent while webview is hidden are still processed
- Responses generated while webview is hidden are buffered
- Sessions remain active across webview hide/show cycles
Why? Agent should not need to know about VSCode-specific concerns. Extension insulates agent from webview lifecycle complexity.
Trade-off: Long-running agent operations may complete while webview is hidden, buffering large amounts of data. If webview remains hidden for extended periods, memory usage grows. Current implementation has no buffer size limit.
Disposal Handling
When the webview is disposed (user closes sidebar permanently, workspace switch), buffered messages are discarded:
- Buffer is cleared
- Agent sessions continue running
- Next webview creation can restore tab → session mappings
Why not save buffered messages? Messages are ephemeral rendering updates. State persistence (see State Persistence chapter) handles durable state. Buffering is purely a delivery mechanism for real-time updates.
Design Rationale
Why buffer in extension instead of agent? Webview lifecycle is VSCode-specific. Agent shouldn't need VSCode-specific logic. Extension handles UI framework concerns.
Why replay all messages instead of tracking delivered? Simpler implementation. Webview deduplication is cheap (index comparison). Tracking exactly which messages were delivered requires more complex state management.
Why not queue in webview? Webview is destroyed/recreated when hidden in some cases. Can't rely on webview maintaining queue across lifecycle events. Extension has stable lifecycle tied to VSCode session.
Why immediate send when visible? Minimize latency. Users expect real-time streaming responses. Buffering only when necessary provides best UX.
VSCode Extension Integration Testing Guide
Table of Contents
- Overview
- Testing Types
- Setting Up Integration Tests
- Writing Integration Tests
- Testing Webviews
- Advanced Testing Scenarios
- Testing Best Practices
- Debugging Tests
- Common Patterns
- Tools and Libraries
Overview
VSCode extension testing involves multiple layers, with integration tests being crucial for verifying that your extension works correctly with the VSCode API in a real VSCode environment.
Why Integration Tests Matter:
- Unit tests can't verify VSCode API interactions
- Extensions can break due to VSCode API changes
- Manual testing doesn't scale as extensions grow
- Integration tests catch issues that unit tests miss
Key Principle: Follow the test pyramid - most tests should be fast unit tests, with a smaller number of integration tests for critical workflows.
Testing Types
Unit Tests
- Test pure logic in isolation
- No VSCode API required
- Fast and can run in any environment
- Use standard frameworks (Mocha, Jest, etc.)
- Good for: utility functions, data transformations, business logic
Integration Tests
- Run inside a real VSCode instance (Extension Development Host)
- Have access to full VSCode API
- Test extension behavior with actual VSCode
- Slower but more realistic
- Good for: command execution, UI interactions, API integrations
End-to-End Tests
- Automate the full VSCode UI using tools like WebdriverIO or Playwright
- Most complex to set up
- Test complete user workflows
- Good for: complex UIs, webviews, full user journeys
Setting Up Integration Tests
Option 1: Using @vscode/test-cli (Recommended)
The modern approach using the official VSCode test CLI.
Installation:
npm install --save-dev @vscode/test-cli @vscode/test-electron
package.json configuration:
{
"scripts": {
"test": "vscode-test"
}
}
Create .vscode-test.js or .vscode-test.mjs:
import { defineConfig } from '@vscode/test-cli';
export default defineConfig({
files: 'out/test/**/*.test.js',
version: 'stable', // or 'insiders' or specific version like '1.85.0'
workspaceFolder: './test-workspace',
mocha: {
ui: 'tdd',
timeout: 20000
}
});
Run tests:
npm test
Option 2: Using @vscode/test-electron Directly
For more control over the test runner.
Installation:
npm install --save-dev @vscode/test-electron mocha
Create src/test/runTest.ts:
import * as path from 'path';
import { runTests } from '@vscode/test-electron';
async function main() {
try {
// The folder containing the Extension Manifest package.json
const extensionDevelopmentPath = path.resolve(__dirname, '../../');
// The path to test runner
const extensionTestsPath = path.resolve(__dirname, './suite/index');
// Optional: specific workspace to open
const testWorkspace = path.resolve(__dirname, '../../test-fixtures');
// Download VS Code, unzip it and run the integration test
await runTests({
extensionDevelopmentPath,
extensionTestsPath,
launchArgs: [
testWorkspace,
'--disable-extensions' // Disable other extensions during testing
]
});
} catch (err) {
console.error('Failed to run tests');
process.exit(1);
}
}
main();
Create src/test/suite/index.ts (test runner):
import * as path from 'path';
import * as Mocha from 'mocha';
import { glob } from 'glob';
export function run(): Promise<void> {
const mocha = new Mocha({
ui: 'tdd',
color: true,
timeout: 20000
});
const testsRoot = path.resolve(__dirname, '.');
return new Promise((resolve, reject) => {
glob('**/**.test.js', { cwd: testsRoot }).then((files) => {
// Add files to the test suite
files.forEach(f => mocha.addFile(path.resolve(testsRoot, f)));
try {
// Run the mocha test
mocha.run(failures => {
if (failures > 0) {
reject(new Error(`${failures} tests failed.`));
} else {
resolve();
}
});
} catch (err) {
reject(err);
}
}).catch((err) => {
reject(err);
});
});
}
Project Structure
your-extension/
├── src/
│ ├── extension.ts
│ └── test/
│ ├── runTest.ts
│ └── suite/
│ ├── index.ts
│ ├── extension.test.ts
│ └── other.test.ts
├── test-fixtures/ # Optional test workspace
│ └── sample-file.txt
├── .vscode/
│ └── launch.json # Debug configuration
└── package.json
Writing Integration Tests
Basic Test Structure
import * as assert from 'assert';
import * as vscode from 'vscode';
suite('Extension Test Suite', () => {
vscode.window.showInformationMessage('Start all tests.');
test('Sample test', () => {
assert.strictEqual(-1, [1, 2, 3].indexOf(5));
assert.strictEqual(-1, [1, 2, 3].indexOf(0));
});
test('Extension should be present', () => {
assert.ok(vscode.extensions.getExtension('your-publisher.your-extension'));
});
test('Should register commands', async () => {
const commands = await vscode.commands.getCommands(true);
assert.ok(commands.includes('your-extension.yourCommand'));
});
});
Testing Commands
test('Execute command should work', async () => {
const result = await vscode.commands.executeCommand('your-extension.yourCommand');
assert.ok(result);
assert.strictEqual(result.status, 'success');
});
Testing with Documents and Editors
test('Should modify document', async () => {
// Create a new document
const doc = await vscode.workspace.openTextDocument({
content: 'Hello World',
language: 'plaintext'
});
// Open it in an editor
const editor = await vscode.window.showTextDocument(doc);
// Execute your command that modifies the document
await vscode.commands.executeCommand('your-extension.formatDocument');
// Assert the document was modified
assert.strictEqual(doc.getText(), 'HELLO WORLD');
// Clean up
await vscode.commands.executeCommand('workbench.action.closeActiveEditor');
});
Asynchronous Operations and Waiting
function waitForCondition(
condition: () => boolean,
timeout: number = 5000,
message?: string
): Promise<void> {
return new Promise((resolve, reject) => {
const startTime = Date.now();
const interval = setInterval(() => {
if (condition()) {
clearInterval(interval);
resolve();
} else if (Date.now() - startTime > timeout) {
clearInterval(interval);
reject(new Error(message || 'Timeout waiting for condition'));
}
}, 50);
});
}
test('Wait for extension activation', async () => {
const extension = vscode.extensions.getExtension('your-publisher.your-extension');
if (!extension!.isActive) {
await extension!.activate();
}
await waitForCondition(
() => extension!.isActive,
5000,
'Extension did not activate'
);
assert.ok(extension!.isActive);
});
Testing Events
test('Should trigger onDidChangeTextDocument', async () => {
const doc = await vscode.workspace.openTextDocument({
content: 'Test',
language: 'plaintext'
});
let eventFired = false;
const disposable = vscode.workspace.onDidChangeTextDocument(e => {
if (e.document === doc) {
eventFired = true;
}
});
const editor = await vscode.window.showTextDocument(doc);
await editor.edit(edit => {
edit.insert(new vscode.Position(0, 0), 'Hello ');
});
await waitForCondition(() => eventFired, 2000);
assert.ok(eventFired, 'Event should have fired');
disposable.dispose();
});
Testing Webviews
Testing webviews is challenging because they run in an isolated context. There are several approaches:
Approach 1: Message-Based Testing (Recommended for Integration Tests)
Extension Side - Add Test Hooks:
class ChatPanel {
private panel: vscode.WebviewPanel;
private messageHandlers: Map<string, (message: any) => void> = new Map();
constructor(extensionUri: vscode.Uri) {
this.panel = vscode.window.createWebviewPanel(
'chat',
'Chat',
vscode.ViewColumn.One,
{
enableScripts: true,
retainContextWhenHidden: true
}
);
this.panel.webview.onDidReceiveMessage(message => {
// Handle normal messages
if (message.type === 'userMessage') {
this.handleUserMessage(message.text);
}
// Handle test messages (only in test environment)
if (process.env.VSCODE_TEST_MODE === 'true') {
if (message.type === 'test:state') {
const handler = this.messageHandlers.get('state');
handler?.(message);
}
}
});
}
// Public method for tests to get state
public requestState(): Promise<any> {
return new Promise((resolve) => {
this.messageHandlers.set('state', (message) => {
resolve(message.data);
this.messageHandlers.delete('state');
});
this.panel.webview.postMessage({ type: 'test:getState' });
});
}
// Method to send messages to webview
public sendMessage(text: string) {
this.handleUserMessage(text);
}
private handleUserMessage(text: string) {
// Your normal message handling logic
// ...
// Send to webview
this.panel.webview.postMessage({
type: 'agentResponse',
text: 'Response to: ' + text
});
}
}
Webview Side - Add Test Handlers:
// In your webview HTML/JS
const vscode = acquireVsCodeApi();
let messages = [];
// Handle messages from extension
window.addEventListener('message', event => {
const message = event.data;
if (message.type === 'agentResponse') {
messages.push(message);
updateUI();
}
// Test-specific handlers
if (message.type === 'test:getState') {
vscode.postMessage({
type: 'test:state',
data: {
messages: messages,
// other state...
}
});
}
});
// Handle user input
function sendMessage(text) {
vscode.postMessage({
type: 'userMessage',
text: text
});
}
Integration Test:
suite('Chat Webview Tests', () => {
let chatPanel: ChatPanel;
setup(async () => {
// Set test mode
process.env.VSCODE_TEST_MODE = 'true';
// Create chat panel
chatPanel = new ChatPanel(extensionUri);
});
teardown(async () => {
// Clean up
await vscode.commands.executeCommand('workbench.action.closeAllEditors');
process.env.VSCODE_TEST_MODE = 'false';
});
test('Chat state persistence', async () => {
// Send a message
chatPanel.sendMessage('Hello');
// Wait for response
await new Promise(resolve => setTimeout(resolve, 500));
// Get state before closing
const stateBefore = await chatPanel.requestState();
assert.strictEqual(stateBefore.messages.length, 1);
// Close and reopen
await vscode.commands.executeCommand('workbench.action.closePanel');
await new Promise(resolve => setTimeout(resolve, 100));
// Reopen chat
chatPanel = new ChatPanel(extensionUri);
await new Promise(resolve => setTimeout(resolve, 500));
// Verify state persisted
const stateAfter = await chatPanel.requestState();
assert.strictEqual(stateAfter.messages.length, 1);
assert.strictEqual(stateAfter.messages[0].text, 'Response to: Hello');
});
});
Approach 2: Direct Extension-Side Testing
If your webview logic mostly lives on the extension side, test the handlers directly:
test('Handle user message', async () => {
const chatPanel = new ChatPanel(extensionUri);
// Simulate message from webview by calling the handler directly
await chatPanel.handleWebviewMessage({
type: 'userMessage',
text: 'Test message'
});
// Verify the extension's state changed
const messages = chatPanel.getMessages();
assert.strictEqual(messages.length, 1);
assert.strictEqual(messages[0].user, 'Test message');
});
Approach 3: Using WebdriverIO for True E2E Webview Testing
For complex webview UIs where you need to test the actual DOM:
Installation:
npm install --save-dev @wdio/cli @wdio/mocha-framework wdio-vscode-service
wdio.conf.ts:
import path from 'path';
export const config = {
specs: ['./test/e2e/**/*.test.ts'],
capabilities: [{
browserName: 'vscode',
browserVersion: 'stable',
'wdio:vscodeOptions': {
extensionPath: path.join(__dirname, '.'),
userSettings: {
'window.dialogStyle': 'custom'
}
}
}],
services: ['vscode'],
framework: 'mocha',
mochaOpts: {
ui: 'bdd',
timeout: 60000
}
};
E2E Test:
describe('Chat Webview E2E', () => {
it('should allow typing and sending messages', async () => {
const workbench = await browser.getWorkbench();
// Open your chat panel
await browser.executeWorkbench((vscode) => {
vscode.commands.executeCommand('your-extension.openChat');
});
// Wait for webview to appear
await browser.pause(1000);
// Switch to webview frame
const webview = await $('iframe.webview');
await browser.switchToFrame(webview);
// Interact with webview DOM
const input = await $('input[type="text"]');
await input.setValue('Hello from E2E test');
const sendButton = await $('button[type="submit"]');
await sendButton.click();
// Verify response appears
const messages = await $$('.message');
expect(messages).toHaveLength(2); // User message + bot response
});
});
Advanced Testing Scenarios
Testing with Mock Dependencies
// Create a mock agent for deterministic testing
class MockAgent {
async sendMessage(text: string): Promise<string> {
// Return deterministic responses for testing
if (text.includes('hello')) {
return 'Hi there!';
}
return 'I received: ' + text;
}
}
// Inject mock in tests
test('Chat with mock agent', async () => {
const mockAgent = new MockAgent();
const chatPanel = new ChatPanel(extensionUri, mockAgent);
chatPanel.sendMessage('hello');
await waitForCondition(() => chatPanel.getMessages().length > 0);
const messages = chatPanel.getMessages();
assert.strictEqual(messages[0].response, 'Hi there!');
});
Testing State Serialization
test('Serialize and restore webview state', async () => {
const chatPanel = new ChatPanel(extensionUri);
// Add some state
chatPanel.sendMessage('First message');
await new Promise(resolve => setTimeout(resolve, 200));
chatPanel.sendMessage('Second message');
await new Promise(resolve => setTimeout(resolve, 200));
// Get serialized state
const state = chatPanel.getSerializedState();
assert.ok(state);
assert.ok(state.messages);
// Close panel
chatPanel.dispose();
// Create new panel with saved state
const newChatPanel = ChatPanel.restore(extensionUri, state);
// Verify state was restored
const messages = newChatPanel.getMessages();
assert.strictEqual(messages.length, 2);
assert.strictEqual(messages[0].text, 'First message');
});
Testing with File System
import * as fs from 'fs/promises';
import * as path from 'path';
import * as os from 'os';
suite('File Operations', () => {
let tempDir: string;
setup(async () => {
// Create temp directory for test files
tempDir = await fs.mkdtemp(path.join(os.tmpdir(), 'vscode-test-'));
});
teardown(async () => {
// Clean up temp files
await fs.rm(tempDir, { recursive: true, force: true });
});
test('Should read and process files', async () => {
// Create test file
const testFile = path.join(tempDir, 'test.txt');
await fs.writeFile(testFile, 'test content');
// Open file in VSCode
const doc = await vscode.workspace.openTextDocument(testFile);
await vscode.window.showTextDocument(doc);
// Execute your command
await vscode.commands.executeCommand('your-extension.processFile');
// Verify results
const content = await fs.readFile(testFile, 'utf-8');
assert.strictEqual(content, 'PROCESSED: test content');
});
});
Testing Extension Configuration
test('Should respect configuration changes', async () => {
const config = vscode.workspace.getConfiguration('your-extension');
// Set test configuration
await config.update('someSetting', 'testValue',
vscode.ConfigurationTarget.Global);
// Execute command that uses config
const result = await vscode.commands.executeCommand('your-extension.useConfig');
assert.strictEqual(result.settingValue, 'testValue');
// Clean up
await config.update('someSetting', undefined,
vscode.ConfigurationTarget.Global);
});
Testing Best Practices
1. Isolation
- Each test should be independent
- Clean up resources in
teardown() - Don't rely on test execution order
- Close editors and panels after tests
2. Determinism
- Use mock agents or services for predictable behavior
- Avoid timing dependencies where possible
- Use proper wait conditions instead of arbitrary sleeps
- Control randomness (use seeds for random data)
3. Speed
- Keep integration tests focused
- Don't test every edge case in integration tests
- Use unit tests for detailed logic testing
- Disable unnecessary extensions with
--disable-extensions
4. Clarity
- Use descriptive test names
- Comment complex setup/teardown logic
- Group related tests in suites
- Keep tests readable and maintainable
5. Reliability
- Handle asynchronous operations properly
- Use appropriate timeouts
- Add retry logic for flaky operations
- Log failures for debugging
Test Helpers
Create reusable test utilities:
// test/helpers.ts
export async function createTestDocument(
content: string,
language: string = 'plaintext'
): Promise<vscode.TextDocument> {
const doc = await vscode.workspace.openTextDocument({
content,
language
});
return doc;
}
export async function closeAllEditors(): Promise<void> {
await vscode.commands.executeCommand('workbench.action.closeAllEditors');
}
export function waitForExtensionActivation(
extensionId: string
): Promise<void> {
return new Promise((resolve, reject) => {
const extension = vscode.extensions.getExtension(extensionId);
if (!extension) {
reject(new Error(`Extension ${extensionId} not found`));
return;
}
if (extension.isActive) {
resolve();
return;
}
extension.activate()
.then(() => resolve())
.catch(reject);
});
}
export class Deferred<T> {
promise: Promise<T>;
resolve!: (value: T) => void;
reject!: (error: Error) => void;
constructor() {
this.promise = new Promise((resolve, reject) => {
this.resolve = resolve;
this.reject = reject;
});
}
}
Debugging Tests
VSCode Launch Configuration
Add to .vscode/launch.json:
{
"version": "0.2.0",
"configurations": [
{
"name": "Extension Tests",
"type": "extensionHost",
"request": "launch",
"runtimeExecutable": "${execPath}",
"args": [
"--extensionDevelopmentPath=${workspaceFolder}",
"--extensionTestsPath=${workspaceFolder}/out/test/suite/index",
"--disable-extensions"
],
"outFiles": [
"${workspaceFolder}/out/test/**/*.js"
],
"preLaunchTask": "npm: compile"
}
]
}
Debugging Tips
- Set breakpoints in your test files
- Use Debug Console to inspect variables
- Run single tests by using
.only():test.only('This test will run alone', () => { // ... }); - Use console.log for quick debugging
- Check Extension Development Host output for extension logs
Running Specific Tests
# Run all tests
npm test
# Run tests matching pattern
npm test -- --grep "specific test name"
# Run with more verbose output
npm test -- --reporter spec
Common Patterns
Pattern: Testing Command Registration
test('Commands should be registered', async () => {
const commands = await vscode.commands.getCommands(true);
const expectedCommands = [
'your-extension.command1',
'your-extension.command2',
'your-extension.command3'
];
for (const cmd of expectedCommands) {
assert.ok(
commands.includes(cmd),
`Command ${cmd} should be registered`
);
}
});
Pattern: Testing Status Bar Items
test('Should show status bar item', async () => {
// Trigger action that creates status bar item
await vscode.commands.executeCommand('your-extension.showStatus');
// Status bar items aren't directly testable via API,
// so test the underlying state
const extension = vscode.extensions.getExtension('your-publisher.your-extension');
const statusItem = (extension?.exports as any).statusBarItem;
assert.ok(statusItem);
assert.strictEqual(statusItem.text, '$(check) Ready');
});
Pattern: Testing Tree Views
test('Tree view should show items', async () => {
// Get your tree data provider
const extension = vscode.extensions.getExtension('your-publisher.your-extension');
const treeProvider = (extension?.exports as any).treeDataProvider;
// Get root items
const items = await treeProvider.getChildren();
assert.ok(items.length > 0);
assert.strictEqual(items[0].label, 'Expected Item');
});
Pattern: Testing Quick Picks
test('Quick pick should show options', async () => {
// This is tricky - quick picks block execution
// One approach is to test the logic that generates options
const extension = vscode.extensions.getExtension('your-publisher.your-extension');
const getQuickPickItems = (extension?.exports as any).getQuickPickItems;
const items = await getQuickPickItems();
assert.strictEqual(items.length, 3);
assert.strictEqual(items[0].label, 'Option 1');
});
Tools and Libraries
Core Testing Tools
- @vscode/test-cli: Official CLI for running tests (recommended)
- @vscode/test-electron: Lower-level test runner for Desktop VSCode
- @vscode/test-web: Test runner for web extensions
- Mocha: Test framework used by VSCode (TDD or BDD style)
Additional Testing Tools
- WebdriverIO + wdio-vscode-service: E2E testing with webview support
- vscode-extension-tester: Alternative E2E testing tool by Red Hat
- Sinon: Mocking and stubbing library
- Chai: Assertion library (alternative to Node's assert)
Useful Utilities
// Helper to wait for promises with timeout
export function withTimeout<T>(
promise: Promise<T>,
timeoutMs: number
): Promise<T> {
return Promise.race([
promise,
new Promise<T>((_, reject) =>
setTimeout(() => reject(new Error('Timeout')), timeoutMs)
)
]);
}
// Helper to retry flaky operations
export async function retry<T>(
fn: () => Promise<T>,
attempts: number = 3,
delay: number = 100
): Promise<T> {
for (let i = 0; i < attempts; i++) {
try {
return await fn();
} catch (error) {
if (i === attempts - 1) throw error;
await new Promise(resolve => setTimeout(resolve, delay));
}
}
throw new Error('Retry failed');
}
Example: Complete Test Suite
Here's a complete example putting it all together:
import * as assert from 'assert';
import * as vscode from 'vscode';
import { ChatPanel } from '../../chatPanel';
suite('Chat Extension Test Suite', () => {
let extensionUri: vscode.Uri;
let chatPanel: ChatPanel | undefined;
suiteSetup(async () => {
// Run once before all tests
const extension = vscode.extensions.getExtension('your-publisher.your-extension');
assert.ok(extension);
if (!extension.isActive) {
await extension.activate();
}
extensionUri = extension.extensionUri;
});
setup(() => {
// Run before each test
process.env.VSCODE_TEST_MODE = 'true';
});
teardown(async () => {
// Run after each test
if (chatPanel) {
chatPanel.dispose();
chatPanel = undefined;
}
await vscode.commands.executeCommand('workbench.action.closeAllEditors');
process.env.VSCODE_TEST_MODE = 'false';
});
test('Extension should be present', () => {
assert.ok(vscode.extensions.getExtension('your-publisher.your-extension'));
});
test('Chat command should be registered', async () => {
const commands = await vscode.commands.getCommands(true);
assert.ok(commands.includes('your-extension.openChat'));
});
test('Should create chat panel', async () => {
chatPanel = new ChatPanel(extensionUri);
assert.ok(chatPanel);
});
test('Should send and receive messages', async function() {
this.timeout(5000);
chatPanel = new ChatPanel(extensionUri);
// Send message
chatPanel.sendMessage('Hello');
// Wait for response
await new Promise(resolve => setTimeout(resolve, 1000));
const state = await chatPanel.requestState();
assert.ok(state.messages.length > 0);
});
test('Should persist state across panel close/reopen', async function() {
this.timeout(10000);
// Create panel and send message
chatPanel = new ChatPanel(extensionUri);
chatPanel.sendMessage('Test message');
await new Promise(resolve => setTimeout(resolve, 500));
// Get state
const stateBefore = await chatPanel.requestState();
const messageCount = stateBefore.messages.length;
// Serialize and dispose
const serialized = chatPanel.getSerializedState();
chatPanel.dispose();
chatPanel = undefined;
// Wait a bit
await new Promise(resolve => setTimeout(resolve, 200));
// Restore
chatPanel = ChatPanel.restore(extensionUri, serialized);
await new Promise(resolve => setTimeout(resolve, 500));
// Verify
const stateAfter = await chatPanel.requestState();
assert.strictEqual(stateAfter.messages.length, messageCount);
});
});
Summary
Integration testing for VSCode extensions requires:
- Proper setup using @vscode/test-cli or @vscode/test-electron
- Strategic testing - focus on critical workflows, use unit tests for details
- Webview testing via message-passing or E2E tools like WebdriverIO
- Good practices - isolation, determinism, proper cleanup
- Debugging support with launch configurations
Testing webviews specifically requires creative approaches since they run in isolated contexts. The message-passing pattern works well for integration tests, while WebdriverIO is better for true E2E testing of complex UIs.
Remember: integration tests are slower than unit tests, so use them strategically for testing VSCode API interactions and critical user workflows.
Testing Implementation
This chapter documents the testing framework architecture for the VSCode extension, explaining how tests are structured and how to extend the testing system with new capabilities.
Architecture
Test Infrastructure
The test suite uses @vscode/test-cli which downloads and runs a VSCode instance, loads the extension in development mode, and executes Mocha tests in the extension host context.
Configuration in .vscode-test.mjs:
{
files: "out/test/**/*.test.js",
version: "stable",
workspaceFolder: "./test-workspace",
mocha: { ui: "tdd", timeout: 20000 }
}
Tests run with:
npm test
Testing API Design
Rather than coupling tests to implementation details, the extension exposes a command-based testing API. Tests invoke VSCode commands which delegate to public testing methods on ChatViewProvider.
Pattern:
// In extension.ts - register test command
context.subscriptions.push(
vscode.commands.registerCommand("symposium.test.commandName",
async (arg1, arg2) => {
return await chatProvider.testingMethod(arg1, arg2);
}
)
);
// In test - invoke via command
const result = await vscode.commands.executeCommand(
"symposium.test.commandName",
arg1,
arg2
);
Current Testing Commands:
symposium.test.simulateNewTab(tabId)- Create a tabsymposium.test.getTabs()- Get list of tab IDssymposium.test.sendPrompt(tabId, prompt)- Send prompt to tabsymposium.test.startCapturingResponses(tabId)- Begin capturing agent responsessymposium.test.getResponse(tabId)- Get accumulated response textsymposium.test.stopCapturingResponses(tabId)- Stop capturing
Adding New Test Commands
To test new behavior:
- Add public method to
ChatViewProvider(or relevant class):
export class ChatViewProvider {
// Existing test methods...
public async newTestingMethod(param: string): Promise<ResultType> {
// Implementation that exposes needed behavior
return result;
}
}
- Register command in
extension.ts:
context.subscriptions.push(
vscode.commands.registerCommand(
"symposium.test.newCommand",
async (param: string) => {
return await chatProvider.newTestingMethod(param);
}
)
);
- Use in tests:
test("Should test new behavior", async () => {
const result = await vscode.commands.executeCommand(
"symposium.test.newCommand",
"test-param"
);
assert.strictEqual(result.expected, true);
});
Structured Logging for Assertions
Tests verify behavior through structured log events rather than console scraping.
Logger Architecture:
export class Logger {
private outputChannel: vscode.OutputChannel;
private eventEmitter = new vscode.EventEmitter<LogEvent>();
public get onLog(): vscode.Event<LogEvent> {
return this.eventEmitter.event;
}
public info(category: string, message: string, data?: any): void {
const event: LogEvent = {
timestamp: new Date(),
level: "info",
category,
message,
data
};
this.eventEmitter.fire(event);
this.outputChannel.appendLine(/* formatted output */);
}
}
Dual Purpose:
- Testing - Event emitter allows tests to capture and assert on events
- Live Debugging - Output channel shows logs in VSCode Output panel
Usage in Tests:
const logEvents: LogEvent[] = [];
const disposable = logger.onLog((event) => logEvents.push(event));
// ... perform test actions ...
const relevantEvents = logEvents.filter(
e => e.category === "agent" && e.message === "Session created"
);
assert.strictEqual(relevantEvents.length, 2);
Adding New Log Points
To make behavior testable:
- Add log statement in implementation:
logger.info("category", "Descriptive message", {
relevantData: value,
moreContext: other
});
- Filter and assert in tests:
const events = logEvents.filter(
e => e.category === "category" && e.message === "Descriptive message"
);
assert.ok(events.length > 0);
assert.strictEqual(events[0].data.relevantData, expectedValue);
Log Categories:
webview- Webview lifecycle eventsagent- Agent spawning, sessions, communication- Add new categories as needed for different subsystems
Design Decisions
Command-Based Testing API
Alternative: Direct access to ChatViewProvider internals from tests
Chosen: Command-based testing API
Rationale:
- Decouples tests from implementation details
- Tests the same code paths as real usage
- Allows refactoring without breaking tests
- Commands document the testing interface
Real Agents vs Mocks
Alternative: Mock agent responses with canned data
Chosen: Real ElizACP over ACP protocol
Rationale:
- Tests the full protocol stack (JSON-RPC, stdio, conductor)
- Verifies conductor integration
- Catches protocol-level bugs
- Provides realistic timing and behavior
ElizACP is lightweight, deterministic, and fast enough for testing.
Event-Based Logging
Alternative: Console output scraping with regex
Chosen: Event emitter with structured data
Rationale:
- Enables precise assertions on event counts and data
- Provides rich context for debugging
- Output panel visibility for live debugging
- No brittle string matching
- Same infrastructure serves testing and development
Test Isolation
Challenge: Tests share VSCode instance, agent processes persist across tests
Strategy: Make tests order-independent:
- Assert "spawned OR reused" rather than exact counts
- Focus on test-specific events (e.g., prompts sent, responses received)
- Capture logs from test start, not globally
- Don't assume clean state between tests
This allows the test suite to pass regardless of execution order.
Writing Tests
Tests live in src/test/*.test.ts and use Mocha's TDD interface:
suite("Feature Tests", () => {
test("Should do something", async function() {
this.timeout(20000); // Extend timeout for async operations
// Setup log capture
const logEvents: LogEvent[] = [];
const disposable = logger.onLog((event) => logEvents.push(event));
// Perform test actions via commands
await vscode.commands.executeCommand("symposium.test.doSomething");
// Wait for async completion
await new Promise(resolve => setTimeout(resolve, 1000));
// Assert on results
const events = logEvents.filter(/* ... */);
assert.ok(events.length > 0);
disposable.dispose();
});
});
Key Patterns:
- Use
async function()(not arrow functions) to accessthis.timeout() - Extend timeout for operations involving agent spawning
- Always dispose log listeners
- Add delays for async operations (agent responses, UI updates)
Related Documentation
- Message Protocol - Extension ↔ webview communication
- State Persistence - How state survives webview lifecycle
Implementation Status
This chapter tracks what's been implemented, what's in progress, and what's planned for the VSCode extension.
Core Architecture
- Three-layer architecture (webview/extension/agent)
- Message routing with UUID-based identification
- HomerActor mock agent with session support
- Webview state persistence with session ID checking
- Message buffering when webview is hidden
- Message deduplication via last-seen-index tracking
Error Handling
- Agent crash detection (partially implemented - detection works, UI error display incomplete)
- Complete error recovery UX (restart agent button, error notifications)
- Agent health monitoring and automatic restart
Agent Lifecycle
- Agent spawn on extension activation (partially implemented - spawn/restart works, graceful shutdown incomplete)
- Graceful agent shutdown on extension deactivation
- Agent process supervision and restart on crash
ACP Protocol Support
Connection & Lifecycle
-
Client-side connection (
ClientSideConnection) - Protocol initialization and capability negotiation
-
Session creation (
newSession) -
Prompt sending (
prompt) -
Streaming response handling (
sessionUpdate) -
Session cancellation (
session/cancel) -
Session mode switching (
session/set_mode) -
Model selection (
session/set_model) - Authentication flow
Tool Permissions
-
Permission request callback (
requestPermission) - MynahUI approval cards with approve/deny/bypass options
- Per-agent bypass permissions in settings
- Settings UI for managing bypass permissions
- Automatic approval when bypass enabled
Session Updates
The client receives sessionUpdate notifications from the agent. Current support:
-
agent_message_chunk- Display streaming text in chat UI -
tool_call- Logged to console (not displayed in UI) -
tool_call_update- Logged to console (not displayed in UI) - Execution plans - Not implemented
- Thinking steps - Not implemented
- Custom update types - Not implemented
Gap: Tool calls are logged but not visually displayed. Users don't see which tools are being executed or their progress.
File System Capabilities
-
readTextFile- Stub implemented (throws "not yet implemented") -
writeTextFile- Stub implemented (throws "not yet implemented")
Current state: We advertise fs.readTextFile: false and fs.writeTextFile: false in capabilities, so agents know we don't support file operations.
Why not implemented: Requires VSCode workspace API integration and security considerations (which files can be accessed, path validation, etc.).
Terminal Capabilities
-
createTerminal- Not implemented - Terminal output streaming - Not implemented
- Terminal lifecycle (kill, release) - Not implemented
Why not implemented: Requires integrating with VSCode's terminal API and managing terminal lifecycle. Also involves security considerations around command execution.
Extension Points
-
Extension methods (
extMethod) - Not implemented -
Extension notifications (
extNotification) - Not implemented
These allow protocol extensions beyond the ACP specification. Not currently needed but could be useful for custom features.
State Management
- Webview state persistence within session
- Chat history persistence across hide/show cycles
- Draft text persistence (FIXME: partially typed prompts are lost on hide/show)
- Session restoration after VSCode restart
- Workspace-specific state persistence
- Tab history and conversation export
MynahUI GUI Capabilities Guide
Overview
MynahUI is a data and event-driven chat interface library for browsers and webviews. This guide focuses on the interactive GUI capabilities relevant for building tool permission and approval workflows.
Core Concepts
Chat Items
Chat items are the fundamental building blocks of the conversation UI. Each chat item is a "card" that can contain various interactive elements.
Basic Structure:
interface ChatItem {
type: ChatItemType; // Determines positioning and styling
messageId?: string; // Unique identifier for updates
body?: string; // Markdown content
buttons?: ChatItemButton[]; // Action buttons
formItems?: ChatItemFormItem[]; // Form inputs
fileList?: FileList; // File tree display
followUp?: FollowUpOptions; // Quick action pills
// ... many more options
}
Chat Item Types:
ANSWER/ANSWER_STREAM/CODE_RESULT→ Left-aligned (AI responses)PROMPT/SYSTEM_PROMPT→ Right-aligned (user messages)DIRECTIVE→ Transparent, no background
Interactive Components
1. Buttons (ChatItemButton)
Buttons are the primary action mechanism for user approval/denial workflows.
Interface:
interface ChatItemButton {
id: string; // Unique identifier for the button
text?: string; // Button label
icon?: MynahIcons; // Optional icon
status?: 'main' | 'primary' | 'clear' | 'dimmed-clear' | 'info' | 'success' | 'warning' | 'error';
keepCardAfterClick?: boolean; // If false, removes card after click
waitMandatoryFormItems?: boolean; // Disables until mandatory form items are filled
disabled?: boolean;
description?: string; // Tooltip text
}
Status Colors:
main- Primary brand colorprimary- Accent colorsuccess- Green (for approval actions)error- Red (for denial/rejection actions)warning- Yellow/orangeinfo- Blueclear- Transparent background
Event Handler:
onInBodyButtonClicked: (tabId: string, messageId: string, action: {
id: string;
text?: string;
// ... other button properties
}) => void
Example - Approval Buttons:
{
type: ChatItemType.ANSWER,
messageId: 'tool-approval-123',
body: 'Tool execution request...',
buttons: [
{
id: 'approve-once',
text: 'Approve',
status: 'primary',
icon: MynahIcons.OK
},
{
id: 'approve-session',
text: 'Approve for Session',
status: 'success',
icon: MynahIcons.OK_CIRCLED
},
{
id: 'deny',
text: 'Deny',
status: 'error',
icon: MynahIcons.CANCEL,
keepCardAfterClick: false // Card disappears on denial
}
]
}
2. Form Items (ChatItemFormItem)
Form items allow collecting structured user input alongside button actions.
Available Form Types:
textinput/textarea/numericinput/emailselect(dropdown)radiogroup/togglecheckbox/switchstars(rating)list(dynamic list of items)pillbox(tag/pill input)
Common Properties:
interface BaseFormItem {
id: string; // Unique identifier
type: string; // Form type
mandatory?: boolean; // Required field
title?: string; // Label
description?: string; // Help text
tooltip?: string; // Tooltip
value?: string; // Initial/current value
disabled?: boolean;
}
Example - Checkbox for "Remember Choice":
formItems: [
{
type: 'checkbox',
id: 'remember-approval',
label: 'Remember this choice for similar requests',
value: 'false',
tooltip: 'If checked, future requests for this tool will be automatically approved'
}
]
Example - Toggle for Options:
formItems: [
{
type: 'toggle',
id: 'approval-scope',
title: 'Approval Scope',
value: 'once',
options: [
{ value: 'once', label: 'Once', icon: MynahIcons.CHECK },
{ value: 'session', label: 'Session', icon: MynahIcons.STACK },
{ value: 'always', label: 'Always', icon: MynahIcons.OK_CIRCLED }
]
}
]
Event Handlers:
onFormChange: (tabId: string, messageId: string, item: ChatItemFormItem, value: any) => void
3. Content Display Options
Markdown Body
The body field supports full markdown including:
- Headings (
#,##,###) - Code blocks with syntax highlighting
- Inline code
- Links
- Lists (ordered/unordered)
- Blockquotes
- Tables
Example - Displaying Tool Parameters:
body: `### Tool Execution Request
**Tool:** \`read_file\`
**Parameters:**
\`\`\`json
{
"file_path": "/Users/niko/src/config.ts",
"offset": 0,
"limit": 100
}
\`\`\`
Do you want to allow this tool to execute?`
Custom Renderer
For complex layouts beyond markdown, use customRenderer with HTML markup:
customRenderer: `
<div>
<h4>Tool: <code>read_file</code></h4>
<table>
<tr>
<th>Parameter</th>
<th>Value</th>
</tr>
<tr>
<td>file_path</td>
<td><code>/Users/niko/src/config.ts</code></td>
</tr>
<tr>
<td>offset</td>
<td><code>0</code></td>
</tr>
</table>
</div>
`
Information Cards
For hierarchical content with status indicators:
informationCard: {
title: 'Security Notice',
status: {
status: 'warning',
icon: MynahIcons.WARNING,
body: 'This tool will access filesystem resources'
},
description: 'Review the parameters carefully',
content: {
body: '... detailed information ...'
}
}
4. File Lists
Display file paths with actions and metadata:
fileList: {
fileTreeTitle: 'Files to be accessed',
filePaths: ['/src/config.ts', '/src/main.ts'],
details: {
'/src/config.ts': {
icon: MynahIcons.FILE,
description: 'Configuration file',
clickable: true
}
},
actions: {
'/src/config.ts': [
{
name: 'view-details',
icon: MynahIcons.EYE,
description: 'View file details'
}
]
}
}
Event Handler:
onFileActionClick: (tabId: string, messageId: string, filePath: string, actionName: string) => void
5. Follow-Up Pills
Quick action buttons displayed as pills:
followUp: {
text: 'Quick actions',
options: [
{
pillText: 'Approve All',
icon: MynahIcons.OK,
status: 'success',
prompt: 'approve-all' // Can trigger automatic actions
},
{
pillText: 'Deny All',
icon: MynahIcons.CANCEL,
status: 'error',
prompt: 'deny-all'
}
]
}
Event Handler:
onFollowUpClicked: (tabId: string, messageId: string, followUp: ChatItemAction) => void
Card Behavior Options
Visual States
{
status?: 'info' | 'success' | 'warning' | 'error'; // Colors the card border/icon
shimmer?: boolean; // Loading animation
canBeVoted?: boolean; // Show thumbs up/down
canBeDismissed?: boolean; // Show dismiss button
snapToTop?: boolean; // Pin to top of chat
border?: boolean; // Show border
hoverEffect?: boolean; // Highlight on hover
}
Layout Options
{
fullWidth?: boolean; // Stretch to container width
padding?: boolean; // Internal padding
contentHorizontalAlignment?: 'default' | 'center';
}
Card Lifecycle
{
keepCardAfterClick?: boolean; // On buttons - remove card after click
autoCollapse?: boolean; // Auto-collapse long content
}
Updating Chat Items
Chat items can be updated after creation:
// Add new chat item
mynahUI.addChatItem(tabId, chatItem);
// Update by message ID
mynahUI.updateChatAnswerWithMessageId(tabId, messageId, updatedChatItem);
// Update last streaming answer
mynahUI.updateLastChatAnswer(tabId, partialChatItem);
Complete Example: Tool Approval Workflow
// 1. Show tool approval request
mynahUI.addChatItem('main-tab', {
type: ChatItemType.ANSWER,
messageId: 'tool-approval-read-file-001',
status: 'warning',
icon: MynahIcons.LOCK,
body: `### Tool Execution Request
**Tool:** \`read_file\`
**Description:** Read file contents from the filesystem
**Parameters:**
\`\`\`json
{
"file_path": "/Users/nikomat/dev/mynah-ui/src/config.ts",
"offset": 0,
"limit": 2000
}
\`\`\`
**Security:** This tool will access local filesystem resources.`,
formItems: [
{
type: 'checkbox',
id: 'remember-read-file',
label: 'Trust this tool for the remainder of the session',
value: 'false'
}
],
buttons: [
{
id: 'approve',
text: 'Approve',
status: 'success',
icon: MynahIcons.OK,
keepCardAfterClick: false
},
{
id: 'deny',
text: 'Deny',
status: 'error',
icon: MynahIcons.CANCEL,
keepCardAfterClick: false
},
{
id: 'details',
text: 'More Details',
status: 'clear',
icon: MynahIcons.INFO
}
]
});
// 2. Handle button clicks
mynahUI.onInBodyButtonClicked = (tabId, messageId, action) => {
if (messageId === 'tool-approval-read-file-001') {
const formState = mynahUI.getFormState(tabId, messageId);
const rememberChoice = formState['remember-read-file'] === 'true';
switch (action.id) {
case 'approve':
// Execute tool
// If rememberChoice, add to session whitelist
break;
case 'deny':
// Cancel tool execution
break;
case 'details':
// Show additional information
mynahUI.updateChatAnswerWithMessageId(tabId, messageId, {
informationCard: {
title: 'Tool Details',
content: {
body: 'Detailed tool documentation...'
}
}
});
break;
}
}
};
Progressive Updates
For multi-step approval flows, you can progressively update the same card:
// Initial request
mynahUI.addChatItem(tabId, {
messageId: 'approval-001',
type: ChatItemType.ANSWER,
body: 'Waiting for approval...',
shimmer: true
});
// User approves
mynahUI.updateChatAnswerWithMessageId(tabId, 'approval-001', {
body: 'Approved! Executing tool...',
shimmer: true,
buttons: [] // Remove buttons
});
// Execution complete
mynahUI.updateChatAnswerWithMessageId(tabId, 'approval-001', {
body: 'Tool execution complete!',
shimmer: false,
status: 'success',
icon: MynahIcons.OK_CIRCLED
});
Sticky Cards
For persistent approval requests that stay above the prompt:
mynahUI.updateStore(tabId, {
promptInputStickyCard: {
messageId: 'persistent-approval',
body: 'Multiple tools are waiting for approval',
status: 'warning',
icon: MynahIcons.WARNING,
buttons: [
{
id: 'review-pending',
text: 'Review Pending',
status: 'info'
}
]
}
});
// Clear sticky card
mynahUI.updateStore(tabId, {
promptInputStickyCard: null
});
Best Practices for Tool Approval UI
- Clear Tool Identity: Always show tool name prominently
- Parameter Visibility: Display all parameters the tool will receive
- Security Context: Indicate security implications (file access, network, etc.)
- Action Clarity: Use clear "Approve" vs "Deny" with appropriate status colors
- Scope Options: Provide "once", "session", "always" choices when appropriate
- Non-blocking: Use
keepCardAfterClick: falseto auto-dismiss after approval - Progressive Disclosure: Start simple, show details on demand
- Feedback: Update card state to show execution progress after approval
Key Event Handlers
interface MynahUIProps {
onInBodyButtonClicked?: (tabId: string, messageId: string, action: ChatItemButton) => void;
onFollowUpClicked?: (tabId: string, messageId: string, followUp: ChatItemAction) => void;
onFormChange?: (tabId: string, messageId: string, item: ChatItemFormItem, value: any) => void;
onFileActionClick?: (tabId: string, messageId: string, filePath: string, actionName: string) => void;
// ... many more
}
Reference
- Full documentation: mynah-ui/docs/DATAMODEL.md
- Type definitions: mynah-ui/src/static.ts
- Examples: mynah-ui/example/src/samples/
VSCode Webview State Preservation: Complete Guide for Chat Interfaces
Your mynah-ui chat extension can preserve draft text automatically using VSCode's built-in APIs. The key insight: there's no "last chance" event before destruction, so you must save continuously. The official VSCode documentation shows setState() being called every 100ms without performance concerns, and popular extensions use debounced saves at 300-500ms intervals.
VSCode webview lifecycle: No beforeunload safety net
VSCode webviews do not expose a beforeunload or similar "last chance" event through the extension API. This is the most critical finding for your implementation. You have exactly two lifecycle events to work with:
onDidChangeViewState fires when the webview's visibility changes or moves to a different editor column. It provides access to webviewPanel.visible and webviewPanel.viewColumn properties. Critically, this event does NOT fire when the webview is disposed—only when it becomes hidden or changes position. The browser's beforeunload event exists within the webview iframe itself but cannot communicate asynchronously back to your extension, making it effectively useless for state preservation.
onDidDispose fires after the webview is already destroyed—too late for state saving. Use it only for cleanup operations like canceling timers or removing subscriptions. By the time this event fires, your webview context is gone and any unsaved state is lost.
The recommended pattern is to save state continuously rather than trying to intercept disposal. VSCode's official documentation explicitly shows this approach, with their example calling setState() every 100ms in a setInterval without any warnings about performance impact.
setState performance: Call it freely with light debouncing
The performance cost of vscode.setState() is remarkably low. Microsoft's official documentation states that "getState and setState are the preferred way to persist state, as they have much lower performance overhead than retainContextWhenHidden." The API appears to be synchronous, accepts JSON-serializable objects, and has no documented size limits or throttling mechanisms.
The official VSCode webview sample demonstrates calling setState() 10 times per second (every 100ms) without any performance warnings or caveats. This suggests the operation is highly optimized and suitable for frequent updates. Real-world extension analysis shows a community consensus around 300-500ms debounce intervals for text input, which balances responsiveness with minimal overhead.
Is it acceptable to call on every keystroke? Technically yes, but practically you should debounce. Here's why: while setState itself is lightweight, debouncing serves UX purposes more than performance. A 300-500ms debounce provides a better user experience by avoiding excessive state churn while ensuring draft preservation happens quickly enough that users rarely lose more than half a second of typing if they close the sidebar mid-sentence.
Popular extension patterns: The REST Client extension saves request history to globalState immediately on submission. The GistPad extension uses a 1500ms debounce for search input updates. The Continue AI extension relies on message passing between webview and extension for complex state management rather than setState alone. Most extensions combine approaches—using setState for immediate UI state and globalState for data that must survive webview disposal.
mynah-ui API: Event-driven architecture with limited draft access
mynah-ui does not expose a direct API to retrieve current draft text from input fields in its public documentation. The library follows a strictly event-driven pattern where user input is captured through the onChatPrompt callback, which fires when users submit messages—not during typing.
The getAllTabs() method is not explicitly documented as including unsent draft messages. Based on the library's architecture, tabs contain conversation history and submitted messages, not draft state. You'll need to implement your own draft tracking by monitoring the underlying DOM input elements or maintaining draft state in your extension code.
Events you can hook into:
- onChatPrompt: Fires when users submit a message (your primary input capture point)
- onTabChange: Fires when switching between tabs (good opportunity to save current draft)
- onTabAdd/onTabRemove: Tab lifecycle events
mynah-ui uses a centralized reactive data store where updates automatically trigger re-renders of subscribed components. The library prioritizes declarative state management over imperative queries, which is why draft access methods aren't prominent. For your use case, you'll likely need to access the input DOM elements directly or maintain a parallel draft state structure outside mynah-ui.
User expectations: Auto-save is non-negotiable
Users expect automatic draft preservation based on industry-standard chat applications. Research into Slack, Teams, Discord, and even recent iOS updates reveals consistent patterns:
Automatic per-conversation drafts are table stakes. Slack saves drafts automatically per channel, Teams maintains drafts per conversation, and Discord preserves drafts across app restarts. All provide visual indicators (bold channel names, "[Draft]" labels, or draft count badges) showing where unsent messages exist.
VSCode users are already frustrated by draft loss in existing extensions. GitHub issues show significant pain points: users lose hours of work when chat history disappears during workspace switches, and Claude Code extension users report losing conversation context due to inadequate state preservation. One user complaint: "Lost chats today and am here to express how insane it is that this is even possible."
Expected behavior for your sidebar: When users close the sidebar while typing, they expect that text to reappear when they reopen it—period. This expectation comes from every major communication platform they use daily. Losing draft text is not acceptable. Your implementation must preserve this state automatically, invisibly, and reliably.
VSCode's built-in GitHub Copilot Chat demonstrates the acceptable standard: chat sessions persist within a workspace, history is accessible via "Show Chats...", and sessions can be exported. However, even Copilot Chat has limitations—history loss when switching workspaces causes major user frustration, proving that inadequate persistence is a critical UX failure.
Recommended implementation: Hybrid approach with debounced auto-save
The optimal pattern combines immediate setState() for UI state with debounced saves for draft content, backed by globalState for persistence beyond webview lifecycle. Here's the complete implementation strategy:
Pattern 1: Continuous state preservation in webview
// Inside your webview script
const vscode = acquireVsCodeApi();
// Restore previous state immediately
const previousState = vscode.getState() || {
drafts: {}, // keyed by tab/conversation ID
activeTab: null
};
// Debounced save function (500ms is the sweet spot)
let saveTimeout;
function saveDraftDebounced(tabId, draftText) {
clearTimeout(saveTimeout);
saveTimeout = setTimeout(() => {
const currentState = vscode.getState() || { drafts: {} };
currentState.drafts[tabId] = {
text: draftText,
timestamp: Date.now()
};
vscode.setState(currentState);
// Also notify extension for globalState backup
vscode.postMessage({
command: 'saveDraft',
tabId: tabId,
text: draftText
});
}, 500);
}
// Hook into mynah-ui or direct DOM events
// Since mynah-ui doesn't expose input change events, access the DOM
const chatInput = document.querySelector('[data-mynah-chat-input]'); // adjust selector
if (chatInput) {
chatInput.addEventListener('input', (e) => {
const currentTab = getCurrentTabId(); // your function to get active tab
saveDraftDebounced(currentTab, e.target.value);
});
}
// Immediate save on tab switch (use mynah-ui's onTabChange)
mynahUI = new MynahUI({
onTabChange: (tabId) => {
// Save current draft immediately before switching
const currentDraft = getCurrentDraftText();
if (currentDraft) {
const state = vscode.getState() || { drafts: {} };
state.drafts[getCurrentTabId()] = {
text: currentDraft,
timestamp: Date.now()
};
vscode.setState(state);
}
// Restore draft for new tab
const newState = vscode.getState();
if (newState?.drafts?.[tabId]) {
restoreDraftToInput(newState.drafts[tabId].text);
}
},
onChatPrompt: (tabId, prompt) => {
// Clear draft after successful send
const state = vscode.getState() || { drafts: {} };
delete state.drafts[tabId];
vscode.setState(state);
vscode.postMessage({
command: 'clearDraft',
tabId: tabId
});
}
});
// Restore drafts on load
window.addEventListener('load', () => {
const state = vscode.getState();
const activeTab = getCurrentTabId();
if (state?.drafts?.[activeTab]?.text) {
restoreDraftToInput(state.drafts[activeTab].text);
}
});
Pattern 2: Extension-side backup with globalState
// In your extension code (extension.ts)
export function activate(context: vscode.ExtensionContext) {
// Handle messages from webview
webviewPanel.webview.onDidReceiveMessage(
message => {
switch (message.command) {
case 'saveDraft':
// Save to globalState as backup
const drafts = context.globalState.get('chatDrafts', {});
drafts[message.tabId] = {
text: message.text,
timestamp: Date.now(),
workspace: vscode.workspace.name || 'default'
};
context.globalState.update('chatDrafts', drafts);
break;
case 'clearDraft':
const currentDrafts = context.globalState.get('chatDrafts', {});
delete currentDrafts[message.tabId];
context.globalState.update('chatDrafts', currentDrafts);
break;
case 'getDrafts':
// Send stored drafts back to webview for restoration
const storedDrafts = context.globalState.get('chatDrafts', {});
webviewPanel.webview.postMessage({
command: 'restoreDrafts',
drafts: storedDrafts
});
break;
}
},
undefined,
context.subscriptions
);
// Implement WebviewPanelSerializer for cross-restart persistence
vscode.window.registerWebviewPanelSerializer('yourViewType', {
async deserializeWebviewPanel(webviewPanel: vscode.WebviewPanel, state: any) {
// Restore webview with saved state
webviewPanel.webview.html = getWebviewContent();
// Send drafts from globalState
const drafts = context.globalState.get('chatDrafts', {});
webviewPanel.webview.postMessage({
command: 'restoreDrafts',
drafts: drafts
});
}
});
}
Pattern 3: Flush on critical visibility changes
// Listen to visibility changes
webviewPanel.onDidChangeViewState(
e => {
if (!e.webviewPanel.visible) {
// Webview is becoming hidden - request final state save
webviewPanel.webview.postMessage({
command: 'flushState'
});
}
},
null,
context.subscriptions
);
// In webview: handle flush command
window.addEventListener('message', event => {
const message = event.data;
if (message.command === 'flushState') {
// Immediately save current state without debouncing
const currentDraft = getCurrentDraftText();
if (currentDraft) {
vscode.setState({
drafts: {
[getCurrentTabId()]: {
text: currentDraft,
timestamp: Date.now()
}
}
});
vscode.postMessage({
command: 'saveDraft',
tabId: getCurrentTabId(),
text: currentDraft
});
}
}
});
Trade-offs and performance considerations
Debounce intervals tested in the wild:
- 100ms (VSCode official example): No debounce, continuous updates, perfect for demos but potentially excessive
- 300-500ms (community standard): Optimal balance between responsiveness and efficiency—recommended for most chat interfaces
- 1500ms (GistPad search): Too long for draft preservation, risks losing 1.5 seconds of typing
- Immediate (on send/tab switch): Essential for critical actions where data loss is unacceptable
The undo/redo conflict: Custom text editors that debounce updates face a specific problem—hitting undo before the debounce fires causes undo to jump back to a previous state instead of the last edit. For chat interfaces this is less critical since most chat inputs don't implement complex undo stacks, but be aware if you're building rich text editing features.
Memory and storage considerations: setState() stores data in memory until the webview is disposed. globalState persists to disk and survives VSCode restarts but should be used judiciously for data that truly needs long-term persistence. For your chat extension, draft text is lightweight (typically under 10KB per draft) and appropriate for globalState backup.
retainContextWhenHidden alternative: You could set retainContextWhenHidden: true in your webview options to keep the entire webview context alive when hidden. This would eliminate the need for state persistence entirely, but Microsoft explicitly warns about "much higher performance overhead." Only use this for complex UIs that cannot be quickly serialized and restored. For a chat interface with text drafts, setState/getState is definitively the right choice.
Specific recommendations for your mynah-ui extension
Your implementation checklist:
- Implement debounced auto-save at 500ms intervals for draft text as users type
- Save immediately on tab switches using mynah-ui's
onTabChangeevent - Clear drafts after successful message submission in the
onChatPrompthandler - Back up drafts to globalState via message passing to your extension for persistence beyond webview lifecycle
- Restore drafts on webview load by checking both
vscode.getState()and requesting globalState from your extension - Use onDidChangeViewState to trigger immediate flush when the webview becomes hidden
- Implement WebviewPanelSerializer if you want drafts to survive VSCode restarts (optional but recommended)
Accessing mynah-ui input fields: Since mynah-ui doesn't expose a direct draft text API, you'll need to either:
- Query the DOM directly for the input element (look for
textareaor input fields within mynah-ui's rendered structure) - Maintain a parallel state object that tracks input as users type by monitoring DOM events
- Wrap mynah-ui's initialization and hook into its input element references after construction
Visual indicators to add: Following industry standards, consider adding:
- "[Draft]" label next to tabs with unsaved text
- Badge count showing number of tabs with drafts
- Timestamp showing when draft was last saved
- Warning dialog if user attempts to close VSCode with unsaved drafts (though VSCode doesn't provide a beforeunload hook, you could show a modal when dispose is called)
Testing your implementation:
- Type draft text and close the sidebar—text should reappear on reopen
- Type draft in one tab, switch tabs, return—draft should persist
- Reload the webview (Developer: Reload Webview command)—draft should restore
- Restart VSCode—draft should restore if using WebviewPanelSerializer
- Type draft, wait only 200ms, close sidebar—draft should still save (test your debounce timing)
Code you can ship today
Here's a minimal, production-ready implementation you can add to your existing code:
// Add to your webview script
class DraftManager {
constructor(vscode, mynahUI) {
this.vscode = vscode;
this.mynahUI = mynahUI;
this.saveTimeout = null;
this.DEBOUNCE_MS = 500;
this.init();
}
init() {
// Restore drafts on load
this.restoreAllDrafts();
// Hook into input changes
this.monitorInput();
// Save immediately on visibility change
window.addEventListener('beforeunload', () => this.flushAll());
}
monitorInput() {
// Find mynah-ui input element (adjust selector as needed)
const inputObserver = new MutationObserver(() => {
const input = document.querySelector('textarea[data-mynah-input]');
if (input && !input.dataset.draftHandlerAttached) {
input.dataset.draftHandlerAttached = 'true';
input.addEventListener('input', (e) => {
this.saveDraft(this.getCurrentTabId(), e.target.value);
});
}
});
inputObserver.observe(document.body, {
childList: true,
subtree: true
});
}
saveDraft(tabId, text) {
clearTimeout(this.saveTimeout);
this.saveTimeout = setTimeout(() => {
const state = this.vscode.getState() || { drafts: {} };
state.drafts[tabId] = { text, timestamp: Date.now() };
this.vscode.setState(state);
// Backup to extension
this.vscode.postMessage({
command: 'saveDraft',
tabId,
text
});
}, this.DEBOUNCE_MS);
}
flushAll() {
clearTimeout(this.saveTimeout);
const tabId = this.getCurrentTabId();
const text = this.getCurrentDraftText();
if (text) {
const state = this.vscode.getState() || { drafts: {} };
state.drafts[tabId] = { text, timestamp: Date.now() };
this.vscode.setState(state);
}
}
restoreAllDrafts() {
const state = this.vscode.getState();
if (state?.drafts) {
const currentTab = this.getCurrentTabId();
const draft = state.drafts[currentTab];
if (draft?.text) {
this.setInputText(draft.text);
}
}
}
getCurrentTabId() {
// Your logic to get active tab ID
return this.mynahUI.getSelectedTabId?.() || 'default';
}
getCurrentDraftText() {
const input = document.querySelector('textarea[data-mynah-input]');
return input?.value || '';
}
setInputText(text) {
const input = document.querySelector('textarea[data-mynah-input]');
if (input) {
input.value = text;
input.dispatchEvent(new Event('input', { bubbles: true }));
}
}
}
// Initialize
const vscode = acquireVsCodeApi();
const draftManager = new DraftManager(vscode, mynahUI);
// Integrate with mynah-ui events
mynahUI.onTabChange = (tabId) => {
draftManager.flushAll(); // Save current before switching
draftManager.restoreAllDrafts(); // Restore for new tab
};
mynahUI.onChatPrompt = (tabId, prompt) => {
// Clear draft after send
const state = vscode.getState() || { drafts: {} };
delete state.drafts[tabId];
vscode.setState(state);
};
This implementation provides automatic draft preservation with minimal overhead, follows VSCode best practices, and aligns with industry-standard user expectations. Your users will never lose draft text when closing the sidebar, and the 500ms debounce ensures efficient performance even during rapid typing.
Key documentation references
VSCode Official:
- Webview API Guide: https://code.visualstudio.com/api/extension-guides/webview
- Webview UX Guidelines: https://code.visualstudio.com/api/ux-guidelines/webviews
- Extension Samples (webview-sample): https://github.com/microsoft/vscode-extension-samples
mynah-ui:
- GitHub Repository: https://github.com/aws/mynah-ui
- Documentation files: STARTUP.md, CONFIG.md, DATAMODEL.md, USAGE.md
Open Source Extension Examples:
- Continue (AI chat): https://github.com/continuedev/continue
- REST Client: https://github.com/Huachao/vscode-restclient
- Jupyter: https://github.com/microsoft/vscode-jupyter
Performance and UX Research:
- VSCode GitHub Issues #66939, #109521, #127006 (lifecycle events)
- Community Discussion #68362 (draft loss frustration)
- Issue #251340 (chat history preservation requests)
Language Server Protocol (LSP) - Comprehensive Overview
Executive Summary
The Language Server Protocol (LSP) defines the protocol used between an editor or IDE and a language server that provides language features like auto complete, go to definition, find all references etc. The goal of the Language Server Index Format (LSIF, pronounced like "else if") is to support rich code navigation in development tools or a Web UI without needing a local copy of the source code.
The idea behind the Language Server Protocol (LSP) is to standardize the protocol for how tools and servers communicate, so a single Language Server can be re-used in multiple development tools, and tools can support languages with minimal effort.
Key Benefits:
- Reduces M×N complexity to M+N (one server per language instead of one implementation per editor per language)
- Enables language providers to focus on a single high-quality implementation
- Allows editors to support multiple languages with minimal effort
- Standardized JSON-RPC based communication
Table of Contents
- Architecture & Core Concepts
- Base Protocol
- Message Types
- Capabilities System
- Lifecycle Management
- Document Synchronization
- Language Features
- Workspace Features
- Window Features
- Implementation Considerations
- Version History
Architecture & Core Concepts
Problem Statement
Prior to the design and implementation of the Language Server Protocol for the development of Visual Studio Code, most language services were generally tied to a given IDE or other editor. In the absence of the Language Server Protocol, language services are typically implemented by using a tool-specific extension API.
This created a classic M×N complexity problem where:
- M = Number of editors/IDEs
- N = Number of programming languages
- Total implementations needed = M × N
LSP Solution
The idea behind a Language Server is to provide the language-specific smarts inside a server that can communicate with development tooling over a protocol that enables inter-process communication.
Architecture Components:
- Language Client: The editor/IDE that requests language services
- Language Server: A separate process providing language intelligence
- LSP: The standardized communication protocol between them
Communication Model:
- JSON-RPC 2.0 based messaging
- A language server runs as a separate process and development tools communicate with the server using the language protocol over JSON-RPC.
- Bi-directional communication (client ↔ server)
- Support for synchronous requests and asynchronous notifications
Supported Languages & Environments
LSP is not restricted to programming languages. It can be used for any kind of text-based language, like specifications or domain-specific languages (DSL).
Transport Options:
- stdio (standard input/output)
- Named pipes (Windows) / Unix domain sockets
- TCP sockets
- Node.js IPC
This comprehensive overview provides the foundation for understanding and implementing Language Server Protocol solutions. Each section can be expanded into detailed implementation guides as needed.
Base Protocol
Message Structure
The base protocol consists of a header and a content part (comparable to HTTP). The header and content part are separated by a '\r\n'.
Header Format
Content-Length: <number>\r\n
Content-Type: application/vscode-jsonrpc; charset=utf-8\r\n
\r\n
Required Headers:
Content-Length: Length of content in bytes (mandatory)Content-Type: MIME type (optional, defaults toapplication/vscode-jsonrpc; charset=utf-8)
Content Format
Contains the actual content of the message. The content part of a message uses JSON-RPC to describe requests, responses and notifications.
Example Message:
Content-Length: 126\r\n
\r\n
{
"jsonrpc": "2.0",
"id": 1,
"method": "textDocument/completion",
"params": {
"textDocument": { "uri": "file:///path/to/file.js" },
"position": { "line": 5, "character": 10 }
}
}
JSON-RPC Structure
Base Message
interface Message {
jsonrpc: string; // Always "2.0"
}
Request Message
interface RequestMessage extends Message {
id: integer | string;
method: string;
params?: array | object;
}
Response Message
interface ResponseMessage extends Message {
id: integer | string | null;
result?: any;
error?: ResponseError;
}
Notification Message
interface NotificationMessage extends Message {
method: string;
params?: array | object;
}
Error Handling
Standard Error Codes:
-32700: Parse error-32600: Invalid Request-32601: Method not found-32602: Invalid params-32603: Internal error
LSP-Specific Error Codes:
-32803: RequestFailed-32802: ServerCancelled-32801: ContentModified-32800: RequestCancelled
Language Features
Language Features provide the actual smarts in the language server protocol. They are usually executed on a [text document, position] tuple. The main language feature categories are: code comprehension features like Hover or Goto Definition. coding features like diagnostics, code complete or code actions.
Navigation Features
Go to Definition
textDocument/definition: TextDocumentPositionParams → Location | Location[] | LocationLink[] | null
Go to Declaration
textDocument/declaration: TextDocumentPositionParams → Location | Location[] | LocationLink[] | null
Go to Type Definition
textDocument/typeDefinition: TextDocumentPositionParams → Location | Location[] | LocationLink[] | null
Go to Implementation
textDocument/implementation: TextDocumentPositionParams → Location | Location[] | LocationLink[] | null
Find References
textDocument/references: ReferenceParams → Location[] | null
interface ReferenceParams extends TextDocumentPositionParams {
context: { includeDeclaration: boolean; }
}
Information Features
Hover
textDocument/hover: TextDocumentPositionParams → Hover | null
interface Hover {
contents: MarkedString | MarkedString[] | MarkupContent;
range?: Range;
}
Signature Help
textDocument/signatureHelp: SignatureHelpParams → SignatureHelp | null
interface SignatureHelp {
signatures: SignatureInformation[];
activeSignature?: uinteger;
activeParameter?: uinteger;
}
Document Symbols
textDocument/documentSymbol: DocumentSymbolParams → DocumentSymbol[] | SymbolInformation[] | null
Workspace Symbols
workspace/symbol: WorkspaceSymbolParams → SymbolInformation[] | WorkspaceSymbol[] | null
Code Intelligence Features
Code Completion
textDocument/completion: CompletionParams → CompletionItem[] | CompletionList | null
interface CompletionList {
isIncomplete: boolean;
items: CompletionItem[];
}
interface CompletionItem {
label: string;
kind?: CompletionItemKind;
detail?: string;
documentation?: string | MarkupContent;
sortText?: string;
filterText?: string;
insertText?: string;
textEdit?: TextEdit;
additionalTextEdits?: TextEdit[];
}
Completion Triggers:
- User invoked (Ctrl+Space)
- Trigger characters (
.,->, etc.) - Incomplete completion re-trigger
Code Actions
textDocument/codeAction: CodeActionParams → (Command | CodeAction)[] | null
interface CodeAction {
title: string;
kind?: CodeActionKind;
diagnostics?: Diagnostic[];
isPreferred?: boolean;
disabled?: { reason: string; };
edit?: WorkspaceEdit;
command?: Command;
}
Code Action Kinds:
quickfix- Fix problemsrefactor- Refactoring operationssource- Source code actions (organize imports, etc.)
Code Lens
textDocument/codeLens: CodeLensParams → CodeLens[] | null
interface CodeLens {
range: Range;
command?: Command;
data?: any; // For resolve support
}
Formatting Features
Document Formatting
textDocument/formatting: DocumentFormattingParams → TextEdit[] | null
Range Formatting
textDocument/rangeFormatting: DocumentRangeFormattingParams → TextEdit[] | null
On-Type Formatting
textDocument/onTypeFormatting: DocumentOnTypeFormattingParams → TextEdit[] | null
Semantic Features
Semantic Tokens
Since version 3.16.0. The request is sent from the client to the server to resolve semantic tokens for a given file. Semantic tokens are used to add additional color information to a file that depends on language specific symbol information.
textDocument/semanticTokens/full: SemanticTokensParams → SemanticTokens | null
textDocument/semanticTokens/range: SemanticTokensRangeParams → SemanticTokens | null
textDocument/semanticTokens/full/delta: SemanticTokensDeltaParams → SemanticTokens | SemanticTokensDelta | null
Token Encoding:
- 5 integers per token:
[deltaLine, deltaStart, length, tokenType, tokenModifiers] - Relative positioning for efficiency
- Bit flags for modifiers
Inlay Hints
textDocument/inlayHint: InlayHintParams → InlayHint[] | null
interface InlayHint {
position: Position;
label: string | InlayHintLabelPart[];
kind?: InlayHintKind; // Type | Parameter
tooltip?: string | MarkupContent;
paddingLeft?: boolean;
paddingRight?: boolean;
}
Diagnostics
Push Model (Traditional)
textDocument/publishDiagnostics: PublishDiagnosticsParams
interface PublishDiagnosticsParams {
uri: DocumentUri;
version?: integer;
diagnostics: Diagnostic[];
}
Pull Model (Since 3.17)
textDocument/diagnostic: DocumentDiagnosticParams → DocumentDiagnosticReport
workspace/diagnostic: WorkspaceDiagnosticParams → WorkspaceDiagnosticReport
Diagnostic Structure:
interface Diagnostic {
range: Range;
severity?: DiagnosticSeverity; // Error | Warning | Information | Hint
code?: integer | string;
source?: string; // e.g., "typescript"
message: string;
tags?: DiagnosticTag[]; // Unnecessary | Deprecated
relatedInformation?: DiagnosticRelatedInformation[];
}
Implementation Guide
Performance Guidelines
Message Ordering: Responses to requests should be sent in roughly the same order as the requests appear on the server or client side.
State Management:
- Servers should handle partial/incomplete requests gracefully
- Use
ContentModifiederror for outdated results - Implement proper cancellation support
Resource Management:
- Language servers run in separate processes
- Avoid memory leaks in long-running servers
- Implement proper cleanup on shutdown
Error Handling
Client Responsibilities:
- Restart crashed servers (with exponential backoff)
- Handle
ContentModifiederrors gracefully - Validate server responses
Server Responsibilities:
- Return appropriate error codes
- Handle malformed/outdated requests
- Monitor client process health
Transport Considerations
Command Line Arguments:
language-server --stdio # Use stdio
language-server --pipe=<n> # Use named pipe/socket
language-server --socket --port=<port> # Use TCP socket
language-server --node-ipc # Use Node.js IPC
language-server --clientProcessId=<pid> # Monitor client process
Testing Strategies
Unit Testing:
- Mock LSP message exchange
- Test individual feature implementations
- Validate message serialization/deserialization
Integration Testing:
- End-to-end editor integration
- Multi-document scenarios
- Error condition handling
Performance Testing:
- Large file handling
- Memory usage patterns
- Response time benchmarks
Advanced Topics
Custom Extensions
Experimental Capabilities:
interface ClientCapabilities {
experimental?: {
customFeature?: boolean;
vendorSpecificExtension?: any;
};
}
Custom Methods:
- Use vendor prefixes:
mycompany/customFeature - Document custom protocol extensions
- Ensure graceful degradation
Security Considerations
Process Isolation:
- Language servers run in separate processes
- Limit file system access appropriately
- Validate all input from untrusted sources
Content Validation:
- Sanitize file paths and URIs
- Validate document versions
- Implement proper input validation
Multi-Language Support
Language Identification:
interface TextDocumentItem {
uri: DocumentUri;
languageId: string; // "typescript", "python", etc.
version: integer;
text: string;
}
Document Selectors:
type DocumentSelector = DocumentFilter[];
interface DocumentFilter {
language?: string; // "typescript"
scheme?: string; // "file", "untitled"
pattern?: string; // "**/*.{ts,js}"
}
Message Reference
Message Types
Request/Response Pattern
Client-to-Server Requests:
initialize- Server initializationtextDocument/hover- Get hover informationtextDocument/completion- Get code completionstextDocument/definition- Go to definition
Server-to-Client Requests:
client/registerCapability- Register new capabilitiesworkspace/configuration- Get configuration settingswindow/showMessageRequest- Show message with actions
Notification Pattern
Client-to-Server Notifications:
initialized- Initialization completetextDocument/didOpen- Document openedtextDocument/didChange- Document changedtextDocument/didSave- Document savedtextDocument/didClose- Document closed
Server-to-Client Notifications:
textDocument/publishDiagnostics- Send diagnosticswindow/showMessage- Display messagetelemetry/event- Send telemetry data
Special Messages
Dollar Prefixed Messages: Notifications and requests whose methods start with '$/' are messages which are protocol implementation dependent and might not be implementable in all clients or servers.
Examples:
$/cancelRequest- Cancel ongoing request$/progress- Progress reporting$/setTrace- Set trace level
Capabilities System
Not every language server can support all features defined by the protocol. LSP therefore provides 'capabilities'. A capability groups a set of language features.
Capability Exchange
During Initialization:
- Client announces capabilities in
initializerequest - Server announces capabilities in
initializeresponse - Both sides adapt behavior based on announced capabilities
Client Capabilities Structure
interface ClientCapabilities {
workspace?: WorkspaceClientCapabilities;
textDocument?: TextDocumentClientCapabilities;
window?: WindowClientCapabilities;
general?: GeneralClientCapabilities;
experimental?: any;
}
Key Client Capabilities:
textDocument.hover.dynamicRegistration- Support dynamic hover registrationtextDocument.completion.contextSupport- Support completion contextworkspace.workspaceFolders- Multi-root workspace supportwindow.workDoneProgress- Progress reporting support
Server Capabilities Structure
interface ServerCapabilities {
textDocumentSync?: TextDocumentSyncKind | TextDocumentSyncOptions;
completionProvider?: CompletionOptions;
hoverProvider?: boolean | HoverOptions;
definitionProvider?: boolean | DefinitionOptions;
referencesProvider?: boolean | ReferenceOptions;
documentSymbolProvider?: boolean | DocumentSymbolOptions;
workspaceSymbolProvider?: boolean | WorkspaceSymbolOptions;
codeActionProvider?: boolean | CodeActionOptions;
// ... many more
}
Dynamic Registration
Servers can register/unregister capabilities after initialization:
// Register new capability
client/registerCapability: {
registrations: [{
id: "uuid",
method: "textDocument/willSaveWaitUntil",
registerOptions: { documentSelector: [{ language: "javascript" }] }
}]
}
// Unregister capability
client/unregisterCapability: {
unregisterations: [{ id: "uuid", method: "textDocument/willSaveWaitUntil" }]
}
Lifecycle Management
Initialization Sequence
-
Client → Server:
initializerequestinterface InitializeParams { processId: integer | null; clientInfo?: { name: string; version?: string; }; rootUri: DocumentUri | null; initializationOptions?: any; capabilities: ClientCapabilities; workspaceFolders?: WorkspaceFolder[] | null; } -
Server → Client:
initializeresponseinterface InitializeResult { capabilities: ServerCapabilities; serverInfo?: { name: string; version?: string; }; } -
Client → Server:
initializednotification- Signals completion of initialization
- Server can now send requests to client
Shutdown Sequence
-
Client → Server:
shutdownrequest- Server must not accept new requests (except
exit) - Server should finish processing ongoing requests
- Server must not accept new requests (except
-
Client → Server:
exitnotification- Server should exit immediately
- Exit code: 0 if shutdown was called, 1 otherwise
Process Monitoring
Client Process Monitoring:
- Server can monitor client process via
processIdfrom initialize - Server should exit if client process dies
Server Crash Handling:
- Client should restart crashed servers
- Implement exponential backoff to prevent restart loops
Document Synchronization
Client support for textDocument/didOpen, textDocument/didChange and textDocument/didClose notifications is mandatory in the protocol and clients can not opt out supporting them.
Text Document Sync Modes
enum TextDocumentSyncKind {
None = 0, // No synchronization
Full = 1, // Full document sync on every change
Incremental = 2 // Incremental sync (deltas only)
}
Document Lifecycle
Document Open
textDocument/didOpen: {
textDocument: {
uri: "file:///path/to/file.js",
languageId: "javascript",
version: 1,
text: "console.log('hello');"
}
}
Document Change
textDocument/didChange: {
textDocument: { uri: "file:///path/to/file.js", version: 2 },
contentChanges: [{
range: { start: { line: 0, character: 12 }, end: { line: 0, character: 17 } },
text: "world"
}]
}
Change Event Types:
- Full text: Replace entire document
- Incremental: Specify range and replacement text
Document Save
// Optional: Before save
textDocument/willSave: {
textDocument: { uri: "file:///path/to/file.js" },
reason: TextDocumentSaveReason.Manual
}
// Optional: Before save with text edits
textDocument/willSaveWaitUntil → TextEdit[]
// After save
textDocument/didSave: {
textDocument: { uri: "file:///path/to/file.js" },
text?: "optional full text"
}
Document Close
textDocument/didClose: {
textDocument: { uri: "file:///path/to/file.js" }
}
Position Encoding
Prior to 3.17 the offsets were always based on a UTF-16 string representation. Since 3.17 clients and servers can agree on a different string encoding representation (e.g. UTF-8).
Supported Encodings:
utf-16(default, mandatory)utf-8utf-32
Position Structure:
interface Position {
line: uinteger; // Zero-based line number
character: uinteger; // Zero-based character offset
}
interface Range {
start: Position;
end: Position;
}
Workspace Features
Multi-Root Workspaces
workspace/workspaceFolders → WorkspaceFolder[] | null
interface WorkspaceFolder {
uri: URI;
name: string;
}
// Notification when folders change
workspace/didChangeWorkspaceFolders: DidChangeWorkspaceFoldersParams
Configuration Management
// Server requests configuration from client
workspace/configuration: ConfigurationParams → any[]
interface ConfigurationItem {
scopeUri?: URI; // Scope (file/folder) for the setting
section?: string; // Setting name (e.g., "typescript.preferences")
}
// Client notifies server of configuration changes
workspace/didChangeConfiguration: DidChangeConfigurationParams
File Operations
File Watching
workspace/didChangeWatchedFiles: DidChangeWatchedFilesParams
interface FileEvent {
uri: DocumentUri;
type: FileChangeType; // Created | Changed | Deleted
}
File System Operations
// Before operations (can return WorkspaceEdit)
workspace/willCreateFiles: CreateFilesParams → WorkspaceEdit | null
workspace/willRenameFiles: RenameFilesParams → WorkspaceEdit | null
workspace/willDeleteFiles: DeleteFilesParams → WorkspaceEdit | null
// After operations (notifications)
workspace/didCreateFiles: CreateFilesParams
workspace/didRenameFiles: RenameFilesParams
workspace/didDeleteFiles: DeleteFilesParams
Command Execution
workspace/executeCommand: ExecuteCommandParams → any
interface ExecuteCommandParams {
command: string; // Command identifier
arguments?: any[]; // Command arguments
}
// Server applies edits to workspace
workspace/applyEdit: ApplyWorkspaceEditParams → ApplyWorkspaceEditResult
Window Features
Message Display
Show Message (Notification)
window/showMessage: ShowMessageParams
interface ShowMessageParams {
type: MessageType; // Error | Warning | Info | Log | Debug
message: string;
}
Show Message Request
window/showMessageRequest: ShowMessageRequestParams → MessageActionItem | null
interface ShowMessageRequestParams {
type: MessageType;
message: string;
actions?: MessageActionItem[]; // Buttons to show
}
Show Document
window/showDocument: ShowDocumentParams → ShowDocumentResult
interface ShowDocumentParams {
uri: URI;
external?: boolean; // Open in external program
takeFocus?: boolean; // Focus the document
selection?: Range; // Select range in document
}
Progress Reporting
Work Done Progress
// Server creates progress token
window/workDoneProgress/create: WorkDoneProgressCreateParams → void
// Report progress using $/progress
$/progress: ProgressParams<WorkDoneProgressBegin | WorkDoneProgressReport | WorkDoneProgressEnd>
// Client can cancel progress
window/workDoneProgress/cancel: WorkDoneProgressCancelParams
Progress Reporting Pattern
// Begin
{ kind: "begin", title: "Indexing", cancellable: true, percentage: 0 }
// Report
{ kind: "report", message: "Processing file.ts", percentage: 25 }
// End
{ kind: "end", message: "Indexing complete" }
Logging & Telemetry
window/logMessage: LogMessageParams // Development logs
telemetry/event: any // Usage analytics
Version History
LSP 3.17 (Current)
Major new feature are: type hierarchy, inline values, inlay hints, notebook document support and a meta model that describes the 3.17 LSP version.
Key Features:
- Type hierarchy support
- Inline value provider
- Inlay hints
- Notebook document synchronization
- Diagnostic pull model
- Position encoding negotiation
LSP 3.16
Key Features:
- Semantic tokens
- Call hierarchy
- Moniker support
- File operation events
- Linked editing ranges
- Code action resolve
LSP 3.15
Key Features:
- Progress reporting
- Selection ranges
- Signature help context
LSP 3.0
Breaking Changes:
- Client capabilities system
- Dynamic registration
- Workspace folders
- Document link support