Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

Symposium Logo
AI the Rust Way

What we want to achieve

You fire up your agent of choice through Symposium. It has a more collaborative style, remembers the way you like to work. It knows about your dependencies and incorporates advice supplied by the crate authors on how best to use them. You can install extensions that transform the agent — new skills, new MCP servers, or more advanced capabilities like custom GUI interfaces and new ways of working.

AI the Rust Way

Symposium brings Rust’s design philosophy to AI-assisted development.

Leverage the wisdom of the crowd crates.io

Rust embraces a small stdlib and a rich crate ecosystem. Symposium brings that philosophy to AI: your dependencies can teach your agent how to use them. Add a crate, and your agent learns its idioms, patterns, and best practices.

Beyond crate knowledge, we want to make it easy to publish agent extensions that others can try out and adopt just by adding a line to their configuration — the same way you’d add a dependency to Cargo.toml.

Stability without stagnation

Rust evolves quickly and agents’ training data goes stale. Symposium helps your agent take advantage of the latest Rust features and learn how to use new or private crates — things not found in its training data.

We provide guides and context that keep models up-to-date, helping them write idiomatic Rust rather than JavaScript-in-disguise.

Open, portable, and vendor neutral

Open source tools that everyone can improve. Build extensions once, use them with any ACP-compatible agent. No vendor lock-in.

How to install

Install in your favorite editor

Installing from source

Clone the repository and use the setup tool:

git clone https://github.com/symposium-dev/symposium.git
cd symposium
cargo setup --all

Setup options

OptionDescription
--allInstall everything (ACP binaries, VSCode extension, Zed config)
--acpInstall ACP binaries only
--vscodeBuild and install VSCode extension
--zedConfigure Zed editor
--dry-runShow what would be done without making changes

Options can be combined:

cargo setup --acp --zed    # Install ACP binaries and configure Zed

For editors other than VSCode and Zed, you need to manually configure your editor to run symposium-acp-agent run.

VSCode and VSCode-based editors

Step 1: Install the extension

Install the Symposium extension from:

Installing Symposium extension

Step 2: Activate the panel and start chatting

Symposium panel in VSCode
  1. Activity bar icon — Click to open the Symposium panel
  2. New tab — Start a new conversation with the current settings
  3. Settings — Expand to configure agent and extensions
  4. Agent selector — Choose which agent to use (Claude Code, Gemini CLI, etc.)
  5. Extensions — Enable MCP servers that add capabilities to your agent
  6. Add extension — Add custom extensions
  7. Edit all settings — Access full settings

Custom Agents

The agent selector shows agents from the Symposium registry. To add a custom agent not in the registry, use VS Code settings.

Open Settings (Cmd/Ctrl+,) and search for symposium.agents. Add your custom agent to the JSON array:

"symposium.agents": [
  {
    "id": "my-custom-agent",
    "name": "My Custom Agent",
    "distribution": {
      "npx": { "package": "@myorg/my-agent" }
    }
  }
]

Distribution Types

TypeExampleNotes
npx{ "npx": { "package": "@org/pkg" } }Requires npm/npx installed
pipx{ "pipx": { "package": "my-agent" } }Requires pipx installed
executable{ "executable": { "path": "/usr/local/bin/my-agent" } }Local binary

Your custom agent will appear in the agent selector alongside registry agents.

Other Editors

Symposium works with any editor that supports ACP. See the editors on ACP page for a list of supported editors and how to install ACP support.

Installation

  1. Install ACP support in your editor of choice
  2. Install the Symposium agent binary:
    cargo binstall symposium-acp-agent
    
    or from source:
    cargo install symposium-acp-agent
    
  3. Configure your editor to run:
    ~/.cargo/bin/symposium-acp-agent run
    

Instructions for configuring ACP support in common editors can be found here:

Configuring Symposium

On first run, Symposium will ask you a few questions to create your configuration file at ~/.symposium/config.jsonc:

Welcome to Symposium!

No configuration found. Let's set up your AI agent.

Which agent would you like to use?

  1. Claude Code
  2. Gemini CLI
  3. Codex
  4. Kiro CLI

Type a number (1-4) to select:

After selecting an agent, Symposium creates the config file and you can restart your editor to start using it.

Manual Configuration

You can edit ~/.symposium/config.jsonc directly for more control. The format is:

{
  "agent": "npx -y @zed-industries/claude-code-acp",
  "proxies": [
    { "name": "sparkle", "enabled": true },
    { "name": "ferris", "enabled": true },
    { "name": "cargo", "enabled": true }
  ]
}

Fields:

  • agent: The command to run your downstream AI agent. This is passed to the shell, so you can use any command that works in your terminal.

  • proxies: List of Symposium extensions to enable. Each entry has:

    • name: The extension name
    • enabled: Set to true or false to enable/disable

Built-in Extensions

NameDescription
sparkleAI collaboration identity and memory
ferrisRust crate source fetching
cargoCargo build/test/check commands

Using Symposium

Symposium focuses on creating the best environment for Rust coding through Agent Extensions - MCP servers that add specialized tools and context to your agent.

Symposium is built on the Agent Client Protocol (ACP), which means the core functionality is portable across editors and environments. VSCode is the showcase environment with experimental GUI support, but the basic functionality can be configured in any ACP-supporting editor.

The instructions below use the VSCode extension as the basis for explanation.

Selecting an Agent

To select an agent, click on it in the agent picker. Symposium will download and install the agent binary automatically.

Some agents may require additional tools to be available on your system:

  • npx - for agents distributed via npm
  • uvx - for agents distributed via Python
  • cargo - for agents distributed via crates.io (uses cargo binstall if available, falls back to cargo install)

Symposium checks for updates and installs new versions automatically as they become available.

For adding custom agents not in the registry, see VSCode Installation - Custom Agents.

Managing Extensions

Extensions add capabilities to your agent. Open the Settings panel to manage them.

In the Extensions section you can:

  • Enable/disable extensions via the checkbox
  • Reorder extensions by dragging the handle
  • Add extensions via the “+ Add extension” link
  • Delete extensions from the list

Order matters - extensions are applied in the order listed. The first extension is closest to the editor, and the last is closest to the agent.

When adding extensions, you can choose from:

  • Built-in extensions (Sparkle, Ferris, Cargo)
  • Registry extensions from the shared catalog
  • Custom extensions via executable, npx, pipx, cargo, or URL

Builtin Extensions

Symposium ships with three builtin extensions:

  • Sparkle - AI collaboration framework that learns your working patterns
  • Ferris - Rust crate source inspection
  • Cargo - Compressed cargo command output

Built-in Extensions

Symposium ships with three built-in extensions that enhance your agent’s capabilities for Rust development.

  • Sparkle - AI collaboration framework that learns your working patterns
  • Ferris - Rust crate source inspection
  • Cargo - Compressed cargo command output

See Using Symposium for how to enable, disable, and reorder extensions.

Sparkle

Sparkle is an AI collaboration framework that transforms your agent from a helpful assistant into a thinking partner. It learns your working patterns over time and maintains context across sessions.

Quick Reference

WhatHow
ActivateAutomatic when extension is enabled
Teach a patternSay “meta moment” during a session
Save sessionUse /checkpoint before ending
Local state.sparkle-space/ (add to .gitignore)
Persistent learnings~/.sparkle/

How It Works

Automatic activation - When the Sparkle extension is enabled, it activates automatically when you create a new thread. No manual setup required.

Local workspace state - Sparkle creates a .sparkle-space/ directory in your workspace to store working memory and session checkpoints. Add this to your .gitignore.

Persistent learnings - Pattern anchors and collaboration insights are stored in ~/.sparkle/ and carry across all your workspaces.

Pattern anchors - These are exact phrases that recreate collaborative patterns. Sparkle learns these over time as you work together, capturing what works well in your collaboration style.

Teaching patterns - During a session, say “meta moment” to pause and examine what’s working. Sparkle will capture the insight as a pattern anchor or collaboration evolution that future sessions can build on.

Closing out - Use /checkpoint to save session learnings before ending. This preserves your progress and creates continuity for the next session.

Learn More

For full documentation on Sparkle’s collaboration patterns and identity framework, see the Sparkle documentation.

Ferris

Ferris provides tools for inspecting Rust crate source code, helping your agent understand actual implementations rather than guessing at APIs.

Quick Reference

WhatHow
Fetch crate sourcesAgent uses crate_sources tool
Check workspace versionAutomatic - defaults to version in your Cargo.toml
Specify versionAgent can request specific versions or semver ranges

How It Works

When your agent needs to understand how a crate works, Ferris can fetch the source code directly from crates.io. This is useful when:

  • Working with an unfamiliar crate
  • Checking exact API signatures
  • Understanding internal implementation details
  • Finding usage examples in the crate’s own code

Tips

Encourage source checking - If Claude seems uncertain about a crate’s API or is making incorrect assumptions, prompt it to “check the sources” for that crate. This often leads to more accurate code.

Version awareness - Ferris automatically uses the crate version from your workspace’s Cargo.toml. If you need a different version, you can ask for a specific version or semver range.

Future Plans

Ferris is a work in progress. Future versions will include guidance on strong Rust coding patterns to help your agent write more idiomatic Rust.

Cargo

Cargo provides tools for running common cargo commands with compressed output, helping your agent save context and focus on what matters.

Quick Reference

WhatHow
BuildAgent uses cargo build tool
RunAgent uses cargo run tool
TestAgent uses cargo test tool

How It Works

Instead of running raw cargo commands through bash, your agent can use Cargo’s specialized tools. These tools:

  • Compress output - Filter and summarize cargo’s verbose output to highlight errors, warnings, and key information
  • Save context - Reduce token usage by removing noise, leaving more room for actual problem-solving
  • Focus attention - Present the most important output first so the agent can quickly identify issues

Why Not Just Bash?

Raw cargo build output can be verbose, especially with many dependencies or detailed error messages. The Cargo extension processes this output to extract what the agent actually needs to see, making it more efficient at diagnosing and fixing issues.

Creating Agent Extensions

A Symposium agent extension is an ACP (Agent Client Protocol) proxy that sits between the client and the agent. Proxies can intercept and transform messages, inject context, provide MCP tools, and coordinate agent behavior. Agent extensions are typically distributed as Rust crates on crates.io.

Basic Structure

Your extension crate should:

  1. Implement an ACP proxy using the sacp crate
  2. Produce a binary that speaks ACP over stdio
  3. Include Symposium metadata in Cargo.toml

See the sacp cookbook on building proxies for implementation details and examples.

Cargo.toml Metadata

Add metadata to tell Symposium how to run your extension:

[package]
name = "my-extension"
version = "0.1.0"
description = "Help agents work with MyLibrary"

[package.metadata.symposium]
# Optional: specify which binary if your crate has multiple
binary = "my-extension"

# Optional: arguments to pass when spawning
args = []

# Optional: environment variables
env = { MY_CONFIG = "value" }

The name, description, and version come from the standard [package] section.

Testing Your Extension

Before publishing:

  1. Install locally: cargo install --path .
  2. Test with Symposium: add to your local config and verify it loads correctly
  3. Check ACP compliance: ensure your proxy handles proxy/initialize correctly

Recommending Agent Extensions

There are two ways to recommend agent extensions to users:

  1. Central recommendations - submit to the Symposium recommendations registry
  2. Crate metadata - add recommendations directly in your crate’s Cargo.toml

Extension Sources

Extensions are referenced using a source field:

SourceSyntaxDescription
crates.iosource.crate = "name"Rust crate installed via cargo
ACP Registrysource.acp = "id"Extension from the ACP registry
Direct URLsource.url = "https://..."Direct link to extension.json

Central Recommendations

Submit a PR to symposium-dev/recommendations adding an entry:

[[recommendation]]
source.crate = "my-extension"
when-using-crate = "my-library"

# Or for multiple trigger crates:
[[recommendation]]
source.crate = "my-extension"
when-using-crates = ["my-library", "my-library-derive"]

This tells Symposium: “If a project depends on my-library, suggest my-extension.”

Users can also add their own local recommendation files for internal/proprietary extensions.

Crate Metadata Recommendations

If you maintain a library, you can recommend extensions directly in your Cargo.toml. Users of your crate will see these suggestions in Symposium.

Shorthand Syntax

For crates.io extensions:

[package.metadata.symposium]
recommended = ["some-extension", "another-extension"]

Full Syntax

For extensions from other sources:

# Recommend a crates.io extension
[[package.metadata.symposium.recommended]]
source.crate = "some-extension"

# Recommend an extension from the ACP registry
[[package.metadata.symposium.recommended]]
source.acp = "some-acp-extension"

# Recommend an extension from a direct URL
[[package.metadata.symposium.recommended]]
source.url = "https://example.com/extension.json"

Example

If you maintain tokio, you might add:

[package]
name = "tokio"
version = "1.0.0"

[package.metadata.symposium]
recommended = ["symposium-tokio"]

Users who depend on tokio will see “Tokio Support” suggested in their Symposium settings.

Publishing Agent Extensions

Publishing to crates.io

The simplest way to distribute an agent extension is to publish it to crates.io. Symposium can install extensions directly from crates.io using cargo binstall (for pre-built binaries) or cargo install (building from source).

To make your agent extension installable:

  1. Publish your crate to crates.io as usual
  2. Include a binary target that speaks MCP over stdio
  3. Optionally add [package.metadata.symposium] for configuration (see Creating Extensions)

That’s it. Users can reference your extension by crate name, and crate authors can recommend it in their Cargo.toml (see Recommending Extensions).

Publishing to the ACP Registry (optional)

The ACP Registry is a curated catalog of extensions with broad applicability. Publishing here is appropriate for:

  • General-purpose extensions like Sparkle (AI collaboration identity) that help across all projects
  • Language/framework extensions that benefit many projects
  • Tool integrations that aren’t tied to a specific library

For crate-specific extensions (e.g., an extension that helps with a particular library), crates.io distribution with Cargo.toml recommendations is more appropriate. Users of that library will discover the extension through the recommendation system.

Submitting to the Registry

  1. Fork the registry repository
  2. Create a directory for your extension: my-extension/
  3. Add extension.json:
{
  "id": "my-extension",
  "name": "My Extension",
  "version": "0.1.0",
  "description": "General-purpose extension for X",
  "repository": "https://github.com/you/my-extension",
  "license": "MIT",
  "distribution": {
    "cargo": {
      "crate": "my-extension"
    }
  }
}
  1. Submit a pull request

Distribution Types

Extensions in the registry can specify different distribution methods:

TypeExampleDescription
cargo{ "crate": "my-ext" }Rust crate from crates.io
npx{ "package": "@org/ext" }npm package
binaryPlatform-specific archivesPre-built binaries

For Rust crates, cargo distribution is recommended - it leverages the existing crates.io infrastructure.

How to contribute

Symposium is an open-source project in active development. We’re iterating heavily and welcome collaborators who want to shape where this goes.

Come chat with us

The best way to get involved is to join us on Zulip. We use it for design discussions, coordination, and general conversation about AI-assisted development.

Drop in, say hello, and tell us what you’re interested in working on.

The codebase

The code lives at github.com/symposium-dev/symposium.

We maintain a code of conduct and operate as an independent community exploring what AI has to offer for software development.

Expectations

Given the exploratory nature of Symposium, expect frequent changes. APIs are unstable, and we’re still figuring out the right abstractions. This is a good time to contribute if you want to influence the direction — but be prepared for things to shift as we learn.

Implementation Overview

Symposium uses a conductor to orchestrate a dynamic chain of component proxies that enrich agent capabilities. This architecture adapts to different client capabilities and provides consistent functionality regardless of what the editor or agent natively supports.

Deployment Modes

The symposium-acp-agent binary supports several subcommands:

Run Mode (run)

The primary way to use Symposium. Reads configuration from ~/.symposium/config.jsonc:

symposium-acp-agent run

If no configuration exists, runs an interactive setup wizard. See Run Mode for details.

Run-With Mode (run-with)

For programmatic use by editor extensions. Takes explicit agent and proxy configuration:

flowchart LR
    Editor --> Agent[symposium-acp-agent] --> DownstreamAgent[claude-code, etc.]

Example with agent (wraps downstream agent):

symposium-acp-agent run-with --proxy defaults --agent '{"name":"...","command":"npx",...}'

Example without agent (proxy mode, sits between editor and existing agent):

symposium-acp-agent run-with --proxy sparkle --proxy ferris

Proxy Configuration

Use --proxy <name> to specify which extensions to include. Order matters - proxies are chained in the order specified.

Known proxies: sparkle, ferris, cargo

The special value defaults expands to all known proxies:

--proxy defaults           # equivalent to: --proxy sparkle --proxy ferris --proxy cargo
--proxy foo --proxy defaults --proxy bar  # foo, then all defaults, then bar

If no --proxy flags are given, no proxies are included (pure passthrough).

Internal Structure

Both modes use a conductor to orchestrate multiple component proxies:

flowchart LR
    Input[Editor/stdin] --> S[Symposium Conductor]
    S --> C1[Component 1]
    C1 --> A1[Adapter 1]
    A1 --> C2[Component 2]
    C2 --> Output[Agent/stdout]

The conductor dynamically builds this chain based on what capabilities the editor and agent provide.

Component Pattern

Some Symposium features are implemented as component/adapter pairs:

Components

Components provide functionality to agents through MCP tools and other mechanisms. They:

  • Expose high-level capabilities (e.g., Dialect-based IDE operations)
  • May rely on primitive capabilities from upstream (the editor)
  • Are always included in the chain when their functionality is relevant

Adapters

Adapters “shim” for missing primitive capabilities by providing fallback implementations. They:

  • Check whether required primitive capabilities exist upstream
  • Provide the capability if it’s missing (e.g., spawn rust-analyzer to provide IDE operations)
  • Pass through transparently if the capability already exists
  • Are conditionally included only when needed

Capability-Driven Assembly

During initialization, Symposium:

  1. Receives capabilities from the editor - examines what the upstream client provides
  2. Queries the agent - discovers what capabilities the downstream agent supports
  3. Builds the proxy chain - spawns components and adapters based on detected gaps and opportunities
  4. Advertises enriched capabilities - tells the editor what the complete chain provides

This approach allows Symposium to work with minimal ACP clients (by providing fallback implementations) while taking advantage of native capabilities when available (by passing through directly).

For detailed information about the initialization sequence and capability negotiation, see Initialization Sequence.

Common Issues

This section documents recurring bugs and pitfalls to check when implementing new features.

VS Code Extension

Configuration Not Affecting New Tabs

Symptom: User changes a setting, but new tabs still use the old value.

Cause: The setting affects how the agent process is spawned, but isn’t included in AgentConfiguration.key(). Tabs with the same key share an agent process, so the new tab reuses the existing (stale) process.

Fix: Include the setting in AgentConfiguration:

  1. Add the setting to the AgentConfiguration constructor
  2. Include it in key() so different values produce different keys
  3. Read it in fromSettings() when creating configurations

Example: The symposium.extensions setting was added but new tabs ignored it until extensions were added to AgentConfiguration.key(). See commit fix: include extensions in AgentConfiguration key.

General principle: If a setting affects process behavior (CLI args, environment, etc.), it must be part of the process identity key.

Distribution

This chapter documents how Symposium is released and distributed across platforms.

Release Orchestration

Releases are triggered by release-plz, which:

  1. Creates a release PR when changes accumulate on main
  2. When merged, publishes to crates.io and creates GitHub releases with tags

The symposium-acp-agent-v* tag triggers the binary release workflow.

Distribution Channels

release-plz creates tag
        ↓
┌───────────────────────────────────────┐
│         GitHub Release                │
│  - Binary archives (all platforms)    │
│  - VSCode .vsix files                 │
│  - Source reference                   │
└───────────────────────────────────────┘
        ↓
┌─────────────┬─────────────┬───────────┐
│  crates.io  │   VSCode    │    Zed    │
│             │ Marketplace │Extensions │
│             │ + Open VSX  │           │
└─────────────┴─────────────┴───────────┘

crates.io

The Rust crates are published directly by release-plz. Users can install via:

cargo install symposium-acp-agent

VSCode Marketplace / Open VSX

Platform-specific extensions are built and published automatically. Each platform gets its own ~7MB extension containing only that platform’s binary.

See VSCode Packaging for details.

Zed Extensions

The Zed extension (zed-extension/) points to GitHub release archives. Publishing requires submitting a PR to the zed-industries/extensions repository.

Direct Download

Binary archives are attached to each GitHub release for direct download:

  • symposium-darwin-arm64.tar.gz
  • symposium-darwin-x64.tar.gz
  • symposium-linux-x64.tar.gz
  • symposium-linux-arm64.tar.gz
  • symposium-linux-x64-musl.tar.gz
  • symposium-windows-x64.zip

Supported Platforms

PlatformArchitectureNotes
macOSarm64 (Apple Silicon)Primary development platform
macOSx64 (Intel)
Linuxx64 (glibc)Standard Linux distributions
Linuxarm64ARM servers, Raspberry Pi
Linuxx64 (musl)Static binary, Alpine Linux
Windowsx64

Secrets Required

The release workflow requires these GitHub secrets:

SecretPurpose
RELEASE_PLZ_TOKENGitHub token for release-plz to create releases
VSCE_PATAzure DevOps PAT for VSCode Marketplace
OVSX_PATOpen VSX access token

Agent Registry

Symposium supports multiple ACP-compatible agents and extensions. Users can select from built-in defaults or add entries from the ACP Agent Registry.

The registry resolution logic lives in symposium-acp-agent and is shared across all editor integrations.

Agent Configuration

Each agent or extension is represented as an AgentConfig object:

interface AgentConfig {
  // Required fields
  id: string;
  distribution: {
    local?: { command: string; args?: string[]; env?: Record<string, string> };
    symposium?: { subcommand: string; args?: string[] };
    npx?: { package: string; args?: string[] };
    pipx?: { package: string; args?: string[] };
    cargo?: { crate: string; version?: string; binary?: string; args?: string[] };
    binary?: {
      [platform: string]: {    // e.g., "darwin-aarch64", "linux-x86_64"
        archive: string;
        cmd: string;
        args?: string[];
      };
    };
  };

  // Optional fields (populated from registry if imported)
  name?: string;               // display name, defaults to id
  version?: string;
  description?: string;
  // ... other registry fields as needed

  // Source tracking
  _source?: "registry" | "custom";  // defaults to "custom" if omitted
}

Built-in Agents

Three agents ship as defaults with _source: "custom":

[
  {
    "id": "zed-claude-code",
    "name": "Claude Code",
    "distribution": { "npx": { "package": "@zed-industries/claude-code-acp@latest" } }
  },
  {
    "id": "elizacp",
    "name": "ElizACP",
    "description": "Built-in Eliza agent for testing",
    "distribution": { "symposium": { "subcommand": "eliza" } }
  },
  {
    "id": "kiro-cli",
    "name": "Kiro CLI",
    "distribution": { "local": { "command": "kiro-cli-chat", "args": ["acp"] } }
  }
]

Registry-Imported Agents

When a user imports an agent from the registry, the full registry entry is stored with _source: "registry":

{
  "id": "gemini",
  "name": "Gemini CLI",
  "version": "0.22.3",
  "description": "Google's official CLI for Gemini",
  "_source": "registry",
  "distribution": {
    "npx": { "package": "@google/gemini-cli@0.22.3", "args": ["--experimental-acp"] }
  }
}

Custom Agents

Users can manually add agents with minimal configuration:

{
  "id": "my-agent",
  "distribution": { "npx": { "package": "my-agent-package" } }
}

Registry Sync

For agents with _source: "registry", the extension checks for updates and applies them automatically. Agents removed from the registry are left unchanged—the configuration still works, it just won’t receive future updates.

The registry URL:

https://github.com/agentclientprotocol/registry/releases/latest/download/registry.json

Spawning an Agent

At spawn time, the extension resolves the distribution to a command (priority order):

  1. If distribution.local exists → {command} {args...} with optional env vars
  2. Else if distribution.symposium exists → run as symposium subcommand
  3. Else if distribution.npx exists → npx -y {package} {args...}
  4. Else if distribution.pipx exists → pipx run {package} {args...}
  5. Else if distribution.cargo exists → install and run Rust crate (see below)
  6. Else if distribution.binary[currentPlatform] exists:
    • Check ~/.symposium/bin/{id}/{version}/ for cached binary
    • If not present, download and extract from archive
    • Execute {cache-path}/{cmd} {args...}
  7. Else → error (no compatible distribution for this platform)

Cargo Distribution

The cargo distribution installs agents/extensions from crates.io:

{
  "id": "my-rust-extension",
  "distribution": {
    "cargo": {
      "crate": "my-acp-extension",
      "version": "0.1.0"
    }
  }
}

Resolution process:

  1. Version resolution: If no version specified, query crates.io for the latest stable version
  2. Binary discovery: Query crates.io API for the crate’s bin_names field to determine the executable name
  3. Cache check: Look for ~/.symposium/bin/{id}/{version}/bin/{binary}
  4. Installation: If not cached:
    • Try cargo binstall --no-confirm --root {cache-dir} {crate}@{version} (uses prebuilt binaries, fast)
    • If binstall fails or unavailable, fall back to cargo install --root {cache-dir} {crate}@{version} (builds from source)
  5. Cleanup: Delete old versions when installing a new one

The binary field is optional—if omitted, it’s discovered from crates.io. If the crate has multiple binaries, the field is required to disambiguate.

Platform Detection

Map from Node.js to registry platform keys:

process.platformprocess.archRegistry Key
darwinarm64darwin-aarch64
darwinx64darwin-x86_64
linuxx64linux-x86_64
linuxarm64linux-aarch64
win32x64windows-x86_64

CLI Commands

The symposium-acp-agent binary provides registry subcommands:

# List all available agents (built-ins + registry)
symposium-acp-agent registry list

# Resolve an agent ID to an executable command (McpServer JSON)
symposium-acp-agent registry resolve <agent-id>

The registry list output is a JSON array of {id, name, version?, description?} objects.

The registry resolve output is an McpServer JSON object ready for spawning:

{"name":"Agent Name","command":"/path/to/binary","args":["--flag"],"env":[]}

Decisions

  • Binary cleanup: Delete old versions when downloading a new one. No accumulation.
  • Registry caching: Registry is cached in memory during a session and fetched fresh on first access.

Agent Extensions

Agent extensions are proxy components that enrich an agent’s capabilities. They sit between the editor and the agent, adding tools, context, and behaviors.

Built-in Extensions

IDNameDescription
sparkleSparkleAI collaboration identity and embodiment
ferrisFerrisRust development tools (crate sources, rust researcher)
cargoCargoCargo build and run tools

Extension Sources

Extensions can come from multiple sources:

  • built-in: Bundled with Symposium (sparkle, ferris, cargo)
  • registry: Installed from the shared agent registry
  • custom: User-defined via executable, npx, pipx, cargo, or URL

Distribution Types

Extensions use the same distribution types as agents (see Agent Registry):

  • local - executable command on the system
  • npx - npm package
  • pipx - Python package
  • cargo - Rust crate from crates.io
  • binary - platform-specific archive download

Configuration

Extensions are passed to symposium-acp-agent via --proxy arguments:

symposium-acp-agent run-with --proxy sparkle --proxy ferris --proxy cargo --agent '...'

Order matters - extensions are applied in the order listed. The first extension is closest to the editor, and the last is closest to the agent.

The special value defaults expands to all known built-in extensions:

--proxy defaults  # equivalent to: --proxy sparkle --proxy ferris --proxy cargo

Registry Format

The shared registry includes both agents and extensions:

{
  "date": "2026-01-07",
  "agents": [...],
  "extensions": [
    {
      "id": "some-extension",
      "name": "Some Extension",
      "version": "1.0.0",
      "description": "Does something useful",
      "distribution": {
        "npx": { "package": "@example/some-extension" }
      }
    }
  ]
}

Architecture

┌─────────────────────────────────────────────────┐
│  Editor Extension (VSCode, Zed, etc.)           │
│  - Manages extension configuration              │
│  - Builds --proxy args for agent spawn          │
└─────────────────┬───────────────────────────────┘
                  │
┌─────────────────▼───────────────────────────────┐
│  symposium-acp-agent                            │
│  - Parses --proxy arguments                     │
│  - Resolves extension distributions             │
│  - Builds proxy chain in order                  │
│  - Conductor orchestrates the chain             │
└─────────────────────────────────────────────────┘

Extension Discovery and Recommendations

Symposium can suggest extensions based on a project’s dependencies. This creates a contextual experience where users see relevant extensions for their specific codebase.

Extension Source Naming

Extensions are identified using a source field with multiple options:

source.crate = "foo"           # Rust crate on crates.io
source.acp = "bar"             # Extension ID in ACP registry
source.url = "https://..."     # Direct URL to extension.jsonc

Crate-Defined Recommendations

A crate can recommend extensions to its consumers via Cargo.toml metadata:

[package.metadata.symposium]
# Shorthand for crates.io extensions
recommended = ["foo", "bar"]

# Or explicit with full source specification
[[package.metadata.symposium.recommended]]
source.acp = "some-extension"

When Symposium detects this crate in a user’s dependencies, it surfaces these recommendations.

External Recommendations

Symposium maintains a recommendations file that maps crates to suggested extensions. This allows recommendations without requiring upstream crate changes:

[[recommendation]]
source.crate = "tokio-helper"
when-using-crate = "tokio"

[[recommendation]]
source.crate = "sqlx-helper"
when-using-crates = ["sqlx", "sea-orm"]

Users can add their own recommendation files for custom mappings.

Extension Crate Metadata

When a crate is an extension (not just recommending one), it declares runtime metadata:

[package.metadata.symposium]
binary = "my-extension-bin"        # Optional: if crate has multiple binaries
args = ["--mcp", "--some-flag"]    # Optional: arguments to pass
env = { KEY = "value" }            # Optional: environment variables

Standard package fields (name, description, version) come from [package]. This metadata is used both at runtime and by the GitHub Action that publishes to the ACP registry.

Discovery Flow

  1. Symposium fetches the ACP registry (available extensions and their distributions)
  2. Symposium loads the recommendations file (external mappings)
  3. Symposium scans the user’s Cargo.lock for dependencies
  4. For each dependency, check:
    • Does the recommendations file have an entry with matching when-using-crate(s)?
    • Does the dependency’s Cargo.toml have [package.metadata.symposium.recommended]?
  5. Surface matching extensions in the UI as suggestions

Data Sources

SourcePurposeControlled By
ACP RegistryExtension catalog + distribution infoCommunity
Symposium recommendationsExternal crate-to-extension mappingsSymposium maintainers
User recommendation filesCustom mappingsUser
Cargo.toml metadataCrate author recommendationsCrate authors

Future Work

  • Per-extension configuration: Add sub-options for extensions (e.g., which Ferris tools to enable)
  • Extension updates: Check for and apply updates to registry-sourced extensions

Components

Symposium’s functionality is delivered through component proxies that are orchestrated by the internal conductor. Some features use a component/adapter pattern while others are standalone components.

Component Types

Standalone Components

Some components provide functionality that doesn’t depend on upstream capabilities. These components work with any editor and add features purely through the proxy layer.

Example: A component that provides git history analysis through MCP tools doesn’t need special editor support - it can work with the filesystem directly.

Component/Adapter Pairs

Other components rely on primitive capabilities from the upstream editor. For these, Symposium uses a two-layer approach:

Adapter Layer

The adapter sits upstream in the proxy chain and provides primitive capabilities that the component needs.

Responsibilities:

  • Check for required capabilities during initialization
  • Pass requests through if the editor provides the capability
  • Provide fallback implementation if the capability is missing
  • Abstract away editor differences from the component

Example: The IDE Operations adapter checks if the editor supports ide_operations. If not, it can spawn a language server (like rust-analyzer) to provide that capability.

Component Layer

The component sits downstream from its adapter and enriches primitive capabilities into higher-level MCP tools.

Responsibilities:

  • Expose MCP tools to the agent
  • Process tool invocations
  • Send requests upstream through the adapter
  • Return results to the agent

Example: The IDE Operations component exposes an ide_operation MCP tool that accepts Dialect programs and translates them into IDE operation requests sent upstream.

Component Lifecycle

For component/adapter pairs:

  1. Initialization - Adapter receives initialize request from upstream (editor)
  2. Capability Check - Adapter examines editor capabilities
  3. Conditional Spawning - Adapter spawns fallback if capability is missing
  4. Chain Assembly - Conductor wires adapter → component → downstream
  5. Request Flow - Agent calls MCP tool → component → adapter → editor
  6. Response Flow - Results flow back: editor → adapter → component → agent

Proxy Chain Direction

The proxy chain flows from editor to agent:

Editor → [Adapter] → [Component] → Agent
  • Upstream = toward the editor
  • Downstream = toward the agent

Adapters sit closer to the editor, components sit closer to the agent.

Current Components

Rust Crate Sources

Provides access to published Rust crate source code through an MCP server.

  • Type: Standalone component
  • Implementation: Injects an MCP server that exposes the rust-crate-sources tool
  • Function: Allows agents to fetch and examine source code from crates.io

Sparkle

Provides AI collaboration framework through prompt injection and MCP tooling.

  • Type: Standalone component
  • Implementation: Injects Sparkle MCP server with collaboration tools
  • Function: Enables partnership dynamics, pattern anchors, and meta-collaboration capabilities
  • Documentation: Sparkle docs

Future Components

Additional components can be added following these patterns:

  • IDE Operations - Code navigation and search (likely component/adapter pair)
  • Walkthroughs - Interactive code explanations
  • Git Operations - Repository analysis
  • Build Integration - Compilation and testing workflows

Run Mode

The run subcommand simplifies editor integration by reading agent configuration from a file rather than requiring command-line arguments.

Motivation

Without this mode, editor extensions must either:

  • Hardcode specific agent commands, requiring extension updates to add new agents
  • Expose complex configuration UI for specifying agent commands and proxy options

With run, the extension simply runs:

symposium-acp-agent run

The agent reads its configuration from ~/.symposium/config.jsonc, and if no configuration exists, runs an interactive setup wizard.

Configuration File

Location: ~/.symposium/config.jsonc

The file uses JSONC (JSON with comments) format:

{
  // Downstream agent command (parsed as shell words)
  "agent": "npx -y @zed-industries/claude-code-acp",
  
  // Proxy extensions to enable
  "proxies": [
    { "name": "sparkle", "enabled": true },
    { "name": "ferris", "enabled": true },
    { "name": "cargo", "enabled": true }
  ]
}

Fields

FieldTypeDescription
agentstringShell command to spawn the downstream agent. Parsed using shell word splitting.
proxiesarrayList of proxy extensions with name and enabled fields.

The agent string is parsed as shell words, so commands like npx -y @zed-industries/claude-code-acp work correctly.

Runtime Behavior

┌─────────────────────────────────────────┐
│                 run                     │
└─────────────────┬───────────────────────┘
                  │
                  ▼
        ┌─────────────────┐
        │ Config exists?  │
        └────────┬────────┘
                 │
         ┌───────┴───────┐
         │               │
         ▼               ▼
   ┌──────────┐   ┌──────────────┐
   │  Yes     │   │     No       │
   └────┬─────┘   └──────┬───────┘
        │                │
        ▼                ▼
   Load config     Run configuration
   Run agent       agent (setup wizard)

When a configuration file exists, run behaves equivalently to:

symposium-acp-agent run-with \
    --proxy sparkle --proxy ferris --proxy cargo \
    --agent '{"name":"...","command":"npx",...}'

Configuration Agent

When no configuration file exists, Symposium runs a built-in configuration agent instead of a downstream AI agent. This agent:

  1. Presents a numbered list of known agents (Claude Code, Gemini, Codex, Kiro CLI)
  2. Waits for the user to type a number (1-N)
  3. Saves the configuration file with all proxies enabled
  4. Instructs the user to restart their editor

The configuration agent is a simple state machine that expects numeric input. Invalid input causes the prompt to repeat.

Known Agents

The configuration wizard offers these pre-configured agents:

NameCommand
Claude Codenpx -y @zed-industries/claude-code-acp
Gemini CLInpx -y -- @google/gemini-cli@latest --experimental-acp
Codexnpx -y @zed-industries/codex-acp
Kiro CLIkiro-cli-chat acp

Users can manually edit ~/.symposium/config.jsonc to use other agents or modify proxy settings.

Implementation

The implementation consists of:

  • Config types: SymposiumUserConfig and ProxyEntry structs in src/symposium-acp-agent/src/config.rs
  • Config loading: load() reads from ~/.symposium/config.jsonc, save() writes it
  • Configuration agent: ConfigurationAgent implements the ACP Component trait
  • CLI integration: Run variant in the Command enum

Dependencies

CratePurpose
serde_jsoncParse JSON with comments
shell-wordsParse agent command string into arguments
dirsCross-platform home directory resolution

Rust Crate Sources Component

The Rust Crate Sources component provides agents with the ability to research published Rust crate source code through a sub-agent architecture.

Architecture Overview

The component uses a sub-agent research pattern: when an agent needs information about a Rust crate, the component spawns a dedicated research session with its own agent to investigate the crate sources and return findings.

Message Flow

sequenceDiagram
    participant Client
    participant Proxy as Crate Sources Proxy
    participant Agent

    Note over Client,Proxy: Initial Session Setup
    Client->>Proxy: NewSessionRequest
    Note right of Proxy: Adds user-facing MCP server<br/>(rust_crate_query tool)
    Proxy->>Agent: NewSessionRequest (with user-facing MCP)
    Agent-->>Proxy: NewSessionResponse(session_id)
    Proxy-->>Client: NewSessionResponse(session_id)

    Note over Agent,Proxy: Research Request
    Agent->>Proxy: ToolRequest(rust_crate_query, crate, prompt)
    Note right of Proxy: Create research session
    Proxy->>Agent: NewSessionRequest (with sub-agent MCP)
    Note right of Proxy: Sub-agent MCP has:<br/>- get_rust_crate_source<br/>- return_response_to_user
    Agent-->>Proxy: NewSessionResponse(research_session_id)
    Proxy->>Agent: PromptRequest(research_session_id, prompt)
    
    Note over Agent: Sub-agent researches crate<br/>Uses get_rust_crate_source<br/>Reads files (auto-approved)
    
    Agent->>Proxy: RequestPermissionRequest(Read)
    Proxy-->>Agent: RequestPermissionResponse(approved)
    
    Agent->>Proxy: ToolRequest(return_response_to_user, findings)
    Proxy-->>Agent: ToolResponse(success)
    Note right of Proxy: Response sent via internal channel
    Proxy-->>Agent: ToolResponse(rust_crate_query result)

Two MCP Servers

The component provides two distinct MCP servers:

  1. User-facing MCP Server - Exposed to the main agent session

    • Tool: rust_crate_query - Initiates crate research
  2. Sub-agent MCP Server - Provided only to research sessions

    • Tool: get_rust_crate_source - Locates crate sources and returns path
    • Tool: return_response_to_user - Returns research findings and ends the session

User-Facing Tool: rust_crate_query

Parameters

{
  crate_name: string,      // Name of the Rust crate
  crate_version?: string,  // Optional semver range (defaults to latest)
  prompt: string           // What to research about the crate
}

Examples

{
  "crate_name": "serde",
  "prompt": "How do I use the derive macro for custom field names?"
}
{
  "crate_name": "tokio",
  "crate_version": "1.0",
  "prompt": "What are the signatures of all methods on tokio::runtime::Runtime?"
}

Behavior

  1. Creates a new research session via NewSessionRequest
  2. Attaches the sub-agent MCP server to that session
  3. Sends the user’s prompt via PromptRequest
  4. Waits for the sub-agent to call return_response_to_user
  5. Returns the sub-agent’s findings as the tool result

Sub-Agent Tools

get_rust_crate_source

Locates and extracts the source code for a Rust crate from crates.io.

Parameters:

{
  crate_name: string,
  version?: string  // Semver range
}

Returns:

{
  "crate_name": "serde",
  "version": "1.0.210",
  "checkout_path": "/Users/user/.cargo/registry/src/.../serde-1.0.210",
  "message": "Crate 'serde' version 1.0.210 extracted to ..."
}

The sub-agent can then use Read tool calls (which are auto-approved) to examine the source code.

return_response_to_user

Signals completion of the research and returns findings to the waiting rust_crate_query call.

Parameters:

{
  response: string  // The research findings to return
}

Behavior:

  • Sends the response through an internal channel to the waiting tool handler
  • The original rust_crate_query call completes with this response
  • The research session can then be terminated

Permission Auto-Approval

The component implements a message handler that intercepts RequestPermissionRequest messages from research sessions and automatically approves all permission requests.

Permission Rules

  • Research sessions → All permissions automatically approved
  • Other sessions → Passed through unchanged

Rationale

Research sessions are sandboxed and disposable - they investigate crate sources and return findings. Auto-approving all permissions eliminates the need for dozens of permission prompts while maintaining safety:

  • Research sessions operate on read-only crate sources in the cargo registry cache
  • Sessions are short-lived and focused on a single research task
  • Any side effects are contained within the research session’s scope

Implementation

The handler checks if a permission request comes from a registered research session and automatically selects the first available option (typically “allow”):

#![allow(unused)]
fn main() {
if self.state.is_research_session(&req.session_id) {
    // Select first option (typically "allow")
    let response = RequestPermissionResponse {
        outcome: RequestPermissionOutcome::Selected {
            option_id: req.options.first().unwrap().id.clone(),
        },
        meta: None,
    };
    request_cx.respond(response)?;
    return Ok(Handled::Yes);
}
return Ok(Handled::No(message));  // Not our session, propagate unchanged
}

Session Lifecycle

  1. Agent calls rust_crate_query

    • Handler creates oneshot::channel() for response
    • Registers session in active sessions map
  2. Handler sends NewSessionRequest

    • Includes sub-agent MCP server configuration
    • Receives session_id in response
  3. Handler sends PromptRequest

    • Sends user’s research prompt to the session
    • Awaits response on the oneshot channel
  4. Sub-agent performs research

    • Calls get_rust_crate_source to locate crate
    • Reads source files (auto-approved by permission handler)
    • Analyzes code to answer the prompt
  5. Sub-agent calls return_response_to_user

    • Sends findings through internal channel
    • Original rust_crate_query call receives response
  6. Session cleanup

    • Remove session from active sessions map
    • Session termination (if ACP supports explicit session end)

Shared State

The component uses shared state to coordinate between:

  • The rust_crate_query tool handler (creates sessions, waits for responses)
  • The return_response_to_user tool handler (sends responses)
  • The permission request handler (auto-approves Read operations)

State Structure

#![allow(unused)]
fn main() {
struct ResearchSession {
    session_id: SessionId,
    response_tx: oneshot::Sender<String>,
}

// Shared across all handlers
Arc<Mutex<HashMap<SessionId, ResearchSession>>>
}

Design Decisions

Previous approach: The component exposed get_rust_crate_source with a pattern parameter that performed regex searches across crate sources.

Problems:

  • Agents had to construct exact regex patterns
  • Limited to simple pattern matching
  • No semantic understanding of code structure
  • Single-shot queries couldn’t follow up on findings

Sub-agent approach:

  • Agent describes what information they need in natural language
  • Sub-agent can perform multiple reads, follow references, understand context
  • Can navigate code structure intelligently
  • Returns synthesized answers, not raw pattern matches

Why Auto-Approve All Permissions?

Research sessions need extensive file access to examine crate sources. Requiring user approval for every operation would create dozens of permission prompts, making the feature unusable.

Safety considerations:

  • Research sessions are sandboxed and disposable
  • Scope is limited to investigating crate sources in cargo registry cache
  • Sessions are short-lived with a focused task
  • Any side effects are contained within the research session

Why Oneshot Channels for Response Coordination?

Each rust_crate_query call creates exactly one research session and expects exactly one response. A oneshot::channel models this perfectly:

  • Type-safe guarantee of single response
  • Clear ownership transfer
  • Automatic cleanup on drop
  • No need to poll or maintain complex state

Integration with Symposium

The component is registered with the conductor in symposium-acp-agent/src/symposium.rs:

#![allow(unused)]
fn main() {
proxies.push(DynComponent::new(symposium_ferris::FerrisComponent::new(ferris_config)));
}

The component implements Component::serve() to:

  1. Register the user-facing MCP server via McpServiceRegistry
  2. Implement message handling for permission requests
  3. Forward all other messages to the successor component

Future Enhancements

  • Session timeouts - Terminate research sessions that take too long
  • Concurrent research - Support multiple research sessions simultaneously
  • Caching - Cache common queries to avoid redundant research
  • Progressive responses - Stream findings as they’re discovered rather than waiting for completion
  • Research history - Allow agents to reference previous research results

VSCode Extension Architecture

The Symposium VSCode extension provides a chat interface for interacting with AI agents. The architecture divides responsibilities across three layers to handle VSCode’s webview constraints while maintaining clean separation of concerns.

Components Overview

mynah-ui: AWS’s open-source chat interface library (github.com/aws/mynah-ui). Provides the chat UI rendering, tab management, and message display. The webview layer uses mynah-ui for all visual presentation.

Agent: Currently a mock implementation (HomerActor) that responds with Homer Simpson quotes. Future implementation will spawn an ACP-compatible agent process (see ACP Integration chapter when available).

Extension activation: VSCode activates the extension when the user first opens the Symposium sidebar or runs a Symposium command. The extension spawns the agent process during activation (or lazily on first use) and keeps it alive for the entire VSCode session.

Three-Layer Model

┌─────────────────────────────────────────────────┐
│  Webview (Browser Context)                      │
│  - mynah-ui rendering                           │
│  - User interaction capture                     │
│  - Tab management                               │
└─────────────────┬───────────────────────────────┘
                  │ VSCode postMessage API
┌─────────────────▼───────────────────────────────┐
│  Extension (Node.js Context)                    │
│  - Message routing                              │
│  - Agent lifecycle                              │
│  - Webview lifecycle                            │
└─────────────────┬───────────────────────────────┘
                  │ Process spawning / stdio
┌─────────────────▼───────────────────────────────┐
│  Agent (Separate Process)                       │
│  - Session management                           │
│  - AI interaction                               │
│  - Streaming responses                          │
└─────────────────────────────────────────────────┘

Why Three Layers?

Webview Isolation

VSCode webviews run in isolated browser contexts without Node.js APIs. This security boundary prevents direct file system access, process spawning, or network operations. The webview can only communicate with the extension through VSCode’s postMessage API.

Design consequence: UI code must be pure browser JavaScript. All privileged operations (spawning agents, workspace access, persistence) happen in the extension layer.

Extension as Coordinator

The extension runs in Node.js with full VSCode API access. It bridges between the isolated webview and external agent processes.

Key responsibilities:

  • Message routing - Translates between webview UI events and agent protocol messages
  • Agent lifecycle - Spawns and manages the agent process
  • Webview lifecycle - Handles visibility changes and ensures messages reach the UI

The extension deliberately avoids understanding message semantics. It routes based on IDs (tab ID, message ID) without interpreting content.

Agent Independence

The agent runs as a separate process communicating via stdio. This isolation provides:

  • Flexibility - Agent can be any executable (Rust, Python, TypeScript)
  • Stability - Agent crashes don’t kill the extension
  • Multiple sessions - Single agent process handles all tabs/conversations

The agent owns all session state and conversation logic. The extension only tracks which tab corresponds to which session.

Communication Boundaries

Webview ↔ Extension

Transport: postMessage API (asynchronous, JSON-serializable messages only)

Direction:

  • Webview → Extension: User actions (new tab, send prompt, close tab)
  • Extension → Webview: Agent responses (response chunks, completion signals)

Why not synchronous? VSCode’s webview API is inherently asynchronous. This forces the UI to be resilient to message delays and webview lifecycle events.

Extension ↔ Agent

Transport: ACP (Agent Client Protocol) over stdio

Direction:

  • Extension → Agent: Session commands (new session, process prompt)
  • Agent → Extension: Streaming responses, session state updates

Why ACP over stdio? ACP provides a standardized protocol for agent communication. Stdio is simple, universal, and works with any language. No need for network sockets or IPC complexity.

Agent Configuration and Sharing

The extension uses AgentConfiguration to determine when agent processes can be shared across tabs. An AgentConfiguration consists of:

  • Agent name (e.g., “ElizACP”, “Claude”)
  • Enabled components (e.g., “symposium-acp”)
  • Workspace folder (the VSCode workspace the agent operates in)

Sharing strategy: Tabs with identical configurations share the same agent actor (process), but each tab gets its own session within that process.

Workspace folder selection:

  • Single workspace: Automatically uses that workspace
  • Multiple workspaces: Prompts user to select which workspace folder to use
  • Each session is created with the workspace folder as its working directory

Rationale:

  • Resource efficiency - Shared actor means one process for multiple tabs with the same config
  • Workspace isolation - Different workspace folders get different actors to maintain proper working directory context
  • Session isolation - Each tab gets its own session ID for conversation independence

Trade-off: Agent must implement multiplexing. Messages include session/tab IDs for routing. Extension maps UI tab IDs to agent session IDs.

Design Principles

Opaque state: Each layer owns its state format. Extension stores but doesn’t parse webview UI state or agent session state.

Graceful degradation: Webview can be hidden/shown at any time. Extension buffers messages when webview is inactive.

UUID-based identity: Tab IDs and message IDs use UUIDs to avoid collisions. Generated at source (webview generates tab IDs, extension generates message IDs) to eliminate coordination overhead.

Minimal coupling: Layers communicate through well-defined message protocols. Webview doesn’t know about agents. Agent doesn’t know about webviews. Extension coordinates without understanding semantics.

End-to-End Flow

Here’s how a complete user interaction flows through the system:

sequenceDiagram
    participant User
    participant VSCode
    participant Extension
    participant Webview
    participant Agent
    
    User->>VSCode: Opens Symposium sidebar
    VSCode->>Extension: activate()
    Extension->>Extension: Generate session ID
    Extension->>Agent: Spawn process
    
    Extension->>Webview: Create webview (inject session ID)
    Webview->>Webview: Load, check session ID vs saved state
    Webview->>Webview: Restore or clear tabs, initialize mynah-ui
    Webview->>Extension: webview-ready (last-seen-index)
    
    User->>Webview: Creates new tab
    Webview->>Webview: Generate tab UUID
    Webview->>Extension: new-tab (tabId)
    Extension->>Agent: new-session
    Agent->>Agent: Initialize session
    Agent->>Extension: session-created (sessionId)
    Extension->>Extension: Store tabId ↔ sessionId mapping
    
    User->>Webview: Sends prompt
    Webview->>Webview: Generate message UUID
    Webview->>Extension: prompt (tabId, messageId, text)
    Extension->>Extension: Lookup sessionId for tabId
    Extension->>Agent: process-prompt (sessionId, text)
    
    loop Streaming response
        Agent->>Extension: response-chunk (sessionId, chunk)
        Extension->>Extension: Lookup tabId for sessionId
        Extension->>Webview: response-chunk (tabId, messageId, chunk)
        Webview->>Webview: Render chunk in mynah-ui
    end
    
    Agent->>Extension: response-complete (sessionId)
    Extension->>Webview: response-complete (tabId, messageId)
    Webview->>Webview: End message stream
    Webview->>Webview: setState() - persist session ID and tabs

The extension maintains tab↔session mappings and handles webview visibility, while the agent maintains session state and generates responses.

See also: Common Issues for recurring bug patterns.

Message Protocol

The extension coordinates message flow between the webview UI and agent process. Messages are identified by UUIDs and routed based on tab/session mappings.

Message Identity

The system uses two separate identification mechanisms:

Message IDs (UUIDs): Identify specific prompt/response conversations. When a user sends a prompt, the webview generates a UUID message ID. All response chunks for that prompt include the same message ID, allowing the UI to associate chunks with the correct prompt and render them in the right place. Message IDs enable multiple concurrent prompts (user sends prompt in tab A while tab B is still streaming a response).

Message indices (numbers): Monotonically increasing integers assigned by the extension per tab, used exclusively for deduplication. When the webview is hidden and shown, the extension may replay messages to ensure nothing was missed. The webview tracks the last index it saw per tab (via lastSeenIndex map) and ignores messages with index <= lastSeenIndex[tabId]. This prevents duplicate response chunks from appearing in the UI.

Why both? Message IDs provide semantic identity (“which conversation is this?”). Message indices provide delivery tracking (“have I seen this before?”). The extension assigns indices sequentially as messages flow through; the webview uses UUIDs for UI routing and indices for deduplication.

Message Flow Patterns

Opening a New Tab

sequenceDiagram
    participant User
    participant Webview
    participant Extension
    participant Agent
    
    User->>Webview: Opens new tab
    Webview->>Webview: Generate tab ID (UUID)
    Webview->>Extension: new-tab (tabId)
    Extension->>Agent: new-session
    Agent->>Agent: Initialize session
    Agent->>Extension: session-created (sessionId)
    Extension->>Extension: Store tabId → sessionId mapping

Why UUID generation in webview? The webview owns tab lifecycle. Generating IDs at the source avoids round-trip coordination with the extension.

Why separate session IDs? The agent owns session identity. Tab IDs are UI concepts; session IDs are agent concepts. The extension maps between them without understanding either.

Sending a Prompt

sequenceDiagram
    participant User
    participant Webview
    participant Extension
    participant Agent
    
    User->>Webview: Types message
    Webview->>Extension: prompt (tabId, messageId, text)
    Extension->>Extension: Lookup sessionId for tabId
    Extension->>Agent: process-prompt (sessionId, text)
    
    loop Streaming response
        Agent->>Extension: response-chunk (sessionId, chunk)
        Extension->>Extension: Lookup tabId for sessionId
        alt Webview visible
            Extension->>Webview: response-chunk (tabId, messageId, chunk)
            Webview->>Webview: Append to message stream
        else Webview hidden
            Extension->>Extension: Buffer message
        end
    end
    
    Agent->>Extension: response-complete (sessionId)
    Extension->>Webview: response-complete (tabId, messageId)
    Webview->>Webview: End message stream

Why streaming? AI responses can take seconds to complete. Streaming provides immediate feedback and allows users to start reading while generation continues.

Why message IDs? Multiple prompts can be in flight simultaneously (user sends prompt in tab A while tab B is still receiving a response). Message IDs ensure response chunks are associated with the correct prompt.

Why buffer when hidden? VSCode can hide webviews at any time (user switches away, collapses sidebar). Buffering ensures the UI sees all messages when it becomes visible again.

Closing a Tab

sequenceDiagram
    participant User
    participant Webview
    participant Extension
    participant Agent
    
    User->>Webview: Closes tab
    Webview->>Extension: close-tab (tabId)
    Extension->>Extension: Lookup sessionId for tabId
    Extension->>Agent: close-session (sessionId)
    Agent->>Agent: Cleanup session state
    Extension->>Extension: Remove tabId → sessionId mapping

Why explicit close messages? Allows agent to clean up resources (free memory, close file handles) rather than leaking session state indefinitely.

Message Identification Strategy

Tab IDs

  • Generated by: Webview (when user creates new tab)
  • Format: UUID v4
  • Scope: UI-only concept
  • Lifetime: From tab creation to tab close

Session IDs

  • Generated by: Agent (in response to new-session)
  • Format: Agent-defined (typically UUID)
  • Scope: Agent-only concept
  • Lifetime: From session creation to session close

Message IDs

  • Generated by: Webview (when user sends prompt)
  • Format: UUID v4
  • Scope: Used by both webview and extension for response routing
  • Lifetime: From prompt send to response complete

Why three separate ID spaces? Each layer owns its identity domain. This avoids coupling and eliminates coordination overhead.

Bidirectional Mapping

The extension maintains two maps:

tabId → sessionId    (for extension → agent messages)
sessionId → tabId    (for agent → extension messages)

Synchronization: Maps are updated atomically when session creation completes. Both directions always stay consistent.

Cleanup: Both mappings are removed when either tab closes or session ends.

Message Ordering Guarantees

Within a session: Agent processes prompts sequentially. A second prompt won’t start processing until the first response completes.

Across sessions: No ordering guarantees. Tabs are independent. Multiple sessions can stream responses simultaneously.

Webview messages: Delivered in order sent, but delivery timing depends on webview visibility. Buffered messages are replayed in order when webview becomes visible.

Error Handling

Agent crashes: Extension detects process exit, notifies all active tabs. Tabs display error state. User can trigger agent restart.

Webview disposal: Extension maintains agent sessions. If webview is recreated (VSCode restart), extension can restore tab → session mappings and continue existing sessions.

Message delivery failure: If webview is disposed while messages are buffered, messages are discarded. Agent sessions may continue running. Next webview instantiation can restore session state.

Design Rationale

Why not request/response? Streaming responses require continuous message flow, not single request/reply pairs. The protocol is inherently asynchronous.

Why not share IDs across layers? Each layer has different lifecycle concerns. Decoupling identity spaces allows independent evolution. Extension acts as impedance matcher between UI tab identity and agent session identity.

Why buffer in extension instead of agent? Agent shouldn’t need to know about webview lifecycle. Extension handles VSCode-specific concerns (visibility, disposal) to keep agent implementation portable.

Tool Use Authorization

When agents request permission to execute tools (file operations, terminal commands, etc.), the extension provides a user approval mechanism. This chapter describes how authorization requests flow through the system and how per-agent policies are enforced.

Architecture

The authorization flow bridges three layers:

Agent (ACP requestPermission) → Extension (Promise-based routing) → Webview (MynahUI approval card)

The extension acts as the coordination point:

  • Receives synchronous requestPermission callbacks from the ACP agent
  • Checks per-agent bypass settings
  • Routes approval requests to the webview when user input is needed
  • Blocks the agent using promises until the user responds

Authorization Flow

With Bypass Disabled

sequenceDiagram
    participant Agent
    participant Extension
    participant Settings
    participant Webview
    participant User
    
    Agent->>Extension: requestPermission(toolCall, options)
    Extension->>Settings: Check agents[agentName].bypassPermissions
    Settings-->>Extension: false
    Extension->>Extension: Generate approval ID, create pending promise
    Extension->>Webview: approval-request message
    Webview->>User: Display approval card (MynahUI)
    User->>Webview: Click approve/deny/bypass
    Webview->>Extension: approval-response message
    
    alt User selected "Bypass Permissions"
        Extension->>Settings: Set agents[agentName].bypassPermissions = true
    end
    
    Extension->>Extension: Resolve promise with user's choice
    Extension-->>Agent: return RequestPermissionResponse

With Bypass Enabled

sequenceDiagram
    participant Agent
    participant Extension
    participant Settings
    
    Agent->>Extension: requestPermission(toolCall, options)
    Extension->>Settings: Check agents[agentName].bypassPermissions
    Settings-->>Extension: true
    Extension-->>Agent: return allow_once (auto-approved)

Promise-Based Blocking

The ACP SDK’s requestPermission callback is synchronous - it must return a Promise<RequestPermissionResponse>. The extension creates a promise that resolves when the user responds:

async requestPermission(params) {
  // Check bypass setting first
  if (agentConfig.bypassPermissions) {
    return { outcome: { outcome: "selected", optionId: allowOptionId } };
  }
  
  // Create promise that will resolve when user responds
  const promise = new Promise((resolve, reject) => {
    pendingApprovals.set(approvalId, { resolve, reject, agentName });
  });
  
  // Send request to webview
  sendToWebview({ type: "approval-request", approvalId, ... });
  
  // Return promise (blocks agent until resolved)
  return promise;
}

When the webview sends approval-response, the extension resolves the promise:

case "approval-response":
  const pending = pendingApprovals.get(message.approvalId);
  pending.resolve(message.response);  // Unblocks agent

This allows the agent to block on permission requests without blocking the extension’s event loop.

Per-Agent Settings

Authorization policies are scoped per-agent in symposium.agents configuration:

{
  "symposium.agents": {
    "Claude Code": {
      "command": "npx",
      "args": ["@zed-industries/claude-code-acp"],
      "bypassPermissions": true
    },
    "ElizACP": {
      "command": "elizacp",
      "bypassPermissions": false
    }
  }
}

Why per-agent? Different agents have different trust levels. A user might trust Claude Code with unrestricted file access but want to review every tool call from an experimental agent.

Scope: Settings are stored globally (VSCode user settings), so bypass policies persist across workspaces and sessions.

User Approval Options

When bypass is disabled, the webview displays three options:

  • Approve - Allow this single tool call, continue prompting for future tools
  • Deny - Reject this single tool call, continue prompting for future tools
  • Bypass Permissions - Approve this call AND set bypassPermissions = true for this agent permanently

The “Bypass Permissions” option provides a quick path to trusted status without requiring manual settings edits.

Webview UI Implementation

The webview uses MynahUI primitives to display approval requests:

  • Chat item - Approval request appears as a chat message in the conversation
  • Buttons - Three buttons (Approve, Deny, Bypass) using MynahUI’s button status colors
  • Tool details - Tool name, parameters (formatted as JSON), and any available metadata
  • Card dismissal - Cards auto-dismiss after the user clicks a button (keepCardAfterClick: false)

The specific MynahUI API usage is documented in the MynahUI GUI reference.

Approval Request Message

Extension → Webview:

{
  type: "approval-request",
  tabId: string,
  approvalId: string,        // UUID for matching response
  agentName: string,          // Which agent is requesting permission
  toolCall: {
    toolCallId: string,       // ACP tool call identifier
    title?: string,           // Human-readable tool name (may be null)
    kind?: ToolKind,          // "read", "edit", "execute", etc.
    rawInput?: object         // Tool parameters
  },
  options: PermissionOption[] // Available approval options from ACP
}

Approval Response Message

Webview → Extension:

{
  type: "approval-response",
  approvalId: string,         // Matches approval-request
  response: {
    outcome: {
      outcome: "selected",
      optionId: string        // Which option was chosen
    }
  },
  bypassAll: boolean          // True if "Bypass Permissions" clicked
}

Design Decisions

Why block the agent? Tool execution should wait for user consent. Continuing execution while waiting for approval would allow the agent to make progress on non-tool operations, potentially creating race conditions where the user approves a tool call that’s no longer relevant.

Why promise-based? JavaScript promises provide natural blocking semantics. The extension can return immediately (non-blocking event loop) while the agent perceives the call as synchronous (blocking until approval).

Why store in settings? Bypass permissions should persist across sessions. VSCode settings provide durable storage with UI for manual editing if needed.

Why auto-dismiss cards? Once the user responds, the approval card is no longer actionable. Dismissing it keeps the conversation history clean and focused on the actual work.

Future Enhancements

Potential extensions to the authorization system:

  • Per-tool policies - Trust specific tools (e.g., “always allow Read”) while prompting for others
  • Resource-based rules - Auto-approve file reads within certain directories
  • Temporary sessions - “Bypass for this session” option that doesn’t persist
  • Approval history - Log of past approvals for security auditing
  • Batch approvals - Approve multiple pending tool calls at once

Webview State Persistence

The webview must preserve chat history and UI state across hide/show cycles, but clear state when VSCode restarts. This requires distinguishing between temporary hiding and permanent disposal.

The Problem

VSCode webviews face two distinct lifecycle events that look identical from the webview’s perspective:

  1. User collapses sidebar - Webview is hidden but should restore exactly when reopened
  2. VSCode restarts - Webview is disposed and recreated, should start fresh

Both events destroy and recreate the webview DOM. The webview cannot distinguish between them without additional context.

User expectation: Chat history persists within a VSCode session but doesn’t carry over to the next session. Draft text should survive sidebar collapse but not VSCode restart.

Session ID Solution

The extension generates a session ID (UUID) once per VSCode session at activation. This ID is embedded in the webview HTML as a global JavaScript variable (window.SYMPOSIUM_SESSION_ID) in a script tag. The webview reads this variable synchronously on load and compares it against the session ID stored in saved state.

sequenceDiagram
    participant VSCode
    participant Extension
    participant Webview
    
    Note over VSCode: Extension activation
    Extension->>Extension: Generate session ID
    
    Note over VSCode: User opens sidebar
    Extension->>Webview: Create webview with session ID
    Webview->>Webview: Load saved state
    
    alt Session IDs match
        Webview->>Webview: Restore chat history
    else Session IDs don't match (or no saved ID)
        Webview->>Webview: Clear state, start fresh
    end

Why this works:

  • Within a session: Same session ID embedded every time, state restores
  • After restart: New session ID generated, mismatch detected, state cleared

State Structure

The webview maintains three pieces of state:

  1. Session ID - Embedded from extension, used for freshness detection
  2. Last seen index - Message deduplication tracking (see Webview Lifecycle chapter)
  3. Mynah UI tabs - Opaque blob from mynahUI.getAllTabs() containing tab metadata, chat history, and UI configuration for all open tabs

Ownership: Webview owns this state entirely. Extension provides session ID but doesn’t read or interpret webview state. The mynah-ui tabs structure is treated as opaque—the webview saves whatever getAllTabs() returns and restores it via mynah-ui’s initialization config.

Storage: VSCode’s getState()/setState() API. Persists across hide/show cycles and VSCode restarts.

State Lifecycle

Initial Load

  1. Webview reads embedded session ID from window.SYMPOSIUM_SESSION_ID
  2. Webview calls vscode.getState() to load saved state
  3. If savedState.sessionId === window.SYMPOSIUM_SESSION_ID, restore tabs
  4. Otherwise, call vscode.setState(undefined) to clear stale state

During Use

State is saved after any UI change:

  • User sends a message
  • User opens or closes a tab
  • Agent response is received and rendered

Performance: VSCode’s setState() is optimized for frequent calls. No need to debounce or throttle state saves.

On Restart

  1. Extension activation generates new session ID
  2. Webview loads with new session ID embedded
  3. Session ID mismatch detected (old state has previous session’s ID)
  4. State cleared, webview starts fresh

Message Deduplication

When the webview is hidden and shown, the extension may resend messages to ensure nothing was missed. The webview tracks the last message index seen per tab to avoid duplicates.

Last seen index map: { [tabId: string]: number }

Logic: If incoming message has index <= lastSeenIndex[tabId], ignore it. Otherwise, process and update lastSeenIndex[tabId].

Why needed? Extension buffers messages when webview is hidden (see Webview Lifecycle chapter). Replay strategy is “send everything since last known state” rather than tracking exactly which messages were delivered. Webview deduplicates to avoid showing duplicate response chunks.

Design Trade-offs

Why not retainContextWhenHidden?

VSCode offers retainContextWhenHidden: true to keep webview alive when hidden. This would eliminate the need for state persistence entirely.

Trade-off: Microsoft documentation warns of “much higher performance overhead.” The webview remains in memory consuming resources even when not visible.

Decision: Use state persistence for lightweight chat interfaces. Reserve retainContextWhenHidden for complex UIs (e.g., embedded IDEs) that cannot be easily serialized.

Why not global state in extension?

Extension could store chat history in globalState instead of webview managing its own state.

Trade-off: Violates state ownership principle. Webview understands mynah-ui structure; extension shouldn’t need to parse or manipulate UI state.

Decision: Webview owns UI state, extension provides coordination (session ID injection). Keeps extension simple and allows mynah-ui to evolve independently.

Why clear on restart instead of persisting?

Chat history could persist across VSCode restarts using globalState or workspace storage.

Trade-off: Users expect fresh sessions on restart. Long-lived history creates stale context and memory accumulation. Workspace-specific persistence could be added later if needed.

Decision: Session-scoped state matches user expectations and reduces complexity. Each VSCode session starts clean.

Migration and Compatibility

Old state without session ID: Treated as stale, cleared on first load. Ensures smooth upgrade path when session ID feature is added.

Future state format changes: Session ID check happens before parsing state structure. Mismatched session ID clears everything, eliminating need for explicit version migration.

Webview Lifecycle Management

VSCode can hide and show webviews at any time based on user actions. The extension must handle visibility changes gracefully to ensure no messages are lost and the UI appears responsive when shown.

Visibility States

A webview has three lifecycle states from the extension’s perspective:

  1. Visible - User can see the webview, messages can be delivered immediately
  2. Hidden - Webview exists but is not visible (sidebar collapsed, tab not focused)
  3. Disposed - Webview destroyed, no communication possible

Key constraint: Hidden webviews cannot receive messages. Attempting to send via postMessage succeeds (no error) but messages are silently dropped.

The Hidden Webview Problem

sequenceDiagram
    participant User
    participant Extension
    participant Webview
    participant Agent
    
    User->>Webview: Sends prompt
    Webview->>Extension: prompt message
    Extension->>Agent: Forward prompt
    Agent->>Extension: Start streaming response
    
    Note over User: User collapses sidebar
    Extension->>Extension: Webview hidden (visible = false)
    
    loop Agent still streaming
        Agent->>Extension: response-chunk
        Extension->>Webview: postMessage (silently dropped!)
        Note over Webview: Message lost
    end
    
    Note over User: User reopens sidebar
    Extension->>Extension: Webview visible again
    Note over Webview: Missing chunks, partial response

Without buffering: Messages sent while webview is hidden are lost. When user reopens the sidebar, they see incomplete responses or missing messages entirely.

Message Buffering Strategy

The extension tracks webview visibility and buffers messages when hidden:

sequenceDiagram
    participant Extension
    participant Webview
    participant Agent
    
    Agent->>Extension: response-chunk
    
    alt Webview visible
        Extension->>Webview: Send immediately
    else Webview hidden
        Extension->>Extension: Add to buffer
    end
    
    Note over Extension: Webview becomes visible
    Extension->>Webview: webview-ready request
    Webview->>Extension: last-seen-index
    
    loop For each buffered message
        Extension->>Webview: Send buffered message
        Webview->>Webview: Deduplicate if already seen
    end
    
    Extension->>Extension: Clear buffer

Buffer contents: Any message destined for the webview (response chunks, completion signals, error notifications).

Buffer lifetime: From webview hidden to webview shown. Cleared after replay.

Replay strategy: Send all buffered messages in order. Webview uses last-seen-index tracking (see State Persistence chapter) to ignore duplicates.

Visibility Detection

The extension monitors visibility using VSCode’s onDidChangeViewState event:

stateDiagram-v2
    [*] --> Created: resolveWebviewView
    Created --> Visible: visible = true
    Visible --> Hidden: visible = false
    Hidden --> Visible: visible = true
    Visible --> Disposed: onDidDispose
    Hidden --> Disposed: onDidDispose
    Disposed --> [*]

Event timing:

  • onDidChangeViewState fires when visible property changes
  • onDidDispose fires after webview is destroyed (too late for cleanup)

Race condition: Messages can arrive between “webview created” and “webview visible.” Extension treats created-but-not-visible as hidden state and buffers messages.

Webview-Ready Handshake

When the webview becomes visible (including initial creation), it announces readiness:

  1. Webview finishes initialization - DOM loads, webview script executes, session ID is checked, state is restored or cleared, mynah-ui is constructed with restored tabs (if any)
  2. Webview sends webview-ready - After mynah-ui initialization completes, webview sends message to extension including current last-seen-index map
  3. Extension replays buffered messages - Extension sends any messages that accumulated while webview was hidden
  4. Extension resumes normal message delivery - New messages are sent immediately as they arrive

Why handshake? Webview needs time to initialize mynah-ui and restore state. Sending messages immediately after visibility change could arrive before UI is ready to process them. The webview signals when it’s actually ready to receive messages rather than the extension guessing based on visibility events.

Why include last-seen-index? Allows extension to avoid resending messages the webview already processed before hiding. Reduces redundant replay.

What triggers webview-ready? The webview sends this message during its initialization script, after the mynah-ui constructor completes and before setting up event handlers. On subsequent hide/show cycles, if mynah-ui remains initialized, the webview can send webview-ready immediately after becoming visible.

Agent Independence

The agent continues running regardless of webview visibility:

  • Prompts sent while webview is hidden are still processed
  • Responses generated while webview is hidden are buffered
  • Sessions remain active across webview hide/show cycles

Why? Agent should not need to know about VSCode-specific concerns. Extension insulates agent from webview lifecycle complexity.

Trade-off: Long-running agent operations may complete while webview is hidden, buffering large amounts of data. If webview remains hidden for extended periods, memory usage grows. Current implementation has no buffer size limit.

Disposal Handling

When the webview is disposed (user closes sidebar permanently, workspace switch), buffered messages are discarded:

  • Buffer is cleared
  • Agent sessions continue running
  • Next webview creation can restore tab → session mappings

Why not save buffered messages? Messages are ephemeral rendering updates. State persistence (see State Persistence chapter) handles durable state. Buffering is purely a delivery mechanism for real-time updates.

Design Rationale

Why buffer in extension instead of agent? Webview lifecycle is VSCode-specific. Agent shouldn’t need VSCode-specific logic. Extension handles UI framework concerns.

Why replay all messages instead of tracking delivered? Simpler implementation. Webview deduplication is cheap (index comparison). Tracking exactly which messages were delivered requires more complex state management.

Why not queue in webview? Webview is destroyed/recreated when hidden in some cases. Can’t rely on webview maintaining queue across lifecycle events. Extension has stable lifecycle tied to VSCode session.

Why immediate send when visible? Minimize latency. Users expect real-time streaming responses. Buffering only when necessary provides best UX.

VSCode Extension Integration Testing Guide

Table of Contents

  1. Overview
  2. Testing Types
  3. Setting Up Integration Tests
  4. Writing Integration Tests
  5. Testing Webviews
  6. Advanced Testing Scenarios
  7. Testing Best Practices
  8. Debugging Tests
  9. Common Patterns
  10. Tools and Libraries

Overview

VSCode extension testing involves multiple layers, with integration tests being crucial for verifying that your extension works correctly with the VSCode API in a real VSCode environment.

Why Integration Tests Matter:

  • Unit tests can’t verify VSCode API interactions
  • Extensions can break due to VSCode API changes
  • Manual testing doesn’t scale as extensions grow
  • Integration tests catch issues that unit tests miss

Key Principle: Follow the test pyramid - most tests should be fast unit tests, with a smaller number of integration tests for critical workflows.


Testing Types

Unit Tests

  • Test pure logic in isolation
  • No VSCode API required
  • Fast and can run in any environment
  • Use standard frameworks (Mocha, Jest, etc.)
  • Good for: utility functions, data transformations, business logic

Integration Tests

  • Run inside a real VSCode instance (Extension Development Host)
  • Have access to full VSCode API
  • Test extension behavior with actual VSCode
  • Slower but more realistic
  • Good for: command execution, UI interactions, API integrations

End-to-End Tests

  • Automate the full VSCode UI using tools like WebdriverIO or Playwright
  • Most complex to set up
  • Test complete user workflows
  • Good for: complex UIs, webviews, full user journeys

Setting Up Integration Tests

The modern approach using the official VSCode test CLI.

Installation:

npm install --save-dev @vscode/test-cli @vscode/test-electron

package.json configuration:

{
  "scripts": {
    "test": "vscode-test"
  }
}

Create .vscode-test.js or .vscode-test.mjs:

import { defineConfig } from '@vscode/test-cli';

export default defineConfig({
  files: 'out/test/**/*.test.js',
  version: 'stable', // or 'insiders' or specific version like '1.85.0'
  workspaceFolder: './test-workspace',
  mocha: {
    ui: 'tdd',
    timeout: 20000
  }
});

Run tests:

npm test

Option 2: Using @vscode/test-electron Directly

For more control over the test runner.

Installation:

npm install --save-dev @vscode/test-electron mocha

Create src/test/runTest.ts:

import * as path from 'path';
import { runTests } from '@vscode/test-electron';

async function main() {
  try {
    // The folder containing the Extension Manifest package.json
    const extensionDevelopmentPath = path.resolve(__dirname, '../../');
    
    // The path to test runner
    const extensionTestsPath = path.resolve(__dirname, './suite/index');
    
    // Optional: specific workspace to open
    const testWorkspace = path.resolve(__dirname, '../../test-fixtures');
    
    // Download VS Code, unzip it and run the integration test
    await runTests({
      extensionDevelopmentPath,
      extensionTestsPath,
      launchArgs: [
        testWorkspace,
        '--disable-extensions' // Disable other extensions during testing
      ]
    });
  } catch (err) {
    console.error('Failed to run tests');
    process.exit(1);
  }
}

main();

Create src/test/suite/index.ts (test runner):

import * as path from 'path';
import * as Mocha from 'mocha';
import { glob } from 'glob';

export function run(): Promise<void> {
  const mocha = new Mocha({
    ui: 'tdd',
    color: true,
    timeout: 20000
  });

  const testsRoot = path.resolve(__dirname, '.');

  return new Promise((resolve, reject) => {
    glob('**/**.test.js', { cwd: testsRoot }).then((files) => {
      // Add files to the test suite
      files.forEach(f => mocha.addFile(path.resolve(testsRoot, f)));

      try {
        // Run the mocha test
        mocha.run(failures => {
          if (failures > 0) {
            reject(new Error(`${failures} tests failed.`));
          } else {
            resolve();
          }
        });
      } catch (err) {
        reject(err);
      }
    }).catch((err) => {
      reject(err);
    });
  });
}

Project Structure

your-extension/
├── src/
│   ├── extension.ts
│   └── test/
│       ├── runTest.ts
│       └── suite/
│           ├── index.ts
│           ├── extension.test.ts
│           └── other.test.ts
├── test-fixtures/          # Optional test workspace
│   └── sample-file.txt
├── .vscode/
│   └── launch.json         # Debug configuration
└── package.json

Writing Integration Tests

Basic Test Structure

import * as assert from 'assert';
import * as vscode from 'vscode';

suite('Extension Test Suite', () => {
  vscode.window.showInformationMessage('Start all tests.');

  test('Sample test', () => {
    assert.strictEqual(-1, [1, 2, 3].indexOf(5));
    assert.strictEqual(-1, [1, 2, 3].indexOf(0));
  });

  test('Extension should be present', () => {
    assert.ok(vscode.extensions.getExtension('your-publisher.your-extension'));
  });

  test('Should register commands', async () => {
    const commands = await vscode.commands.getCommands(true);
    assert.ok(commands.includes('your-extension.yourCommand'));
  });
});

Testing Commands

test('Execute command should work', async () => {
  const result = await vscode.commands.executeCommand('your-extension.yourCommand');
  assert.ok(result);
  assert.strictEqual(result.status, 'success');
});

Testing with Documents and Editors

test('Should modify document', async () => {
  // Create a new document
  const doc = await vscode.workspace.openTextDocument({
    content: 'Hello World',
    language: 'plaintext'
  });

  // Open it in an editor
  const editor = await vscode.window.showTextDocument(doc);

  // Execute your command that modifies the document
  await vscode.commands.executeCommand('your-extension.formatDocument');

  // Assert the document was modified
  assert.strictEqual(doc.getText(), 'HELLO WORLD');

  // Clean up
  await vscode.commands.executeCommand('workbench.action.closeActiveEditor');
});

Asynchronous Operations and Waiting

function waitForCondition(
  condition: () => boolean,
  timeout: number = 5000,
  message?: string
): Promise<void> {
  return new Promise((resolve, reject) => {
    const startTime = Date.now();
    const interval = setInterval(() => {
      if (condition()) {
        clearInterval(interval);
        resolve();
      } else if (Date.now() - startTime > timeout) {
        clearInterval(interval);
        reject(new Error(message || 'Timeout waiting for condition'));
      }
    }, 50);
  });
}

test('Wait for extension activation', async () => {
  const extension = vscode.extensions.getExtension('your-publisher.your-extension');
  
  if (!extension!.isActive) {
    await extension!.activate();
  }

  await waitForCondition(
    () => extension!.isActive,
    5000,
    'Extension did not activate'
  );

  assert.ok(extension!.isActive);
});

Testing Events

test('Should trigger onDidChangeTextDocument', async () => {
  const doc = await vscode.workspace.openTextDocument({
    content: 'Test',
    language: 'plaintext'
  });

  let eventFired = false;
  const disposable = vscode.workspace.onDidChangeTextDocument(e => {
    if (e.document === doc) {
      eventFired = true;
    }
  });

  const editor = await vscode.window.showTextDocument(doc);
  await editor.edit(edit => {
    edit.insert(new vscode.Position(0, 0), 'Hello ');
  });

  await waitForCondition(() => eventFired, 2000);
  assert.ok(eventFired, 'Event should have fired');

  disposable.dispose();
});

Testing Webviews

Testing webviews is challenging because they run in an isolated context. There are several approaches:

Extension Side - Add Test Hooks:

class ChatPanel {
  private panel: vscode.WebviewPanel;
  private messageHandlers: Map<string, (message: any) => void> = new Map();

  constructor(extensionUri: vscode.Uri) {
    this.panel = vscode.window.createWebviewPanel(
      'chat',
      'Chat',
      vscode.ViewColumn.One,
      {
        enableScripts: true,
        retainContextWhenHidden: true
      }
    );

    this.panel.webview.onDidReceiveMessage(message => {
      // Handle normal messages
      if (message.type === 'userMessage') {
        this.handleUserMessage(message.text);
      }
      
      // Handle test messages (only in test environment)
      if (process.env.VSCODE_TEST_MODE === 'true') {
        if (message.type === 'test:state') {
          const handler = this.messageHandlers.get('state');
          handler?.(message);
        }
      }
    });
  }

  // Public method for tests to get state
  public requestState(): Promise<any> {
    return new Promise((resolve) => {
      this.messageHandlers.set('state', (message) => {
        resolve(message.data);
        this.messageHandlers.delete('state');
      });
      this.panel.webview.postMessage({ type: 'test:getState' });
    });
  }

  // Method to send messages to webview
  public sendMessage(text: string) {
    this.handleUserMessage(text);
  }

  private handleUserMessage(text: string) {
    // Your normal message handling logic
    // ...
    
    // Send to webview
    this.panel.webview.postMessage({
      type: 'agentResponse',
      text: 'Response to: ' + text
    });
  }
}

Webview Side - Add Test Handlers:

// In your webview HTML/JS
const vscode = acquireVsCodeApi();

let messages = [];

// Handle messages from extension
window.addEventListener('message', event => {
  const message = event.data;
  
  if (message.type === 'agentResponse') {
    messages.push(message);
    updateUI();
  }
  
  // Test-specific handlers
  if (message.type === 'test:getState') {
    vscode.postMessage({
      type: 'test:state',
      data: {
        messages: messages,
        // other state...
      }
    });
  }
});

// Handle user input
function sendMessage(text) {
  vscode.postMessage({
    type: 'userMessage',
    text: text
  });
}

Integration Test:

suite('Chat Webview Tests', () => {
  let chatPanel: ChatPanel;

  setup(async () => {
    // Set test mode
    process.env.VSCODE_TEST_MODE = 'true';
    
    // Create chat panel
    chatPanel = new ChatPanel(extensionUri);
  });

  teardown(async () => {
    // Clean up
    await vscode.commands.executeCommand('workbench.action.closeAllEditors');
    process.env.VSCODE_TEST_MODE = 'false';
  });

  test('Chat state persistence', async () => {
    // Send a message
    chatPanel.sendMessage('Hello');
    
    // Wait for response
    await new Promise(resolve => setTimeout(resolve, 500));
    
    // Get state before closing
    const stateBefore = await chatPanel.requestState();
    assert.strictEqual(stateBefore.messages.length, 1);
    
    // Close and reopen
    await vscode.commands.executeCommand('workbench.action.closePanel');
    await new Promise(resolve => setTimeout(resolve, 100));
    
    // Reopen chat
    chatPanel = new ChatPanel(extensionUri);
    await new Promise(resolve => setTimeout(resolve, 500));
    
    // Verify state persisted
    const stateAfter = await chatPanel.requestState();
    assert.strictEqual(stateAfter.messages.length, 1);
    assert.strictEqual(stateAfter.messages[0].text, 'Response to: Hello');
  });
});

Approach 2: Direct Extension-Side Testing

If your webview logic mostly lives on the extension side, test the handlers directly:

test('Handle user message', async () => {
  const chatPanel = new ChatPanel(extensionUri);
  
  // Simulate message from webview by calling the handler directly
  await chatPanel.handleWebviewMessage({
    type: 'userMessage',
    text: 'Test message'
  });
  
  // Verify the extension's state changed
  const messages = chatPanel.getMessages();
  assert.strictEqual(messages.length, 1);
  assert.strictEqual(messages[0].user, 'Test message');
});

Approach 3: Using WebdriverIO for True E2E Webview Testing

For complex webview UIs where you need to test the actual DOM:

Installation:

npm install --save-dev @wdio/cli @wdio/mocha-framework wdio-vscode-service

wdio.conf.ts:

import path from 'path';

export const config = {
  specs: ['./test/e2e/**/*.test.ts'],
  capabilities: [{
    browserName: 'vscode',
    browserVersion: 'stable',
    'wdio:vscodeOptions': {
      extensionPath: path.join(__dirname, '.'),
      userSettings: {
        'window.dialogStyle': 'custom'
      }
    }
  }],
  services: ['vscode'],
  framework: 'mocha',
  mochaOpts: {
    ui: 'bdd',
    timeout: 60000
  }
};

E2E Test:

describe('Chat Webview E2E', () => {
  it('should allow typing and sending messages', async () => {
    const workbench = await browser.getWorkbench();
    
    // Open your chat panel
    await browser.executeWorkbench((vscode) => {
      vscode.commands.executeCommand('your-extension.openChat');
    });
    
    // Wait for webview to appear
    await browser.pause(1000);
    
    // Switch to webview frame
    const webview = await $('iframe.webview');
    await browser.switchToFrame(webview);
    
    // Interact with webview DOM
    const input = await $('input[type="text"]');
    await input.setValue('Hello from E2E test');
    
    const sendButton = await $('button[type="submit"]');
    await sendButton.click();
    
    // Verify response appears
    const messages = await $$('.message');
    expect(messages).toHaveLength(2); // User message + bot response
  });
});

Advanced Testing Scenarios

Testing with Mock Dependencies

// Create a mock agent for deterministic testing
class MockAgent {
  async sendMessage(text: string): Promise<string> {
    // Return deterministic responses for testing
    if (text.includes('hello')) {
      return 'Hi there!';
    }
    return 'I received: ' + text;
  }
}

// Inject mock in tests
test('Chat with mock agent', async () => {
  const mockAgent = new MockAgent();
  const chatPanel = new ChatPanel(extensionUri, mockAgent);
  
  chatPanel.sendMessage('hello');
  await waitForCondition(() => chatPanel.getMessages().length > 0);
  
  const messages = chatPanel.getMessages();
  assert.strictEqual(messages[0].response, 'Hi there!');
});

Testing State Serialization

test('Serialize and restore webview state', async () => {
  const chatPanel = new ChatPanel(extensionUri);
  
  // Add some state
  chatPanel.sendMessage('First message');
  await new Promise(resolve => setTimeout(resolve, 200));
  
  chatPanel.sendMessage('Second message');
  await new Promise(resolve => setTimeout(resolve, 200));
  
  // Get serialized state
  const state = chatPanel.getSerializedState();
  assert.ok(state);
  assert.ok(state.messages);
  
  // Close panel
  chatPanel.dispose();
  
  // Create new panel with saved state
  const newChatPanel = ChatPanel.restore(extensionUri, state);
  
  // Verify state was restored
  const messages = newChatPanel.getMessages();
  assert.strictEqual(messages.length, 2);
  assert.strictEqual(messages[0].text, 'First message');
});

Testing with File System

import * as fs from 'fs/promises';
import * as path from 'path';
import * as os from 'os';

suite('File Operations', () => {
  let tempDir: string;

  setup(async () => {
    // Create temp directory for test files
    tempDir = await fs.mkdtemp(path.join(os.tmpdir(), 'vscode-test-'));
  });

  teardown(async () => {
    // Clean up temp files
    await fs.rm(tempDir, { recursive: true, force: true });
  });

  test('Should read and process files', async () => {
    // Create test file
    const testFile = path.join(tempDir, 'test.txt');
    await fs.writeFile(testFile, 'test content');
    
    // Open file in VSCode
    const doc = await vscode.workspace.openTextDocument(testFile);
    await vscode.window.showTextDocument(doc);
    
    // Execute your command
    await vscode.commands.executeCommand('your-extension.processFile');
    
    // Verify results
    const content = await fs.readFile(testFile, 'utf-8');
    assert.strictEqual(content, 'PROCESSED: test content');
  });
});

Testing Extension Configuration

test('Should respect configuration changes', async () => {
  const config = vscode.workspace.getConfiguration('your-extension');
  
  // Set test configuration
  await config.update('someSetting', 'testValue', 
    vscode.ConfigurationTarget.Global);
  
  // Execute command that uses config
  const result = await vscode.commands.executeCommand('your-extension.useConfig');
  
  assert.strictEqual(result.settingValue, 'testValue');
  
  // Clean up
  await config.update('someSetting', undefined, 
    vscode.ConfigurationTarget.Global);
});

Testing Best Practices

1. Isolation

  • Each test should be independent
  • Clean up resources in teardown()
  • Don’t rely on test execution order
  • Close editors and panels after tests

2. Determinism

  • Use mock agents or services for predictable behavior
  • Avoid timing dependencies where possible
  • Use proper wait conditions instead of arbitrary sleeps
  • Control randomness (use seeds for random data)

3. Speed

  • Keep integration tests focused
  • Don’t test every edge case in integration tests
  • Use unit tests for detailed logic testing
  • Disable unnecessary extensions with --disable-extensions

4. Clarity

  • Use descriptive test names
  • Comment complex setup/teardown logic
  • Group related tests in suites
  • Keep tests readable and maintainable

5. Reliability

  • Handle asynchronous operations properly
  • Use appropriate timeouts
  • Add retry logic for flaky operations
  • Log failures for debugging

Test Helpers

Create reusable test utilities:

// test/helpers.ts
export async function createTestDocument(
  content: string, 
  language: string = 'plaintext'
): Promise<vscode.TextDocument> {
  const doc = await vscode.workspace.openTextDocument({
    content,
    language
  });
  return doc;
}

export async function closeAllEditors(): Promise<void> {
  await vscode.commands.executeCommand('workbench.action.closeAllEditors');
}

export function waitForExtensionActivation(
  extensionId: string
): Promise<void> {
  return new Promise((resolve, reject) => {
    const extension = vscode.extensions.getExtension(extensionId);
    if (!extension) {
      reject(new Error(`Extension ${extensionId} not found`));
      return;
    }
    
    if (extension.isActive) {
      resolve();
      return;
    }
    
    extension.activate()
      .then(() => resolve())
      .catch(reject);
  });
}

export class Deferred<T> {
  promise: Promise<T>;
  resolve!: (value: T) => void;
  reject!: (error: Error) => void;

  constructor() {
    this.promise = new Promise((resolve, reject) => {
      this.resolve = resolve;
      this.reject = reject;
    });
  }
}

Debugging Tests

VSCode Launch Configuration

Add to .vscode/launch.json:

{
  "version": "0.2.0",
  "configurations": [
    {
      "name": "Extension Tests",
      "type": "extensionHost",
      "request": "launch",
      "runtimeExecutable": "${execPath}",
      "args": [
        "--extensionDevelopmentPath=${workspaceFolder}",
        "--extensionTestsPath=${workspaceFolder}/out/test/suite/index",
        "--disable-extensions"
      ],
      "outFiles": [
        "${workspaceFolder}/out/test/**/*.js"
      ],
      "preLaunchTask": "npm: compile"
    }
  ]
}

Debugging Tips

  1. Set breakpoints in your test files
  2. Use Debug Console to inspect variables
  3. Run single tests by using .only():
    test.only('This test will run alone', () => {
      // ...
    });
    
  4. Use console.log for quick debugging
  5. Check Extension Development Host output for extension logs

Running Specific Tests

# Run all tests
npm test

# Run tests matching pattern
npm test -- --grep "specific test name"

# Run with more verbose output
npm test -- --reporter spec

Common Patterns

Pattern: Testing Command Registration

test('Commands should be registered', async () => {
  const commands = await vscode.commands.getCommands(true);
  const expectedCommands = [
    'your-extension.command1',
    'your-extension.command2',
    'your-extension.command3'
  ];
  
  for (const cmd of expectedCommands) {
    assert.ok(
      commands.includes(cmd),
      `Command ${cmd} should be registered`
    );
  }
});

Pattern: Testing Status Bar Items

test('Should show status bar item', async () => {
  // Trigger action that creates status bar item
  await vscode.commands.executeCommand('your-extension.showStatus');
  
  // Status bar items aren't directly testable via API,
  // so test the underlying state
  const extension = vscode.extensions.getExtension('your-publisher.your-extension');
  const statusItem = (extension?.exports as any).statusBarItem;
  
  assert.ok(statusItem);
  assert.strictEqual(statusItem.text, '$(check) Ready');
});

Pattern: Testing Tree Views

test('Tree view should show items', async () => {
  // Get your tree data provider
  const extension = vscode.extensions.getExtension('your-publisher.your-extension');
  const treeProvider = (extension?.exports as any).treeDataProvider;
  
  // Get root items
  const items = await treeProvider.getChildren();
  
  assert.ok(items.length > 0);
  assert.strictEqual(items[0].label, 'Expected Item');
});

Pattern: Testing Quick Picks

test('Quick pick should show options', async () => {
  // This is tricky - quick picks block execution
  // One approach is to test the logic that generates options
  
  const extension = vscode.extensions.getExtension('your-publisher.your-extension');
  const getQuickPickItems = (extension?.exports as any).getQuickPickItems;
  
  const items = await getQuickPickItems();
  
  assert.strictEqual(items.length, 3);
  assert.strictEqual(items[0].label, 'Option 1');
});

Tools and Libraries

Core Testing Tools

  • @vscode/test-cli: Official CLI for running tests (recommended)
  • @vscode/test-electron: Lower-level test runner for Desktop VSCode
  • @vscode/test-web: Test runner for web extensions
  • Mocha: Test framework used by VSCode (TDD or BDD style)

Additional Testing Tools

  • WebdriverIO + wdio-vscode-service: E2E testing with webview support
  • vscode-extension-tester: Alternative E2E testing tool by Red Hat
  • Sinon: Mocking and stubbing library
  • Chai: Assertion library (alternative to Node’s assert)

Useful Utilities

// Helper to wait for promises with timeout
export function withTimeout<T>(
  promise: Promise<T>, 
  timeoutMs: number
): Promise<T> {
  return Promise.race([
    promise,
    new Promise<T>((_, reject) => 
      setTimeout(() => reject(new Error('Timeout')), timeoutMs)
    )
  ]);
}

// Helper to retry flaky operations
export async function retry<T>(
  fn: () => Promise<T>,
  attempts: number = 3,
  delay: number = 100
): Promise<T> {
  for (let i = 0; i < attempts; i++) {
    try {
      return await fn();
    } catch (error) {
      if (i === attempts - 1) throw error;
      await new Promise(resolve => setTimeout(resolve, delay));
    }
  }
  throw new Error('Retry failed');
}

Example: Complete Test Suite

Here’s a complete example putting it all together:

import * as assert from 'assert';
import * as vscode from 'vscode';
import { ChatPanel } from '../../chatPanel';

suite('Chat Extension Test Suite', () => {
  let extensionUri: vscode.Uri;
  let chatPanel: ChatPanel | undefined;

  suiteSetup(async () => {
    // Run once before all tests
    const extension = vscode.extensions.getExtension('your-publisher.your-extension');
    assert.ok(extension);
    
    if (!extension.isActive) {
      await extension.activate();
    }
    
    extensionUri = extension.extensionUri;
  });

  setup(() => {
    // Run before each test
    process.env.VSCODE_TEST_MODE = 'true';
  });

  teardown(async () => {
    // Run after each test
    if (chatPanel) {
      chatPanel.dispose();
      chatPanel = undefined;
    }
    await vscode.commands.executeCommand('workbench.action.closeAllEditors');
    process.env.VSCODE_TEST_MODE = 'false';
  });

  test('Extension should be present', () => {
    assert.ok(vscode.extensions.getExtension('your-publisher.your-extension'));
  });

  test('Chat command should be registered', async () => {
    const commands = await vscode.commands.getCommands(true);
    assert.ok(commands.includes('your-extension.openChat'));
  });

  test('Should create chat panel', async () => {
    chatPanel = new ChatPanel(extensionUri);
    assert.ok(chatPanel);
  });

  test('Should send and receive messages', async function() {
    this.timeout(5000);
    
    chatPanel = new ChatPanel(extensionUri);
    
    // Send message
    chatPanel.sendMessage('Hello');
    
    // Wait for response
    await new Promise(resolve => setTimeout(resolve, 1000));
    
    const state = await chatPanel.requestState();
    assert.ok(state.messages.length > 0);
  });

  test('Should persist state across panel close/reopen', async function() {
    this.timeout(10000);
    
    // Create panel and send message
    chatPanel = new ChatPanel(extensionUri);
    chatPanel.sendMessage('Test message');
    await new Promise(resolve => setTimeout(resolve, 500));
    
    // Get state
    const stateBefore = await chatPanel.requestState();
    const messageCount = stateBefore.messages.length;
    
    // Serialize and dispose
    const serialized = chatPanel.getSerializedState();
    chatPanel.dispose();
    chatPanel = undefined;
    
    // Wait a bit
    await new Promise(resolve => setTimeout(resolve, 200));
    
    // Restore
    chatPanel = ChatPanel.restore(extensionUri, serialized);
    await new Promise(resolve => setTimeout(resolve, 500));
    
    // Verify
    const stateAfter = await chatPanel.requestState();
    assert.strictEqual(stateAfter.messages.length, messageCount);
  });
});

Summary

Integration testing for VSCode extensions requires:

  1. Proper setup using @vscode/test-cli or @vscode/test-electron
  2. Strategic testing - focus on critical workflows, use unit tests for details
  3. Webview testing via message-passing or E2E tools like WebdriverIO
  4. Good practices - isolation, determinism, proper cleanup
  5. Debugging support with launch configurations

Testing webviews specifically requires creative approaches since they run in isolated contexts. The message-passing pattern works well for integration tests, while WebdriverIO is better for true E2E testing of complex UIs.

Remember: integration tests are slower than unit tests, so use them strategically for testing VSCode API interactions and critical user workflows.

Testing Implementation

This chapter documents the testing framework architecture for the VSCode extension, explaining how tests are structured and how to extend the testing system with new capabilities.

Architecture

Test Infrastructure

The test suite uses @vscode/test-cli which downloads and runs a VSCode instance, loads the extension in development mode, and executes Mocha tests in the extension host context.

Configuration in .vscode-test.mjs:

{
  files: "out/test/**/*.test.js",
  version: "stable",
  workspaceFolder: "./test-workspace",
  mocha: { ui: "tdd", timeout: 20000 }
}

Tests run with:

npm test

Testing API Design

Rather than coupling tests to implementation details, the extension exposes a command-based testing API. Tests invoke VSCode commands which delegate to public testing methods on ChatViewProvider.

Pattern:

// In extension.ts - register test command
context.subscriptions.push(
  vscode.commands.registerCommand("symposium.test.commandName", 
    async (arg1, arg2) => {
      return await chatProvider.testingMethod(arg1, arg2);
    }
  )
);

// In test - invoke via command
const result = await vscode.commands.executeCommand(
  "symposium.test.commandName", 
  arg1, 
  arg2
);

Current Testing Commands:

  • symposium.test.simulateNewTab(tabId) - Create a tab
  • symposium.test.getTabs() - Get list of tab IDs
  • symposium.test.sendPrompt(tabId, prompt) - Send prompt to tab
  • symposium.test.startCapturingResponses(tabId) - Begin capturing agent responses
  • symposium.test.getResponse(tabId) - Get accumulated response text
  • symposium.test.stopCapturingResponses(tabId) - Stop capturing

Adding New Test Commands

To test new behavior:

  1. Add public method to ChatViewProvider (or relevant class):
export class ChatViewProvider {
  // Existing test methods...
  
  public async newTestingMethod(param: string): Promise<ResultType> {
    // Implementation that exposes needed behavior
    return result;
  }
}
  1. Register command in extension.ts:
context.subscriptions.push(
  vscode.commands.registerCommand(
    "symposium.test.newCommand",
    async (param: string) => {
      return await chatProvider.newTestingMethod(param);
    }
  )
);
  1. Use in tests:
test("Should test new behavior", async () => {
  const result = await vscode.commands.executeCommand(
    "symposium.test.newCommand",
    "test-param"
  );
  assert.strictEqual(result.expected, true);
});

Structured Logging for Assertions

Tests verify behavior through structured log events rather than console scraping.

Logger Architecture:

export class Logger {
  private outputChannel: vscode.OutputChannel;
  private eventEmitter = new vscode.EventEmitter<LogEvent>();
  
  public get onLog(): vscode.Event<LogEvent> {
    return this.eventEmitter.event;
  }
  
  public info(category: string, message: string, data?: any): void {
    const event: LogEvent = { 
      timestamp: new Date(), 
      level: "info", 
      category, 
      message, 
      data 
    };
    this.eventEmitter.fire(event);
    this.outputChannel.appendLine(/* formatted output */);
  }
}

Dual Purpose:

  • Testing - Event emitter allows tests to capture and assert on events
  • Live Debugging - Output channel shows logs in VSCode Output panel

Usage in Tests:

const logEvents: LogEvent[] = [];
const disposable = logger.onLog((event) => logEvents.push(event));

// ... perform test actions ...

const relevantEvents = logEvents.filter(
  e => e.category === "agent" && e.message === "Session created"
);
assert.strictEqual(relevantEvents.length, 2);

Adding New Log Points

To make behavior testable:

  1. Add log statement in implementation:
logger.info("category", "Descriptive message", {
  relevantData: value,
  moreContext: other
});
  1. Filter and assert in tests:
const events = logEvents.filter(
  e => e.category === "category" && e.message === "Descriptive message"
);
assert.ok(events.length > 0);
assert.strictEqual(events[0].data.relevantData, expectedValue);

Log Categories:

  • webview - Webview lifecycle events
  • agent - Agent spawning, sessions, communication
  • Add new categories as needed for different subsystems

Design Decisions

Command-Based Testing API

Alternative: Direct access to ChatViewProvider internals from tests

Chosen: Command-based testing API

Rationale:

  • Decouples tests from implementation details
  • Tests the same code paths as real usage
  • Allows refactoring without breaking tests
  • Commands document the testing interface

Real Agents vs Mocks

Alternative: Mock agent responses with canned data

Chosen: Real ElizACP over ACP protocol

Rationale:

  • Tests the full protocol stack (JSON-RPC, stdio, conductor)
  • Verifies conductor integration
  • Catches protocol-level bugs
  • Provides realistic timing and behavior

ElizACP is lightweight, deterministic, and fast enough for testing.

Event-Based Logging

Alternative: Console output scraping with regex

Chosen: Event emitter with structured data

Rationale:

  • Enables precise assertions on event counts and data
  • Provides rich context for debugging
  • Output panel visibility for live debugging
  • No brittle string matching
  • Same infrastructure serves testing and development

Test Isolation

Challenge: Tests share VSCode instance, agent processes persist across tests

Strategy: Make tests order-independent:

  • Assert “spawned OR reused” rather than exact counts
  • Focus on test-specific events (e.g., prompts sent, responses received)
  • Capture logs from test start, not globally
  • Don’t assume clean state between tests

This allows the test suite to pass regardless of execution order.

Writing Tests

Tests live in src/test/*.test.ts and use Mocha’s TDD interface:

suite("Feature Tests", () => {
  test("Should do something", async function() {
    this.timeout(20000); // Extend timeout for async operations
    
    // Setup log capture
    const logEvents: LogEvent[] = [];
    const disposable = logger.onLog((event) => logEvents.push(event));
    
    // Perform test actions via commands
    await vscode.commands.executeCommand("symposium.test.doSomething");
    
    // Wait for async completion
    await new Promise(resolve => setTimeout(resolve, 1000));
    
    // Assert on results
    const events = logEvents.filter(/* ... */);
    assert.ok(events.length > 0);
    
    disposable.dispose();
  });
});

Key Patterns:

  • Use async function() (not arrow functions) to access this.timeout()
  • Extend timeout for operations involving agent spawning
  • Always dispose log listeners
  • Add delays for async operations (agent responses, UI updates)

Extension Packaging

This chapter documents the design decisions for building and distributing the VSCode extension.

Architecture Overview

The extension consists of two parts that must be bundled together:

  1. TypeScript code - The extension logic and webview, bundled via webpack
  2. Native binary - The symposium-acp-agent Rust binary for the target platform

Platform-Specific Extensions

We publish separate extensions for each platform rather than a universal extension containing all binaries.

Rationale:

  • A universal extension would be ~70MB+ (all platform binaries)
  • Platform-specific extensions are ~7MB each
  • VSCode Marketplace natively supports this - users automatically get the right variant
  • Aligns with how other extensions with native dependencies work (rust-analyzer, etc.)

Supported platforms:

PlatformDescription
darwin-arm64macOS Apple Silicon
darwin-x64macOS Intel
linux-x64Linux x86_64
linux-arm64Linux ARM64
win32-x64Windows x86_64

Binary Resolution

The extension uses a fallback chain for finding the conductor binary:

  1. Bundled binary in bin/<platform>/ (production)
  2. PATH lookup (development)
  3. User override via the symposium.acpAgentPath setting — if set, this path is used verbatim and takes precedence

This enables local development without packaging - developers can cargo install the binary and the extension finds it in PATH.

Release Flow

Releases are orchestrated through release-plz and GitHub Actions:

release-plz creates tag
        ↓
GitHub Release created
        ↓
Binary build workflow triggered
        ↓
┌───────────────────────────────────────┐
│  Build binaries (parallel)            │
│  - macOS arm64/x64                    │
│  - Linux x64/arm64/musl               │
│  - Windows x64                        │
└───────────────────────────────────────┘
        ↓
Upload archives to GitHub Release
        ↓
┌───────────────────────────────────────┐
│  Build VSCode extensions (parallel)   │
│  - One per platform                   │
│  - Each bundles its platform binary   │
└───────────────────────────────────────┘
        ↓
Upload .vsix files to GitHub Release
        ↓
Publish to marketplaces (TODO)

Why GitHub Releases as the source:

  • Single source of truth for all binaries
  • Enables Zed extension (points to release archives)
  • Enables direct downloads for users not on VSCode
  • Versioned and immutable

Vendored mynah-ui

The extension depends on a fork of mynah-ui (AWS’s chat UI component) located in vendor/mynah-ui. This is managed as a git subtree.

Why vendor:

  • Enables custom features not yet upstream
  • Webpack bundles it into webview.js - only the built output ships in the extension

Local Development

For development without building platform packages:

  1. Install the conductor: cargo install --path src/symposium-acp-agent
  2. Build the extension: cd vscode-extension && npm run compile
  3. Launch via F5 in VSCode

The extension finds the binary in PATH when no bundled binary exists.

Extension UI (VSCode)

This chapter covers VSCode-specific UI for managing extensions. For general extension concepts, see Agent Extensions.

Configuration Storage

Extensions are configured via the symposium.extensions VS Code setting:

"symposium.extensions": [
  { "id": "sparkle", "_enabled": true, "_source": "built-in" },
  { "id": "ferris", "_enabled": true, "_source": "built-in" },
  { "id": "cargo", "_enabled": true, "_source": "built-in" }
]

Custom extensions include their distribution:

{
  "id": "my-extension",
  "_enabled": true,
  "_source": "custom",
  "name": "My Extension",
  "distribution": {
    "npx": { "package": "@myorg/my-extension" }
  }
}

Default behavior - when no setting exists, all built-in extensions are enabled. If the user returns to the default configuration, the key is removed from settings.json entirely.

Settings UI

The Settings panel includes an Extensions section where users can:

  • Enable/disable extensions via checkbox
  • Reorder extensions by dragging the handle
  • Delete extensions from the list
  • Add extensions via the “+ Add extension” link, which opens a QuickPick dialog

Add Extension Dialog

The QuickPick dialog shows three sections:

  1. Built-in - sparkle, ferris, cargo (greyed out if already added)
  2. From Registry - extensions from the shared registry with type: "extension"
  3. Add Custom Extension:
    • From executable on your system (local command/path)
    • From npx package
    • From pipx package
    • From cargo crate
    • From URL to extension.json (GitHub URLs auto-converted to raw)

Spawn Integration

When spawning an agent, the extension builds --proxy arguments from enabled extensions:

symposium-acp-agent run-with --proxy sparkle --proxy ferris --proxy cargo --agent '...'

Only enabled extensions are passed, in their configured order.

Language Model Provider

Experimental: This feature is disabled by default. Set symposium.enableExperimentalLM: true in VS Code settings to enable it.

This chapter describes the architecture for exposing ACP agents as VS Code Language Models via the LanguageModelChatProvider API (introduced in VS Code 1.104). This allows ACP agents to appear in VS Code’s model picker and be used by any extension that consumes the Language Model API.

Current Status

The Language Model Provider is experimental and may not be the right approach for Symposium.

What works:

  • Basic message flow between VS Code LM API and ACP agents
  • Session management with committed/provisional history model
  • Tool bridging architecture (both directions)

Known issues:

  • Tool invocation fails when multiple VS Code-provided tools are bridged to the agent. A single isolated tool works correctly, but when multiple tools are available, the model doesn’t invoke them properly. The root cause is not yet understood.

Open question: VS Code LM consumers (like GitHub Copilot) inject their own context into requests - project details, file contents, editor state, etc. ACP agents like Claude Code also inject their own context. When both layers add context, they may “fight” each other, confusing the model. The LM API may be better suited for raw model access rather than wrapping agents that have their own context management.

Overview

The Language Model Provider bridges VS Code’s stateless Language Model API to ACP’s stateful session model. When users select “Symposium” in the model picker, requests are routed through Symposium to the configured ACP agent.

┌─────────────────────────────────────────────────────────────────┐
│                         VS Code                                 │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │              Language Model Consumer                       │  │
│  │         (Copilot, other extensions, etc.)                 │  │
│  └─────────────────────────┬─────────────────────────────────┘  │
│                            │                                    │
│                            ▼                                    │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │           LanguageModelChatProvider (TypeScript)          │  │
│  │                                                           │  │
│  │  - Thin adapter layer                                     │  │
│  │  - Serializes VS Code API calls to JSON-RPC               │  │
│  │  - Forwards to Rust process                               │  │
│  │  - Deserializes responses, streams back via progress      │  │
│  └─────────────────────────┬─────────────────────────────────┘  │
└────────────────────────────┼────────────────────────────────────┘
                             │ JSON-RPC (stdio)
                             ▼
┌─────────────────────────────────────────────────────────────────┐
│              symposium-acp-agent vscodelm                       │
│                                                                 │
│  - Receives serialized VS Code LM API calls                    │
│  - Manages session state                                        │
│  - Routes to ACP agent (or Eliza for prototype)                │
│  - Streams responses back                                       │
└─────────────────────────────────────────────────────────────────┘

Design Decisions

TypeScript/Rust Split

The TypeScript extension is a thin adapter:

  • Registers as LanguageModelChatProvider
  • Serializes provideLanguageModelChatResponse calls to JSON-RPC
  • Sends to Rust process over stdio
  • Deserializes responses and streams back via progress callback

The Rust process handles all logic:

  • Session management
  • Message history tracking
  • ACP protocol (future)
  • Response streaming

This keeps the interesting logic in Rust where it’s testable and maintainable.

Session Management

VS Code’s Language Model API is stateless: each request includes the full message history. ACP sessions are stateful. The Rust backend bridges this gap using a History Actor that tracks session state.

Architecture

graph LR
    VSCode[VS Code] <--> HA[History Actor]
    HA <--> SA[Session Actor]
    SA <--> Agent[ACP Agent]
  • History Actor: Receives requests from VS Code, tracks message history, identifies new messages
  • Session Actor: Manages the ACP agent connection, handles streaming responses

Committed and Provisional History

The History Actor maintains two pieces of state:

  • Committed: Complete (User, Assistant)* message pairs that VS Code has acknowledged. Always ends with an assistant message (or is empty).
  • Provisional: The current in-flight exchange: one user message U and the assistant response parts A we’ve sent so far (possibly empty).

Commit Flow

When we receive a new request, we compare its history against committed + provisional:

sequenceDiagram
    participant VSCode as VS Code
    participant HA as History Actor
    participant SA as Session Actor

    Note over HA: committed = [], provisional = (U1, [])
    
    SA->>HA: stream parts P1, P2, P3
    Note over HA: provisional = (U1, [P1, P2, P3])
    HA->>VSCode: stream P1, P2, P3
    
    SA->>HA: done streaming
    HA->>VSCode: response complete
    
    VSCode->>HA: new request with history [U1, A1, U2]
    Note over HA: matches committed + provisional + new user msg
    Note over HA: commit: committed = [U1, A1]
    Note over HA: provisional = (U2, [])
    HA->>SA: new_messages = [U2], canceled = false

The new user message U2 confirms that VS Code received and accepted our assistant response A1. We commit the exchange and start fresh with U2.

Cancellation via History Mismatch

If VS Code sends a request that doesn’t include our provisional content, the provisional work was rejected:

sequenceDiagram
    participant VSCode as VS Code
    participant HA as History Actor
    participant SA as Session Actor

    Note over HA: committed = [U1, A1], provisional = (U2, [P1, P2])
    
    VSCode->>HA: new request with history [U1, A1, U3]
    Note over HA: doesn't match committed + provisional
    Note over HA: discard provisional
    Note over HA: provisional = (U3, [])
    HA->>SA: new_messages = [U3], canceled = true
    
    SA->>SA: cancel downstream agent

This happens when:

  • User cancels the chat in VS Code
  • User rejects a tool confirmation
  • User sends a different message while we were responding

The Session Actor receives canceled = true and propagates cancellation to the downstream ACP agent.

Agent Configuration

The agent to use is specified per-request via the agent field in the JSON-RPC protocol. This is an AgentDefinition enum:

type AgentDefinition =
  | { eliza: { deterministic?: boolean } }
  | { mcp_server: McpServerStdio };

interface McpServerStdio {
  name: string;
  command: string;
  args: string[];
  env: Array<{ name: string; value: string }>;
}

The TypeScript extension reads the agent configuration from VS Code settings via the agent registry, resolves the distribution to get the actual command, and includes it in each request. The Rust backend dispatches based on the variant:

  • eliza: Uses the in-process Eliza chatbot (useful for testing)
  • mcp_server: Spawns an external ACP agent process and manages sessions

JSON-RPC Protocol

The protocol between TypeScript and Rust mirrors the LanguageModelChatProvider interface.

Requests (TypeScript → Rust)

lm/provideLanguageModelChatResponse

Each request includes the agent configuration via the agent field, which is an AgentDefinition enum with two variants:

External ACP agent (mcp_server):

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "lm/provideLanguageModelChatResponse",
  "params": {
    "modelId": "symposium",
    "messages": [
      { "role": "user", "content": [{ "type": "text", "value": "Hello" }] }
    ],
    "agent": {
      "mcp_server": {
        "name": "my-agent",
        "command": "/path/to/agent",
        "args": ["--flag"],
        "env": [{ "name": "KEY", "value": "value" }]
      }
    }
  }
}

Built-in Eliza (for testing):

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "lm/provideLanguageModelChatResponse",
  "params": {
    "modelId": "symposium-eliza",
    "messages": [
      { "role": "user", "content": [{ "type": "text", "value": "Hello" }] }
    ],
    "agent": {
      "eliza": { "deterministic": true }
    }
  }
}

The Rust backend dispatches based on the variant - spawning an external process for mcp_server or using the in-process Eliza for eliza.

Notifications (Rust → TypeScript)

lm/responsePart - Streams response chunks

{
  "jsonrpc": "2.0",
  "method": "lm/responsePart",
  "params": {
    "requestId": 1,
    "part": { "type": "text", "value": "How " }
  }
}

lm/responseComplete - Signals end of response

{
  "jsonrpc": "2.0",
  "method": "lm/responseComplete",
  "params": {
    "requestId": 1
  }
}

Response

After all parts are streamed, the request completes:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {}
}

Implementation Status

  • Rust: vscodelm subcommand in symposium-acp-agent
  • Rust: JSON-RPC message parsing
  • Rust: Eliza integration for testing
  • Rust: Response streaming
  • Rust: Configurable agent backend (McpServer support)
  • Rust: Session actor with ACP session management
  • TypeScript: LanguageModelChatProvider registration
  • TypeScript: JSON-RPC client over stdio
  • TypeScript: Progress callback integration
  • TypeScript: Agent configuration from settings
  • End-to-end test with real ACP agent

Tool Bridging

See Language Model Tool Bridging for the design of how tools flow between VS Code and ACP agents. This covers:

  • VS Code-provided tools (shuttled to agent via synthetic MCP server)
  • Agent-internal tools (permission requests surfaced via symposium-agent-action)
  • Handle state management across requests
  • Cancellation and history matching

Future Work

  • Session caching with message history diffing
  • Token counting heuristics
  • Model metadata from agent capabilities

Language Model Tool Bridging

This chapter describes how Symposium bridges tool calls between VS Code’s Language Model API and ACP agents. For the general session management model (committed/provisional history), see Language Model Provider.

Tool Call Categories

There are two categories of tools:

  1. Agent-internal tools - Tools the ACP agent manages via its own MCP servers (e.g., bash, file editing)
  2. VS Code-provided tools - Tools that VS Code extensions offer to the model

Agent-Internal Tools

ACP agents have their own MCP servers providing tools. The agent can execute these directly, but may request permission first via ACP’s session/request_permission.

When an agent requests permission, Symposium surfaces this to VS Code using a special tool called symposium-agent-action.

How Tool Calls Fit the Session Model

A tool call creates a multi-step exchange within the committed/provisional model:

  1. User sends message U1 → provisional = (U1, [])
  2. Agent streams response, ends with tool call → provisional = (U1, [text..., ToolCall])
  3. VS Code shows confirmation UI, user approves
  4. VS Code sends new request with [U1, A1, ToolResult]
  5. ToolResult is a new user message, so we commit (U1, A1) → committed = [U1, A1], provisional = (ToolResult, [])
  6. Agent continues with the tool result

The key insight: the tool result is just another user message from the session model’s perspective. It triggers a commit of the previous exchange.

Permission Approved Flow

sequenceDiagram
    participant VSCode as VS Code
    participant HA as History Actor
    participant SA as Session Actor
    participant Agent as ACP Agent

    Note over HA: committed = [], provisional = (U1, [])
    
    Agent->>SA: session/request_permission ("run bash")
    SA->>HA: emit ToolCall part for symposium-agent-action
    Note over HA: provisional = (U1, [ToolCall])
    HA->>VSCode: stream ToolCall part
    HA->>VSCode: response complete
    
    Note over VSCode: show confirmation UI
    Note over VSCode: user approves
    
    VSCode->>HA: request with [U1, A1, ToolResult]
    Note over HA: matches committed + provisional + new user msg
    Note over HA: commit: committed = [U1, A1]
    Note over HA: provisional = (ToolResult, [])
    HA->>SA: new_messages = [ToolResult], canceled = false
    
    SA->>Agent: allow-once
    Agent->>Agent: execute tool internally
    Agent->>SA: continue streaming response

Permission Rejected Flow

When the user rejects a tool (or cancels the chat), VS Code sends a request that doesn’t include our tool call:

sequenceDiagram
    participant VSCode as VS Code
    participant HA as History Actor
    participant SA as Session Actor
    participant Agent as ACP Agent

    Note over HA: committed = [], provisional = (U1, [ToolCall])
    
    Note over VSCode: show confirmation UI
    Note over VSCode: user rejects (cancels chat)
    
    VSCode->>HA: request with [U2]
    Note over HA: doesn't include our ToolCall
    Note over HA: discard provisional
    Note over HA: provisional = (U2, [])
    HA->>SA: new_messages = [U2], canceled = true
    
    SA->>Agent: session/cancel
    Note over SA: start fresh with U2

Session Actor Tool Use Handling

The Session Actor uses a peek/consume pattern when waiting for tool results:

sequenceDiagram
    participant HA as History Actor
    participant SA as Session Actor
    participant RR as RequestResponse

    SA->>RR: send_tool_use(call_id, name, input)
    RR->>HA: emit ToolCall part
    RR->>RR: drop prompt_tx (signal complete)
    
    RR->>HA: peek next ModelRequest
    
    alt canceled = false, exactly one ToolResult
        RR->>HA: consume request
        RR->>SA: Ok(SendToolUseResult)
    else canceled = true
        Note over RR: don't consume
        RR->>SA: Err(Canceled)
    else unexpected content (ToolResult + other messages)
        Note over RR: don't consume, treat as canceled
        RR->>SA: Err(Canceled)
    end

When Err(Canceled) is returned:

  1. The outer loop cancels the downstream agent
  2. It loops around and sees the unconsumed ModelRequest
  3. Processes new messages, ignoring orphaned ToolResult parts
  4. Starts a fresh prompt

The “unexpected content” case handles the edge case where VS Code sends both a tool result and additional user content. Rather than trying to handle this complex state, we treat it as a soft cancellation and start fresh.

VS Code-Provided Tools

VS Code consumers pass tools to the model via options.tools[] in each request. These are tools implemented by VS Code extensions (e.g., “search workspace”, “read file”).

To expose these to an ACP agent, Symposium creates a synthetic MCP server that:

  1. Offers the same tools that VS Code provided
  2. When the agent invokes a tool, emits a ToolCall to VS Code and waits
  3. Returns the result from VS Code to the agent

VS Code Tool Flow

sequenceDiagram
    participant VSCode as VS Code
    participant HA as History Actor
    participant SA as Session Actor
    participant Agent as ACP Agent
    participant MCP as Synthetic MCP

    VSCode->>HA: request with options.tools[]
    HA->>SA: forward tools list
    SA->>MCP: update available tools
    
    Agent->>MCP: invoke tool "search_workspace"
    MCP->>SA: tool invocation
    SA->>HA: emit ToolCall part
    HA->>VSCode: stream ToolCall, complete
    
    Note over VSCode: invoke tool, get result
    
    VSCode->>HA: request with ToolResult
    HA->>SA: new_messages = [ToolResult], canceled = false
    SA->>MCP: return result
    MCP->>Agent: tool result

Implementation Status

Agent-Internal Tools (Implemented)

The permission flow for agent-internal tools is implemented:

  • TypeScript: symposium-agent-action tool in agentActionTool.ts
  • Rust: Session actor handles session/request_permission, emits ToolCall parts
  • History matching: History actor tracks committed/provisional, detects approval/rejection

VS Code-Provided Tools (Implemented)

The synthetic MCP server for bridging VS Code-provided tools is implemented:

  • Rust: VscodeToolsMcpServer in vscodelm/vscode_tools_mcp.rs implements rmcp::ServerHandler
  • Integration: Session actor creates one MCP server per session, injects it via with_mcp_server()
  • Tool list: Updated on each VS Code request via VscodeToolsHandle
  • Tool invocation: Session actor handles invocations from the MCP server using tokio::select!, emits ToolCall to VS Code, waits for ToolResult

Limitations

VS Code Tool Rejection Cancels Entire Chat

When a user rejects a tool in VS Code’s confirmation UI, the entire chat is cancelled. This is a VS Code limitation (GitHub #241039). Symposium handles this by detecting the cancellation via history mismatch.

No Per-Tool Rejection Signaling

VS Code doesn’t tell the model that a tool was rejected - the cancelled turn simply doesn’t appear in history. The model has no memory of what it tried.

Tool Approval Levels Managed by VS Code

VS Code manages approval persistence (single use, session, workspace, always). Symposium just receives the result.

Implementation Status

This chapter tracks what’s been implemented, what’s in progress, and what’s planned for the VSCode extension.

Core Architecture

  • Three-layer architecture (webview/extension/agent)
  • Message routing with UUID-based identification
  • HomerActor mock agent with session support
  • Webview state persistence with session ID checking
  • Message buffering when webview is hidden
  • Message deduplication via last-seen-index tracking

Error Handling

  • Agent crash detection (partially implemented - detection works, UI error display incomplete)
  • Complete error recovery UX (restart agent button, error notifications)
  • Agent health monitoring and automatic restart

Agent Lifecycle

  • Agent spawn on extension activation (partially implemented - spawn/restart works, graceful shutdown incomplete)
  • Graceful agent shutdown on extension deactivation
  • Agent process supervision and restart on crash

ACP Protocol Support

Connection & Lifecycle

  • Client-side connection (ClientSideConnection)
  • Protocol initialization and capability negotiation
  • Session creation (newSession)
  • Prompt sending (prompt)
  • Streaming response handling (sessionUpdate)
  • Session cancellation (session/cancel)
  • Session mode switching (session/set_mode)
  • Model selection (session/set_model)
  • Authentication flow

Tool Permissions

  • Permission request callback (requestPermission)
  • MynahUI approval cards with approve/deny/bypass options
  • Per-agent bypass permissions in settings
  • Settings UI for managing bypass permissions
  • Automatic approval when bypass enabled

Session Updates

The client receives sessionUpdate notifications from the agent. Current support:

  • agent_message_chunk - Display streaming text in chat UI
  • tool_call - Logged to console (not displayed in UI)
  • tool_call_update - Logged to console (not displayed in UI)
  • Execution plans - Not implemented
  • Thinking steps - Not implemented
  • Custom update types - Not implemented

Gap: Tool calls are logged but not visually displayed. Users don’t see which tools are being executed or their progress.

File System Capabilities

  • readTextFile - Stub implemented (throws “not yet implemented”)
  • writeTextFile - Stub implemented (throws “not yet implemented”)

Current state: We advertise fs.readTextFile: false and fs.writeTextFile: false in capabilities, so agents know we don’t support file operations.

Why not implemented: Requires VSCode workspace API integration and security considerations (which files can be accessed, path validation, etc.).

Terminal Capabilities

  • createTerminal - Not implemented
  • Terminal output streaming - Not implemented
  • Terminal lifecycle (kill, release) - Not implemented

Why not implemented: Requires integrating with VSCode’s terminal API and managing terminal lifecycle. Also involves security considerations around command execution.

Extension Points

  • Extension methods (extMethod) - Not implemented
  • Extension notifications (extNotification) - Not implemented

These allow protocol extensions beyond the ACP specification. Not currently needed but could be useful for custom features.

State Management

  • Webview state persistence within session
  • Chat history persistence across hide/show cycles
  • Draft text persistence (FIXME: partially typed prompts are lost on hide/show)
  • Session restoration after VSCode restart
  • Workspace-specific state persistence
  • Tab history and conversation export

Agent Extensions

Agent extensions are proxy components that enrich the agent’s capabilities. See Agent Extensions for details.

  • CLI support (--proxy argument for symposium-acp-agent)
  • VS Code setting (symposium.extensions array)
  • Settings UI with enable/disable checkboxes
  • Drag-to-reorder in Settings UI
  • Delete and add extensions back
  • Registry extensions (install from agent registry with type = 'extension')
  • Per-extension configuration (e.g., which Ferris tools to enable)

Language Model Provider (Experimental)

Set symposium.enableExperimentalLM: true in VS Code settings to enable.

This feature exposes ACP agents via VS Code’s LanguageModelChatProvider API, allowing them to appear in the model picker for use by Copilot and other extensions.

Status: Experimental, disabled by default. May not be the right approach.

  • TypeScript: LanguageModelChatProvider registration
  • TypeScript: JSON-RPC client over stdio
  • TypeScript: Progress callback integration
  • Rust: vscodelm subcommand
  • Rust: Session actor with history management
  • Rust: Tool bridging (symposium-agent-action for permissions)
  • Rust: VS Code tools via synthetic MCP server
  • Feature flag gating (symposium.enableExperimentalLM)
  • Fix: Multiple MCP tools cause invocation failures

Known issue: Tool invocation works with a single isolated tool but fails when multiple VS Code-provided tools are bridged. Root cause unknown.

Open question: VS Code LM consumers inject their own context (project details, editor state, etc.) into requests. ACP agents like Claude Code also inject context. These competing context layers may confuse the model, making the LM API better suited for raw model access than wrapping full agents.

See Language Model Provider and Tool Bridging for architecture details.

Reference Material

Detailed research reports covering specific API details and implementation guides. These are generally fed to AI agents but can be useful for humans too!

MynahUI GUI Capabilities Guide

Overview

MynahUI is a data and event-driven chat interface library for browsers and webviews. This guide focuses on the interactive GUI capabilities relevant for building tool permission and approval workflows.

Core Concepts

Chat Items

Chat items are the fundamental building blocks of the conversation UI. Each chat item is a “card” that can contain various interactive elements.

Basic Structure:

interface ChatItem {
  type: ChatItemType;           // Determines positioning and styling
  messageId?: string;            // Unique identifier for updates
  body?: string;                 // Markdown content
  buttons?: ChatItemButton[];    // Action buttons
  formItems?: ChatItemFormItem[]; // Form inputs
  fileList?: FileList;           // File tree display
  followUp?: FollowUpOptions;    // Quick action pills
  // ... many more options
}

Chat Item Types:

  • ANSWER / ANSWER_STREAM / CODE_RESULT → Left-aligned (AI responses)
  • PROMPT / SYSTEM_PROMPT → Right-aligned (user messages)
  • DIRECTIVE → Transparent, no background

Interactive Components

1. Buttons (ChatItemButton)

Buttons are the primary action mechanism for user approval/denial workflows.

Interface:

interface ChatItemButton {
  id: string;                    // Unique identifier for the button
  text?: string;                 // Button label
  icon?: MynahIcons;             // Optional icon
  status?: 'main' | 'primary' | 'clear' | 'dimmed-clear' | 'info' | 'success' | 'warning' | 'error';
  keepCardAfterClick?: boolean;  // If false, removes card after click
  waitMandatoryFormItems?: boolean; // Disables until mandatory form items are filled
  disabled?: boolean;
  description?: string;          // Tooltip text
}

Status Colors:

  • main - Primary brand color
  • primary - Accent color
  • success - Green (for approval actions)
  • error - Red (for denial/rejection actions)
  • warning - Yellow/orange
  • info - Blue
  • clear - Transparent background

Event Handler:

onInBodyButtonClicked: (tabId: string, messageId: string, action: {
  id: string;
  text?: string;
  // ... other button properties
}) => void

Example - Approval Buttons:

{
  type: ChatItemType.ANSWER,
  messageId: 'tool-approval-123',
  body: 'Tool execution request...',
  buttons: [
    {
      id: 'approve-once',
      text: 'Approve',
      status: 'primary',
      icon: MynahIcons.OK
    },
    {
      id: 'approve-session',
      text: 'Approve for Session',
      status: 'success',
      icon: MynahIcons.OK_CIRCLED
    },
    {
      id: 'deny',
      text: 'Deny',
      status: 'error',
      icon: MynahIcons.CANCEL,
      keepCardAfterClick: false  // Card disappears on denial
    }
  ]
}

2. Form Items (ChatItemFormItem)

Form items allow collecting structured user input alongside button actions.

Available Form Types:

  • textinput / textarea / numericinput / email
  • select (dropdown)
  • radiogroup / toggle
  • checkbox / switch
  • stars (rating)
  • list (dynamic list of items)
  • pillbox (tag/pill input)

Common Properties:

interface BaseFormItem {
  id: string;                // Unique identifier
  type: string;              // Form type
  mandatory?: boolean;       // Required field
  title?: string;            // Label
  description?: string;      // Help text
  tooltip?: string;          // Tooltip
  value?: string;            // Initial/current value
  disabled?: boolean;
}

Example - Checkbox for “Remember Choice”:

formItems: [
  {
    type: 'checkbox',
    id: 'remember-approval',
    label: 'Remember this choice for similar requests',
    value: 'false',
    tooltip: 'If checked, future requests for this tool will be automatically approved'
  }
]

Example - Toggle for Options:

formItems: [
  {
    type: 'toggle',
    id: 'approval-scope',
    title: 'Approval Scope',
    value: 'once',
    options: [
      { value: 'once', label: 'Once', icon: MynahIcons.CHECK },
      { value: 'session', label: 'Session', icon: MynahIcons.STACK },
      { value: 'always', label: 'Always', icon: MynahIcons.OK_CIRCLED }
    ]
  }
]

Event Handlers:

onFormChange: (tabId: string, messageId: string, item: ChatItemFormItem, value: any) => void

3. Content Display Options

Markdown Body

The body field supports full markdown including:

  • Headings (#, ##, ###)
  • Code blocks with syntax highlighting
  • Inline code
  • Links
  • Lists (ordered/unordered)
  • Blockquotes
  • Tables

Example - Displaying Tool Parameters:

body: `### Tool Execution Request

**Tool:** \`read_file\`

**Parameters:**
\`\`\`json
{
  "file_path": "/Users/niko/src/config.ts",
  "offset": 0,
  "limit": 100
}
\`\`\`

Do you want to allow this tool to execute?`

Custom Renderer

For complex layouts beyond markdown, use customRenderer with HTML markup:

customRenderer: `
<div>
  <h4>Tool: <code>read_file</code></h4>
  <table>
    <tr>
      <th>Parameter</th>
      <th>Value</th>
    </tr>
    <tr>
      <td>file_path</td>
      <td><code>/Users/niko/src/config.ts</code></td>
    </tr>
    <tr>
      <td>offset</td>
      <td><code>0</code></td>
    </tr>
  </table>
</div>
`

Information Cards

For hierarchical content with status indicators:

informationCard: {
  title: 'Security Notice',
  status: {
    status: 'warning',
    icon: MynahIcons.WARNING,
    body: 'This tool will access filesystem resources'
  },
  description: 'Review the parameters carefully',
  content: {
    body: '... detailed information ...'
  }
}

4. File Lists

Display file paths with actions and metadata:

fileList: {
  fileTreeTitle: 'Files to be accessed',
  filePaths: ['/src/config.ts', '/src/main.ts'],
  details: {
    '/src/config.ts': {
      icon: MynahIcons.FILE,
      description: 'Configuration file',
      clickable: true
    }
  },
  actions: {
    '/src/config.ts': [
      {
        name: 'view-details',
        icon: MynahIcons.EYE,
        description: 'View file details'
      }
    ]
  }
}

Event Handler:

onFileActionClick: (tabId: string, messageId: string, filePath: string, actionName: string) => void

5. Follow-Up Pills

Quick action buttons displayed as pills:

followUp: {
  text: 'Quick actions',
  options: [
    {
      pillText: 'Approve All',
      icon: MynahIcons.OK,
      status: 'success',
      prompt: 'approve-all'  // Can trigger automatic actions
    },
    {
      pillText: 'Deny All',
      icon: MynahIcons.CANCEL,
      status: 'error',
      prompt: 'deny-all'
    }
  ]
}

Event Handler:

onFollowUpClicked: (tabId: string, messageId: string, followUp: ChatItemAction) => void

Card Behavior Options

Visual States

{
  status?: 'info' | 'success' | 'warning' | 'error';  // Colors the card border/icon
  shimmer?: boolean;         // Loading animation
  canBeVoted?: boolean;      // Show thumbs up/down
  canBeDismissed?: boolean;  // Show dismiss button
  snapToTop?: boolean;       // Pin to top of chat
  border?: boolean;          // Show border
  hoverEffect?: boolean;     // Highlight on hover
}

Layout Options

{
  fullWidth?: boolean;               // Stretch to container width
  padding?: boolean;                 // Internal padding
  contentHorizontalAlignment?: 'default' | 'center';
}

Card Lifecycle

{
  keepCardAfterClick?: boolean;      // On buttons - remove card after click
  autoCollapse?: boolean;            // Auto-collapse long content
}

Updating Chat Items

Chat items can be updated after creation:

// Add new chat item
mynahUI.addChatItem(tabId, chatItem);

// Update by message ID
mynahUI.updateChatAnswerWithMessageId(tabId, messageId, updatedChatItem);

// Update last streaming answer
mynahUI.updateLastChatAnswer(tabId, partialChatItem);

Complete Example: Tool Approval Workflow

// 1. Show tool approval request
mynahUI.addChatItem('main-tab', {
  type: ChatItemType.ANSWER,
  messageId: 'tool-approval-read-file-001',
  status: 'warning',
  icon: MynahIcons.LOCK,
  body: `### Tool Execution Request

**Tool:** \`read_file\`

**Description:** Read file contents from the filesystem

**Parameters:**
\`\`\`json
{
  "file_path": "/Users/nikomat/dev/mynah-ui/src/config.ts",
  "offset": 0,
  "limit": 2000
}
\`\`\`

**Security:** This tool will access local filesystem resources.`,
  
  formItems: [
    {
      type: 'checkbox',
      id: 'remember-read-file',
      label: 'Trust this tool for the remainder of the session',
      value: 'false'
    }
  ],
  
  buttons: [
    {
      id: 'approve',
      text: 'Approve',
      status: 'success',
      icon: MynahIcons.OK,
      keepCardAfterClick: false
    },
    {
      id: 'deny',
      text: 'Deny',
      status: 'error',
      icon: MynahIcons.CANCEL,
      keepCardAfterClick: false
    },
    {
      id: 'details',
      text: 'More Details',
      status: 'clear',
      icon: MynahIcons.INFO
    }
  ]
});

// 2. Handle button clicks
mynahUI.onInBodyButtonClicked = (tabId, messageId, action) => {
  if (messageId === 'tool-approval-read-file-001') {
    const formState = mynahUI.getFormState(tabId, messageId);
    const rememberChoice = formState['remember-read-file'] === 'true';
    
    switch (action.id) {
      case 'approve':
        // Execute tool
        // If rememberChoice, add to session whitelist
        break;
      case 'deny':
        // Cancel tool execution
        break;
      case 'details':
        // Show additional information
        mynahUI.updateChatAnswerWithMessageId(tabId, messageId, {
          informationCard: {
            title: 'Tool Details',
            content: {
              body: 'Detailed tool documentation...'
            }
          }
        });
        break;
    }
  }
};

Progressive Updates

For multi-step approval flows, you can progressively update the same card:

// Initial request
mynahUI.addChatItem(tabId, {
  messageId: 'approval-001',
  type: ChatItemType.ANSWER,
  body: 'Waiting for approval...',
  shimmer: true
});

// User approves
mynahUI.updateChatAnswerWithMessageId(tabId, 'approval-001', {
  body: 'Approved! Executing tool...',
  shimmer: true,
  buttons: []  // Remove buttons
});

// Execution complete
mynahUI.updateChatAnswerWithMessageId(tabId, 'approval-001', {
  body: 'Tool execution complete!',
  shimmer: false,
  status: 'success',
  icon: MynahIcons.OK_CIRCLED
});

Sticky Cards

For persistent approval requests that stay above the prompt:

mynahUI.updateStore(tabId, {
  promptInputStickyCard: {
    messageId: 'persistent-approval',
    body: 'Multiple tools are waiting for approval',
    status: 'warning',
    icon: MynahIcons.WARNING,
    buttons: [
      {
        id: 'review-pending',
        text: 'Review Pending',
        status: 'info'
      }
    ]
  }
});

// Clear sticky card
mynahUI.updateStore(tabId, {
  promptInputStickyCard: null
});

Best Practices for Tool Approval UI

  1. Clear Tool Identity: Always show tool name prominently
  2. Parameter Visibility: Display all parameters the tool will receive
  3. Security Context: Indicate security implications (file access, network, etc.)
  4. Action Clarity: Use clear “Approve” vs “Deny” with appropriate status colors
  5. Scope Options: Provide “once”, “session”, “always” choices when appropriate
  6. Non-blocking: Use keepCardAfterClick: false to auto-dismiss after approval
  7. Progressive Disclosure: Start simple, show details on demand
  8. Feedback: Update card state to show execution progress after approval

Key Event Handlers

interface MynahUIProps {
  onInBodyButtonClicked?: (tabId: string, messageId: string, action: ChatItemButton) => void;
  onFollowUpClicked?: (tabId: string, messageId: string, followUp: ChatItemAction) => void;
  onFormChange?: (tabId: string, messageId: string, item: ChatItemFormItem, value: any) => void;
  onFileActionClick?: (tabId: string, messageId: string, filePath: string, actionName: string) => void;
  // ... many more
}

Reference

  • Full documentation: mynah-ui/docs/DATAMODEL.md
  • Type definitions: mynah-ui/src/static.ts
  • Examples: mynah-ui/example/src/samples/

VSCode Webview State Preservation: Complete Guide for Chat Interfaces

Your mynah-ui chat extension can preserve draft text automatically using VSCode’s built-in APIs. The key insight: there’s no “last chance” event before destruction, so you must save continuously. The official VSCode documentation shows setState() being called every 100ms without performance concerns, and popular extensions use debounced saves at 300-500ms intervals.

VSCode webview lifecycle: No beforeunload safety net

VSCode webviews do not expose a beforeunload or similar “last chance” event through the extension API. This is the most critical finding for your implementation. You have exactly two lifecycle events to work with:

onDidChangeViewState fires when the webview’s visibility changes or moves to a different editor column. It provides access to webviewPanel.visible and webviewPanel.viewColumn properties. Critically, this event does NOT fire when the webview is disposed—only when it becomes hidden or changes position. The browser’s beforeunload event exists within the webview iframe itself but cannot communicate asynchronously back to your extension, making it effectively useless for state preservation.

onDidDispose fires after the webview is already destroyed—too late for state saving. Use it only for cleanup operations like canceling timers or removing subscriptions. By the time this event fires, your webview context is gone and any unsaved state is lost.

The recommended pattern is to save state continuously rather than trying to intercept disposal. VSCode’s official documentation explicitly shows this approach, with their example calling setState() every 100ms in a setInterval without any warnings about performance impact.

setState performance: Call it freely with light debouncing

The performance cost of vscode.setState() is remarkably low. Microsoft’s official documentation states that “getState and setState are the preferred way to persist state, as they have much lower performance overhead than retainContextWhenHidden.” The API appears to be synchronous, accepts JSON-serializable objects, and has no documented size limits or throttling mechanisms.

The official VSCode webview sample demonstrates calling setState() 10 times per second (every 100ms) without any performance warnings or caveats. This suggests the operation is highly optimized and suitable for frequent updates. Real-world extension analysis shows a community consensus around 300-500ms debounce intervals for text input, which balances responsiveness with minimal overhead.

Is it acceptable to call on every keystroke? Technically yes, but practically you should debounce. Here’s why: while setState itself is lightweight, debouncing serves UX purposes more than performance. A 300-500ms debounce provides a better user experience by avoiding excessive state churn while ensuring draft preservation happens quickly enough that users rarely lose more than half a second of typing if they close the sidebar mid-sentence.

Popular extension patterns: The REST Client extension saves request history to globalState immediately on submission. The GistPad extension uses a 1500ms debounce for search input updates. The Continue AI extension relies on message passing between webview and extension for complex state management rather than setState alone. Most extensions combine approaches—using setState for immediate UI state and globalState for data that must survive webview disposal.

mynah-ui API: Event-driven architecture with limited draft access

mynah-ui does not expose a direct API to retrieve current draft text from input fields in its public documentation. The library follows a strictly event-driven pattern where user input is captured through the onChatPrompt callback, which fires when users submit messages—not during typing.

The getAllTabs() method is not explicitly documented as including unsent draft messages. Based on the library’s architecture, tabs contain conversation history and submitted messages, not draft state. You’ll need to implement your own draft tracking by monitoring the underlying DOM input elements or maintaining draft state in your extension code.

Events you can hook into:

  • onChatPrompt: Fires when users submit a message (your primary input capture point)
  • onTabChange: Fires when switching between tabs (good opportunity to save current draft)
  • onTabAdd/onTabRemove: Tab lifecycle events

mynah-ui uses a centralized reactive data store where updates automatically trigger re-renders of subscribed components. The library prioritizes declarative state management over imperative queries, which is why draft access methods aren’t prominent. For your use case, you’ll likely need to access the input DOM elements directly or maintain a parallel draft state structure outside mynah-ui.

User expectations: Auto-save is non-negotiable

Users expect automatic draft preservation based on industry-standard chat applications. Research into Slack, Teams, Discord, and even recent iOS updates reveals consistent patterns:

Automatic per-conversation drafts are table stakes. Slack saves drafts automatically per channel, Teams maintains drafts per conversation, and Discord preserves drafts across app restarts. All provide visual indicators (bold channel names, “[Draft]” labels, or draft count badges) showing where unsent messages exist.

VSCode users are already frustrated by draft loss in existing extensions. GitHub issues show significant pain points: users lose hours of work when chat history disappears during workspace switches, and Claude Code extension users report losing conversation context due to inadequate state preservation. One user complaint: “Lost chats today and am here to express how insane it is that this is even possible.”

Expected behavior for your sidebar: When users close the sidebar while typing, they expect that text to reappear when they reopen it—period. This expectation comes from every major communication platform they use daily. Losing draft text is not acceptable. Your implementation must preserve this state automatically, invisibly, and reliably.

VSCode’s built-in GitHub Copilot Chat demonstrates the acceptable standard: chat sessions persist within a workspace, history is accessible via “Show Chats…”, and sessions can be exported. However, even Copilot Chat has limitations—history loss when switching workspaces causes major user frustration, proving that inadequate persistence is a critical UX failure.

The optimal pattern combines immediate setState() for UI state with debounced saves for draft content, backed by globalState for persistence beyond webview lifecycle. Here’s the complete implementation strategy:

Pattern 1: Continuous state preservation in webview

// Inside your webview script
const vscode = acquireVsCodeApi();

// Restore previous state immediately
const previousState = vscode.getState() || { 
  drafts: {},  // keyed by tab/conversation ID
  activeTab: null 
};

// Debounced save function (500ms is the sweet spot)
let saveTimeout;
function saveDraftDebounced(tabId, draftText) {
  clearTimeout(saveTimeout);
  saveTimeout = setTimeout(() => {
    const currentState = vscode.getState() || { drafts: {} };
    currentState.drafts[tabId] = {
      text: draftText,
      timestamp: Date.now()
    };
    vscode.setState(currentState);
    
    // Also notify extension for globalState backup
    vscode.postMessage({
      command: 'saveDraft',
      tabId: tabId,
      text: draftText
    });
  }, 500);
}

// Hook into mynah-ui or direct DOM events
// Since mynah-ui doesn't expose input change events, access the DOM
const chatInput = document.querySelector('[data-mynah-chat-input]'); // adjust selector
if (chatInput) {
  chatInput.addEventListener('input', (e) => {
    const currentTab = getCurrentTabId(); // your function to get active tab
    saveDraftDebounced(currentTab, e.target.value);
  });
}

// Immediate save on tab switch (use mynah-ui's onTabChange)
mynahUI = new MynahUI({
  onTabChange: (tabId) => {
    // Save current draft immediately before switching
    const currentDraft = getCurrentDraftText();
    if (currentDraft) {
      const state = vscode.getState() || { drafts: {} };
      state.drafts[getCurrentTabId()] = {
        text: currentDraft,
        timestamp: Date.now()
      };
      vscode.setState(state);
    }
    
    // Restore draft for new tab
    const newState = vscode.getState();
    if (newState?.drafts?.[tabId]) {
      restoreDraftToInput(newState.drafts[tabId].text);
    }
  },
  
  onChatPrompt: (tabId, prompt) => {
    // Clear draft after successful send
    const state = vscode.getState() || { drafts: {} };
    delete state.drafts[tabId];
    vscode.setState(state);
    
    vscode.postMessage({
      command: 'clearDraft',
      tabId: tabId
    });
  }
});

// Restore drafts on load
window.addEventListener('load', () => {
  const state = vscode.getState();
  const activeTab = getCurrentTabId();
  if (state?.drafts?.[activeTab]?.text) {
    restoreDraftToInput(state.drafts[activeTab].text);
  }
});

Pattern 2: Extension-side backup with globalState

// In your extension code (extension.ts)
export function activate(context: vscode.ExtensionContext) {
  
  // Handle messages from webview
  webviewPanel.webview.onDidReceiveMessage(
    message => {
      switch (message.command) {
        case 'saveDraft':
          // Save to globalState as backup
          const drafts = context.globalState.get('chatDrafts', {});
          drafts[message.tabId] = {
            text: message.text,
            timestamp: Date.now(),
            workspace: vscode.workspace.name || 'default'
          };
          context.globalState.update('chatDrafts', drafts);
          break;
          
        case 'clearDraft':
          const currentDrafts = context.globalState.get('chatDrafts', {});
          delete currentDrafts[message.tabId];
          context.globalState.update('chatDrafts', currentDrafts);
          break;
          
        case 'getDrafts':
          // Send stored drafts back to webview for restoration
          const storedDrafts = context.globalState.get('chatDrafts', {});
          webviewPanel.webview.postMessage({
            command: 'restoreDrafts',
            drafts: storedDrafts
          });
          break;
      }
    },
    undefined,
    context.subscriptions
  );
  
  // Implement WebviewPanelSerializer for cross-restart persistence
  vscode.window.registerWebviewPanelSerializer('yourViewType', {
    async deserializeWebviewPanel(webviewPanel: vscode.WebviewPanel, state: any) {
      // Restore webview with saved state
      webviewPanel.webview.html = getWebviewContent();
      
      // Send drafts from globalState
      const drafts = context.globalState.get('chatDrafts', {});
      webviewPanel.webview.postMessage({
        command: 'restoreDrafts',
        drafts: drafts
      });
    }
  });
}

Pattern 3: Flush on critical visibility changes

// Listen to visibility changes
webviewPanel.onDidChangeViewState(
  e => {
    if (!e.webviewPanel.visible) {
      // Webview is becoming hidden - request final state save
      webviewPanel.webview.postMessage({
        command: 'flushState'
      });
    }
  },
  null,
  context.subscriptions
);
// In webview: handle flush command
window.addEventListener('message', event => {
  const message = event.data;
  if (message.command === 'flushState') {
    // Immediately save current state without debouncing
    const currentDraft = getCurrentDraftText();
    if (currentDraft) {
      vscode.setState({ 
        drafts: { 
          [getCurrentTabId()]: { 
            text: currentDraft, 
            timestamp: Date.now() 
          } 
        } 
      });
      
      vscode.postMessage({
        command: 'saveDraft',
        tabId: getCurrentTabId(),
        text: currentDraft
      });
    }
  }
});

Trade-offs and performance considerations

Debounce intervals tested in the wild:

  • 100ms (VSCode official example): No debounce, continuous updates, perfect for demos but potentially excessive
  • 300-500ms (community standard): Optimal balance between responsiveness and efficiency—recommended for most chat interfaces
  • 1500ms (GistPad search): Too long for draft preservation, risks losing 1.5 seconds of typing
  • Immediate (on send/tab switch): Essential for critical actions where data loss is unacceptable

The undo/redo conflict: Custom text editors that debounce updates face a specific problem—hitting undo before the debounce fires causes undo to jump back to a previous state instead of the last edit. For chat interfaces this is less critical since most chat inputs don’t implement complex undo stacks, but be aware if you’re building rich text editing features.

Memory and storage considerations: setState() stores data in memory until the webview is disposed. globalState persists to disk and survives VSCode restarts but should be used judiciously for data that truly needs long-term persistence. For your chat extension, draft text is lightweight (typically under 10KB per draft) and appropriate for globalState backup.

retainContextWhenHidden alternative: You could set retainContextWhenHidden: true in your webview options to keep the entire webview context alive when hidden. This would eliminate the need for state persistence entirely, but Microsoft explicitly warns about “much higher performance overhead.” Only use this for complex UIs that cannot be quickly serialized and restored. For a chat interface with text drafts, setState/getState is definitively the right choice.

Specific recommendations for your mynah-ui extension

Your implementation checklist:

  1. Implement debounced auto-save at 500ms intervals for draft text as users type
  2. Save immediately on tab switches using mynah-ui’s onTabChange event
  3. Clear drafts after successful message submission in the onChatPrompt handler
  4. Back up drafts to globalState via message passing to your extension for persistence beyond webview lifecycle
  5. Restore drafts on webview load by checking both vscode.getState() and requesting globalState from your extension
  6. Use onDidChangeViewState to trigger immediate flush when the webview becomes hidden
  7. Implement WebviewPanelSerializer if you want drafts to survive VSCode restarts (optional but recommended)

Accessing mynah-ui input fields: Since mynah-ui doesn’t expose a direct draft text API, you’ll need to either:

  • Query the DOM directly for the input element (look for textarea or input fields within mynah-ui’s rendered structure)
  • Maintain a parallel state object that tracks input as users type by monitoring DOM events
  • Wrap mynah-ui’s initialization and hook into its input element references after construction

Visual indicators to add: Following industry standards, consider adding:

  • “[Draft]” label next to tabs with unsaved text
  • Badge count showing number of tabs with drafts
  • Timestamp showing when draft was last saved
  • Warning dialog if user attempts to close VSCode with unsaved drafts (though VSCode doesn’t provide a beforeunload hook, you could show a modal when dispose is called)

Testing your implementation:

  1. Type draft text and close the sidebar—text should reappear on reopen
  2. Type draft in one tab, switch tabs, return—draft should persist
  3. Reload the webview (Developer: Reload Webview command)—draft should restore
  4. Restart VSCode—draft should restore if using WebviewPanelSerializer
  5. Type draft, wait only 200ms, close sidebar—draft should still save (test your debounce timing)

Code you can ship today

Here’s a minimal, production-ready implementation you can add to your existing code:

// Add to your webview script
class DraftManager {
  constructor(vscode, mynahUI) {
    this.vscode = vscode;
    this.mynahUI = mynahUI;
    this.saveTimeout = null;
    this.DEBOUNCE_MS = 500;
    
    this.init();
  }
  
  init() {
    // Restore drafts on load
    this.restoreAllDrafts();
    
    // Hook into input changes
    this.monitorInput();
    
    // Save immediately on visibility change
    window.addEventListener('beforeunload', () => this.flushAll());
  }
  
  monitorInput() {
    // Find mynah-ui input element (adjust selector as needed)
    const inputObserver = new MutationObserver(() => {
      const input = document.querySelector('textarea[data-mynah-input]');
      if (input && !input.dataset.draftHandlerAttached) {
        input.dataset.draftHandlerAttached = 'true';
        input.addEventListener('input', (e) => {
          this.saveDraft(this.getCurrentTabId(), e.target.value);
        });
      }
    });
    
    inputObserver.observe(document.body, { 
      childList: true, 
      subtree: true 
    });
  }
  
  saveDraft(tabId, text) {
    clearTimeout(this.saveTimeout);
    this.saveTimeout = setTimeout(() => {
      const state = this.vscode.getState() || { drafts: {} };
      state.drafts[tabId] = { text, timestamp: Date.now() };
      this.vscode.setState(state);
      
      // Backup to extension
      this.vscode.postMessage({
        command: 'saveDraft',
        tabId,
        text
      });
    }, this.DEBOUNCE_MS);
  }
  
  flushAll() {
    clearTimeout(this.saveTimeout);
    const tabId = this.getCurrentTabId();
    const text = this.getCurrentDraftText();
    if (text) {
      const state = this.vscode.getState() || { drafts: {} };
      state.drafts[tabId] = { text, timestamp: Date.now() };
      this.vscode.setState(state);
    }
  }
  
  restoreAllDrafts() {
    const state = this.vscode.getState();
    if (state?.drafts) {
      const currentTab = this.getCurrentTabId();
      const draft = state.drafts[currentTab];
      if (draft?.text) {
        this.setInputText(draft.text);
      }
    }
  }
  
  getCurrentTabId() {
    // Your logic to get active tab ID
    return this.mynahUI.getSelectedTabId?.() || 'default';
  }
  
  getCurrentDraftText() {
    const input = document.querySelector('textarea[data-mynah-input]');
    return input?.value || '';
  }
  
  setInputText(text) {
    const input = document.querySelector('textarea[data-mynah-input]');
    if (input) {
      input.value = text;
      input.dispatchEvent(new Event('input', { bubbles: true }));
    }
  }
}

// Initialize
const vscode = acquireVsCodeApi();
const draftManager = new DraftManager(vscode, mynahUI);

// Integrate with mynah-ui events
mynahUI.onTabChange = (tabId) => {
  draftManager.flushAll(); // Save current before switching
  draftManager.restoreAllDrafts(); // Restore for new tab
};

mynahUI.onChatPrompt = (tabId, prompt) => {
  // Clear draft after send
  const state = vscode.getState() || { drafts: {} };
  delete state.drafts[tabId];
  vscode.setState(state);
};

This implementation provides automatic draft preservation with minimal overhead, follows VSCode best practices, and aligns with industry-standard user expectations. Your users will never lose draft text when closing the sidebar, and the 500ms debounce ensures efficient performance even during rapid typing.

Key documentation references

VSCode Official:

  • Webview API Guide: https://code.visualstudio.com/api/extension-guides/webview
  • Webview UX Guidelines: https://code.visualstudio.com/api/ux-guidelines/webviews
  • Extension Samples (webview-sample): https://github.com/microsoft/vscode-extension-samples

mynah-ui:

  • GitHub Repository: https://github.com/aws/mynah-ui
  • Documentation files: STARTUP.md, CONFIG.md, DATAMODEL.md, USAGE.md

Open Source Extension Examples:

  • Continue (AI chat): https://github.com/continuedev/continue
  • REST Client: https://github.com/Huachao/vscode-restclient
  • Jupyter: https://github.com/microsoft/vscode-jupyter

Performance and UX Research:

  • VSCode GitHub Issues #66939, #109521, #127006 (lifecycle events)
  • Community Discussion #68362 (draft loss frustration)
  • Issue #251340 (chat history preservation requests)

VS Code Language Model Tool API

This reference documents VS Code’s Language Model Tool API (1.104+), which enables extensions to contribute callable tools that LLMs can invoke during chat interactions.

Tool Registration

Tools require dual registration: static declaration in package.json and runtime registration via vscode.lm.registerTool().

package.json Declaration

{
  "contributes": {
    "languageModelTools": [{
      "name": "myext_searchFiles",
      "displayName": "Search Files",
      "toolReferenceName": "searchFiles",
      "canBeReferencedInPrompt": true,
      "modelDescription": "Searches workspace files matching a pattern",
      "userDescription": "Search for files in the workspace",
      "icon": "$(search)",
      "inputSchema": {
        "type": "object",
        "properties": {
          "pattern": { "type": "string", "description": "Glob pattern to match" },
          "maxResults": { "type": "number", "default": 10 }
        },
        "required": ["pattern"]
      },
      "when": "workspaceFolderCount > 0"
    }]
  }
}

Runtime Registration

interface LanguageModelTool<T> {
  invoke(
    options: LanguageModelToolInvocationOptions<T>,
    token: CancellationToken
  ): ProviderResult<LanguageModelToolResult>;

  prepareInvocation?(
    options: LanguageModelToolInvocationPrepareOptions<T>,
    token: CancellationToken
  ): ProviderResult<PreparedToolInvocation>;
}

// Registration in activate()
export function activate(context: vscode.ExtensionContext) {
  context.subscriptions.push(
    vscode.lm.registerTool('myext_searchFiles', new SearchFilesTool())
  );
}

Tool Call Flow

The model provider streams LanguageModelToolCallPart objects, and the consumer handles invocation and result feeding.

Sequence

  1. Model receives prompt and tool definitions
  2. Model generates LanguageModelToolCallPart objects with parameters
  3. VS Code presents confirmation UI
  4. Consumer invokes vscode.lm.invokeTool()
  5. Results wrap in LanguageModelToolResultPart
  6. New request includes results for model’s next response

Key Types

class LanguageModelToolCallPart {
  callId: string;   // Unique ID to match with results
  name: string;     // Tool name to invoke
  input: object;    // LLM-generated parameters
}

class LanguageModelToolResultPart {
  callId: string;  // Must match LanguageModelToolCallPart.callId
  content: Array<LanguageModelTextPart | LanguageModelPromptTsxPart | unknown>;
}

Consumer Tool Loop

async function handleWithTools(
  model: vscode.LanguageModelChat,
  messages: vscode.LanguageModelChatMessage[],
  token: vscode.CancellationToken
) {
  const options: vscode.LanguageModelChatRequestOptions = {
    tools: vscode.lm.tools.map(t => ({
      name: t.name,
      description: t.description,
      inputSchema: t.inputSchema ?? {}
    })),
    toolMode: vscode.LanguageModelChatToolMode.Auto
  };

  while (true) {
    const response = await model.sendRequest(messages, options, token);
    const toolCalls: vscode.LanguageModelToolCallPart[] = [];
    let text = '';

    for await (const part of response.stream) {
      if (part instanceof vscode.LanguageModelTextPart) {
        text += part.value;
      } else if (part instanceof vscode.LanguageModelToolCallPart) {
        toolCalls.push(part);
      }
    }

    if (toolCalls.length === 0) break;

    const results: vscode.LanguageModelToolResultPart[] = [];
    for (const call of toolCalls) {
      const result = await vscode.lm.invokeTool(call.name, {
        input: call.input,
        toolInvocationToken: undefined
      }, token);
      results.push(new vscode.LanguageModelToolResultPart(call.callId, result.content));
    }

    messages.push(vscode.LanguageModelChatMessage.Assistant([
      new vscode.LanguageModelTextPart(text),
      ...toolCalls
    ]));
    messages.push(vscode.LanguageModelChatMessage.User(results));
  }
}

Tool Mode

enum LanguageModelChatToolMode {
  Auto = 1,      // Model chooses whether to use tools
  Required = 2   // Model must use a tool
}

Confirmation UI

Every tool invocation triggers a confirmation dialog. Extensions customize via prepareInvocation().

interface PreparedToolInvocation {
  invocationMessage?: string;  // Shown during execution
  confirmationMessages?: {
    title: string;
    message: string | MarkdownString;
  };
}

class SearchFilesTool implements vscode.LanguageModelTool<{pattern: string}> {
  async prepareInvocation(
    options: vscode.LanguageModelToolInvocationPrepareOptions<{pattern: string}>,
    _token: vscode.CancellationToken
  ): Promise<vscode.PreparedToolInvocation> {
    return {
      invocationMessage: `Searching for files matching "${options.input.pattern}"...`,
      confirmationMessages: {
        title: 'Search Workspace Files',
        message: new vscode.MarkdownString(
          `Search for files matching pattern **\`${options.input.pattern}\`**?`
        )
      }
    };
  }
}

Approval Levels

  • Single use
  • Current session
  • Current workspace
  • Always allow

Users reset approvals via Chat: Reset Tool Confirmations command.

Configuration

  • chat.tools.eligibleForAutoApproval: Require manual approval for specific tools
  • chat.tools.global.autoApprove: Allow all tools without prompting
  • chat.tools.urls.autoApprove: Auto-approve URL patterns

Tool Visibility

When Clauses

{
  "contributes": {
    "languageModelTools": [{
      "name": "debug_getCallStack",
      "when": "debugState == 'running'"
    }]
  }
}

Private Tools

Skip vscode.lm.registerTool() to keep tools extension-only.

Filtering

const options: vscode.LanguageModelChatRequestOptions = {
  tools: vscode.lm.tools
    .filter(tool => tool.tags.includes('vscode_editing'))
    .map(tool => ({
      name: tool.name,
      description: tool.description,
      inputSchema: t.inputSchema ?? {}
    }))
};

Tool Information

interface LanguageModelToolInformation {
  readonly name: string;
  readonly description: string;
  readonly inputSchema: object | undefined;
  readonly tags: readonly string[];
}

const allTools = vscode.lm.tools;  // readonly LanguageModelToolInformation[]

Provider-Side Tool Handling

For LanguageModelChatProvider implementations:

interface LanguageModelChatProvider<T extends LanguageModelChatInformation> {
  provideLanguageModelChatResponse(
    model: T,
    messages: readonly LanguageModelChatRequestMessage[],
    options: LanguageModelChatRequestOptions,  // Contains tools array
    progress: Progress<LanguageModelResponsePart>,
    token: CancellationToken
  ): Thenable<any>;
}

interface LanguageModelChatInformation {
  readonly id: string;
  readonly name: string;
  readonly family: string;
  readonly version: string;
  readonly maxInputTokens: number;
  readonly maxOutputTokens: number;
  readonly capabilities: {
    readonly toolCalling?: boolean | number;
  };
}

Providers stream tool calls via progress.report() using LanguageModelToolCallPart.

Limits

  • 128 tool limit per request
  • Use tool picker to deselect unneeded tools
  • Enable virtual tools via github.copilot.chat.virtualTools.threshold

VS Code Language Model Tool Rejection Handling

This reference documents how VS Code handles tool rejection in the Language Model API.

Tool Call Timing: Why Providers Can’t Detect Rejection

Tool calls are processed after providers return, not during streaming. When a LanguageModelChatProvider emits a LanguageModelToolCallPart via progress.report(), VS Code does not process it immediately. Instead:

// VS Code's internal consumption pattern
const toolCalls: LanguageModelToolCallPart[] = [];

for await (const part of response.stream) {
    if (part instanceof LanguageModelTextPart) {
        stream.markdown(part.value);  // Text streams immediately to UI
    } else if (part instanceof LanguageModelToolCallPart) {
        toolCalls.push(part);  // Tool calls are BUFFERED, not processed
    }
}

// Only AFTER stream completes: process collected tool calls
if (toolCalls.length > 0) {
    await processToolCalls(toolCalls);  // Confirmation UI appears HERE
}

The temporal sequence:

  1. Provider emits ToolCallPart via progress.report()
  2. Provider continues executing or returns
  3. Only then: VS Code shows confirmation UI
  4. User accepts or rejects
  5. If rejected: the chat session cancels entirely

This means:

  • You cannot block inside provideLanguageModelChatResponse waiting for the tool decision
  • The CancellationToken cannot detect tool rejection during execution, because rejection happens after your method returns
  • You must use history matching to detect approval on subsequent requests

Detecting Approval via History

On approval, the next provideLanguageModelChatResponse call includes:

  1. An Assistant message with your ToolCallPart
  2. A User message with a ToolResultPart containing the matching callId
for (const msg of messages) {
    if (msg.role === 'user') {
        for (const part of msg.content) {
            if (part instanceof LanguageModelToolResultPart) {
                if (part.callId === yourPreviousToolCallId) {
                    // User approved - tool was invoked
                }
            }
        }
    }
}

On rejection, the chat session cancels - you won’t receive a follow-up request at all.

Consumer Perspective: invokeTool() on Rejection

It throws an exception. When the user clicks “Cancel” on the confirmation dialog, invokeTool() rejects with a CancellationError:

try {
  const result = await vscode.lm.invokeTool(call.name, {
    input: call.input,
    toolInvocationToken: request.toolInvocationToken
  }, token);
} catch (err) {
  if (err instanceof vscode.CancellationError) {
    // User declined the tool confirmation
  }
}

There is no special “rejected” result object - rejection is purely via exception.

Critical Limitation: Rejection Cancels Entire Chat

When a user hits “Cancel” on a tool confirmation, the whole chat gets cancelled - not just that individual tool invocation. This is a documented behavioral issue (GitHub Issue #241039).

The expected behavior would be that a cancelled tool call responds to the LLM with an error message for that specific tool, allowing the LLM to reason based on received results. Currently, this doesn’t happen.

Provider Perspective

If you’re a LanguageModelChatProvider that emitted a LanguageModelToolCallPart:

  • You don’t receive a signal in the next request’s message history
  • The entire request chain is terminated via cancellation
  • There’s no opportunity to continue with partial results

Cancellation vs. Rejection: No Distinction

Both user rejection (clicking “Cancel” on confirmation) and user cancellation (stopping the entire chat) surface identically as CancellationError. The API provides no way to distinguish between:

  • User rejected this specific tool but wants to continue the chat
  • User cancelled the entire request

What Happens After Cancellation

History After Rejection

The cancelled turn does NOT appear in history:

  • ChatResponseTurn entries only exist for completed responses
  • If the handler threw/rejected (due to cancellation), there’s no ChatResult
  • The user’s prompt (ChatRequestTurn) does appear, but with no corresponding response

So the history looks like:

Turn 1: User prompt → "Help me edit this file"
Turn 1: Assistant response → [MISSING - cancelled]
Turn 2: User prompt → "Try a different approach"

What the Model Sees on Follow-up

When the user sends a follow-up after rejection:

What the model sees:

  • The original user request
  • NO assistant response for that turn (it was cancelled)
  • The new user message

What the model does NOT see:

  • The tool call it attempted
  • Any partial text streamed before the tool call
  • The fact that there was a rejection at all

This means the tool call effectively “never happened” from the model’s perspective.

Summary

ScenarioAPI BehaviorChat continues?In history?
User approves toolinvokeTool() resolves with resultYesYes
User rejects toolinvokeTool() throws CancellationErrorNoNo
User cancels entire chatCancellationToken triggeredNoNo

Key Takeaways

  1. No partial execution: Cannot reject some tools while accepting others
  2. No rejection signaling: Model doesn’t know a tool was rejected
  3. Clean slate on retry: The cancelled turn disappears from history
  4. Exception-based flow: All rejections surface as CancellationError

References

GitHub Actions Reusable Workflow for Cross-Platform Rust Releases

Research on building a reusable GitHub workflow for cross-platform Rust binary releases.

Architecture Validation

Reusable workflows fully support multiple jobs with different runners, internal matrix strategies, and coordinated uploads to the same release. Each job can independently specify runs-on: ubuntu-latest, macos-14, or windows-latest.

For release uploads, two coordination patterns are proven in production:

  • Pattern 1 (Recommended): Create release first in a dedicated job, then fan out build jobs with needs: create-release. Each build job uploads to the upload_url output.
  • Pattern 2: Use softprops/action-gh-release which handles concurrent uploads atomically.

Cargo.toml Parsing

Use cargo metadata --format-version=1 --no-deps | jq:

ApproachReliabilityCustom MetadataCross-Platform
cargo metadata + jq⭐⭐⭐⭐⭐Full accessAll platforms
dasel⭐⭐⭐⭐⭐Full accessAll platforms
toml-cli⭐⭐⭐⭐Full accessBuild required
grep/sed/awk⭐⭐UnreliableBSD/GNU issues

Custom [package.metadata.symposium] sections appear in the JSON output under packages[0].metadata.symposium:

- name: Extract package metadata
  id: meta
  shell: bash
  run: |
    METADATA=$(cargo metadata --format-version=1 --no-deps)
    echo "name=$(echo "$METADATA" | jq -r '.packages[0].name')" >> $GITHUB_OUTPUT
    echo "version=$(echo "$METADATA" | jq -r '.packages[0].version')" >> $GITHUB_OUTPUT
    echo "binary=$(echo "$METADATA" | jq -r '.packages[0].metadata.symposium.binary // ""')" >> $GITHUB_OUTPUT
    echo "args=$(echo "$METADATA" | jq -c '.packages[0].metadata.symposium.args // []')" >> $GITHUB_OUTPUT

Both cargo and jq are pre-installed on all GitHub-hosted runners.

Why Not cargo-dist or cross-rs

cargo-dist generates complete, self-contained workflows rather than providing reusable components. It’s incompatible with the reusable workflow pattern where callers uses: org/repo/.github/workflows/build.yml@v1.

cross-rs cannot practically build macOS from Linux (requires SDK extraction, custom Docker images, legal gray areas). Every major Rust project uses native macOS runners for Darwin targets.

For Linux ARM targets, native ARM runners (ubuntu-24.04-arm) are now free for public repos, so native builds are simpler than cross-rs.

Critical Implementation Details

Permissions and Secrets

The reusable workflow cannot request contents: write - callers must set it:

jobs:
  release:
    permissions:
      contents: write  # Required - cannot be set by called workflow
    uses: symposium-dev/package-agent-extension/.github/workflows/build.yml@v1
    secrets: inherit

secrets: inherit only works within the same organization. For cross-org callers, secrets must be explicitly declared.

Parallel Upload Coordination

Multiple jobs uploading to the same release can cause 409 Conflict errors. Use the two-phase pattern:

jobs:
  create-release:
    runs-on: ubuntu-latest
    outputs:
      upload_url: ${{ steps.create.outputs.upload_url }}
    steps:
      - uses: softprops/action-gh-release@v2
        id: create
        with:
          draft: true
          files: ""

  build:
    needs: create-release
    strategy:
      fail-fast: false
      matrix:
        include:
          - target: x86_64-unknown-linux-musl
            os: ubuntu-latest
          - target: aarch64-apple-darwin
            os: macos-14
    runs-on: ${{ matrix.os }}

Windows MAX_PATH Limits

Enable long paths for Windows builds:

- name: Enable long paths (Windows)
  if: runner.os == 'Windows'
  run: git config --system core.longpaths true

musl Allocator Performance

musl’s memory allocator is 7-20x slower than glibc’s under multi-threaded workloads. For performance-sensitive binaries, override with jemalloc:

#![allow(unused)]
fn main() {
#[cfg(target_env = "musl")]
#[global_allocator]
static ALLOC: jemallocator::Jemalloc = jemallocator::Jemalloc;
}

Patterns from Production Projects

Analysis of ripgrep, bat, fd, delta, nushell, and hyperfine:

  • Two-phase release structure is universal: create-releasebuild-release → optional publish-release
  • Naming convention: {binary}-{target}-{version}.{ext} with .tar.gz for Unix and .zip for Windows
  • Workflow versioning: Use major tags (v1, v2) as floating tags pointing to latest patch
name: Build and Release Extension

on:
  workflow_call:
    inputs:
      manifest:
        description: 'Path to Cargo.toml'
        type: string
        default: './Cargo.toml'
      musl:
        description: 'Use musl for Linux builds (true) or glibc (false)'
        type: boolean
        required: true

jobs:
  metadata:
    runs-on: ubuntu-latest
    outputs:
      name: ${{ steps.meta.outputs.name }}
      version: ${{ steps.meta.outputs.version }}
      binary: ${{ steps.meta.outputs.binary }}
    steps:
      - uses: actions/checkout@v4
      - name: Extract metadata
        id: meta
        run: |
          METADATA=$(cargo metadata --format-version=1 --no-deps --manifest-path ${{ inputs.manifest }})
          echo "name=$(echo "$METADATA" | jq -r '.packages[0].name')" >> $GITHUB_OUTPUT
          echo "version=$(echo "$METADATA" | jq -r '.packages[0].version')" >> $GITHUB_OUTPUT
          echo "binary=$(echo "$METADATA" | jq -r '.packages[0].metadata.symposium.binary // .packages[0].name')" >> $GITHUB_OUTPUT

  build:
    needs: metadata
    strategy:
      fail-fast: false
      matrix:
        include:
          - target: x86_64-unknown-linux-${{ inputs.musl && 'musl' || 'gnu' }}
            os: ubuntu-latest
          - target: aarch64-unknown-linux-${{ inputs.musl && 'musl' || 'gnu' }}
            os: ubuntu-24.04-arm
          - target: x86_64-apple-darwin
            os: macos-13
          - target: aarch64-apple-darwin
            os: macos-14
          - target: x86_64-pc-windows-msvc
            os: windows-latest
    runs-on: ${{ matrix.os }}
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
        with:
          targets: ${{ matrix.target }}
      - uses: Swatinem/rust-cache@v2
      - name: Build
        run: cargo build --release --target ${{ matrix.target }}
      - name: Package
        # Create {binary}-{os}-{arch}-{version}.zip
      - uses: softprops/action-gh-release@v2
        with:
          files: '*.zip'

Caller Template

# .github/workflows/release.yml
on:
  release:
    types: [published]

jobs:
  release:
    permissions:
      contents: write
    uses: symposium-dev/package-agent-extension/.github/workflows/build.yml@v1
    with:
      musl: true
    secrets: inherit

Language Server Protocol (LSP) - Comprehensive Overview

Executive Summary

The Language Server Protocol (LSP) defines the protocol used between an editor or IDE and a language server that provides language features like auto complete, go to definition, find all references etc. The goal of the Language Server Index Format (LSIF, pronounced like “else if”) is to support rich code navigation in development tools or a Web UI without needing a local copy of the source code.

The idea behind the Language Server Protocol (LSP) is to standardize the protocol for how tools and servers communicate, so a single Language Server can be re-used in multiple development tools, and tools can support languages with minimal effort.

Key Benefits:

  • Reduces M×N complexity to M+N (one server per language instead of one implementation per editor per language)
  • Enables language providers to focus on a single high-quality implementation
  • Allows editors to support multiple languages with minimal effort
  • Standardized JSON-RPC based communication

Table of Contents

  1. Architecture & Core Concepts
  2. Base Protocol
  3. Message Types
  4. Capabilities System
  5. Lifecycle Management
  6. Document Synchronization
  7. Language Features
  8. Workspace Features
  9. Window Features
  10. Implementation Considerations
  11. Version History

Architecture & Core Concepts

Problem Statement

Prior to the design and implementation of the Language Server Protocol for the development of Visual Studio Code, most language services were generally tied to a given IDE or other editor. In the absence of the Language Server Protocol, language services are typically implemented by using a tool-specific extension API.

This created a classic M×N complexity problem where:

  • M = Number of editors/IDEs
  • N = Number of programming languages
  • Total implementations needed = M × N

LSP Solution

The idea behind a Language Server is to provide the language-specific smarts inside a server that can communicate with development tooling over a protocol that enables inter-process communication.

Architecture Components:

  1. Language Client: The editor/IDE that requests language services
  2. Language Server: A separate process providing language intelligence
  3. LSP: The standardized communication protocol between them

Communication Model:

  • JSON-RPC 2.0 based messaging
  • A language server runs as a separate process and development tools communicate with the server using the language protocol over JSON-RPC.
  • Bi-directional communication (client ↔ server)
  • Support for synchronous requests and asynchronous notifications

Supported Languages & Environments

LSP is not restricted to programming languages. It can be used for any kind of text-based language, like specifications or domain-specific languages (DSL).

Transport Options:

  • stdio (standard input/output)
  • Named pipes (Windows) / Unix domain sockets
  • TCP sockets
  • Node.js IPC

This comprehensive overview provides the foundation for understanding and implementing Language Server Protocol solutions. Each section can be expanded into detailed implementation guides as needed.

Base Protocol

Message Structure

The base protocol consists of a header and a content part (comparable to HTTP). The header and content part are separated by a ‘\r\n’.

Header Format

Content-Length: <number>\r\n
Content-Type: application/vscode-jsonrpc; charset=utf-8\r\n
\r\n

Required Headers:

  • Content-Length: Length of content in bytes (mandatory)
  • Content-Type: MIME type (optional, defaults to application/vscode-jsonrpc; charset=utf-8)

Content Format

Contains the actual content of the message. The content part of a message uses JSON-RPC to describe requests, responses and notifications.

Example Message:

Content-Length: 126\r\n
\r\n
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "textDocument/completion",
  "params": {
    "textDocument": { "uri": "file:///path/to/file.js" },
    "position": { "line": 5, "character": 10 }
  }
}

JSON-RPC Structure

Base Message

interface Message {
  jsonrpc: string; // Always "2.0"
}

Request Message

interface RequestMessage extends Message {
  id: integer | string;
  method: string;
  params?: array | object;
}

Response Message

interface ResponseMessage extends Message {
  id: integer | string | null;
  result?: any;
  error?: ResponseError;
}

Notification Message

interface NotificationMessage extends Message {
  method: string;
  params?: array | object;
}

Error Handling

Standard Error Codes:

  • -32700: Parse error
  • -32600: Invalid Request
  • -32601: Method not found
  • -32602: Invalid params
  • -32603: Internal error

LSP-Specific Error Codes:

  • -32803: RequestFailed
  • -32802: ServerCancelled
  • -32801: ContentModified
  • -32800: RequestCancelled

Language Features

Language Features provide the actual smarts in the language server protocol. They are usually executed on a [text document, position] tuple. The main language feature categories are: code comprehension features like Hover or Goto Definition. coding features like diagnostics, code complete or code actions.

Go to Definition

textDocument/definition: TextDocumentPositionParams → Location | Location[] | LocationLink[] | null

Go to Declaration

textDocument/declaration: TextDocumentPositionParams → Location | Location[] | LocationLink[] | null

Go to Type Definition

textDocument/typeDefinition: TextDocumentPositionParams → Location | Location[] | LocationLink[] | null

Go to Implementation

textDocument/implementation: TextDocumentPositionParams → Location | Location[] | LocationLink[] | null

Find References

textDocument/references: ReferenceParams → Location[] | null

interface ReferenceParams extends TextDocumentPositionParams {
  context: { includeDeclaration: boolean; }
}

Information Features

Hover

textDocument/hover: TextDocumentPositionParams → Hover | null

interface Hover {
  contents: MarkedString | MarkedString[] | MarkupContent;
  range?: Range;
}

Signature Help

textDocument/signatureHelp: SignatureHelpParams → SignatureHelp | null

interface SignatureHelp {
  signatures: SignatureInformation[];
  activeSignature?: uinteger;
  activeParameter?: uinteger;
}

Document Symbols

textDocument/documentSymbol: DocumentSymbolParams → DocumentSymbol[] | SymbolInformation[] | null

Workspace Symbols

workspace/symbol: WorkspaceSymbolParams → SymbolInformation[] | WorkspaceSymbol[] | null

Code Intelligence Features

Code Completion

textDocument/completion: CompletionParams → CompletionItem[] | CompletionList | null

interface CompletionList {
  isIncomplete: boolean;
  items: CompletionItem[];
}

interface CompletionItem {
  label: string;
  kind?: CompletionItemKind;
  detail?: string;
  documentation?: string | MarkupContent;
  sortText?: string;
  filterText?: string;
  insertText?: string;
  textEdit?: TextEdit;
  additionalTextEdits?: TextEdit[];
}

Completion Triggers:

  • User invoked (Ctrl+Space)
  • Trigger characters (., ->, etc.)
  • Incomplete completion re-trigger

Code Actions

textDocument/codeAction: CodeActionParams → (Command | CodeAction)[] | null

interface CodeAction {
  title: string;
  kind?: CodeActionKind;
  diagnostics?: Diagnostic[];
  isPreferred?: boolean;
  disabled?: { reason: string; };
  edit?: WorkspaceEdit;
  command?: Command;
}

Code Action Kinds:

  • quickfix - Fix problems
  • refactor - Refactoring operations
  • source - Source code actions (organize imports, etc.)

Code Lens

textDocument/codeLens: CodeLensParams → CodeLens[] | null

interface CodeLens {
  range: Range;
  command?: Command;
  data?: any; // For resolve support
}

Formatting Features

Document Formatting

textDocument/formatting: DocumentFormattingParams → TextEdit[] | null

Range Formatting

textDocument/rangeFormatting: DocumentRangeFormattingParams → TextEdit[] | null

On-Type Formatting

textDocument/onTypeFormatting: DocumentOnTypeFormattingParams → TextEdit[] | null

Semantic Features

Semantic Tokens

Since version 3.16.0. The request is sent from the client to the server to resolve semantic tokens for a given file. Semantic tokens are used to add additional color information to a file that depends on language specific symbol information.

textDocument/semanticTokens/full: SemanticTokensParams → SemanticTokens | null
textDocument/semanticTokens/range: SemanticTokensRangeParams → SemanticTokens | null
textDocument/semanticTokens/full/delta: SemanticTokensDeltaParams → SemanticTokens | SemanticTokensDelta | null

Token Encoding:

  • 5 integers per token: [deltaLine, deltaStart, length, tokenType, tokenModifiers]
  • Relative positioning for efficiency
  • Bit flags for modifiers

Inlay Hints

textDocument/inlayHint: InlayHintParams → InlayHint[] | null

interface InlayHint {
  position: Position;
  label: string | InlayHintLabelPart[];
  kind?: InlayHintKind; // Type | Parameter
  tooltip?: string | MarkupContent;
  paddingLeft?: boolean;
  paddingRight?: boolean;
}

Diagnostics

Push Model (Traditional)

textDocument/publishDiagnostics: PublishDiagnosticsParams

interface PublishDiagnosticsParams {
  uri: DocumentUri;
  version?: integer;
  diagnostics: Diagnostic[];
}

Pull Model (Since 3.17)

textDocument/diagnostic: DocumentDiagnosticParams → DocumentDiagnosticReport
workspace/diagnostic: WorkspaceDiagnosticParams → WorkspaceDiagnosticReport

Diagnostic Structure:

interface Diagnostic {
  range: Range;
  severity?: DiagnosticSeverity; // Error | Warning | Information | Hint
  code?: integer | string;
  source?: string; // e.g., "typescript"
  message: string;
  tags?: DiagnosticTag[]; // Unnecessary | Deprecated
  relatedInformation?: DiagnosticRelatedInformation[];
}

Implementation Guide

Performance Guidelines

Message Ordering: Responses to requests should be sent in roughly the same order as the requests appear on the server or client side.

State Management:

  • Servers should handle partial/incomplete requests gracefully
  • Use ContentModified error for outdated results
  • Implement proper cancellation support

Resource Management:

  • Language servers run in separate processes
  • Avoid memory leaks in long-running servers
  • Implement proper cleanup on shutdown

Error Handling

Client Responsibilities:

  • Restart crashed servers (with exponential backoff)
  • Handle ContentModified errors gracefully
  • Validate server responses

Server Responsibilities:

  • Return appropriate error codes
  • Handle malformed/outdated requests
  • Monitor client process health

Transport Considerations

Command Line Arguments:

language-server --stdio                    # Use stdio
language-server --pipe=<n>             # Use named pipe/socket
language-server --socket --port=<port>    # Use TCP socket  
language-server --node-ipc                # Use Node.js IPC
language-server --clientProcessId=<pid>   # Monitor client process

Testing Strategies

Unit Testing:

  • Mock LSP message exchange
  • Test individual feature implementations
  • Validate message serialization/deserialization

Integration Testing:

  • End-to-end editor integration
  • Multi-document scenarios
  • Error condition handling

Performance Testing:

  • Large file handling
  • Memory usage patterns
  • Response time benchmarks

Advanced Topics

Custom Extensions

Experimental Capabilities:

interface ClientCapabilities {
  experimental?: {
    customFeature?: boolean;
    vendorSpecificExtension?: any;
  };
}

Custom Methods:

  • Use vendor prefixes: mycompany/customFeature
  • Document custom protocol extensions
  • Ensure graceful degradation

Security Considerations

Process Isolation:

  • Language servers run in separate processes
  • Limit file system access appropriately
  • Validate all input from untrusted sources

Content Validation:

  • Sanitize file paths and URIs
  • Validate document versions
  • Implement proper input validation

Multi-Language Support

Language Identification:

interface TextDocumentItem {
  uri: DocumentUri;
  languageId: string; // "typescript", "python", etc.
  version: integer;
  text: string;
}

Document Selectors:

type DocumentSelector = DocumentFilter[];

interface DocumentFilter {
  language?: string;    // "typescript"
  scheme?: string;      // "file", "untitled"  
  pattern?: string;     // "**/*.{ts,js}"
}

Message Reference

Message Types

Request/Response Pattern

Client-to-Server Requests:

  • initialize - Server initialization
  • textDocument/hover - Get hover information
  • textDocument/completion - Get code completions
  • textDocument/definition - Go to definition

Server-to-Client Requests:

  • client/registerCapability - Register new capabilities
  • workspace/configuration - Get configuration settings
  • window/showMessageRequest - Show message with actions

Notification Pattern

Client-to-Server Notifications:

  • initialized - Initialization complete
  • textDocument/didOpen - Document opened
  • textDocument/didChange - Document changed
  • textDocument/didSave - Document saved
  • textDocument/didClose - Document closed

Server-to-Client Notifications:

  • textDocument/publishDiagnostics - Send diagnostics
  • window/showMessage - Display message
  • telemetry/event - Send telemetry data

Special Messages

Dollar Prefixed Messages: Notifications and requests whose methods start with ‘$/’ are messages which are protocol implementation dependent and might not be implementable in all clients or servers.

Examples:

  • $/cancelRequest - Cancel ongoing request
  • $/progress - Progress reporting
  • $/setTrace - Set trace level

Capabilities System

Not every language server can support all features defined by the protocol. LSP therefore provides ‘capabilities’. A capability groups a set of language features.

Capability Exchange

During Initialization:

  1. Client announces capabilities in initialize request
  2. Server announces capabilities in initialize response
  3. Both sides adapt behavior based on announced capabilities

Client Capabilities Structure

interface ClientCapabilities {
  workspace?: WorkspaceClientCapabilities;
  textDocument?: TextDocumentClientCapabilities;
  window?: WindowClientCapabilities;
  general?: GeneralClientCapabilities;
  experimental?: any;
}

Key Client Capabilities:

  • textDocument.hover.dynamicRegistration - Support dynamic hover registration
  • textDocument.completion.contextSupport - Support completion context
  • workspace.workspaceFolders - Multi-root workspace support
  • window.workDoneProgress - Progress reporting support

Server Capabilities Structure

interface ServerCapabilities {
  textDocumentSync?: TextDocumentSyncKind | TextDocumentSyncOptions;
  completionProvider?: CompletionOptions;
  hoverProvider?: boolean | HoverOptions;
  definitionProvider?: boolean | DefinitionOptions;
  referencesProvider?: boolean | ReferenceOptions;
  documentSymbolProvider?: boolean | DocumentSymbolOptions;
  workspaceSymbolProvider?: boolean | WorkspaceSymbolOptions;
  codeActionProvider?: boolean | CodeActionOptions;
  // ... many more
}

Dynamic Registration

Servers can register/unregister capabilities after initialization:

// Register new capability
client/registerCapability: {
  registrations: [{
    id: "uuid",
    method: "textDocument/willSaveWaitUntil",
    registerOptions: { documentSelector: [{ language: "javascript" }] }
  }]
}

// Unregister capability
client/unregisterCapability: {
  unregisterations: [{ id: "uuid", method: "textDocument/willSaveWaitUntil" }]
}

Lifecycle Management

Initialization Sequence

  1. Client → Server: initialize request

    interface InitializeParams {
      processId: integer | null;
      clientInfo?: { name: string; version?: string; };
      rootUri: DocumentUri | null;
      initializationOptions?: any;
      capabilities: ClientCapabilities;
      workspaceFolders?: WorkspaceFolder[] | null;
    }
    
  2. Server → Client: initialize response

    interface InitializeResult {
      capabilities: ServerCapabilities;
      serverInfo?: { name: string; version?: string; };
    }
    
  3. Client → Server: initialized notification

    • Signals completion of initialization
    • Server can now send requests to client

Shutdown Sequence

  1. Client → Server: shutdown request

    • Server must not accept new requests (except exit)
    • Server should finish processing ongoing requests
  2. Client → Server: exit notification

    • Server should exit immediately
    • Exit code: 0 if shutdown was called, 1 otherwise

Process Monitoring

Client Process Monitoring:

  • Server can monitor client process via processId from initialize
  • Server should exit if client process dies

Server Crash Handling:

  • Client should restart crashed servers
  • Implement exponential backoff to prevent restart loops

Document Synchronization

Client support for textDocument/didOpen, textDocument/didChange and textDocument/didClose notifications is mandatory in the protocol and clients can not opt out supporting them.

Text Document Sync Modes

enum TextDocumentSyncKind {
  None = 0,        // No synchronization
  Full = 1,        // Full document sync on every change
  Incremental = 2  // Incremental sync (deltas only)
}

Document Lifecycle

Document Open

textDocument/didOpen: {
  textDocument: {
    uri: "file:///path/to/file.js",
    languageId: "javascript", 
    version: 1,
    text: "console.log('hello');"
  }
}

Document Change

textDocument/didChange: {
  textDocument: { uri: "file:///path/to/file.js", version: 2 },
  contentChanges: [{
    range: { start: { line: 0, character: 12 }, end: { line: 0, character: 17 } },
    text: "world"
  }]
}

Change Event Types:

  • Full text: Replace entire document
  • Incremental: Specify range and replacement text

Document Save

// Optional: Before save
textDocument/willSave: {
  textDocument: { uri: "file:///path/to/file.js" },
  reason: TextDocumentSaveReason.Manual
}

// Optional: Before save with text edits
textDocument/willSaveWaitUntil → TextEdit[]

// After save
textDocument/didSave: {
  textDocument: { uri: "file:///path/to/file.js" },
  text?: "optional full text"
}

Document Close

textDocument/didClose: {
  textDocument: { uri: "file:///path/to/file.js" }
}

Position Encoding

Prior to 3.17 the offsets were always based on a UTF-16 string representation. Since 3.17 clients and servers can agree on a different string encoding representation (e.g. UTF-8).

Supported Encodings:

  • utf-16 (default, mandatory)
  • utf-8
  • utf-32

Position Structure:

interface Position {
  line: uinteger;     // Zero-based line number
  character: uinteger; // Zero-based character offset
}

interface Range {
  start: Position;
  end: Position;
}

Workspace Features

Multi-Root Workspaces

workspace/workspaceFolders → WorkspaceFolder[] | null

interface WorkspaceFolder {
  uri: URI;
  name: string;
}

// Notification when folders change
workspace/didChangeWorkspaceFolders: DidChangeWorkspaceFoldersParams

Configuration Management

// Server requests configuration from client
workspace/configuration: ConfigurationParams → any[]

interface ConfigurationItem {
  scopeUri?: URI;     // Scope (file/folder) for the setting
  section?: string;   // Setting name (e.g., "typescript.preferences")
}

// Client notifies server of configuration changes
workspace/didChangeConfiguration: DidChangeConfigurationParams

File Operations

File Watching

workspace/didChangeWatchedFiles: DidChangeWatchedFilesParams

interface FileEvent {
  uri: DocumentUri;
  type: FileChangeType; // Created | Changed | Deleted
}

File System Operations

// Before operations (can return WorkspaceEdit)
workspace/willCreateFiles: CreateFilesParams → WorkspaceEdit | null
workspace/willRenameFiles: RenameFilesParams → WorkspaceEdit | null  
workspace/willDeleteFiles: DeleteFilesParams → WorkspaceEdit | null

// After operations (notifications)
workspace/didCreateFiles: CreateFilesParams
workspace/didRenameFiles: RenameFilesParams
workspace/didDeleteFiles: DeleteFilesParams

Command Execution

workspace/executeCommand: ExecuteCommandParams → any

interface ExecuteCommandParams {
  command: string;           // Command identifier
  arguments?: any[];         // Command arguments
}

// Server applies edits to workspace
workspace/applyEdit: ApplyWorkspaceEditParams → ApplyWorkspaceEditResult

Window Features

Message Display

Show Message (Notification)

window/showMessage: ShowMessageParams

interface ShowMessageParams {
  type: MessageType; // Error | Warning | Info | Log | Debug
  message: string;
}

Show Message Request

window/showMessageRequest: ShowMessageRequestParams → MessageActionItem | null

interface ShowMessageRequestParams {
  type: MessageType;
  message: string;
  actions?: MessageActionItem[]; // Buttons to show
}

Show Document

window/showDocument: ShowDocumentParams → ShowDocumentResult

interface ShowDocumentParams {
  uri: URI;
  external?: boolean;    // Open in external program
  takeFocus?: boolean;   // Focus the document
  selection?: Range;     // Select range in document
}

Progress Reporting

Work Done Progress

// Server creates progress token
window/workDoneProgress/create: WorkDoneProgressCreateParams → void

// Report progress using $/progress
$/progress: ProgressParams<WorkDoneProgressBegin | WorkDoneProgressReport | WorkDoneProgressEnd>

// Client can cancel progress
window/workDoneProgress/cancel: WorkDoneProgressCancelParams

Progress Reporting Pattern

// Begin
{ kind: "begin", title: "Indexing", cancellable: true, percentage: 0 }

// Report
{ kind: "report", message: "Processing file.ts", percentage: 25 }

// End  
{ kind: "end", message: "Indexing complete" }

Logging & Telemetry

window/logMessage: LogMessageParams     // Development logs
telemetry/event: any                   // Usage analytics

Version History

LSP 3.17 (Current)

Major new feature are: type hierarchy, inline values, inlay hints, notebook document support and a meta model that describes the 3.17 LSP version.

Key Features:

  • Type hierarchy support
  • Inline value provider
  • Inlay hints
  • Notebook document synchronization
  • Diagnostic pull model
  • Position encoding negotiation

LSP 3.16

Key Features:

  • Semantic tokens
  • Call hierarchy
  • Moniker support
  • File operation events
  • Linked editing ranges
  • Code action resolve

LSP 3.15

Key Features:

  • Progress reporting
  • Selection ranges
  • Signature help context

LSP 3.0

Breaking Changes:

  • Client capabilities system
  • Dynamic registration
  • Workspace folders
  • Document link support