microsoft · mattkur · Mar 13, 2026 · Mar 13, 2026 · Mar 13, 2026 · Mar 19, 2026
@@ -131,6 +131,10 @@
     - [IGVM](./reference/architecture/openhcl/igvm.md)
   - [Device Architecture](./reference/architecture/devices.md)
     - [Storage Pipeline](./reference/architecture/devices/storage.md)
+  - [Core Concepts]()
+    - [Virtualized Processors](./reference/architecture/concepts/procs.md)
+  - [VMBus Architecture]()
+    - [Channels](./reference/architecture/vmbus/channels.md)
 
 ---
 

@@ -0,0 +1,53 @@
+# Processors in the VMM
+This page describes how virtual and physical processor identifiers are mapped.
+
+## VP index, CPU number, and APIC ID
+
+Much code in the OpenVMM repo relies on a numeric identifier for a virtual
+processor (VP). This is a VMM-specific VP index, which is the hypervisor-level
+identifier assigned to each virtual processor, starting at 0. Three identifiers
+are often confused:
+
+| Identifier | What it is | Numbering |
+|-----------|-----------|-----------|
+| **VP index** | Hypervisor-assigned processor number | 0, 1, 2, ... contiguous |
+| **Linux CPU number** | The kernel's `cpu` in OpenHCL | Currently equals VP index (see below) |
+| **APIC ID** (x86) | Hardware interrupt target | May differ — depends on topology |
+| **MPIDR** (aarch64) | ARM processor affinity register | Not the VP index — topology-dependent |
+
+Each platform has its own architectural way of describing CPUs, with x86 APIC
+IDs and MPIDR on AArch64. Note that these values cannot be assumed to map
+directly to VP index, as the physical or virtual topology of a system determines
+the values for these architectural identifiers.
+
+These can be different even than the **VTL0 guest's** perspective. The guest may
+have its own CPU numbering (which may or may not match the VP index). Guests are
+required to translate the guest VP number to a hypervisor VP number, which is
+then passed to the VMM. For example, the VMBus protocol allows guest drivers to
+specify a VP index for a channel.
+
+```text
+  VTL0 guest sees:          Host / VTL2 sees:
+  ┌──────────────┐          ┌──────────────┐
+  │ CPU 0 ───────┼────────► │ VP index 0   │
+  │ CPU 1 ───────┼────────► │ VP index 1   │
+  │ CPU 2 ───────┼────────► │ VP index 2   │
+  │   ...        │          │   ...        │
+  └──────────────┘          └──────────────┘
+  Guest CPU N maps to       VP index N = Linux
+  VP index N (typical)      CPU N (OpenHCL today)
+```
+
+In OpenHCL today, the VMM assumes that its view of the VP index is the same as
+the CPU number in the OpenHCL Linux Kernel. This is a simplifying assumption,
+not an architectural guarantee. This works because OpenHCL's boot shim validates
+that device-tree CPU ordering matches VP index ordering. This mapping is not
+guaranteed in a general purpose guest. The boot shim also controls the CPU
+online sequence to maintain the mapping.
+
+The APIC ID is a separate concept. On x86, the APIC ID may not match the VP
+index, especially with complex topologies (multiple sockets, SMT). The
+hypervisor provides a [`GetVpIndexFromApicId`
+hypercall](https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/hypercalls/hvcallgetvpindexfromapicid)
+for translation. On aarch64, the device tree `reg` property for each CPU is the
+MPIDR, which is also not the VP index.
@@ -0,0 +1,147 @@
+# VMBus Channels
+
+VMBus is the synthetic bus that connects guest drivers to host-side device
+backends. Every VMBus device communicates through one or more **channels** —
+bidirectional ring-buffer pairs backed by guest memory.
+
+## What is a channel?
+
+A VMBus channel is:
+
+- A **ring buffer pair** — one incoming (guest → host), one outgoing (host →
+  guest) — backed by a single guest-allocated GPADL (Guest Physical Address
+  Descriptor List) — a guest-provided description of guest-physical pages
+  shared with the host. "Incoming" and "outgoing" are always relative to the
+  local endpoint: each side's incoming ring is the other side's outgoing ring.
+  In OpenVMM (the host), the incoming ring carries data from the guest and
+  the outgoing ring carries data to the guest.
+- An **interrupt/event signal** for each direction.
+- A **target VP** — the guest vCPU targeted for channel notifications. In
+  OpenVMM's current implementation, this value also selects the host-side
+  executor used for processing that channel.
+
+Each channel is identified by a unique `channel_id` assigned by the VMBus server
+at offer time. The channel's lifecycle is: **offered → opened → closed** (or
+**rescinded** by the host). If the host rescinds an offer, the channel is torn
+down regardless of guest state.
+
+```text
+  ┌──────────────────────────────────────────────────┐
+  │  VMBus Channel                                   │
+  │                                                  │
+  │  ┌───────────────────┐  ┌───────────────────┐    │
+  │  │  Incoming Ring    │  │  Outgoing Ring    │    │
+  │  │  (guest → host)   │  │  (host → guest)   │    │
+  │  └─────────┬─────────┘  └─────────┬─────────┘    │
+  │            │                      │              │
+  │  ┌─────────┴──────────────────────┴─────────┐    │
+  │  │  GPADL-backed memory (guest-allocated)   │    │
+  │  └──────────────────────────────────────────┘    │
+  │                                                  │
+  │  Signal: guest → host    Signal: host → guest    │
+  │  Target VP: set at open time                     │
+  └──────────────────────────────────────────────────┘
+```
+
+## Subchannels
+
+A **subchannel** is a full additional VMBus channel offer for the same device
+instance. It is not a side-queue or a sub-object of the primary channel — it has
+its own ring buffer GPADL, its own open/close lifecycle, its own channel ID, and
+its own target VP.
+
+The identity of a channel within a device is the tuple `(interface_id,
+instance_id, subchannel_index)`:
+
+| Field | Meaning |
+|-------|---------|
+| `interface_id` | Device type GUID (e.g., SCSI controller) |
+| `instance_id` | Specific device instance |
+| `subchannel_index` | `0` for the primary channel, `1..n` for subchannels |
+
+### Primary and subchannel relationship
+
+- The **primary channel** (`subchannel_index == 0`) is always offered first and
+  handles protocol negotiation.
+- **Subchannels** are offered only after the primary is open, when the device
+  explicitly enables them.
+- A subchannel **cannot exist without its primary channel**. If the primary
+  channel closes, all subchannels are automatically revoked and closed.
+- Subchannels are opened and closed independently; closing one subchannel does
+  not inherently require closing the primary or other subchannels.
+
+```mermaid
+stateDiagram-v2
+    [*] --> PrimaryOffered: VMBus server offers device
+    PrimaryOffered --> PrimaryOpen: Guest opens primary (subchannel_index=0)
+    PrimaryOpen --> SubchannelsOffered: Device backend requests N subchannels
+    SubchannelsOffered --> AllOpen: Guest opens subchannels 1..n
+    AllOpen --> PrimaryOpen: Guest closes subchannels
+    PrimaryOpen --> [*]: Guest closes primary → all subchannels revoked
+```
+
+### Why subchannels exist
+
+Subchannels enable **I/O parallelism with CPU locality**. Each channel has its
+own ring buffer and target VP, so:
+
+- Multiple VPs can issue I/O concurrently without contending on a single ring
+  buffer.
+- Each channel's host-side worker runs on the target VP's thread, keeping cache
+  lines warm and avoiding cross-VP interrupts.
+
+Without subchannels, all I/O for a device funnels through one ring and one
+worker — a bottleneck on multi-VP VMs.
-worker — a bottleneck on multi-VP VMs.
+worker, which can bottleneck on multi-VP VMs.
-worker — a bottleneck on multi-VP VMs.
+worker, which can bottleneck on multi-VP VMs.
+
+## Target VP
+
+When a guest opens a channel, it specifies a `target_vp` — the guest vCPU that
+will receive channel interrupts and events. In OpenVMM's current implementation,
+the VMBus server also uses this value to select the executor that runs the device
+worker for that channel.
+
+The guest can change the target VP at runtime via the `ModifyChannel` VMBus
+message. This is used when VPs come online/offline (e.g., CPU hot-remove) and
+the guest needs to rebalance channel assignments.
+
+If you're curious to learn more about how the VMM and guest decide on the notion
+of a `target_vp`, see the [Processors](../concepts/procs.md) page.
+
+## Ring buffer model
+
+Each ring is a fixed-size circular buffer. The size is determined at channel
+open time and cannot change while the channel is open. Key properties:
+
+- **No overflow** — if the ring is full, the sender must wait. The full ring
+  itself is the only backpressure mechanism; there is no explicit flow-control
+  protocol.
+- **Batched reads** — the host reads packets in batches via
+  [`poll_read_batch()`](https://openvmm.dev/rustdoc/linux/vmbus_async/queue/struct.ReadHalf.html#method.poll_read_batch)
+  (interrupt-driven) or
+  [`try_read_batch()`](https://openvmm.dev/rustdoc/linux/vmbus_async/queue/struct.ReadHalf.html#method.try_read_batch)
+  (poll mode, no interrupt).
+- **Paired** — rings always come in pairs (incoming + outgoing). A channel
+  without both rings is not usable.
+
+Since ring buffers reside in guest-allocated memory, the host must treat all ring
+contents as untrusted input.
+
+For the ring buffer implementation, see the [`vmbus_ring`
+rustdoc](https://openvmm.dev/rustdoc/linux/vmbus_ring/index.html).
+
+## Key types
+
+The following Rust types are the primary building blocks in OpenVMM's VMBus
+implementation; device backends typically interact with `VmbusDevice`,
+`ChannelControl`, and `Queue`.
+
+| Type | Crate | Role |
+|------|-------|------|
+| `OfferKey` | `vmbus_channel` | Channel identity tuple |
+| `OfferParams` | `vmbus_channel` | Full offer metadata |
+| `OpenData` | `vmbus_channel` | Guest-provided open parameters (target VP, ring GPADL) |
+| `ChannelControl` | `vmbus_channel` | Device-side handle to enable subchannels |
+| `VmbusDevice` | `vmbus_channel` | Trait for VMBus device implementations |
+| `RawAsyncChannel` | `vmbus_channel` | Async wrapper around a ring buffer pair |
+| `IncomingRing` / `OutgoingRing` | `vmbus_ring` | Low-level ring buffer types |
+| `Queue` | `vmbus_async` | High-level async packet read/write over a channel |
@@ -52,7 +52,14 @@ pub trait VmbusDevice: Send + Any + InspectMut {
     /// The offer parameters.
     fn offer(&self) -> OfferParams;
 
-    /// The maximum number of subchannels supported by this device.
+    /// The maximum number of subchannels this device will accept.
+    ///
+    /// This is the device's upper bound — the guest may request fewer (or
+    /// none). The VMBus framework uses this to allocate resources and to
+    /// reject [`ChannelControl::enable_subchannels`] calls that exceed
+    /// this limit.
+    ///
+    /// Returns 0 by default (no subchannels — primary channel only).
     fn max_subchannels(&self) -> u16 {
         0
     }
@@ -70,7 +77,14 @@ pub trait VmbusDevice: Send + Any + InspectMut {
     /// Closes the channel number `channel_idx`.
     async fn close(&mut self, channel_idx: u16);
 
-    /// Notifies the device that interrupts for channel will now target `target_vp`.
+    /// Notifies the device that the guest has retargeted interrupts for
+    /// `channel_idx` to `target_vp`.
+    ///
+    /// This is called when the guest sends a `ModifyChannel` message to
+    /// change the VP that handles interrupts and ring processing for a
+    /// channel. Devices that create VP-targeted workers (e.g., StorVSP)
+    /// should forward this to their task driver via
+    /// [`VmTaskDriver::retarget_vp`](vmcore::vm_task::VmTaskDriver::retarget_vp).
     async fn retarget_vp(&mut self, channel_idx: u16, target_vp: u32);
 
     /// Start processing of all channels.
@@ -124,6 +138,11 @@ pub struct ChannelResources {
 }
 
 /// Control object for enabling subchannels.
+///
+/// Obtained from [`DeviceResources`] after the device is installed. The
+/// device calls [`enable_subchannels`](Self::enable_subchannels) from its
+/// protocol handler when the guest requests subchannels — for example,
+/// StorVSP calls this when the guest sends `CREATE_SUB_CHANNELS`.
 #[derive(Debug, Default, Clone)]
 pub struct ChannelControl {
     send: Option<mesh::Sender<u16>>,
@@ -138,10 +157,13 @@ pub struct TooManySubchannels;
 impl ChannelControl {
     /// Enables the first `count` subchannels.
     ///
-    /// If more than `count` subchannels are already enabled, this does nothing.
+    /// If `count` or more subchannels are already enabled, this does
+    /// nothing (the count only grows, never shrinks).
     ///
-    /// Fails if `count` is bigger than the requested maximum returned by
-    /// [`VmbusDevice::max_subchannels`].
+    /// Fails with [`TooManySubchannels`] if `count` exceeds the maximum
+    /// returned by [`VmbusDevice::max_subchannels`]. Callers should map
+    /// this error to an appropriate protocol response — for example,
+    /// StorVSP returns `INVALID_PARAMETER` to the guest.
     pub fn enable_subchannels(&self, count: u16) -> Result<(), TooManySubchannels> {
         if count > self.max {
             return Err(TooManySubchannels);