You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Document the storage stack: disk backends and the frontend-to-backend pipeline
Summary
PR #2932 adds documentation for the VTL2 storage translation and settings model — how OpenHCL maps backing devices onto guest-visible controllers. That covers the outer shell: what the guest sees, what OpenHCL is offered, and how the configuration surface connects them.
What it does not cover is the inside of that shell: the disk backend abstraction, the concrete disk backends, the layered disk model, and the path from a storage frontend (NVMe, StorVSP, IDE) through the SCSI adapter down to a disk backend. That is the scope of this issue.
This should be written so it is useful for both OpenVMM and OpenHCL contexts, since the same DiskIo trait, the same backends, and the same frontend implementations are shared.
What should be documented
The DiskIo trait and the Disk wrapper
The central abstraction is the DiskIo trait in vm/devices/storage/disk_backend/src/lib.rs. Every disk backend implements it. The key operations are:
This is what powers the memdiff: disk configuration in the CLI. It deserves its own section because the bitmap-based presence tracking and the layer configuration model are not obvious from reading a single file.
Resolver integration
The doc should show how the resolver pattern connects configuration to concrete backends. The storage resolver chain is the best example of recursive resolution in the codebase:
NVMe controller → resolves each namespace's disk
Layered disk → resolves each layer in parallel
Each layer or backend → resolves to a concrete DiskIo implementation
This ties back to the resolver documentation (separate issue) but deserves a storage-specific section showing the concrete resolution flow.
Online disk resize
Online disk resize is an interesting cross-cutting concern because the behavior differs by frontend, backend, and OpenHCL vs standalone context.
Frontend notification mechanisms:
Frontend
Resize notification
How it works
NVMe
AEN (Async Event Notification)
Background task calls disk.wait_resize() per namespace; on change, completes a queued AER command with CHANGED_NAMESPACE_LIST
StorVSP/SCSI
UNIT_ATTENTION sense key
On the next SCSI command after a resize, SimpleScsiDisk detects the capacity change and returns UNIT_ATTENTION; guest retries and re-reads capacity
IDE
Not supported
IDE has no standardized capacity-change notification
Backend wait_resize support:
The DiskIo trait has a wait_resize method that defaults to pending() (never completes). Only backends that can detect runtime capacity changes override it:
Backend
wait_resize
How
disk_blockdevice
✅ Event-driven
Linux uevent listener for block device resize events
disk_nvme
✅ Event-driven
NVMe driver monitors AENs from the physical controller; rescans namespace to detect capacity changes
disk_file
❌ Default (pending)
No file-change monitoring
disk_vhd1
❌ Default (pending)
Fixed-size format
disk_blob
❌ Default (pending)
Remote blob, no resize
disk_layered
✅ Delegates
Delegates to bottom-most layer
Wrappers (crypt, delay, prwrap)
✅ Delegates
Forward to inner disk
OpenHCL vs standalone:
In OpenHCL, the resize path is the same as standalone: disk_blockdevice detects the uevent from the host, wait_resize completes, and the NVMe or SCSI frontend notifies the VTL0 guest through the standard mechanism. There is no special paravisor-level resize interception.
The doc should explain this end-to-end flow and make clear which backends actually support it, since a contributor attaching a disk_file backend and expecting runtime resize will be confused when nothing happens.
RAM disk (mem:<len>)
The CLI supports standalone RAM disks via mem:<len> (e.g., --disk mem:1G). This is distinct from memdiff:<disk>, which stacks a RAM layer on top of a backing disk.
Under the hood, mem:<len> creates a RamDiskLayerHandle { len: Some(len) } wrapped in a single-layer LayeredDiskHandle. So even a "standalone" RAM disk is actually the layered disk machinery with one layer.
memdiff:<disk> creates a RamDiskLayerHandle { len: None } (sized from the backing disk) stacked on top of the inner disk. Writes go to the RAM layer; reads fall through to the backing disk for sectors not yet written.
The doc should explain this because the CLI surface (mem: vs memdiff:) hides the underlying layered disk model, and contributors reading the code will see RamDiskLayerHandle in both cases and wonder what the difference is.
Virtual optical / DVD
The storage stack supports virtual DVD/CD-ROM drives, which have a different model from disk devices.
How it works:
SimpleScsiDvd (in scsidisk/src/scsidvd/) implements AsyncScsiDisk and handles optical-specific SCSI commands: GET_EVENT_STATUS_NOTIFICATION, GET_CONFIGURATION, START_STOP_UNIT (eject), media change events, and the standard read path.
The GuestMedia enum (in ide_resources/) distinguishes GuestMedia::Dvd from GuestMedia::Disk. DVD wraps a SimpleScsiDvdHandle which holds a Resource<ScsiDeviceHandleKind>, while Disk wraps a Resource<DiskHandleKind>.
Eject is supported: the DiskIo trait has an eject() method (defaults to UnsupportedEject), and SimpleScsiDvd handles the SCSI START_STOP_UNIT command with the load/eject flag. Once ejected, the media is permanently removed for the lifetime of the VM.
Frontend support:
Frontend
DVD support
StorVSP/SCSI
✅ Via SimpleScsiDvd
IDE
✅ Via AtapiDrive wrapping SimpleScsiDvd through ATAPI
NVMe
❌ Explicitly rejected ("dvd not supported with nvme")
CLI surface:
DVD is specified with the dvd flag on --disk or --ide:
--disk file:my.iso,dvd → SCSI optical drive
--ide file:my.iso,dvd → IDE optical drive (ATAPI)
The dvd flag implicitly sets read_only = true.
The doc should cover the DVD model because it is a common source of confusion: the guest media enum, the SCSI-vs-ATAPI layering, why NVMe rejects DVD, and how eject works.
Where this belongs
This is architecture reference content. I think it belongs as a new page or set of pages under the architecture section, near the existing OpenVMM and OpenHCL architecture pages. It should cross-link to:
Document the storage stack: disk backends and the frontend-to-backend pipeline
Summary
PR #2932 adds documentation for the VTL2 storage translation and settings model — how OpenHCL maps backing devices onto guest-visible controllers. That covers the outer shell: what the guest sees, what OpenHCL is offered, and how the configuration surface connects them.
What it does not cover is the inside of that shell: the disk backend abstraction, the concrete disk backends, the layered disk model, and the path from a storage frontend (NVMe, StorVSP, IDE) through the SCSI adapter down to a disk backend. That is the scope of this issue.
This should be written so it is useful for both OpenVMM and OpenHCL contexts, since the same
DiskIotrait, the same backends, and the same frontend implementations are shared.What should be documented
The
DiskIotrait and theDiskwrapperThe central abstraction is the
DiskIotrait invm/devices/storage/disk_backend/src/lib.rs. Every disk backend implements it. The key operations are:read_vectored/write_vectored(async, scatter-gather)sync_cache(flush)unmap(TRIM / deallocate)The
Diskstruct wrapsArc<dyn DynDisk>for cheap concurrent cloning. This is what frontends hold.The doc should explain the trait, the wrapper, and the design choices (async, scatter-gather, FUA, sector-aligned I/O).
Storage frontends
How each frontend consumes a
Disk:nvme/storvsp/ide/The doc should cover the data flow from guest I/O to
DiskIomethod call. The SCSI path is the most interesting because it goes through two layers:scsidisk/): parses CDB opcodes and translates them toDiskIocallsNVMe is simpler: the NVMe controller's namespace directly holds a
Diskand calls into it.Concrete disk backends
All the backends that implement
DiskIo:disk_file/disk_vhd1/disk_vhdmp/disk_blob/disk_blockdevice/disk_nvme/disk_striped/Wrapping backends (decorators)
Backends that wrap another
Diskand transform I/O:disk_crypt/disk_delay/disk_prwrap/The wrapping pattern is important to document because it is how features compose without modifying backends.
The layered disk model
disk_layered/is its own subsystem. A layered disk stacks multiple layers with read-through and optional write-through semantics:LayerIo(similar toDiskIobut tracks sector presence via bitmap)read_cache)write_through)The two concrete layer implementations today are:
disklayer_ram/) — ephemeral, fastdisklayer_sqlite/) — persistent, portableThis is what powers the
memdiff:disk configuration in the CLI. It deserves its own section because the bitmap-based presence tracking and the layer configuration model are not obvious from reading a single file.Resolver integration
The doc should show how the resolver pattern connects configuration to concrete backends. The storage resolver chain is the best example of recursive resolution in the codebase:
DiskIoimplementationThis ties back to the resolver documentation (separate issue) but deserves a storage-specific section showing the concrete resolution flow.
Online disk resize
Online disk resize is an interesting cross-cutting concern because the behavior differs by frontend, backend, and OpenHCL vs standalone context.
Frontend notification mechanisms:
disk.wait_resize()per namespace; on change, completes a queued AER command withCHANGED_NAMESPACE_LISTSimpleScsiDiskdetects the capacity change and returns UNIT_ATTENTION; guest retries and re-reads capacityBackend
wait_resizesupport:The
DiskIotrait has await_resizemethod that defaults topending()(never completes). Only backends that can detect runtime capacity changes override it:wait_resizedisk_blockdevicedisk_nvmedisk_filedisk_vhd1disk_blobdisk_layeredOpenHCL vs standalone:
In OpenHCL, the resize path is the same as standalone:
disk_blockdevicedetects the uevent from the host,wait_resizecompletes, and the NVMe or SCSI frontend notifies the VTL0 guest through the standard mechanism. There is no special paravisor-level resize interception.The doc should explain this end-to-end flow and make clear which backends actually support it, since a contributor attaching a
disk_filebackend and expecting runtime resize will be confused when nothing happens.RAM disk (
mem:<len>)The CLI supports standalone RAM disks via
mem:<len>(e.g.,--disk mem:1G). This is distinct frommemdiff:<disk>, which stacks a RAM layer on top of a backing disk.Under the hood,
mem:<len>creates aRamDiskLayerHandle { len: Some(len) }wrapped in a single-layerLayeredDiskHandle. So even a "standalone" RAM disk is actually the layered disk machinery with one layer.memdiff:<disk>creates aRamDiskLayerHandle { len: None }(sized from the backing disk) stacked on top of the inner disk. Writes go to the RAM layer; reads fall through to the backing disk for sectors not yet written.The doc should explain this because the CLI surface (
mem:vsmemdiff:) hides the underlying layered disk model, and contributors reading the code will seeRamDiskLayerHandlein both cases and wonder what the difference is.Virtual optical / DVD
The storage stack supports virtual DVD/CD-ROM drives, which have a different model from disk devices.
How it works:
SimpleScsiDvd(inscsidisk/src/scsidvd/) implementsAsyncScsiDiskand handles optical-specific SCSI commands:GET_EVENT_STATUS_NOTIFICATION,GET_CONFIGURATION,START_STOP_UNIT(eject), media change events, and the standard read path.GuestMediaenum (inide_resources/) distinguishesGuestMedia::DvdfromGuestMedia::Disk. DVD wraps aSimpleScsiDvdHandlewhich holds aResource<ScsiDeviceHandleKind>, while Disk wraps aResource<DiskHandleKind>.DiskIotrait has aneject()method (defaults toUnsupportedEject), andSimpleScsiDvdhandles the SCSISTART_STOP_UNITcommand with the load/eject flag. Once ejected, the media is permanently removed for the lifetime of the VM.Frontend support:
SimpleScsiDvdAtapiDrivewrappingSimpleScsiDvdthrough ATAPI"dvd not supported with nvme")CLI surface:
DVD is specified with the
dvdflag on--diskor--ide:--disk file:my.iso,dvd→ SCSI optical drive--ide file:my.iso,dvd→ IDE optical drive (ATAPI)The
dvdflag implicitly setsread_only = true.The doc should cover the DVD model because it is a common source of confusion: the guest media enum, the SCSI-vs-ATAPI layering, why NVMe rejects DVD, and how eject works.
Where this belongs
This is architecture reference content. I think it belongs as a new page or set of pages under the architecture section, near the existing OpenVMM and OpenHCL architecture pages. It should cross-link to:
Possible locations:
Guide/src/reference/architecture/openvmm/storage.md— for the shared storage pipelineGuide/src/reference/devices/if we want it closer to device docsI lean toward the architecture section since this is about the internal pipeline, not about a single device.
What should be rustdoc vs Guide
DiskIotrait semantics, method contracts, scatter-gather modeldisk_backendwait_resizemethod contract and default behaviordisk_backendLayerIotrait, bitmap semantics, layer configurationdisk_layeredGuestMediamodel, eject, frontend support matrixGoals
Non-goals
Rough implementation plan
DiskIotrait, frontends, the SCSI adapter, backends, wrappers, layered disksdisk_backendanddisk_layeredcrate-level docs