Skip to content

hvisor physical memory overlaps with root zone RAM regions (memory stomping) #310

@Inquisitor-201

Description

@Inquisitor-201

Description

hvisor's own physical memory range [skernel, __hv_end) repeatedly falls inside the MEM_TYPE_RAM regions defined in ROOT_ZONE_MEMORY_REGIONS across various board configs. This causes the root zone's Linux kernel page allocator to treat hvisor's memory pages as free RAM, allocate them to kernel or user code, and write to them — silently corrupting hvisor's page tables and heap.

The corruption eventually manifests as seemingly unrelated crashes: "unhandled MMIO fault" panics, stage-2 translation faults, or hangs. The root cause (memory stomping) is very hard to diagnose from the symptoms.

History

This problem has occurred at least 10 times across different boards and refactoring cycles, often after:

  • Adding PCIe support (which significantly increased binary size)
  • Adjusting HV_MEM_POOL_SIZE
  • Porting to new boards where BASE_ADDRESS wasn't carefully checked against the RAM layout

Each time it was "fixed" by manually moving BASE_ADDRESS in the linker script, but the same root cause reappears because nothing enforces the invariant.

Current Mitigations (PR #XXX)

  1. Compile-time overlap check (tools/check_hv_mem_overlap.py): Post-link script that reads skernel/__hv_end from the ELF, parses ROOT_ZONE_MEMORY_REGIONS from board.rs, and fails the build if any MEM_TYPE_RAM region overlaps hvisor's range.

  2. Runtime diagnostic (check_fault_in_hvisor_mem() in trap.rs): When handle_dabt gets an MMIO fault whose address falls within hvisor's memory range, it prints a specific diagnostic ("FAULT ADDRESS is within hvisor's physical memory range") before panicking, instead of the generic "mmio_handle_access" error.

What a Real Fix Would Look Like

These mitigations only detect the problem. A real fix needs an architectural change:

  • Dynamically reserve hvisor's physical pages from the root zone's memory map at boot
  • Or, always place hvisor in a dedicated physical address range outside of any board's RAM layout
  • Or, punch a hole in the root zone RAM regions for hvisor's range automatically during zone creation

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions