Skip to content

WIP: harm runtime#50

Draft
monoid wants to merge 2 commits intomasterfrom
harm/relocations-runtime
Draft

WIP: harm runtime#50
monoid wants to merge 2 commits intomasterfrom
harm/relocations-runtime

Conversation

@monoid
Copy link
Owner

@monoid monoid commented Feb 2, 2026

Memory buffers for collecting instructions, applying relocation and execution.

Ref #31.

Summary by CodeRabbit

  • New Features

    • Added instruction assembly capabilities with label management and relocation tracking.
    • Introduced memory abstraction layer supporting alignment, growth, and conversion to executable code.
    • Optional memory-mapped memory support for flexible memory handling.
  • Dependencies

    • Added error handling dependency.
    • Made memory-mapping dependency optional with feature configuration.

@monoid monoid self-assigned this Feb 2, 2026
@monoid monoid added the enhancement New feature or request label Feb 2, 2026
@coderabbitai
Copy link

coderabbitai bot commented Feb 2, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

  • ✅ Full review completed - (🔄 Check again to review again)
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch harm/relocations-runtime

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@monoid monoid added the runtime The `harm`'s runtime: memory mappings, applying relocations, etc. label Feb 2, 2026
@monoid monoid force-pushed the harm/relocations-runtime branch from ee98019 to c0daa2c Compare February 2, 2026 11:51
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In `@harm-runtime/src/labels.rs`:
- Around line 49-83: The current label APIs (define_label, define_named_label,
name_label) panic or silently overwrite; change them to return Result<LabelId,
LabelRegistryError> (or Result<(), ...> for define_label/name_label) and
introduce a LabelRegistryError enum with variants Unregistered(LabelId),
AlreadyDefined(LabelId), and NameConflict(String). Implement define_label to
check self.labels: return Err(Unregistered) if the id isn't registered,
Err(AlreadyDefined) if it's already an Offset, otherwise set LabelInfo::Offset
and return Ok. Change define_named_label to refuse to overwrite an existing
Offset or rebind an existing name: if name exists map to id and that id is
AlreadyDefined return Err(AlreadyDefined); if name absent but id exists check
and insert only if not already defined, else return Err(Unregistered) or
NameConflict accordingly; return Ok(id) on success. Change name_label to return
Result and only insert name if id is registered and the name isn't already bound
(return NameConflict if it is), otherwise return Unregistered; update all
Assembler call sites to propagate or handle these Results.

In `@harm-runtime/src/memory.rs`:
- Around line 30-37: The align method can panic when alignment == 0 due to pos %
alignment; modify align (the fn align(&mut self, alignment: usize) -> Result<(),
Self::ExtendError>) to explicitly guard against alignment == 0 (and optionally
treat alignment <= 1 as a no-op) before computing pos % alignment; if alignment
<= 1 simply return Ok(()) (or return an appropriate error) and otherwise proceed
to compute remn from self.pos() and call self.try_extend(... ) as before to
perform padding.

In `@harm-runtime/src/memory/memmap2.rs`:
- Around line 132-166: The into_fixed_memory implementations for Vec<u8> and for
&mut Vec<u8> must explicitly reject zero-length vectors to avoid
platform-specific failures from memmap2::MmapMut::map_anon(0); update both
Memory::<Mmap2FixedMemory>::into_fixed_memory implementations to check if
self.len() == 0 and return an Err(std::io::Error) (e.g.,
std::io::ErrorKind::InvalidInput or EINVAL) instead of attempting
allocation/copy when length is zero.
- Around line 175-190: The AArch64 test_mmap writes JIT code into memory (via
Mmap2Buffer::allocate -> into_fixed_memory -> into_executable_memory) but
doesn't flush the I-cache, so exec.as_ptr() may point to stale instructions; add
an I-cache flush before transmuting exec.as_ptr() to a function pointer by
adding the clear-cache crate to Cargo.toml and calling
clear_cache::clear_cache(exec.as_ptr(), exec.as_ptr().add(exec.len())) (or, if
you prefer FFI, declare and call libc __clear_cache with the same begin/end)
immediately before the unsafe transmute/func() call in test_mmap.

In `@harm-runtime/src/runtime.rs`:
- Around line 26-28: The public build() currently panics via todo!(); change its
API to avoid panics by returning a Result and introduce a BuildError enum (e.g.,
BuildError::Unimplemented) so callers receive a structured error until
implementation is added; update the function signature pub fn build(self) ->
Result<(), BuildError> (or gate behind a feature flag) and replace todo!() with
an Err(BuildError::Unimplemented) so no downstream caller is crashed by a panic.

Comment on lines +49 to +83
pub fn define_label(&mut self, label_id: LabelId, offset: Offset) {
if let Some(info) = self.labels.get_mut(&label_id) {
match info {
LabelInfo::Forward => {
*info = LabelInfo::Offset(offset);
}
LabelInfo::Offset(_) => {
todo!("Label {label_id:?} is already defined");
}
}
} else {
panic!("Label {label_id:?} is not registered");
}
}

#[inline]
pub fn define_named_label(&mut self, name: &str, offset: Offset) -> LabelId {
if let Some(id) = self.named_labels.get(name).copied() {
self.labels.insert(id, LabelInfo::Offset(offset));
id
} else {
let id = self.next_label();
self.named_labels.insert(name.to_string(), id);
self.labels.insert(id, LabelInfo::Offset(offset));
id
}
}

pub fn name_label(&mut self, id: LabelId, name: &str) {
if self.labels.contains_key(&id) {
self.named_labels.insert(name.to_string(), id);
} else {
panic!("Label {id:?} is not registered");
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid panics and silent redefinitions in label definition APIs.

define_label uses todo!/panic!, define_named_label overwrites existing definitions, and name_label can rebind names. For a public API this should return Result with explicit errors to prevent accidental label corruption and runtime panics. This will also require updating Assembler call sites.

Suggested fix (return Result and guard redefinitions)
-    pub fn define_label(&mut self, label_id: LabelId, offset: Offset) {
-        if let Some(info) = self.labels.get_mut(&label_id) {
-            match info {
-                LabelInfo::Forward => {
-                    *info = LabelInfo::Offset(offset);
-                }
-                LabelInfo::Offset(_) => {
-                    todo!("Label {label_id:?} is already defined");
-                }
-            }
-        } else {
-            panic!("Label {label_id:?} is not registered");
-        }
-    }
+    pub fn define_label(
+        &mut self,
+        label_id: LabelId,
+        offset: Offset,
+    ) -> Result<(), LabelRegistryError> {
+        if let Some(info) = self.labels.get_mut(&label_id) {
+            match info {
+                LabelInfo::Forward => {
+                    *info = LabelInfo::Offset(offset);
+                    Ok(())
+                }
+                LabelInfo::Offset(_) => Err(LabelRegistryError::AlreadyDefined(label_id)),
+            }
+        } else {
+            Err(LabelRegistryError::Unregistered(label_id))
+        }
+    }

-    pub fn define_named_label(&mut self, name: &str, offset: Offset) -> LabelId {
+    pub fn define_named_label(
+        &mut self,
+        name: &str,
+        offset: Offset,
+    ) -> Result<LabelId, LabelRegistryError> {
         if let Some(id) = self.named_labels.get(name).copied() {
-            self.labels.insert(id, LabelInfo::Offset(offset));
-            id
+            if matches!(self.labels.get(&id), Some(LabelInfo::Offset(_))) {
+                return Err(LabelRegistryError::AlreadyDefined(id));
+            }
+            self.labels.insert(id, LabelInfo::Offset(offset));
+            Ok(id)
         } else {
             let id = self.next_label();
             self.named_labels.insert(name.to_string(), id);
             self.labels.insert(id, LabelInfo::Offset(offset));
-            id
+            Ok(id)
         }
     }

-    pub fn name_label(&mut self, id: LabelId, name: &str) {
-        if self.labels.contains_key(&id) {
-            self.named_labels.insert(name.to_string(), id);
-        } else {
-            panic!("Label {id:?} is not registered");
-        }
-    }
+    pub fn name_label(&mut self, id: LabelId, name: &str) -> Result<(), LabelRegistryError> {
+        if !self.labels.contains_key(&id) {
+            return Err(LabelRegistryError::Unregistered(id));
+        }
+        if let Some(existing) = self.named_labels.get(name) {
+            if *existing != id {
+                return Err(LabelRegistryError::NameConflict(name.to_string()));
+            }
+        }
+        self.named_labels.insert(name.to_string(), id);
+        Ok(())
+    }
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum LabelRegistryError {
    Unregistered(LabelId),
    AlreadyDefined(LabelId),
    NameConflict(String),
}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub fn define_label(&mut self, label_id: LabelId, offset: Offset) {
if let Some(info) = self.labels.get_mut(&label_id) {
match info {
LabelInfo::Forward => {
*info = LabelInfo::Offset(offset);
}
LabelInfo::Offset(_) => {
todo!("Label {label_id:?} is already defined");
}
}
} else {
panic!("Label {label_id:?} is not registered");
}
}
#[inline]
pub fn define_named_label(&mut self, name: &str, offset: Offset) -> LabelId {
if let Some(id) = self.named_labels.get(name).copied() {
self.labels.insert(id, LabelInfo::Offset(offset));
id
} else {
let id = self.next_label();
self.named_labels.insert(name.to_string(), id);
self.labels.insert(id, LabelInfo::Offset(offset));
id
}
}
pub fn name_label(&mut self, id: LabelId, name: &str) {
if self.labels.contains_key(&id) {
self.named_labels.insert(name.to_string(), id);
} else {
panic!("Label {id:?} is not registered");
}
}
pub fn define_label(
&mut self,
label_id: LabelId,
offset: Offset,
) -> Result<(), LabelRegistryError> {
if let Some(info) = self.labels.get_mut(&label_id) {
match info {
LabelInfo::Forward => {
*info = LabelInfo::Offset(offset);
Ok(())
}
LabelInfo::Offset(_) => Err(LabelRegistryError::AlreadyDefined(label_id)),
}
} else {
Err(LabelRegistryError::Unregistered(label_id))
}
}
#[inline]
pub fn define_named_label(
&mut self,
name: &str,
offset: Offset,
) -> Result<LabelId, LabelRegistryError> {
if let Some(id) = self.named_labels.get(name).copied() {
if matches!(self.labels.get(&id), Some(LabelInfo::Offset(_))) {
return Err(LabelRegistryError::AlreadyDefined(id));
}
self.labels.insert(id, LabelInfo::Offset(offset));
Ok(id)
} else {
let id = self.next_label();
self.named_labels.insert(name.to_string(), id);
self.labels.insert(id, LabelInfo::Offset(offset));
Ok(id)
}
}
pub fn name_label(&mut self, id: LabelId, name: &str) -> Result<(), LabelRegistryError> {
if !self.labels.contains_key(&id) {
return Err(LabelRegistryError::Unregistered(id));
}
if let Some(existing) = self.named_labels.get(name) {
if *existing != id {
return Err(LabelRegistryError::NameConflict(name.to_string()));
}
}
self.named_labels.insert(name.to_string(), id);
Ok(())
}
🤖 Prompt for AI Agents
In `@harm-runtime/src/labels.rs` around lines 49 - 83, The current label APIs
(define_label, define_named_label, name_label) panic or silently overwrite;
change them to return Result<LabelId, LabelRegistryError> (or Result<(), ...>
for define_label/name_label) and introduce a LabelRegistryError enum with
variants Unregistered(LabelId), AlreadyDefined(LabelId), and
NameConflict(String). Implement define_label to check self.labels: return
Err(Unregistered) if the id isn't registered, Err(AlreadyDefined) if it's
already an Offset, otherwise set LabelInfo::Offset and return Ok. Change
define_named_label to refuse to overwrite an existing Offset or rebind an
existing name: if name exists map to id and that id is AlreadyDefined return
Err(AlreadyDefined); if name absent but id exists check and insert only if not
already defined, else return Err(Unregistered) or NameConflict accordingly;
return Ok(id) on success. Change name_label to return Result and only insert
name if id is registered and the name isn't already bound (return NameConflict
if it is), otherwise return Unregistered; update all Assembler call sites to
propagate or handle these Results.

Comment on lines +30 to +37
/// Align position.
fn align(&mut self, alignment: usize) -> Result<(), Self::ExtendError> {
let pos = self.pos();
let remn = pos % alignment;
if remn != 0 {
self.try_extend(core::iter::repeat(0).take(alignment - remn))?;
}
Ok(())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard against alignment == 0 to avoid panic.

pos % alignment will panic when alignment is 0. Consider treating alignment ≤ 1 as a no-op (or returning an error) to keep the API safe.

Suggested fix
     fn align(&mut self, alignment: usize) -> Result<(), Self::ExtendError> {
-        let pos = self.pos();
-        let remn = pos % alignment;
+        if alignment <= 1 {
+            return Ok(());
+        }
+        let pos = self.pos();
+        let remn = pos % alignment;
         if remn != 0 {
             self.try_extend(core::iter::repeat(0).take(alignment - remn))?;
         }
         Ok(())
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/// Align position.
fn align(&mut self, alignment: usize) -> Result<(), Self::ExtendError> {
let pos = self.pos();
let remn = pos % alignment;
if remn != 0 {
self.try_extend(core::iter::repeat(0).take(alignment - remn))?;
}
Ok(())
/// Align position.
fn align(&mut self, alignment: usize) -> Result<(), Self::ExtendError> {
if alignment <= 1 {
return Ok(());
}
let pos = self.pos();
let remn = pos % alignment;
if remn != 0 {
self.try_extend(core::iter::repeat(0).take(alignment - remn))?;
}
Ok(())
}
🤖 Prompt for AI Agents
In `@harm-runtime/src/memory.rs` around lines 30 - 37, The align method can panic
when alignment == 0 due to pos % alignment; modify align (the fn align(&mut
self, alignment: usize) -> Result<(), Self::ExtendError>) to explicitly guard
against alignment == 0 (and optionally treat alignment <= 1 as a no-op) before
computing pos % alignment; if alignment <= 1 simply return Ok(()) (or return an
appropriate error) and otherwise proceed to compute remn from self.pos() and
call self.try_extend(... ) as before to perform padding.

Comment on lines +132 to +166
#[inline]
fn into_fixed_memory(self) -> Result<Mmap2FixedMemory, Self::FixedMemoryError> {
let mut mem = Mmap2FixedMemory::allocate(self.len())?;
mem.as_mut().copy_from_slice(&self);
Ok(mem)
}
}

impl Memory<Mmap2FixedMemory> for &mut Vec<u8> {
type ExtendError = Infallible;
type FixedMemoryError = std::io::Error;

#[inline]
fn pos(&self) -> usize {
self.len()
}

#[inline]
fn capacity(&self) -> Option<usize> {
None
}

#[inline]
fn try_extend<I: Iterator<Item = u8>>(&mut self, bytes: I) -> Result<(), Self::ExtendError> {
self.extend(bytes);
Ok(())
}

#[inline]
fn into_fixed_memory(self) -> Result<Mmap2FixedMemory, Self::FixedMemoryError> {
let mut mem = Mmap2FixedMemory::allocate(self.len())?;
// The memmap2 spec doesn't say that the length can be different...
mem.as_mut().copy_from_slice(self);
Ok(mem)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

find . -type f -name "*.rs" | head -20

Repository: monoid/harm

Length of output: 1026


🏁 Script executed:

rg "Mmap2FixedMemory" -t rust -A 5

Repository: monoid/harm

Length of output: 4285


🌐 Web query:

memmap2 map_anon(0) behavior zero-length platform-dependent

💡 Result:

memmap2::MmapMut::map_anon(0) (or MmapOptions::new().len(0).map_anon()) has no stable, cross-platform meaning: memmap2 does not guarantee that a zero-length anonymous mapping is supported; it just forwards the request to the OS, so the result is platform/OS-kernel dependent.

  • memmap2 explicitly notes that if you don’t set a length, you’ll create a zero-length anonymous mapping attempt. Whether that succeeds is then up to the underlying system call. [1]
  • On Linux/POSIX-style mmap, the length must be greater than 0; Linux documents this requirement, and mmap(…, length=0, …) fails (EINVAL on modern Linux). [2]

Practical takeaway: if you need “empty buffer” semantics, special-case len == 0 and use an empty Vec<u8> / slice instead of trying to create a 0-byte memmap.

Sources: [1] [2]


🏁 Script executed:

rg "fn allocate" harm-runtime/src/memory/memmap2.rs -A 10

Repository: monoid/harm

Length of output: 616


🏁 Script executed:

cat -n harm-runtime/src/memory/memmap2.rs | head -100

Repository: monoid/harm

Length of output: 2897


Reject zero-length vector conversions explicitly.

memmap2::MmapMut::map_anon(0) behavior is platform-dependent and fails on Linux/POSIX with EINVAL. Explicitly checking for zero-length in both Vec<u8> and &mut Vec<u8> conversions avoids silent platform-specific failures.

Suggested fix
     fn into_fixed_memory(self) -> Result<Mmap2FixedMemory, Self::FixedMemoryError> {
+        if self.is_empty() {
+            return Err(std::io::Error::new(
+                std::io::ErrorKind::InvalidInput,
+                "cannot convert empty vector to fixed memory",
+            ));
+        }
         let mut mem = Mmap2FixedMemory::allocate(self.len())?;
         mem.as_mut().copy_from_slice(&self);
         Ok(mem)
     }

Apply the same check to the &mut Vec<u8> implementation.

🤖 Prompt for AI Agents
In `@harm-runtime/src/memory/memmap2.rs` around lines 132 - 166, The
into_fixed_memory implementations for Vec<u8> and for &mut Vec<u8> must
explicitly reject zero-length vectors to avoid platform-specific failures from
memmap2::MmapMut::map_anon(0); update both
Memory::<Mmap2FixedMemory>::into_fixed_memory implementations to check if
self.len() == 0 and return an Err(std::io::Error) (e.g.,
std::io::ErrorKind::InvalidInput or EINVAL) instead of attempting
allocation/copy when length is zero.

Comment on lines +175 to +190
#[test]
#[cfg(target_arch = "aarch64")]
fn test_mmap() {
let mut buf = Mmap2Buffer::allocate(1024).expect("mmap failed, system problem");
buf.try_extend(harm::instructions::control::ret().bytes())
.unwrap();

let mem = buf.into_fixed_memory().unwrap();
// Doing relocations...

let exec = mem.into_executable_memory().unwrap();

unsafe {
let func: unsafe extern "C" fn() = std::mem::transmute(exec.as_ptr());
func();
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the file structure and find the specific file
find . -name "memmap2.rs" -type f

Repository: monoid/harm

Length of output: 91


🏁 Script executed:

# Check the actual code at the specified lines
cat -n harm-runtime/src/memory/memmap2.rs | sed -n '160,200p'

Repository: monoid/harm

Length of output: 1511


🏁 Script executed:

# Search for any existing __clear_cache or cache invalidation calls
rg "__clear_cache|clear_cache|cache.*inval" --type rust

Repository: monoid/harm

Length of output: 37


🌐 Web query:

AArch64 I-cache invalidation requirement for JIT code execution

💡 Result:

On AArch64, you must explicitly make your newly generated (JIT) instructions coherent with instruction fetch, because the architecture does not guarantee I-cache coherency with data writes. (manuals.plus)

Required maintenance sequence (per Arm ARM)

After writing the code bytes:

  1. Clean the data cache to the Point of Unification (PoU) for the written addresses: dc cvau, <va> (typically per cache line over the region)
  2. Barrier to ensure the clean is visible: dsb ish
  3. Invalidate the instruction cache to PoU for the same addresses: ic ivau, <va> (per cache line over the region)
  4. Barrier to ensure the invalidation completes: dsb ish
  5. Context sync before executing: isb (manuals.plus)

This is the canonical “self-modifying/JIT code” sequence for a PE to safely execute code it has written. (manuals.plus)

Multicore note

IC IVAU is broadcast within the Inner Shareable domain, but each core that will execute the new code must execute an ISB to ensure its fetched instruction stream is synchronized. (manuals.plus)

When you can omit parts (implementation-dependent)

Some CPUs report that parts of this are unnecessary via CTR_EL0:

  • If CTR_EL0.IDC is set, an explicit dc cvau is not required (but you still need the barrier to order stores).
  • If CTR_EL0.DIC is set, an explicit ic ivau is not required. (developer.arm.com)

Practical guidance (Linux / toolchains)

Use the platform-provided helper (e.g., GCC/Clang’s __builtin___clear_cache / libc __clear_cache) which implements the right loop-over-lines plus barriers for AArch64 user space where permitted. (developer.arm.com)

Citations:


🌐 Web query:

Rust core::arch::aarch64::__clear_cache availability and signature

💡 Result:

core::arch::aarch64::__clear_cache is not a provided Rust API (neither stable nor nightly) in core::arch::aarch64 as of current Rust docs (e.g. Rust 1.93.0 docs). Rust’s standard libraries intentionally do not expose LLVM intrinsics like llvm.clear_cache directly, and there is no corresponding core::intrinsics::clear_cache either. [1]

If you’re looking for “clear instruction cache for a range”:

  • LLVM IR intrinsic (what Clang/GCC builtins often lower to): declare void @llvm.clear_cache(ptr, ptr) (historically i8*, i8*). [2]
  • C runtime symbol commonly used by LLVM as a fallback:
    void __clear_cache(void *begin, void *end); [2]

In Rust, if your target’s runtime actually provides __clear_cache (e.g., via libgcc/compiler-rt), you’d typically bind it yourself, e.g.:

extern "C" {
    fn __clear_cache(begin: *mut u8, end: *mut u8);
}

…but its presence is platform/toolchain-dependent, not guaranteed by Rust.

For a portable Rust solution, use a dedicated crate that implements the behavior per-platform (e.g. clear-cache). [3]

Sources:
[1] Rust Internals discussion on not exposing LLVM intrinsics directly. (internals.rust-lang.org)
[2] LLVM LangRef llvm.clear_cache (and its lowering to __clear_cache). (releases.llvm.org)
[3] clear-cache crate description and purpose. (docs.rs)

Citations:


🏁 Script executed:

# Check if __clear_cache is used or mentioned anywhere in the project
rg "clear_cache|core::arch" --type rust

Repository: monoid/harm

Length of output: 37


🏁 Script executed:

# Check the Rust version and dependencies to understand what APIs are available
cat -n Cargo.toml | head -30

Repository: monoid/harm

Length of output: 1041


🌐 Web query:

How to call __clear_cache in Rust for AArch64 JIT code

💡 Result:

On AArch64 you must invalidate the I-cache for the modified code range before executing it (AArch64 I/D caches aren’t guaranteed coherent for JIT/self-modifying code) [2]. The “C way” is GCC/Clang’s __builtin___clear_cache(begin, end), which typically lowers to a runtime call named __clear_cache(begin, end) [1].

Recommended in Rust (portable, no reliance on __clear_cache being available)

Use a Rust implementation equivalent to __builtin___clear_cache:

# Cargo.toml
[dependencies]
clear-cache = "0.1"
unsafe fn jit_publish(code_ptr: *mut u8, code_len: usize) {
    // after writing code into [code_ptr, code_ptr+code_len)
    clear_cache::clear_cache(code_ptr, code_ptr.add(code_len));
}

This crate is explicitly a native implementation of __builtin___clear_cache (based on LLVM’s implementation) [3].

If you’re in a multi-threaded JIT, also perform the additional pipeline synchronization step used by JITs (Wasmtime documents calling clear_cache and then a pipeline flush for MT) [2]. (You can copy that pattern or use their crate, but it’s marked internal-only.)

Directly calling __clear_cache (FFI; not always link-safe)

If you specifically want __clear_cache, you can declare it like:

use core::ffi::c_void;

extern "C" {
    fn __clear_cache(begin: *mut c_void, end: *mut c_void);
}

unsafe fn jit_publish(code_ptr: *mut u8, code_len: usize) {
    __clear_cache(code_ptr as *mut c_void, code_ptr.add(code_len) as *mut c_void);
}

GCC documents __builtin___clear_cache(void *begin, void *end) for this purpose [1], and libgcc’s __clear_cache is a void*(begin,end)-style API (historically documented as char*, but corrected to void*) [4]. The catch is that, depending on target/toolchain/linking, the __clear_cache symbol may not be provided automatically—hence the “use a Rust crate” recommendation above.

Sources: GCC __builtin___clear_cache docs [1], Wasmtime JIT icache coherence notes (AArch64 non-coherent + MT pipeline flush) [2], clear-cache crate docs [3], GCC/libgcc signature correction for __clear_cache [4].


🏁 Script executed:

# Check harm-runtime's Cargo.toml for dependencies
cat -n harm-runtime/Cargo.toml

Repository: monoid/harm

Length of output: 738


🏁 Script executed:

# Search for any memory-related or cache-related code patterns
rg "mmap|memory|cache" harm-runtime/src/memory/memmap2.rs -i --context=3

Repository: monoid/harm

Length of output: 4605


Add I-cache coherency before executing JIT code on AArch64.

AArch64 does not guarantee I-cache coherency with data writes. The test will fail or execute stale code without flushing the cache. However, core::arch::aarch64::__clear_cache does not exist in Rust's standard library—you must either:

  1. Add the clear-cache crate (recommended):

    • Add clear-cache = "0.1" to Cargo.toml
    • Call clear_cache::clear_cache(exec.as_ptr(), exec.as_ptr().add(exec.len()))
  2. Or manually FFI-bind __clear_cache:

    extern "C" {
        fn __clear_cache(begin: *mut u8, end: *mut u8);
    }

    Then call it before the transmute. Note: availability is toolchain-dependent.

🤖 Prompt for AI Agents
In `@harm-runtime/src/memory/memmap2.rs` around lines 175 - 190, The AArch64
test_mmap writes JIT code into memory (via Mmap2Buffer::allocate ->
into_fixed_memory -> into_executable_memory) but doesn't flush the I-cache, so
exec.as_ptr() may point to stale instructions; add an I-cache flush before
transmuting exec.as_ptr() to a function pointer by adding the clear-cache crate
to Cargo.toml and calling clear_cache::clear_cache(exec.as_ptr(),
exec.as_ptr().add(exec.len())) (or, if you prefer FFI, declare and call libc
__clear_cache with the same begin/end) immediately before the unsafe
transmute/func() call in test_mmap.

Comment on lines +26 to +28
pub fn build(self) {
todo!()
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Public build() should not panic.

todo!() will crash downstream callers. Consider returning a structured error (or gating the API) until build is implemented.

Suggested fix (return explicit error)
-    pub fn build(self) {
-        todo!()
-    }
+    pub fn build(self) -> Result<(), BuildError> {
+        Err(BuildError::Unimplemented)
+    }
#[derive(Debug)]
pub enum BuildError {
    Unimplemented,
}
🤖 Prompt for AI Agents
In `@harm-runtime/src/runtime.rs` around lines 26 - 28, The public build()
currently panics via todo!(); change its API to avoid panics by returning a
Result and introduce a BuildError enum (e.g., BuildError::Unimplemented) so
callers receive a structured error until implementation is added; update the
function signature pub fn build(self) -> Result<(), BuildError> (or gate behind
a feature flag) and replace todo!() with an Err(BuildError::Unimplemented) so no
downstream caller is crashed by a panic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request runtime The `harm`'s runtime: memory mappings, applying relocations, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant