WIP: harm runtime by monoid · Pull Request #50 · monoid/harm

monoid · 2026-02-02T11:49:51Z

Memory buffers for collecting instructions, applying relocation and execution.

Ref #31.

Summary by CodeRabbit

New Features
- Added instruction assembly capabilities with label management and relocation tracking.
- Introduced memory abstraction layer supporting alignment, growth, and conversion to executable code.
- Optional memory-mapped memory support for flexible memory handling.
Dependencies
- Added error handling dependency.
- Made memory-mapping dependency optional with feature configuration.

coderabbitai · 2026-02-02T11:50:00Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✅ Full review completed - (🔄 Check again to review again)

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch harm/relocations-runtime

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

🤖 Fix all issues with AI agents

In `@harm-runtime/src/labels.rs`:
- Around line 49-83: The current label APIs (define_label, define_named_label,
name_label) panic or silently overwrite; change them to return Result<LabelId,
LabelRegistryError> (or Result<(), ...> for define_label/name_label) and
introduce a LabelRegistryError enum with variants Unregistered(LabelId),
AlreadyDefined(LabelId), and NameConflict(String). Implement define_label to
check self.labels: return Err(Unregistered) if the id isn't registered,
Err(AlreadyDefined) if it's already an Offset, otherwise set LabelInfo::Offset
and return Ok. Change define_named_label to refuse to overwrite an existing
Offset or rebind an existing name: if name exists map to id and that id is
AlreadyDefined return Err(AlreadyDefined); if name absent but id exists check
and insert only if not already defined, else return Err(Unregistered) or
NameConflict accordingly; return Ok(id) on success. Change name_label to return
Result and only insert name if id is registered and the name isn't already bound
(return NameConflict if it is), otherwise return Unregistered; update all
Assembler call sites to propagate or handle these Results.

In `@harm-runtime/src/memory.rs`:
- Around line 30-37: The align method can panic when alignment == 0 due to pos %
alignment; modify align (the fn align(&mut self, alignment: usize) -> Result<(),
Self::ExtendError>) to explicitly guard against alignment == 0 (and optionally
treat alignment <= 1 as a no-op) before computing pos % alignment; if alignment
<= 1 simply return Ok(()) (or return an appropriate error) and otherwise proceed
to compute remn from self.pos() and call self.try_extend(... ) as before to
perform padding.

In `@harm-runtime/src/memory/memmap2.rs`:
- Around line 132-166: The into_fixed_memory implementations for Vec<u8> and for
&mut Vec<u8> must explicitly reject zero-length vectors to avoid
platform-specific failures from memmap2::MmapMut::map_anon(0); update both
Memory::<Mmap2FixedMemory>::into_fixed_memory implementations to check if
self.len() == 0 and return an Err(std::io::Error) (e.g.,
std::io::ErrorKind::InvalidInput or EINVAL) instead of attempting
allocation/copy when length is zero.
- Around line 175-190: The AArch64 test_mmap writes JIT code into memory (via
Mmap2Buffer::allocate -> into_fixed_memory -> into_executable_memory) but
doesn't flush the I-cache, so exec.as_ptr() may point to stale instructions; add
an I-cache flush before transmuting exec.as_ptr() to a function pointer by
adding the clear-cache crate to Cargo.toml and calling
clear_cache::clear_cache(exec.as_ptr(), exec.as_ptr().add(exec.len())) (or, if
you prefer FFI, declare and call libc __clear_cache with the same begin/end)
immediately before the unsafe transmute/func() call in test_mmap.

In `@harm-runtime/src/runtime.rs`:
- Around line 26-28: The public build() currently panics via todo!(); change its
API to avoid panics by returning a Result and introduce a BuildError enum (e.g.,
BuildError::Unimplemented) so callers receive a structured error until
implementation is added; update the function signature pub fn build(self) ->
Result<(), BuildError> (or gate behind a feature flag) and replace todo!() with
an Err(BuildError::Unimplemented) so no downstream caller is crashed by a panic.

coderabbitai · 2026-02-02T12:07:39Z

harm-runtime/src/labels.rs

+    pub fn define_label(&mut self, label_id: LabelId, offset: Offset) {
+        if let Some(info) = self.labels.get_mut(&label_id) {
+            match info {
+                LabelInfo::Forward => {
+                    *info = LabelInfo::Offset(offset);
+                }
+                LabelInfo::Offset(_) => {
+                    todo!("Label {label_id:?} is already defined");
+                }
+            }
+        } else {
+            panic!("Label {label_id:?} is not registered");
+        }
+    }
+
+    #[inline]
+    pub fn define_named_label(&mut self, name: &str, offset: Offset) -> LabelId {
+        if let Some(id) = self.named_labels.get(name).copied() {
+            self.labels.insert(id, LabelInfo::Offset(offset));
+            id
+        } else {
+            let id = self.next_label();
+            self.named_labels.insert(name.to_string(), id);
+            self.labels.insert(id, LabelInfo::Offset(offset));
+            id
+        }
+    }
+
+    pub fn name_label(&mut self, id: LabelId, name: &str) {
+        if self.labels.contains_key(&id) {
+            self.named_labels.insert(name.to_string(), id);
+        } else {
+            panic!("Label {id:?} is not registered");
+        }
+    }


⚠️ Potential issue | 🟠 Major

Avoid panics and silent redefinitions in label definition APIs.

define_label uses todo!/panic!, define_named_label overwrites existing definitions, and name_label can rebind names. For a public API this should return Result with explicit errors to prevent accidental label corruption and runtime panics. This will also require updating Assembler call sites.

Suggested fix (return Result and guard redefinitions)

- pub fn define_label(&mut self, label_id: LabelId, offset: Offset) { - if let Some(info) = self.labels.get_mut(&label_id) { - match info { - LabelInfo::Forward => { - *info = LabelInfo::Offset(offset); - } - LabelInfo::Offset(_) => { - todo!("Label {label_id:?} is already defined"); - } - } - } else { - panic!("Label {label_id:?} is not registered"); - } - } + pub fn define_label( + &mut self, + label_id: LabelId, + offset: Offset, + ) -> Result<(), LabelRegistryError> { + if let Some(info) = self.labels.get_mut(&label_id) { + match info { + LabelInfo::Forward => { + *info = LabelInfo::Offset(offset); + Ok(()) + } + LabelInfo::Offset(_) => Err(LabelRegistryError::AlreadyDefined(label_id)), + } + } else { + Err(LabelRegistryError::Unregistered(label_id)) + } + } - pub fn define_named_label(&mut self, name: &str, offset: Offset) -> LabelId { + pub fn define_named_label( + &mut self, + name: &str, + offset: Offset, + ) -> Result<LabelId, LabelRegistryError> { if let Some(id) = self.named_labels.get(name).copied() { - self.labels.insert(id, LabelInfo::Offset(offset)); - id + if matches!(self.labels.get(&id), Some(LabelInfo::Offset(_))) { + return Err(LabelRegistryError::AlreadyDefined(id)); + } + self.labels.insert(id, LabelInfo::Offset(offset)); + Ok(id) } else { let id = self.next_label(); self.named_labels.insert(name.to_string(), id); self.labels.insert(id, LabelInfo::Offset(offset)); - id + Ok(id) } } - pub fn name_label(&mut self, id: LabelId, name: &str) { - if self.labels.contains_key(&id) { - self.named_labels.insert(name.to_string(), id); - } else { - panic!("Label {id:?} is not registered"); - } - } + pub fn name_label(&mut self, id: LabelId, name: &str) -> Result<(), LabelRegistryError> { + if !self.labels.contains_key(&id) { + return Err(LabelRegistryError::Unregistered(id)); + } + if let Some(existing) = self.named_labels.get(name) { + if *existing != id { + return Err(LabelRegistryError::NameConflict(name.to_string())); + } + } + self.named_labels.insert(name.to_string(), id); + Ok(()) + }

#[derive(Debug, Clone, PartialEq, Eq)] pub enum LabelRegistryError { Unregistered(LabelId), AlreadyDefined(LabelId), NameConflict(String), }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

pub fn define_label(&mut self, label_id: LabelId, offset: Offset) {

if let Some(info) = self.labels.get_mut(&label_id) {

match info {

LabelInfo::Forward => {

*info = LabelInfo::Offset(offset);

}

LabelInfo::Offset(_) => {

todo!("Label {label_id:?} is already defined");

}

}

} else {

panic!("Label {label_id:?} is not registered");

}

}

#[inline]

pub fn define_named_label(&mut self, name: &str, offset: Offset) -> LabelId {

if let Some(id) = self.named_labels.get(name).copied() {

self.labels.insert(id, LabelInfo::Offset(offset));

id

} else {

let id = self.next_label();

self.named_labels.insert(name.to_string(), id);

self.labels.insert(id, LabelInfo::Offset(offset));

id

}

}

pub fn name_label(&mut self, id: LabelId, name: &str) {

if self.labels.contains_key(&id) {

self.named_labels.insert(name.to_string(), id);

} else {

panic!("Label {id:?} is not registered");

}

}

pub fn define_label(

&mut self,

label_id: LabelId,

offset: Offset,

) -> Result<(), LabelRegistryError> {

if let Some(info) = self.labels.get_mut(&label_id) {

match info {

LabelInfo::Forward => {

*info = LabelInfo::Offset(offset);

Ok(())

}

LabelInfo::Offset(_) => Err(LabelRegistryError::AlreadyDefined(label_id)),

}

} else {

Err(LabelRegistryError::Unregistered(label_id))

}

}

#[inline]

pub fn define_named_label(

&mut self,

name: &str,

offset: Offset,

) -> Result<LabelId, LabelRegistryError> {

if let Some(id) = self.named_labels.get(name).copied() {

if matches!(self.labels.get(&id), Some(LabelInfo::Offset(_))) {

return Err(LabelRegistryError::AlreadyDefined(id));

}

self.labels.insert(id, LabelInfo::Offset(offset));

Ok(id)

} else {

let id = self.next_label();

self.named_labels.insert(name.to_string(), id);

self.labels.insert(id, LabelInfo::Offset(offset));

Ok(id)

}

}

pub fn name_label(&mut self, id: LabelId, name: &str) -> Result<(), LabelRegistryError> {

if !self.labels.contains_key(&id) {

return Err(LabelRegistryError::Unregistered(id));

}

if let Some(existing) = self.named_labels.get(name) {

if *existing != id {

return Err(LabelRegistryError::NameConflict(name.to_string()));

}

}

self.named_labels.insert(name.to_string(), id);

Ok(())

}

🤖 Prompt for AI Agents

In `@harm-runtime/src/labels.rs` around lines 49 - 83, The current label APIs (define_label, define_named_label, name_label) panic or silently overwrite; change them to return Result<LabelId, LabelRegistryError> (or Result<(), ...> for define_label/name_label) and introduce a LabelRegistryError enum with variants Unregistered(LabelId), AlreadyDefined(LabelId), and NameConflict(String). Implement define_label to check self.labels: return Err(Unregistered) if the id isn't registered, Err(AlreadyDefined) if it's already an Offset, otherwise set LabelInfo::Offset and return Ok. Change define_named_label to refuse to overwrite an existing Offset or rebind an existing name: if name exists map to id and that id is AlreadyDefined return Err(AlreadyDefined); if name absent but id exists check and insert only if not already defined, else return Err(Unregistered) or NameConflict accordingly; return Ok(id) on success. Change name_label to return Result and only insert name if id is registered and the name isn't already bound (return NameConflict if it is), otherwise return Unregistered; update all Assembler call sites to propagate or handle these Results.

coderabbitai · 2026-02-02T12:07:39Z

harm-runtime/src/memory.rs

+    /// Align position.
+    fn align(&mut self, alignment: usize) -> Result<(), Self::ExtendError> {
+        let pos = self.pos();
+        let remn = pos % alignment;
+        if remn != 0 {
+            self.try_extend(core::iter::repeat(0).take(alignment - remn))?;
+        }
+        Ok(())


⚠️ Potential issue | 🟠 Major

Guard against alignment == 0 to avoid panic.

pos % alignment will panic when alignment is 0. Consider treating alignment ≤ 1 as a no-op (or returning an error) to keep the API safe.

Suggested fix

fn align(&mut self, alignment: usize) -> Result<(), Self::ExtendError> { - let pos = self.pos(); - let remn = pos % alignment; + if alignment <= 1 { + return Ok(()); + } + let pos = self.pos(); + let remn = pos % alignment; if remn != 0 { self.try_extend(core::iter::repeat(0).take(alignment - remn))?; } Ok(()) }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

/// Align position.

fn align(&mut self, alignment: usize) -> Result<(), Self::ExtendError> {

let pos = self.pos();

let remn = pos % alignment;

if remn != 0 {

self.try_extend(core::iter::repeat(0).take(alignment - remn))?;

}

Ok(())

/// Align position.

fn align(&mut self, alignment: usize) -> Result<(), Self::ExtendError> {

if alignment <= 1 {

return Ok(());

}

let pos = self.pos();

let remn = pos % alignment;

if remn != 0 {

self.try_extend(core::iter::repeat(0).take(alignment - remn))?;

}

Ok(())

}

🤖 Prompt for AI Agents

In `@harm-runtime/src/memory.rs` around lines 30 - 37, The align method can panic when alignment == 0 due to pos % alignment; modify align (the fn align(&mut self, alignment: usize) -> Result<(), Self::ExtendError>) to explicitly guard against alignment == 0 (and optionally treat alignment <= 1 as a no-op) before computing pos % alignment; if alignment <= 1 simply return Ok(()) (or return an appropriate error) and otherwise proceed to compute remn from self.pos() and call self.try_extend(... ) as before to perform padding.

coderabbitai · 2026-02-02T12:07:39Z

harm-runtime/src/memory/memmap2.rs

+    #[inline]
+    fn into_fixed_memory(self) -> Result<Mmap2FixedMemory, Self::FixedMemoryError> {
+        let mut mem = Mmap2FixedMemory::allocate(self.len())?;
+        mem.as_mut().copy_from_slice(&self);
+        Ok(mem)
+    }
+}
+
+impl Memory<Mmap2FixedMemory> for &mut Vec<u8> {
+    type ExtendError = Infallible;
+    type FixedMemoryError = std::io::Error;
+
+    #[inline]
+    fn pos(&self) -> usize {
+        self.len()
+    }
+
+    #[inline]
+    fn capacity(&self) -> Option<usize> {
+        None
+    }
+
+    #[inline]
+    fn try_extend<I: Iterator<Item = u8>>(&mut self, bytes: I) -> Result<(), Self::ExtendError> {
+        self.extend(bytes);
+        Ok(())
+    }
+
+    #[inline]
+    fn into_fixed_memory(self) -> Result<Mmap2FixedMemory, Self::FixedMemoryError> {
+        let mut mem = Mmap2FixedMemory::allocate(self.len())?;
+        // The memmap2 spec doesn't say that the length can be different...
+        mem.as_mut().copy_from_slice(self);
+        Ok(mem)
+    }


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

find . -type f -name "*.rs" | head -20

Repository: monoid/harm

Length of output: 1026

🏁 Script executed:

rg "Mmap2FixedMemory" -t rust -A 5

Repository: monoid/harm

Length of output: 4285

🌐 Web query:

memmap2 map_anon(0) behavior zero-length platform-dependent

💡 Result:

memmap2::MmapMut::map_anon(0) (or MmapOptions::new().len(0).map_anon()) has no stable, cross-platform meaning: memmap2 does not guarantee that a zero-length anonymous mapping is supported; it just forwards the request to the OS, so the result is platform/OS-kernel dependent.

memmap2 explicitly notes that if you don’t set a length, you’ll create a zero-length anonymous mapping attempt. Whether that succeeds is then up to the underlying system call. [1]

On Linux/POSIX-style mmap, the length must be greater than 0; Linux documents this requirement, and mmap(…, length=0, …) fails (EINVAL on modern Linux). [2]

Practical takeaway: if you need “empty buffer” semantics, special-case len == 0 and use an empty Vec<u8> / slice instead of trying to create a 0-byte memmap.

Sources: [1] [2]

🏁 Script executed:

rg "fn allocate" harm-runtime/src/memory/memmap2.rs -A 10

Repository: monoid/harm

Length of output: 616

🏁 Script executed:

cat -n harm-runtime/src/memory/memmap2.rs | head -100

Repository: monoid/harm

Length of output: 2897

Reject zero-length vector conversions explicitly.

memmap2::MmapMut::map_anon(0) behavior is platform-dependent and fails on Linux/POSIX with EINVAL. Explicitly checking for zero-length in both Vec<u8> and &mut Vec<u8> conversions avoids silent platform-specific failures.

Suggested fix

fn into_fixed_memory(self) -> Result<Mmap2FixedMemory, Self::FixedMemoryError> { + if self.is_empty() { + return Err(std::io::Error::new( + std::io::ErrorKind::InvalidInput, + "cannot convert empty vector to fixed memory", + )); + } let mut mem = Mmap2FixedMemory::allocate(self.len())?; mem.as_mut().copy_from_slice(&self); Ok(mem) }

Apply the same check to the &mut Vec<u8> implementation.

🤖 Prompt for AI Agents

In `@harm-runtime/src/memory/memmap2.rs` around lines 132 - 166, The into_fixed_memory implementations for Vec<u8> and for &mut Vec<u8> must explicitly reject zero-length vectors to avoid platform-specific failures from memmap2::MmapMut::map_anon(0); update both Memory::<Mmap2FixedMemory>::into_fixed_memory implementations to check if self.len() == 0 and return an Err(std::io::Error) (e.g., std::io::ErrorKind::InvalidInput or EINVAL) instead of attempting allocation/copy when length is zero.

coderabbitai · 2026-02-02T12:07:39Z

harm-runtime/src/memory/memmap2.rs

+    #[test]
+    #[cfg(target_arch = "aarch64")]
+    fn test_mmap() {
+        let mut buf = Mmap2Buffer::allocate(1024).expect("mmap failed, system problem");
+        buf.try_extend(harm::instructions::control::ret().bytes())
+            .unwrap();
+
+        let mem = buf.into_fixed_memory().unwrap();
+        // Doing relocations...
+
+        let exec = mem.into_executable_memory().unwrap();
+
+        unsafe {
+            let func: unsafe extern "C" fn() = std::mem::transmute(exec.as_ptr());
+            func();
+        }


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the file structure and find the specific file find . -name "memmap2.rs" -type f

Repository: monoid/harm

Length of output: 91

🏁 Script executed:

# Check the actual code at the specified lines cat -n harm-runtime/src/memory/memmap2.rs | sed -n '160,200p'

Repository: monoid/harm

Length of output: 1511

🏁 Script executed:

# Search for any existing __clear_cache or cache invalidation calls rg "__clear_cache|clear_cache|cache.*inval" --type rust

Repository: monoid/harm

Length of output: 37

🌐 Web query:

AArch64 I-cache invalidation requirement for JIT code execution

💡 Result:

On AArch64, you must explicitly make your newly generated (JIT) instructions coherent with instruction fetch, because the architecture does not guarantee I-cache coherency with data writes. (manuals.plus)

Required maintenance sequence (per Arm ARM)

After writing the code bytes:

Clean the data cache to the Point of Unification (PoU) for the written addresses: dc cvau, <va> (typically per cache line over the region)

Barrier to ensure the clean is visible: dsb ish

Invalidate the instruction cache to PoU for the same addresses: ic ivau, <va> (per cache line over the region)

Barrier to ensure the invalidation completes: dsb ish

Context sync before executing: isb (manuals.plus)

This is the canonical “self-modifying/JIT code” sequence for a PE to safely execute code it has written. (manuals.plus)

Multicore note

IC IVAU is broadcast within the Inner Shareable domain, but each core that will execute the new code must execute an ISB to ensure its fetched instruction stream is synchronized. (manuals.plus)

When you can omit parts (implementation-dependent)

Some CPUs report that parts of this are unnecessary via CTR_EL0:

If CTR_EL0.IDC is set, an explicit dc cvau is not required (but you still need the barrier to order stores).

If CTR_EL0.DIC is set, an explicit ic ivau is not required. (developer.arm.com)

Practical guidance (Linux / toolchains)

Use the platform-provided helper (e.g., GCC/Clang’s __builtin___clear_cache / libc __clear_cache) which implements the right loop-over-lines plus barriers for AArch64 user space where permitted. (developer.arm.com)

Citations:

1: https://manuals.plus/m/c112d67a71a1b94ad89d51a4d7121422015c0103a0da13807e5272030d0b382f?utm_source=openai

2: https://manuals.plus/m/69d2d35bfa2a6d5333a412ab5ea7313bf07aa6a6a67bef56ed5e15c455599ba4?utm_source=openai

3: https://manuals.plus/m/69d2d35bfa2a6d5333a412ab5ea7313bf07aa6a6a67bef56ed5e15c455599ba4?utm_source=openai

4: https://manuals.plus/m/69d2d35bfa2a6d5333a412ab5ea7313bf07aa6a6a67bef56ed5e15c455599ba4?utm_source=openai

5: https://developer.arm.com/community/arm-community-blogs/b/architectures-and-processors-blog/posts/caches-self-modifying-code-implementing-clear-cache

6: https://developer.arm.com/community/arm-community-blogs/b/architectures-and-processors-blog/posts/caches-self-modifying-code-implementing-clear-cache

🌐 Web query:

Rust core::arch::aarch64::__clear_cache availability and signature

💡 Result:

core::arch::aarch64::__clear_cache is not a provided Rust API (neither stable nor nightly) in core::arch::aarch64 as of current Rust docs (e.g. Rust 1.93.0 docs). Rust’s standard libraries intentionally do not expose LLVM intrinsics like llvm.clear_cache directly, and there is no corresponding core::intrinsics::clear_cache either. [1]

If you’re looking for “clear instruction cache for a range”:

LLVM IR intrinsic (what Clang/GCC builtins often lower to): declare void @llvm.clear_cache(ptr, ptr) (historically i8*, i8*). [2]

C runtime symbol commonly used by LLVM as a fallback:
void __clear_cache(void *begin, void *end); [2]

In Rust, if your target’s runtime actually provides __clear_cache (e.g., via libgcc/compiler-rt), you’d typically bind it yourself, e.g.:

extern "C" { fn __clear_cache(begin: *mut u8, end: *mut u8); }

…but its presence is platform/toolchain-dependent, not guaranteed by Rust.

For a portable Rust solution, use a dedicated crate that implements the behavior per-platform (e.g. clear-cache). [3]

Sources:
[1] Rust Internals discussion on not exposing LLVM intrinsics directly. (internals.rust-lang.org)
[2] LLVM LangRef llvm.clear_cache (and its lowering to __clear_cache). (releases.llvm.org)
[3] clear-cache crate description and purpose. (docs.rs)

Citations:

1: https://internals.rust-lang.org/t/whats-rust-policy-towards-llvm-intrinsics/21328?utm_source=openai

2: https://releases.llvm.org/20.1.0/docs/LangRef.html?utm_source=openai

3: https://docs.rs/clear-cache?utm_source=openai

🏁 Script executed:

# Check if __clear_cache is used or mentioned anywhere in the project rg "clear_cache|core::arch" --type rust

Repository: monoid/harm

Length of output: 37

🏁 Script executed:

# Check the Rust version and dependencies to understand what APIs are available cat -n Cargo.toml | head -30

Repository: monoid/harm

Length of output: 1041

🌐 Web query:

How to call __clear_cache in Rust for AArch64 JIT code

💡 Result:

On AArch64 you must invalidate the I-cache for the modified code range before executing it (AArch64 I/D caches aren’t guaranteed coherent for JIT/self-modifying code) [2]. The “C way” is GCC/Clang’s __builtin___clear_cache(begin, end), which typically lowers to a runtime call named __clear_cache(begin, end) [1].

Recommended in Rust (portable, no reliance on __clear_cache being available)

Use a Rust implementation equivalent to __builtin___clear_cache:

# Cargo.toml [dependencies] clear-cache = "0.1"

unsafe fn jit_publish(code_ptr: *mut u8, code_len: usize) { // after writing code into [code_ptr, code_ptr+code_len) clear_cache::clear_cache(code_ptr, code_ptr.add(code_len)); }

This crate is explicitly a native implementation of __builtin___clear_cache (based on LLVM’s implementation) [3].

If you’re in a multi-threaded JIT, also perform the additional pipeline synchronization step used by JITs (Wasmtime documents calling clear_cache and then a pipeline flush for MT) [2]. (You can copy that pattern or use their crate, but it’s marked internal-only.)

Directly calling __clear_cache (FFI; not always link-safe)

If you specifically want __clear_cache, you can declare it like:

use core::ffi::c_void; extern "C" { fn __clear_cache(begin: *mut c_void, end: *mut c_void); } unsafe fn jit_publish(code_ptr: *mut u8, code_len: usize) { __clear_cache(code_ptr as *mut c_void, code_ptr.add(code_len) as *mut c_void); }

GCC documents __builtin___clear_cache(void *begin, void *end) for this purpose [1], and libgcc’s __clear_cache is a void*(begin,end)-style API (historically documented as char*, but corrected to void*) [4]. The catch is that, depending on target/toolchain/linking, the __clear_cache symbol may not be provided automatically—hence the “use a Rust crate” recommendation above.

Sources: GCC __builtin___clear_cache docs [1], Wasmtime JIT icache coherence notes (AArch64 non-coherent + MT pipeline flush) [2], clear-cache crate docs [3], GCC/libgcc signature correction for __clear_cache [4].

🏁 Script executed:

# Check harm-runtime's Cargo.toml for dependencies cat -n harm-runtime/Cargo.toml

Repository: monoid/harm

Length of output: 738

🏁 Script executed:

# Search for any memory-related or cache-related code patterns rg "mmap|memory|cache" harm-runtime/src/memory/memmap2.rs -i --context=3

Repository: monoid/harm

Length of output: 4605

Add I-cache coherency before executing JIT code on AArch64.

AArch64 does not guarantee I-cache coherency with data writes. The test will fail or execute stale code without flushing the cache. However, core::arch::aarch64::__clear_cache does not exist in Rust's standard library—you must either:

Add the clear-cache crate (recommended):

Add clear-cache = "0.1" to Cargo.toml

Call clear_cache::clear_cache(exec.as_ptr(), exec.as_ptr().add(exec.len()))

Or manually FFI-bind __clear_cache:

extern "C" { fn __clear_cache(begin: *mut u8, end: *mut u8); }

Then call it before the transmute. Note: availability is toolchain-dependent.

🤖 Prompt for AI Agents

In `@harm-runtime/src/memory/memmap2.rs` around lines 175 - 190, The AArch64 test_mmap writes JIT code into memory (via Mmap2Buffer::allocate -> into_fixed_memory -> into_executable_memory) but doesn't flush the I-cache, so exec.as_ptr() may point to stale instructions; add an I-cache flush before transmuting exec.as_ptr() to a function pointer by adding the clear-cache crate to Cargo.toml and calling clear_cache::clear_cache(exec.as_ptr(), exec.as_ptr().add(exec.len())) (or, if you prefer FFI, declare and call libc __clear_cache with the same begin/end) immediately before the unsafe transmute/func() call in test_mmap.

coderabbitai · 2026-02-02T12:07:39Z

harm-runtime/src/runtime.rs

+    pub fn build(self) {
+        todo!()
+    }


⚠️ Potential issue | 🟠 Major

Public build() should not panic.

todo!() will crash downstream callers. Consider returning a structured error (or gating the API) until build is implemented.

Suggested fix (return explicit error)

- pub fn build(self) { - todo!() - } + pub fn build(self) -> Result<(), BuildError> { + Err(BuildError::Unimplemented) + }

#[derive(Debug)] pub enum BuildError { Unimplemented, }

🤖 Prompt for AI Agents

In `@harm-runtime/src/runtime.rs` around lines 26 - 28, The public build() currently panics via todo!(); change its API to avoid panics by returning a Result and introduce a BuildError enum (e.g., BuildError::Unimplemented) so callers receive a structured error until implementation is added; update the function signature pub fn build(self) -> Result<(), BuildError> (or gate behind a feature flag) and replace todo!() with an Err(BuildError::Unimplemented) so no downstream caller is crashed by a panic.

runtime: should be renamed to translator or dynasm

8d22663

monoid self-assigned this Feb 2, 2026

monoid added the enhancement New feature or request label Feb 2, 2026

monoid added the runtime The `harm`'s runtime: memory mappings, applying relocations, etc. label Feb 2, 2026

runtime: memory abstraction

c0daa2c

monoid force-pushed the harm/relocations-runtime branch from ee98019 to c0daa2c Compare February 2, 2026 11:51

coderabbitai bot reviewed Feb 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: harm runtime#50

WIP: harm runtime#50
monoid wants to merge 2 commits intomasterfrom
harm/relocations-runtime

monoid commented Feb 2, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 2, 2026 •

edited

Loading

Review skipped

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 2, 2026

Uh oh!

coderabbitai bot Feb 2, 2026

Uh oh!

coderabbitai bot Feb 2, 2026

Uh oh!

coderabbitai bot Feb 2, 2026

Uh oh!

coderabbitai bot Feb 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

monoid commented Feb 2, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 2, 2026

Choose a reason for hiding this comment

Required maintenance sequence (per Arm ARM)

Multicore note

When you can omit parts (implementation-dependent)

Practical guidance (Linux / toolchains)

Recommended in Rust (portable, no reliance on __clear_cache being available)

Directly calling __clear_cache (FFI; not always link-safe)

Uh oh!

coderabbitai bot Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

monoid commented Feb 2, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 2, 2026 •

edited

Loading

Recommended in Rust (portable, no reliance on `__clear_cache` being available)

Directly calling `__clear_cache` (FFI; not always link-safe)