Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
ee98019 to
c0daa2c
Compare
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In `@harm-runtime/src/labels.rs`:
- Around line 49-83: The current label APIs (define_label, define_named_label,
name_label) panic or silently overwrite; change them to return Result<LabelId,
LabelRegistryError> (or Result<(), ...> for define_label/name_label) and
introduce a LabelRegistryError enum with variants Unregistered(LabelId),
AlreadyDefined(LabelId), and NameConflict(String). Implement define_label to
check self.labels: return Err(Unregistered) if the id isn't registered,
Err(AlreadyDefined) if it's already an Offset, otherwise set LabelInfo::Offset
and return Ok. Change define_named_label to refuse to overwrite an existing
Offset or rebind an existing name: if name exists map to id and that id is
AlreadyDefined return Err(AlreadyDefined); if name absent but id exists check
and insert only if not already defined, else return Err(Unregistered) or
NameConflict accordingly; return Ok(id) on success. Change name_label to return
Result and only insert name if id is registered and the name isn't already bound
(return NameConflict if it is), otherwise return Unregistered; update all
Assembler call sites to propagate or handle these Results.
In `@harm-runtime/src/memory.rs`:
- Around line 30-37: The align method can panic when alignment == 0 due to pos %
alignment; modify align (the fn align(&mut self, alignment: usize) -> Result<(),
Self::ExtendError>) to explicitly guard against alignment == 0 (and optionally
treat alignment <= 1 as a no-op) before computing pos % alignment; if alignment
<= 1 simply return Ok(()) (or return an appropriate error) and otherwise proceed
to compute remn from self.pos() and call self.try_extend(... ) as before to
perform padding.
In `@harm-runtime/src/memory/memmap2.rs`:
- Around line 132-166: The into_fixed_memory implementations for Vec<u8> and for
&mut Vec<u8> must explicitly reject zero-length vectors to avoid
platform-specific failures from memmap2::MmapMut::map_anon(0); update both
Memory::<Mmap2FixedMemory>::into_fixed_memory implementations to check if
self.len() == 0 and return an Err(std::io::Error) (e.g.,
std::io::ErrorKind::InvalidInput or EINVAL) instead of attempting
allocation/copy when length is zero.
- Around line 175-190: The AArch64 test_mmap writes JIT code into memory (via
Mmap2Buffer::allocate -> into_fixed_memory -> into_executable_memory) but
doesn't flush the I-cache, so exec.as_ptr() may point to stale instructions; add
an I-cache flush before transmuting exec.as_ptr() to a function pointer by
adding the clear-cache crate to Cargo.toml and calling
clear_cache::clear_cache(exec.as_ptr(), exec.as_ptr().add(exec.len())) (or, if
you prefer FFI, declare and call libc __clear_cache with the same begin/end)
immediately before the unsafe transmute/func() call in test_mmap.
In `@harm-runtime/src/runtime.rs`:
- Around line 26-28: The public build() currently panics via todo!(); change its
API to avoid panics by returning a Result and introduce a BuildError enum (e.g.,
BuildError::Unimplemented) so callers receive a structured error until
implementation is added; update the function signature pub fn build(self) ->
Result<(), BuildError> (or gate behind a feature flag) and replace todo!() with
an Err(BuildError::Unimplemented) so no downstream caller is crashed by a panic.
| pub fn define_label(&mut self, label_id: LabelId, offset: Offset) { | ||
| if let Some(info) = self.labels.get_mut(&label_id) { | ||
| match info { | ||
| LabelInfo::Forward => { | ||
| *info = LabelInfo::Offset(offset); | ||
| } | ||
| LabelInfo::Offset(_) => { | ||
| todo!("Label {label_id:?} is already defined"); | ||
| } | ||
| } | ||
| } else { | ||
| panic!("Label {label_id:?} is not registered"); | ||
| } | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn define_named_label(&mut self, name: &str, offset: Offset) -> LabelId { | ||
| if let Some(id) = self.named_labels.get(name).copied() { | ||
| self.labels.insert(id, LabelInfo::Offset(offset)); | ||
| id | ||
| } else { | ||
| let id = self.next_label(); | ||
| self.named_labels.insert(name.to_string(), id); | ||
| self.labels.insert(id, LabelInfo::Offset(offset)); | ||
| id | ||
| } | ||
| } | ||
|
|
||
| pub fn name_label(&mut self, id: LabelId, name: &str) { | ||
| if self.labels.contains_key(&id) { | ||
| self.named_labels.insert(name.to_string(), id); | ||
| } else { | ||
| panic!("Label {id:?} is not registered"); | ||
| } | ||
| } |
There was a problem hiding this comment.
Avoid panics and silent redefinitions in label definition APIs.
define_label uses todo!/panic!, define_named_label overwrites existing definitions, and name_label can rebind names. For a public API this should return Result with explicit errors to prevent accidental label corruption and runtime panics. This will also require updating Assembler call sites.
Suggested fix (return Result and guard redefinitions)
- pub fn define_label(&mut self, label_id: LabelId, offset: Offset) {
- if let Some(info) = self.labels.get_mut(&label_id) {
- match info {
- LabelInfo::Forward => {
- *info = LabelInfo::Offset(offset);
- }
- LabelInfo::Offset(_) => {
- todo!("Label {label_id:?} is already defined");
- }
- }
- } else {
- panic!("Label {label_id:?} is not registered");
- }
- }
+ pub fn define_label(
+ &mut self,
+ label_id: LabelId,
+ offset: Offset,
+ ) -> Result<(), LabelRegistryError> {
+ if let Some(info) = self.labels.get_mut(&label_id) {
+ match info {
+ LabelInfo::Forward => {
+ *info = LabelInfo::Offset(offset);
+ Ok(())
+ }
+ LabelInfo::Offset(_) => Err(LabelRegistryError::AlreadyDefined(label_id)),
+ }
+ } else {
+ Err(LabelRegistryError::Unregistered(label_id))
+ }
+ }
- pub fn define_named_label(&mut self, name: &str, offset: Offset) -> LabelId {
+ pub fn define_named_label(
+ &mut self,
+ name: &str,
+ offset: Offset,
+ ) -> Result<LabelId, LabelRegistryError> {
if let Some(id) = self.named_labels.get(name).copied() {
- self.labels.insert(id, LabelInfo::Offset(offset));
- id
+ if matches!(self.labels.get(&id), Some(LabelInfo::Offset(_))) {
+ return Err(LabelRegistryError::AlreadyDefined(id));
+ }
+ self.labels.insert(id, LabelInfo::Offset(offset));
+ Ok(id)
} else {
let id = self.next_label();
self.named_labels.insert(name.to_string(), id);
self.labels.insert(id, LabelInfo::Offset(offset));
- id
+ Ok(id)
}
}
- pub fn name_label(&mut self, id: LabelId, name: &str) {
- if self.labels.contains_key(&id) {
- self.named_labels.insert(name.to_string(), id);
- } else {
- panic!("Label {id:?} is not registered");
- }
- }
+ pub fn name_label(&mut self, id: LabelId, name: &str) -> Result<(), LabelRegistryError> {
+ if !self.labels.contains_key(&id) {
+ return Err(LabelRegistryError::Unregistered(id));
+ }
+ if let Some(existing) = self.named_labels.get(name) {
+ if *existing != id {
+ return Err(LabelRegistryError::NameConflict(name.to_string()));
+ }
+ }
+ self.named_labels.insert(name.to_string(), id);
+ Ok(())
+ }#[derive(Debug, Clone, PartialEq, Eq)]
pub enum LabelRegistryError {
Unregistered(LabelId),
AlreadyDefined(LabelId),
NameConflict(String),
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| pub fn define_label(&mut self, label_id: LabelId, offset: Offset) { | |
| if let Some(info) = self.labels.get_mut(&label_id) { | |
| match info { | |
| LabelInfo::Forward => { | |
| *info = LabelInfo::Offset(offset); | |
| } | |
| LabelInfo::Offset(_) => { | |
| todo!("Label {label_id:?} is already defined"); | |
| } | |
| } | |
| } else { | |
| panic!("Label {label_id:?} is not registered"); | |
| } | |
| } | |
| #[inline] | |
| pub fn define_named_label(&mut self, name: &str, offset: Offset) -> LabelId { | |
| if let Some(id) = self.named_labels.get(name).copied() { | |
| self.labels.insert(id, LabelInfo::Offset(offset)); | |
| id | |
| } else { | |
| let id = self.next_label(); | |
| self.named_labels.insert(name.to_string(), id); | |
| self.labels.insert(id, LabelInfo::Offset(offset)); | |
| id | |
| } | |
| } | |
| pub fn name_label(&mut self, id: LabelId, name: &str) { | |
| if self.labels.contains_key(&id) { | |
| self.named_labels.insert(name.to_string(), id); | |
| } else { | |
| panic!("Label {id:?} is not registered"); | |
| } | |
| } | |
| pub fn define_label( | |
| &mut self, | |
| label_id: LabelId, | |
| offset: Offset, | |
| ) -> Result<(), LabelRegistryError> { | |
| if let Some(info) = self.labels.get_mut(&label_id) { | |
| match info { | |
| LabelInfo::Forward => { | |
| *info = LabelInfo::Offset(offset); | |
| Ok(()) | |
| } | |
| LabelInfo::Offset(_) => Err(LabelRegistryError::AlreadyDefined(label_id)), | |
| } | |
| } else { | |
| Err(LabelRegistryError::Unregistered(label_id)) | |
| } | |
| } | |
| #[inline] | |
| pub fn define_named_label( | |
| &mut self, | |
| name: &str, | |
| offset: Offset, | |
| ) -> Result<LabelId, LabelRegistryError> { | |
| if let Some(id) = self.named_labels.get(name).copied() { | |
| if matches!(self.labels.get(&id), Some(LabelInfo::Offset(_))) { | |
| return Err(LabelRegistryError::AlreadyDefined(id)); | |
| } | |
| self.labels.insert(id, LabelInfo::Offset(offset)); | |
| Ok(id) | |
| } else { | |
| let id = self.next_label(); | |
| self.named_labels.insert(name.to_string(), id); | |
| self.labels.insert(id, LabelInfo::Offset(offset)); | |
| Ok(id) | |
| } | |
| } | |
| pub fn name_label(&mut self, id: LabelId, name: &str) -> Result<(), LabelRegistryError> { | |
| if !self.labels.contains_key(&id) { | |
| return Err(LabelRegistryError::Unregistered(id)); | |
| } | |
| if let Some(existing) = self.named_labels.get(name) { | |
| if *existing != id { | |
| return Err(LabelRegistryError::NameConflict(name.to_string())); | |
| } | |
| } | |
| self.named_labels.insert(name.to_string(), id); | |
| Ok(()) | |
| } |
🤖 Prompt for AI Agents
In `@harm-runtime/src/labels.rs` around lines 49 - 83, The current label APIs
(define_label, define_named_label, name_label) panic or silently overwrite;
change them to return Result<LabelId, LabelRegistryError> (or Result<(), ...>
for define_label/name_label) and introduce a LabelRegistryError enum with
variants Unregistered(LabelId), AlreadyDefined(LabelId), and
NameConflict(String). Implement define_label to check self.labels: return
Err(Unregistered) if the id isn't registered, Err(AlreadyDefined) if it's
already an Offset, otherwise set LabelInfo::Offset and return Ok. Change
define_named_label to refuse to overwrite an existing Offset or rebind an
existing name: if name exists map to id and that id is AlreadyDefined return
Err(AlreadyDefined); if name absent but id exists check and insert only if not
already defined, else return Err(Unregistered) or NameConflict accordingly;
return Ok(id) on success. Change name_label to return Result and only insert
name if id is registered and the name isn't already bound (return NameConflict
if it is), otherwise return Unregistered; update all Assembler call sites to
propagate or handle these Results.
| /// Align position. | ||
| fn align(&mut self, alignment: usize) -> Result<(), Self::ExtendError> { | ||
| let pos = self.pos(); | ||
| let remn = pos % alignment; | ||
| if remn != 0 { | ||
| self.try_extend(core::iter::repeat(0).take(alignment - remn))?; | ||
| } | ||
| Ok(()) |
There was a problem hiding this comment.
Guard against alignment == 0 to avoid panic.
pos % alignment will panic when alignment is 0. Consider treating alignment ≤ 1 as a no-op (or returning an error) to keep the API safe.
Suggested fix
fn align(&mut self, alignment: usize) -> Result<(), Self::ExtendError> {
- let pos = self.pos();
- let remn = pos % alignment;
+ if alignment <= 1 {
+ return Ok(());
+ }
+ let pos = self.pos();
+ let remn = pos % alignment;
if remn != 0 {
self.try_extend(core::iter::repeat(0).take(alignment - remn))?;
}
Ok(())
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /// Align position. | |
| fn align(&mut self, alignment: usize) -> Result<(), Self::ExtendError> { | |
| let pos = self.pos(); | |
| let remn = pos % alignment; | |
| if remn != 0 { | |
| self.try_extend(core::iter::repeat(0).take(alignment - remn))?; | |
| } | |
| Ok(()) | |
| /// Align position. | |
| fn align(&mut self, alignment: usize) -> Result<(), Self::ExtendError> { | |
| if alignment <= 1 { | |
| return Ok(()); | |
| } | |
| let pos = self.pos(); | |
| let remn = pos % alignment; | |
| if remn != 0 { | |
| self.try_extend(core::iter::repeat(0).take(alignment - remn))?; | |
| } | |
| Ok(()) | |
| } |
🤖 Prompt for AI Agents
In `@harm-runtime/src/memory.rs` around lines 30 - 37, The align method can panic
when alignment == 0 due to pos % alignment; modify align (the fn align(&mut
self, alignment: usize) -> Result<(), Self::ExtendError>) to explicitly guard
against alignment == 0 (and optionally treat alignment <= 1 as a no-op) before
computing pos % alignment; if alignment <= 1 simply return Ok(()) (or return an
appropriate error) and otherwise proceed to compute remn from self.pos() and
call self.try_extend(... ) as before to perform padding.
| #[inline] | ||
| fn into_fixed_memory(self) -> Result<Mmap2FixedMemory, Self::FixedMemoryError> { | ||
| let mut mem = Mmap2FixedMemory::allocate(self.len())?; | ||
| mem.as_mut().copy_from_slice(&self); | ||
| Ok(mem) | ||
| } | ||
| } | ||
|
|
||
| impl Memory<Mmap2FixedMemory> for &mut Vec<u8> { | ||
| type ExtendError = Infallible; | ||
| type FixedMemoryError = std::io::Error; | ||
|
|
||
| #[inline] | ||
| fn pos(&self) -> usize { | ||
| self.len() | ||
| } | ||
|
|
||
| #[inline] | ||
| fn capacity(&self) -> Option<usize> { | ||
| None | ||
| } | ||
|
|
||
| #[inline] | ||
| fn try_extend<I: Iterator<Item = u8>>(&mut self, bytes: I) -> Result<(), Self::ExtendError> { | ||
| self.extend(bytes); | ||
| Ok(()) | ||
| } | ||
|
|
||
| #[inline] | ||
| fn into_fixed_memory(self) -> Result<Mmap2FixedMemory, Self::FixedMemoryError> { | ||
| let mut mem = Mmap2FixedMemory::allocate(self.len())?; | ||
| // The memmap2 spec doesn't say that the length can be different... | ||
| mem.as_mut().copy_from_slice(self); | ||
| Ok(mem) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -type f -name "*.rs" | head -20Repository: monoid/harm
Length of output: 1026
🏁 Script executed:
rg "Mmap2FixedMemory" -t rust -A 5Repository: monoid/harm
Length of output: 4285
🌐 Web query:
memmap2 map_anon(0) behavior zero-length platform-dependent
💡 Result:
memmap2::MmapMut::map_anon(0) (or MmapOptions::new().len(0).map_anon()) has no stable, cross-platform meaning: memmap2 does not guarantee that a zero-length anonymous mapping is supported; it just forwards the request to the OS, so the result is platform/OS-kernel dependent.
memmap2explicitly notes that if you don’t set a length, you’ll create a zero-length anonymous mapping attempt. Whether that succeeds is then up to the underlying system call. [1]- On Linux/POSIX-style
mmap, the length must be greater than 0; Linux documents this requirement, andmmap(…, length=0, …)fails (EINVAL on modern Linux). [2]
Practical takeaway: if you need “empty buffer” semantics, special-case len == 0 and use an empty Vec<u8> / slice instead of trying to create a 0-byte memmap.
Sources: [1] [2]
🏁 Script executed:
rg "fn allocate" harm-runtime/src/memory/memmap2.rs -A 10Repository: monoid/harm
Length of output: 616
🏁 Script executed:
cat -n harm-runtime/src/memory/memmap2.rs | head -100Repository: monoid/harm
Length of output: 2897
Reject zero-length vector conversions explicitly.
memmap2::MmapMut::map_anon(0) behavior is platform-dependent and fails on Linux/POSIX with EINVAL. Explicitly checking for zero-length in both Vec<u8> and &mut Vec<u8> conversions avoids silent platform-specific failures.
Suggested fix
fn into_fixed_memory(self) -> Result<Mmap2FixedMemory, Self::FixedMemoryError> {
+ if self.is_empty() {
+ return Err(std::io::Error::new(
+ std::io::ErrorKind::InvalidInput,
+ "cannot convert empty vector to fixed memory",
+ ));
+ }
let mut mem = Mmap2FixedMemory::allocate(self.len())?;
mem.as_mut().copy_from_slice(&self);
Ok(mem)
}Apply the same check to the &mut Vec<u8> implementation.
🤖 Prompt for AI Agents
In `@harm-runtime/src/memory/memmap2.rs` around lines 132 - 166, The
into_fixed_memory implementations for Vec<u8> and for &mut Vec<u8> must
explicitly reject zero-length vectors to avoid platform-specific failures from
memmap2::MmapMut::map_anon(0); update both
Memory::<Mmap2FixedMemory>::into_fixed_memory implementations to check if
self.len() == 0 and return an Err(std::io::Error) (e.g.,
std::io::ErrorKind::InvalidInput or EINVAL) instead of attempting
allocation/copy when length is zero.
| #[test] | ||
| #[cfg(target_arch = "aarch64")] | ||
| fn test_mmap() { | ||
| let mut buf = Mmap2Buffer::allocate(1024).expect("mmap failed, system problem"); | ||
| buf.try_extend(harm::instructions::control::ret().bytes()) | ||
| .unwrap(); | ||
|
|
||
| let mem = buf.into_fixed_memory().unwrap(); | ||
| // Doing relocations... | ||
|
|
||
| let exec = mem.into_executable_memory().unwrap(); | ||
|
|
||
| unsafe { | ||
| let func: unsafe extern "C" fn() = std::mem::transmute(exec.as_ptr()); | ||
| func(); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's look at the file structure and find the specific file
find . -name "memmap2.rs" -type fRepository: monoid/harm
Length of output: 91
🏁 Script executed:
# Check the actual code at the specified lines
cat -n harm-runtime/src/memory/memmap2.rs | sed -n '160,200p'Repository: monoid/harm
Length of output: 1511
🏁 Script executed:
# Search for any existing __clear_cache or cache invalidation calls
rg "__clear_cache|clear_cache|cache.*inval" --type rustRepository: monoid/harm
Length of output: 37
🌐 Web query:
AArch64 I-cache invalidation requirement for JIT code execution
💡 Result:
On AArch64, you must explicitly make your newly generated (JIT) instructions coherent with instruction fetch, because the architecture does not guarantee I-cache coherency with data writes. (manuals.plus)
Required maintenance sequence (per Arm ARM)
After writing the code bytes:
- Clean the data cache to the Point of Unification (PoU) for the written addresses:
dc cvau, <va>(typically per cache line over the region) - Barrier to ensure the clean is visible:
dsb ish - Invalidate the instruction cache to PoU for the same addresses:
ic ivau, <va>(per cache line over the region) - Barrier to ensure the invalidation completes:
dsb ish - Context sync before executing:
isb(manuals.plus)
This is the canonical “self-modifying/JIT code” sequence for a PE to safely execute code it has written. (manuals.plus)
Multicore note
IC IVAU is broadcast within the Inner Shareable domain, but each core that will execute the new code must execute an ISB to ensure its fetched instruction stream is synchronized. (manuals.plus)
When you can omit parts (implementation-dependent)
Some CPUs report that parts of this are unnecessary via CTR_EL0:
- If
CTR_EL0.IDCis set, an explicitdc cvauis not required (but you still need the barrier to order stores). - If
CTR_EL0.DICis set, an explicitic ivauis not required. (developer.arm.com)
Practical guidance (Linux / toolchains)
Use the platform-provided helper (e.g., GCC/Clang’s __builtin___clear_cache / libc __clear_cache) which implements the right loop-over-lines plus barriers for AArch64 user space where permitted. (developer.arm.com)
Citations:
- 1: https://manuals.plus/m/c112d67a71a1b94ad89d51a4d7121422015c0103a0da13807e5272030d0b382f?utm_source=openai
- 2: https://manuals.plus/m/69d2d35bfa2a6d5333a412ab5ea7313bf07aa6a6a67bef56ed5e15c455599ba4?utm_source=openai
- 3: https://manuals.plus/m/69d2d35bfa2a6d5333a412ab5ea7313bf07aa6a6a67bef56ed5e15c455599ba4?utm_source=openai
- 4: https://manuals.plus/m/69d2d35bfa2a6d5333a412ab5ea7313bf07aa6a6a67bef56ed5e15c455599ba4?utm_source=openai
- 5: https://developer.arm.com/community/arm-community-blogs/b/architectures-and-processors-blog/posts/caches-self-modifying-code-implementing-clear-cache
- 6: https://developer.arm.com/community/arm-community-blogs/b/architectures-and-processors-blog/posts/caches-self-modifying-code-implementing-clear-cache
🌐 Web query:
Rust core::arch::aarch64::__clear_cache availability and signature
💡 Result:
core::arch::aarch64::__clear_cache is not a provided Rust API (neither stable nor nightly) in core::arch::aarch64 as of current Rust docs (e.g. Rust 1.93.0 docs). Rust’s standard libraries intentionally do not expose LLVM intrinsics like llvm.clear_cache directly, and there is no corresponding core::intrinsics::clear_cache either. [1]
If you’re looking for “clear instruction cache for a range”:
- LLVM IR intrinsic (what Clang/GCC builtins often lower to):
declare void@llvm.clear_cache(ptr, ptr)(historicallyi8*, i8*). [2] - C runtime symbol commonly used by LLVM as a fallback:
void __clear_cache(void *begin, void *end);[2]
In Rust, if your target’s runtime actually provides __clear_cache (e.g., via libgcc/compiler-rt), you’d typically bind it yourself, e.g.:
extern "C" {
fn __clear_cache(begin: *mut u8, end: *mut u8);
}…but its presence is platform/toolchain-dependent, not guaranteed by Rust.
For a portable Rust solution, use a dedicated crate that implements the behavior per-platform (e.g. clear-cache). [3]
Sources:
[1] Rust Internals discussion on not exposing LLVM intrinsics directly. (internals.rust-lang.org)
[2] LLVM LangRef llvm.clear_cache (and its lowering to __clear_cache). (releases.llvm.org)
[3] clear-cache crate description and purpose. (docs.rs)
Citations:
- 1: https://internals.rust-lang.org/t/whats-rust-policy-towards-llvm-intrinsics/21328?utm_source=openai
- 2: https://releases.llvm.org/20.1.0/docs/LangRef.html?utm_source=openai
- 3: https://docs.rs/clear-cache?utm_source=openai
🏁 Script executed:
# Check if __clear_cache is used or mentioned anywhere in the project
rg "clear_cache|core::arch" --type rustRepository: monoid/harm
Length of output: 37
🏁 Script executed:
# Check the Rust version and dependencies to understand what APIs are available
cat -n Cargo.toml | head -30Repository: monoid/harm
Length of output: 1041
🌐 Web query:
How to call __clear_cache in Rust for AArch64 JIT code
💡 Result:
On AArch64 you must invalidate the I-cache for the modified code range before executing it (AArch64 I/D caches aren’t guaranteed coherent for JIT/self-modifying code) [2]. The “C way” is GCC/Clang’s __builtin___clear_cache(begin, end), which typically lowers to a runtime call named __clear_cache(begin, end) [1].
Recommended in Rust (portable, no reliance on __clear_cache being available)
Use a Rust implementation equivalent to __builtin___clear_cache:
# Cargo.toml
[dependencies]
clear-cache = "0.1"unsafe fn jit_publish(code_ptr: *mut u8, code_len: usize) {
// after writing code into [code_ptr, code_ptr+code_len)
clear_cache::clear_cache(code_ptr, code_ptr.add(code_len));
}This crate is explicitly a native implementation of __builtin___clear_cache (based on LLVM’s implementation) [3].
If you’re in a multi-threaded JIT, also perform the additional pipeline synchronization step used by JITs (Wasmtime documents calling clear_cache and then a pipeline flush for MT) [2]. (You can copy that pattern or use their crate, but it’s marked internal-only.)
Directly calling __clear_cache (FFI; not always link-safe)
If you specifically want __clear_cache, you can declare it like:
use core::ffi::c_void;
extern "C" {
fn __clear_cache(begin: *mut c_void, end: *mut c_void);
}
unsafe fn jit_publish(code_ptr: *mut u8, code_len: usize) {
__clear_cache(code_ptr as *mut c_void, code_ptr.add(code_len) as *mut c_void);
}GCC documents __builtin___clear_cache(void *begin, void *end) for this purpose [1], and libgcc’s __clear_cache is a void*(begin,end)-style API (historically documented as char*, but corrected to void*) [4]. The catch is that, depending on target/toolchain/linking, the __clear_cache symbol may not be provided automatically—hence the “use a Rust crate” recommendation above.
Sources: GCC __builtin___clear_cache docs [1], Wasmtime JIT icache coherence notes (AArch64 non-coherent + MT pipeline flush) [2], clear-cache crate docs [3], GCC/libgcc signature correction for __clear_cache [4].
🏁 Script executed:
# Check harm-runtime's Cargo.toml for dependencies
cat -n harm-runtime/Cargo.tomlRepository: monoid/harm
Length of output: 738
🏁 Script executed:
# Search for any memory-related or cache-related code patterns
rg "mmap|memory|cache" harm-runtime/src/memory/memmap2.rs -i --context=3Repository: monoid/harm
Length of output: 4605
Add I-cache coherency before executing JIT code on AArch64.
AArch64 does not guarantee I-cache coherency with data writes. The test will fail or execute stale code without flushing the cache. However, core::arch::aarch64::__clear_cache does not exist in Rust's standard library—you must either:
-
Add the
clear-cachecrate (recommended):- Add
clear-cache = "0.1"toCargo.toml - Call
clear_cache::clear_cache(exec.as_ptr(), exec.as_ptr().add(exec.len()))
- Add
-
Or manually FFI-bind
__clear_cache:extern "C" { fn __clear_cache(begin: *mut u8, end: *mut u8); }
Then call it before the transmute. Note: availability is toolchain-dependent.
🤖 Prompt for AI Agents
In `@harm-runtime/src/memory/memmap2.rs` around lines 175 - 190, The AArch64
test_mmap writes JIT code into memory (via Mmap2Buffer::allocate ->
into_fixed_memory -> into_executable_memory) but doesn't flush the I-cache, so
exec.as_ptr() may point to stale instructions; add an I-cache flush before
transmuting exec.as_ptr() to a function pointer by adding the clear-cache crate
to Cargo.toml and calling clear_cache::clear_cache(exec.as_ptr(),
exec.as_ptr().add(exec.len())) (or, if you prefer FFI, declare and call libc
__clear_cache with the same begin/end) immediately before the unsafe
transmute/func() call in test_mmap.
| pub fn build(self) { | ||
| todo!() | ||
| } |
There was a problem hiding this comment.
Public build() should not panic.
todo!() will crash downstream callers. Consider returning a structured error (or gating the API) until build is implemented.
Suggested fix (return explicit error)
- pub fn build(self) {
- todo!()
- }
+ pub fn build(self) -> Result<(), BuildError> {
+ Err(BuildError::Unimplemented)
+ }#[derive(Debug)]
pub enum BuildError {
Unimplemented,
}🤖 Prompt for AI Agents
In `@harm-runtime/src/runtime.rs` around lines 26 - 28, The public build()
currently panics via todo!(); change its API to avoid panics by returning a
Result and introduce a BuildError enum (e.g., BuildError::Unimplemented) so
callers receive a structured error until implementation is added; update the
function signature pub fn build(self) -> Result<(), BuildError> (or gate behind
a feature flag) and replace todo!() with an Err(BuildError::Unimplemented) so no
downstream caller is crashed by a panic.
Memory buffers for collecting instructions, applying relocation and execution.
Ref #31.
Summary by CodeRabbit
New Features
Dependencies