Skip to content

fix: literal arrays are memcpy-ed instead of stored, non constant arrays are unrolled#1633

Merged
ghaith merged 12 commits intomasterfrom
endless_loop
Mar 23, 2026
Merged

fix: literal arrays are memcpy-ed instead of stored, non constant arrays are unrolled#1633
ghaith merged 12 commits intomasterfrom
endless_loop

Conversation

@ghaith
Copy link
Copy Markdown
Collaborator

@ghaith ghaith commented Mar 13, 2026

Literal array values are now memcpy-ed
Arrays with non-constant values are converted into single assignments (or for loop assignments for multipication assignments)

For Reviewers: I mostly vibed this (with some steering/guidance), pay extra attention to the review

@ghaith ghaith changed the title fix: optimize array initialization and memcpy handling fix: literal arrays are memcpy-ed instead of stored, non constant arrays are unrolled Mar 13, 2026
@ghaith ghaith requested review from Copilot and mhasel March 16, 2026 08:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the compiler’s handling of array initializers by emitting memcpy from a materialized global for constant aggregate literals, and introducing a lowering pass that rewrites non-constant array literals into indexed assignments / FOR loops (including external POU constructor wiring behavior).

Changes:

  • Materialize constant array/struct literals as anonymous .const_init globals and initialize via llvm.memcpy to avoid large inline aggregate stores.
  • Add array_lowering pass to lower non-constant array literal initializers into element assignments / loops, integrated into the driver pipeline.
  • Add extensive lit + IR + Rust snapshot tests covering constant vs non-constant arrays, multiplied segments, REF/ADR cases, and external ctor scenarios.

Reviewed changes

Copilot reviewed 34 out of 34 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/lit/single/init/array_variable_elements.st New lit test for non-constant variable element array lowering behavior.
tests/lit/single/init/array_ref_elements.st New lit test for arrays initialized with REF() calls.
tests/lit/single/init/array_multiplied_variable_override.st New lit test for overriding type-initialized arrays with multiplied literals.
tests/lit/single/init/array_multiplied_type_init.st New lit test for type-level multiplied array initializer.
tests/lit/single/init/array_multiplied_struct_elements.st New lit test for multiplied array init with struct literals.
tests/lit/single/init/array_multiplied_partial_init.st New lit test for partial multiplied init and defaulting remainder.
tests/lit/single/init/array_multiplied_nonzero_offset.st New lit test for non-zero-based array bounds with multiplied init.
tests/lit/single/init/array_multiplied_mixed_init.st New lit test for mixed multiplied segments + explicit tail values.
tests/lit/single/init/array_multiplied_large.st New lit test stressing large constant array init.
tests/lit/single/init/array_mixed_ref_segments.st New lit test for mixed multiplied segments with REF() calls.
tests/lit/single/init/array_external_type_init.st New lit test for type-level init correctness in external/global blocks.
tests/lit/multi/extern_fb_array_ctor_wiring/run.test New split-compilation lit harness for external FB ctor + const array init.
tests/lit/multi/extern_fb_array_ctor_wiring/main.st Consumer side of split-compilation test verifying ctor wiring.
tests/lit/multi/extern_fb_array_ctor_wiring/lit.local.cfg Custom lit config to compile/link C FB body + ctor-only object.
tests/lit/multi/extern_fb_array_ctor_wiring/header.pli External FB declaration with constant array initializer.
tests/lit/multi/extern_fb_array_ctor_wiring/fb_body.c C implementation of external FB body for split-compilation scenario.
tests/lit/ir_tests/external_fb_array_no_init_globals.st IR test: external FB without external-ctor generation should not emit init globals.
tests/lit/ir_tests/external_fb_array_const_init.st IR test: with external ctor generation, should emit .const_init + memcpy in ctor.
src/tests/adr/arrays_adr.rs Update expected IR to use .const_init + llvm.memcpy instead of inline aggregate stores.
src/codegen/tests/snapshots/rusty__codegen__tests__function_tests__function_call_with_array_access.snap Snapshot update for memcpy-based constant aggregate initialization.
src/codegen/tests/snapshots/rusty__codegen__tests__code_gen_tests__array_of_struct_initialization_in_body.snap Snapshot update for struct array init via .const_init + memcpy.
src/codegen/tests/oop_tests/super_tests.rs Snapshot/expectation updates due to memcpy-based init for aggregates.
src/codegen/tests/initialization_test/complex_initializers.rs Snapshot update for aggregate string-array init using .const_init + memcpy.
src/codegen/tests/fnptr.rs Snapshot update for aggregate return/init path using .const_init + memcpy.
src/codegen/tests/debug_tests.rs Debug IR snapshot updates for aggregate init now using memcpy.
src/codegen/generators/pou_generator.rs Skip generating __init globals for external/included POUs.
src/codegen/generators/llvm.rs Add helper to materialize aggregate constants as private unnamed-addr globals.
src/codegen/generators/expression_generator.rs Use memcpy from materialized globals for aggregate literals to reduce IR size.
compiler/plc_lowering/src/tests/array_lowering_tests.rs New unit tests validating array lowering decisions (unroll vs loop, stripping init).
compiler/plc_lowering/src/tests.rs Register new lowering test module.
compiler/plc_lowering/src/lib.rs Export new array_lowering module.
compiler/plc_lowering/src/array_lowering.rs New lowering pass implementation for non-constant array literal initializers.
compiler/plc_driver/src/pipelines/participant.rs Add pipeline participant integrating array lowering before annotation.
compiler/plc_driver/src/pipelines.rs Wire the new ArrayLowerer into the default mutable pipeline participants.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread compiler/plc_lowering/src/tests/array_lowering_tests.rs Outdated
Comment on lines +241 to +245
for block in &mut pou.variable_blocks {
for var in &mut block.variables {
if has_array_literal_initializer(var) && is_lowered_type(var, lowered_type_names) {
var.initializer = None;
}
Comment thread src/codegen/generators/llvm.rs Outdated
// For aggregate constant values (array/struct literals), avoid emitting a giant
// inline `store` instruction. Instead, materialize the constant as an anonymous
// global and memcpy from it — this is O(1) in IR size regardless of array length.
if (expression.is_array_value() || expression.is_struct_value()) && left_type.is_aggregate() {
Comment thread tests/lit/multi/extern_fb_array_ctor_wiring/lit.local.cfg Outdated
@ghaith ghaith added the 0.3 To be merged before we tag 0.3 label Mar 19, 2026
Copy link
Copy Markdown
Member

@mhasel mhasel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it! One bug, the rest of the comments are minor.

Comment thread src/codegen/generators/expression_generator.rs Outdated
Comment thread src/codegen/generators/expression_generator.rs Outdated
Comment thread compiler/plc_lowering/src/array_lowering.rs Outdated
global.set_unnamed_addr(true);
global.set_linkage(Linkage::Private);
// Prevent Module::drop from disposing the module we don't own.
std::mem::forget(module);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While creating a new module that thinks it owns the pointer and forgetting it later looks to be fine now, there is a risk if inkwell ever adds panic!s or Results to the called functions.

A panic would result in the rust unwinder dropping both the original module and the copy here, resulting in a double free.

A result with an early return would call drop on the local value, invalidating the original module's llvm pointer in the process. I think it is worth to add this to the SAFETY comment above

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this is the one thing i don't like about this change, i'm thinking of moving this into the llvm wrapper and implementing it through C api calls there and getting a reference to the module. Will look into it

array_info: &ArrayInfo,
id_provider: &mut IdProvider,
) -> LoweredResult {
let use_for_loop = count >= FOR_LOOP_THRESHOLD;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The threshold is checked per MultipliedStatement segment, not against the total element count. An initializer split into multiple sub-threshold segments - e.g. [10(v), 10(v), 10(v), 10(v)] totalling 40 elements - is fully unrolled into 40 individual assignments instead of emitting a FOR loop.

This is not blocking for me, as this is an edge case that is unlikely to happen in real code. I've added a test (multi_segment_above_threshold_is_unrolled_per_segment) to document this behaviour anyway.

@ghaith ghaith requested a review from mhasel March 23, 2026 08:59
@ghaith ghaith merged commit 4d493ee into master Mar 23, 2026
33 checks passed
@ghaith ghaith deleted the endless_loop branch March 23, 2026 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

0.3 To be merged before we tag 0.3

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants