Skip to content

Fix mv read loop hardcoding 4 instead of nnode#1190

Draft
sbryngelson wants to merge 9 commits intoMFlowCode:masterfrom
sbryngelson:fix/mv-loop-nnode
Draft

Fix mv read loop hardcoding 4 instead of nnode#1190
sbryngelson wants to merge 9 commits intoMFlowCode:masterfrom
sbryngelson:fix/mv-loop-nnode

Conversation

@sbryngelson
Copy link
Member

@sbryngelson sbryngelson commented Feb 21, 2026

User description

Summary

Severity: HIGH — mv data file loop uses wrong bounds and offsets when nnode != 4.

File: src/pre_process/m_start_up.fpp, lines 479-482

The pb (pressure) read loop just above correctly uses nnode for both the loop bound and file number offset calculation. The mv (mass/velocity) read loop immediately below hardcodes 4 instead.

Before

! pb loop (correct)
do r = 1, nnode
    write (file_num, '(I0)') sys_size + r + (i - 1)*nnode
    file_loc = ... '/pb' ...

! mv loop (wrong)
do r = 1, 4                                              ! should be nnode
    write (file_num, '(I0)') sys_size + r + (i - 1)*4    ! should be nnode
    file_loc = ... '/mv' ...

After

do r = 1, nnode
    write (file_num, '(I0)') sys_size + r + (i - 1)*nnode

Why this went undetected

nnode is 4 in the most common configuration, so the hardcoded value happens to be correct.

Test plan

  • Run bubble restart with non-default nnode value
  • Verify all mv data files are read correctly

🤖 Generated with Claude Code

Fixes #1209

Summary by CodeRabbit

  • Bug Fixes
    • Corrected MPI communication and buffer packing/unpacking so bubble-related data scale with the actual number of MPI nodes (nnode) instead of a fixed factor.
    • Fixed parallel I/O allocation and subarray sizing so memory views and transfers size to nnode.
    • Updated serial file indexing and buffer addressing so host/device and GPU data transfers align with nnode.

CodeAnt-AI Description

Fix QBMM node-count assumptions so I/O and communication use configured nnode

What Changed

  • mv (mass/velocity) restart file reads use the configured node count (nnode) instead of a hardcoded 4, so mv files are located and read correctly when nnode ≠ 4
  • MPI I/O view sizing and allocations now account for nnode for QBMM variables, ensuring correct file offsets and memory layout during parallel reads/writes
  • Communication buffer packing/unpacking and send/receive ranges use nnode for PB/MV variables, so halo exchanges and GPU pack loops handle the correct number of quadrature nodes

Impact

✅ Correct mv restart for non-default nnode
✅ Accurate parallel I/O offsets for QBMM data
✅ Reliable buffer exchanges and halo transfers when nnode != 4

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

Copilot AI review requested due to automatic review settings February 21, 2026 03:23
@codeant-ai codeant-ai bot added the size:XS This PR changes 0-9 lines, ignoring generated files label Feb 21, 2026
@codeant-ai
Copy link
Contributor

codeant-ai bot commented Feb 21, 2026

Nitpicks 🔍

🔒 No security issues identified
⚡ Recommended areas for review

  • Parallel I/O inconsistency
    The serial mv loop was updated to use nnode, but the parallel ic-reading loop still uses a hardcoded 4 in the bound sys_size + 2*nb*4. This is an inconsistency: parallel I/O ranges should be updated to use nnode as well (or otherwise validated) so serialized and parallel read logic match.

  • File-number string width
    file_num is allocated with length based only on sys_size digits. When computing filenames with indices like sys_size + r + (i-1)*nnode (and the pb/mv ranges), the printed numeric value may require more digits than computed. This can lead to truncated filenames or incorrect file names when nnode, nb, or sys_size grows. Ensure file_num can hold the maximum index used.

  • Serial mv loop update
    The serial mv file-read loop was changed to iterate r=1..nnode and to compute the file number using nnode (sys_size + r + (i-1)*nnode). Verify this matches the intended file-numbering scheme for all nb and nnode combinations and is consistent with the pb loop. Validate with non-default nnode to ensure expected files are opened and read.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a bug in the bubble restart file reading logic where the mv data file loop was hardcoded to iterate 4 times instead of using the nnode variable, which could cause incorrect file reads when nnode != 4.

Changes:

  • Updated the mv data file loop to use nnode instead of the hardcoded value 4
  • Corrected the file number offset calculation to use nnode for proper indexing

cubic-dev-ai[bot]

This comment was marked as off-topic.

@codecov
Copy link

codecov bot commented Feb 21, 2026

Codecov Report

❌ Patch coverage is 11.11111% with 24 lines in your changes missing coverage. Please review.
✅ Project coverage is 44.04%. Comparing base (df28255) to head (7cc2bde).
⚠️ Report is 29 commits behind head on master.

Files with missing lines Patch % Lines
src/common/m_mpi_common.fpp 7.14% 13 Missing ⚠️
src/post_process/m_global_parameters.fpp 14.28% 4 Missing and 2 partials ⚠️
src/pre_process/m_start_up.fpp 0.00% 3 Missing ⚠️
src/pre_process/m_global_parameters.fpp 33.33% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1190      +/-   ##
==========================================
- Coverage   44.05%   44.04%   -0.02%     
==========================================
  Files          70       70              
  Lines       20496    20503       +7     
  Branches     1989     1991       +2     
==========================================
+ Hits         9030     9031       +1     
- Misses      10328    10332       +4     
- Partials     1138     1140       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@codeant-ai codeant-ai bot added size:M This PR changes 30-99 lines, ignoring generated files and removed size:XS This PR changes 0-9 lines, ignoring generated files labels Feb 22, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/pre_process/m_start_up.fpp (1)

417-420: ⚠️ Potential issue | 🟡 Minor

file_num buffer may be too short for sys_size + nb*nnode when nnode is large.

The buffer is sized for the digit count of sys_size alone, but numbers written during pb/mv reads go up to sys_size + nb*nnode. For the current default (nnode = 4) this is benign because sys_size already incorporates nb*nmom (which exceeds nb*4). However, for larger nnode values — which this PR explicitly enables — the number of digits can increase, causing Fortran to fill file_num with '*' characters and silently failing to find the data file.

🐛 Proposed fix
-character(LEN= &
-          int(floor(log10(real(sys_size, wp)))) + 1) :: file_num
+character(LEN= &
+          int(floor(log10(real(sys_size + 2*nb*nnode, wp)))) + 1) :: file_num
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/pre_process/m_start_up.fpp` around lines 417 - 420, The declaration for
file_num is too short because it only uses digits of sys_size; change the sizing
to compute the maximum possible index (e.g. max_index = sys_size + nb*nnode) and
use int(floor(log10(real(max_index, wp))) + 1) for the character(LEN=...) length
so file_num can hold numbers up to sys_size + nb*nnode; update any related
initializations that assume the old length and keep the symbol names file_num,
sys_size, nb, nnode, and wp to locate and modify the declaration and any
dependent code.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/pre_process/m_start_up.fpp`:
- Around line 417-420: The declaration for file_num is too short because it only
uses digits of sys_size; change the sizing to compute the maximum possible index
(e.g. max_index = sys_size + nb*nnode) and use int(floor(log10(real(max_index,
wp))) + 1) for the character(LEN=...) length so file_num can hold numbers up to
sys_size + nb*nnode; update any related initializations that assume the old
length and keep the symbol names file_num, sys_size, nb, nnode, and wp to locate
and modify the declaration and any dependent code.

@codeant-ai codeant-ai bot added size:M This PR changes 30-99 lines, ignoring generated files and removed size:M This PR changes 30-99 lines, ignoring generated files labels Feb 23, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/common/m_mpi_common.fpp (1)

742-757: collapse=4 with a 5-loop body — consider bumping to collapse=5 for GPU parallelism consistency.

The direction-1 qbmm_comm pb-pack loop has 5 levels (l, k, j, i, q) but uses collapse=4, leaving q sequential inside each thread. Every other qbmm direction pack/unpack loop (including the adjacent mv-pack on Line 759 and the direction-1 pb-unpack on Line 951) uses collapse=5. There is no data dependency preventing collapse=5 here; writes to buff_send(r) are unique per (l, k, j, i, q) combination.

♻️ Suggested fix
-                    $:GPU_PARALLEL_LOOP(collapse=4,private='[r]')
+                    $:GPU_PARALLEL_LOOP(collapse=5,private='[r]')
                     do l = 0, p
                         do k = 0, n
                             do j = 0, buff_size - 1
                                 do i = nVar + 1, nVar + nnode
                                     do q = 1, nb

Based on learnings: "When modifying MPI pack/unpack logic in one sweep direction, update all directions" — the same parallelism level should be consistent across the symmetric pack/unpack loops for correctness and maintainability.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/common/m_mpi_common.fpp` around lines 742 - 757, The GPU parallel loop
for the qbmm_comm pb-pack currently uses
$:GPU_PARALLEL_LOOP(collapse=4,private='[r]') but the loop body has five nested
loops (l,k,j,i,q), so change collapse=4 to collapse=5 to allow all five loops to
be collapsed for GPU parallelism; locate the qbmm_comm pack block that writes
buff_send(r) (the nested loops with r = ... and buff_send(r) = real(pb_in(...),
kind=wp)) and update the pragma to collapse=5 to match the adjacent mv-pack and
pb-unpack loops for consistency.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/common/m_mpi_common.fpp`:
- Around line 742-757: The GPU parallel loop for the qbmm_comm pb-pack currently
uses $:GPU_PARALLEL_LOOP(collapse=4,private='[r]') but the loop body has five
nested loops (l,k,j,i,q), so change collapse=4 to collapse=5 to allow all five
loops to be collapsed for GPU parallelism; locate the qbmm_comm pack block that
writes buff_send(r) (the nested loops with r = ... and buff_send(r) =
real(pb_in(...), kind=wp)) and update the pragma to collapse=5 to match the
adjacent mv-pack and pb-unpack loops for consistency.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9a63b88 and c4aa202.

📒 Files selected for processing (4)
  • src/common/m_mpi_common.fpp
  • src/pre_process/m_global_parameters.fpp
  • src/pre_process/m_start_up.fpp
  • src/simulation/m_global_parameters.fpp
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/pre_process/m_global_parameters.fpp
  • src/simulation/m_global_parameters.fpp
  • src/pre_process/m_start_up.fpp

@MFlowCode MFlowCode deleted a comment from github-actions bot Feb 23, 2026
@sbryngelson sbryngelson force-pushed the fix/mv-loop-nnode branch 2 times, most recently from cfca1b2 to edd3145 Compare February 24, 2026 16:03
@codeant-ai codeant-ai bot added size:M This PR changes 30-99 lines, ignoring generated files and removed size:M This PR changes 30-99 lines, ignoring generated files labels Feb 24, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/simulation/m_global_parameters.fpp (1)

1217-1236: Keep finalization nullification in sync with the expanded QBMM range.

MPI_IO_DATA%view/var now allocate up to sys_size + 2*nb*nnode. To avoid leaving associated pointers for the QBMM extra slots, consider mirroring that upper bound in the finalization nullify loop.

🔧 Suggested adjustment in finalization
-            if (bubbles_lagrange) then
-                do i = 1, sys_size + 1
-                    MPI_IO_DATA%var(i)%sf => null()
-                end do
-            else
-                do i = 1, sys_size
-                    MPI_IO_DATA%var(i)%sf => null()
-                end do
-            end if
+            if (bubbles_lagrange) then
+              do i = 1, sys_size + 1
+                MPI_IO_DATA%var(i)%sf => null()
+              end do
+            elseif (bubbles_euler .and. qbmm .and. .not. polytropic) then
+              do i = 1, sys_size + 2*nb*nnode
+                MPI_IO_DATA%var(i)%sf => null()
+              end do
+            else
+              do i = 1, sys_size
+                MPI_IO_DATA%var(i)%sf => null()
+              end do
+            end if
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/simulation/m_global_parameters.fpp` around lines 1217 - 1236, The
finalization/nullify loop does not cover the extra QBMM slots allocated to
MPI_IO_DATA%view/var (allocated up to sys_size + 2*nb*nnode when bubbles_euler
.and. qbmm .and. .not. polytropic); update the corresponding cleanup loop that
nullifies MPI_IO_DATA%var(i)%sf (and any pointer fields) to iterate through i =
1, sys_size + 2*nb*nnode under the same condition (bubbles_euler .and. qbmm
.and. .not. polytropic) so the extra slots are nullified just like the base
range. Ensure the loop bounds and condition match the allocation logic for
MPI_IO_DATA%view/var and apply the same change wherever MPI_IO_DATA%var
finalization occurs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/simulation/m_global_parameters.fpp`:
- Around line 1217-1236: The finalization/nullify loop does not cover the extra
QBMM slots allocated to MPI_IO_DATA%view/var (allocated up to sys_size +
2*nb*nnode when bubbles_euler .and. qbmm .and. .not. polytropic); update the
corresponding cleanup loop that nullifies MPI_IO_DATA%var(i)%sf (and any pointer
fields) to iterate through i = 1, sys_size + 2*nb*nnode under the same condition
(bubbles_euler .and. qbmm .and. .not. polytropic) so the extra slots are
nullified just like the base range. Ensure the loop bounds and condition match
the allocation logic for MPI_IO_DATA%view/var and apply the same change wherever
MPI_IO_DATA%var finalization occurs.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c4aa202 and edd3145.

📒 Files selected for processing (4)
  • src/common/m_mpi_common.fpp
  • src/pre_process/m_global_parameters.fpp
  • src/pre_process/m_start_up.fpp
  • src/simulation/m_global_parameters.fpp
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/pre_process/m_global_parameters.fpp

sbryngelson and others added 3 commits February 24, 2026 11:49
The pb loop just above correctly uses nnode for both the loop bound
and file number offset. The mv loop should match.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ommon

Fix all remaining instances of the literal integer 4 representing the
QBMM quadrature node count (nnode) across simulation and common modules,
completing the nnode parameterization started in the PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fix remaining instances of the literal 4 representing QBMM quadrature
node count in pre_process/m_global_parameters.fpp and m_start_up.fpp,
consistent with the fixes already applied to common and simulation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@codeant-ai codeant-ai bot removed the size:M This PR changes 30-99 lines, ignoring generated files label Feb 25, 2026
@codeant-ai codeant-ai bot added the size:M This PR changes 30-99 lines, ignoring generated files label Feb 25, 2026
Copy link
Contributor

@wilfonba wilfonba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@sbryngelson sbryngelson marked this pull request as draft February 26, 2026 00:08
@MFlowCode MFlowCode deleted a comment from codeant-ai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from codeant-ai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from codeant-ai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from codeant-ai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from codeant-ai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from codeant-ai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from codeant-ai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from codeant-ai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from codeant-ai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from codeant-ai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from codeant-ai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from codeant-ai bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from coderabbitai bot Feb 26, 2026
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@MFlowCode MFlowCode deleted a comment from codeant-ai bot Feb 26, 2026
post_process was allocating MPI_IO_DATA to sys_size without the
2*nb*nnode extension needed for QBMM non-polytropic bubble data,
matching the fix already applied to pre_process and simulation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link

Claude Code Review

Head SHA: 7cc2bde
Files changed: 5

  • src/common/m_mpi_common.fpp
  • src/post_process/m_global_parameters.fpp
  • src/pre_process/m_global_parameters.fpp
  • src/pre_process/m_start_up.fpp
  • src/simulation/m_global_parameters.fpp

Summary

  • Core fix is correct and complete. Every instance of hardcoded 4 used as nnode in QBMM restart I/O and MPI buffer pack/unpack has been replaced with the actual nnode variable, covering serial loops, GPU parallel loops, MPI type subarray creation, and allocation sizing across all three executables.
  • Scope is appropriately broad. Beyond the headline mv read-loop fix in m_start_up.fpp, the PR also corrects the buffer size formula (v_size = sys_size + 2*nb*nnode), MPI type creation loop, all six pack/unpack path variants in m_mpi_common.fpp, and allocations in pre_process/simulation/post_process m_global_parameters.fpp.
  • Consistency of pack/unpack offsets is maintained. The pb offset is (i-1) + (q-1)*nnode and mv offset adds nb*nnode; pack and unpack mirror each other correctly after the substitution.
  • Post_process gets new conditional QBMM allocation (src/post_process/m_global_parameters.fpp) that was missing before. This is a genuine correctness fix for that target.
  • Two cosmetic typo fixes (imersedimmersed) are harmless.

Findings

1. src/post_process/m_global_parameters.fppMPI_IO_DATA%view entries uninitialized for QBMM indices (low-severity)

The new allocation:

allocate (MPI_IO_DATA%view(1:sys_size + 2*nb*nnode))

allocates entries sys_size+1 … sys_size+2*nb*nnode, but the loop in m_mpi_common.fpp that calls MPI_TYPE_CREATE_SUBARRAY/MPI_TYPE_COMMIT for these indices is guarded:

#ifndef MFC_POST_PROCESS
if (qbmm .and. .not. polytropic) then
    do i = sys_size + 1, sys_size + 2*nb*nnode
        call MPI_TYPE_CREATE_SUBARRAY(...)
        call MPI_TYPE_COMMIT(MPI_IO_DATA%view(i), ierr)
    end do
end if

Those view handles are therefore uninitialized (contain garbage) in the post_process binary. If any post_process code path iterates over the full 1:sys_size+2*nb*nnode view range and calls MPI_File_set_view with one of the uninitialized handles, it will produce undefined MPI behaviour. Please confirm (a) that post_process never accesses MPI_IO_DATA%view(i) for i > sys_size, or (b) add corresponding view initialization for post_process.

2. src/post_process/m_global_parameters.fpp ~line 878 — allocate-then-null pattern propagated

do i = sys_size + 1, sys_size + 2*nb*nnode
    allocate (MPI_IO_DATA%var(i)%sf(0:m, 0:n, 0:p))
    MPI_IO_DATA%var(i)%sf => null()
end do

This pattern (allocate + immediately nullify the pointer) is already present for the 1:sys_size loop and is thus a pre-existing quirk, not introduced here. Worth a follow-up cleanup to avoid the guaranteed memory leak, but not a blocker for this PR.

3. Test coverage gap — non-default nnode path still untested

The PR description explicitly lists [ ] Run bubble restart with non-default nnode value as unchecked. Without a regression test, this fix cannot be verified and could silently regress. Please add a case to toolchain/mfc/test/cases.py that exercises QBMM restart with nnode != 4 (e.g., nnode = 9), or document why such a test is infeasible in the PR body.


Verdict

The logic of all 4 → nnode substitutions is correct and the change addresses a real bug that silently corrupts data for non-default QBMM node counts. The post_process view initialization gap (Finding 1) should be verified before merge; the other two findings are low priority.

@MFlowCode MFlowCode deleted a comment from github-actions bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from github-actions bot Feb 26, 2026
@MFlowCode MFlowCode deleted a comment from github-actions bot Feb 26, 2026
@anandrdbz
Copy link
Contributor

Nnode can never not be 4, but I guess no harm in adding this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:M This PR changes 30-99 lines, ignoring generated files

Development

Successfully merging this pull request may close these issues.

mv read loop hardcodes 4 instead of nnode

4 participants