Skip to content

storage: fix stale chunks handling on limits#11632

Open
edsiper wants to merge 4 commits intomasterfrom
storage-fix-stale-chunk
Open

storage: fix stale chunks handling on limits#11632
edsiper wants to merge 4 commits intomasterfrom
storage-fix-stale-chunk

Conversation

@edsiper
Copy link
Copy Markdown
Member

@edsiper edsiper commented Mar 30, 2026


Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • Bug Fixes

    • Improved space projection and release logic to better estimate append growth and avoid unnecessary removals; introduced a two-pass local deletion strategy to meet required space sooner.
    • Adjusted behavior to prefer removing actually-deletable chunk files before shared/unlinked ones under storage pressure.
  • Tests

    • Added an automated test that verifies storage-management prefers deletable chunks when multiple outputs share limits.

edsiper added 2 commits March 30, 2026 13:06
…ction

Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
…egressions

Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 30, 2026

📝 Walkthrough

Walkthrough

Adds helpers and projected write-size estimation, implements a two-pass local space-release strategy for input chunks, updates release/placement APIs to use projected sizes, and adds a test verifying deletable files are preferred when storage limits are hit.

Changes

Cohort / File(s) Summary
Chunk space management
src/flb_input_chunk.c
Introduced local mirror of chunkio FS view and helpers: flb_input_chunk_has_other_routes, flb_input_chunk_prefers_physical_delete, flb_input_chunk_get_projected_write_size. Refactored flb_input_chunk_release_space to support two-pass local deletion and early-exit when requirement met. Changed flb_input_chunk_release_space_compound signature to add size_t required_space. Propagated projected placement_size into placement and release flows.
Tests — deletion preference
tests/internal/input_chunk.c
Added count_chunk_files() helper and new test flb_test_input_chunk_prefers_deletable_files_on_limit() (registered in TEST_LIST) that sets up storage+outputs, appends chunks, adjusts limits, and asserts that physically deletable chunks are preferred when limits are enforced.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Input as Input logic
participant Router as Router/Outputs
participant FS as Filesystem (storage)
Input->>Input: compute projected write size
Input->>Router: inspect routes_mask / other routes
Router-->>Input: route membership info
Input->>FS: request/release required_space
alt local scope, pass 0
Input->>Input: decide prefers physical delete?
opt skip physical deletion
Input-->>FS: skip physical delete (unlink deferred)
end
else pass 1 or global
Input-->>FS: delete/unlink candidate chunk(s)
end
FS-->>Input: released_space
Input->>FS: place new chunk using projected placement_size
FS-->>Input: confirm write/placement

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

backport to v4.2.x

Poem

🐰 I nibble bytes beneath the moon,

Two-pass I hop, decide too soon? No—soon.
I drop the chunks that can be shed,
Keep shared friends safe in their bed,
Storage tidy, I hum a tune.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'storage: fix stale chunks handling on limits' directly relates to the main changes in the PR, which focus on improving how chunks are handled when storage limits are enforced, including better space projection and deletion logic.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch storage-fix-stale-chunk

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/flb_input_chunk.c (1)

694-733: Document the magic number 24 for chunk header overhead.

The value 24 corresponds to the chunk file header structure (2 header bytes + 4 CRC32 + 16 padding + 2 metadata length bytes), as documented in flb_input_chunk_get_real_size (lines 522-526). Consider either referencing that constant or adding a similar comment here for maintainability.

    meta_size = (size_t) cio_meta_size(ic->chunk);
    /* See https://github.com/edsiper/chunkio#file-layout for header overhead:
     * 2 (header) + 4 (CRC32) + 16 (padding) + 2 (metadata length) = 24 bytes */
    logical_size = (size_t) flb_input_chunk_get_size(ic) + meta_size + 24;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/flb_input_chunk.c` around lines 694 - 733, The code in
flb_input_chunk_get_projected_write_size uses the magic number 24 for chunk
header overhead without explanation; update the function to document or
reference that constant by adding a brief comment explaining the 24 bytes
breakdown (2 header bytes + 4 CRC32 + 16 padding + 2 metadata length bytes) or
use the same named constant used by flb_input_chunk_get_real_size if one exists;
reference flb_input_chunk_get_projected_write_size and
flb_input_chunk_get_real_size so reviewers can locate and align the
header-overhead handling and ensure maintainability.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/flb_input_chunk.c`:
- Around line 694-733: The code in flb_input_chunk_get_projected_write_size uses
the magic number 24 for chunk header overhead without explanation; update the
function to document or reference that constant by adding a brief comment
explaining the 24 bytes breakdown (2 header bytes + 4 CRC32 + 16 padding + 2
metadata length bytes) or use the same named constant used by
flb_input_chunk_get_real_size if one exists; reference
flb_input_chunk_get_projected_write_size and flb_input_chunk_get_real_size so
reviewers can locate and align the header-overhead handling and ensure
maintainability.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c3e81c91-1a73-4a0f-8e0b-9685008fa15a

📥 Commits

Reviewing files that changed from the base of the PR and between 8647f02 and 0723c30.

📒 Files selected for processing (2)
  • src/flb_input_chunk.c
  • tests/internal/input_chunk.c

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0723c308b7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…iction

Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/flb_input_chunk.c (1)

75-91: Consider directly including <chunkio/cio_file.h> instead of mirroring the struct.

The struct flb_input_chunk_cio_file_view mirrors fields from cio_file, which is defined in the public header lib/chunkio/include/chunkio/cio_file.h. This header is not private—it's part of chunkio's public API and is directly included in chunkio's own tests and source files (e.g., lib/chunkio/tests/fs.c, lib/chunkio/tests/fs_windows.c). Fluent-bit could simplify this by including <chunkio/cio_file.h> and using struct cio_file directly, as chunkio itself does, rather than maintaining a duplicate struct definition. If there is a specific reason to avoid this include (e.g., build isolation), document it in the comment.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/flb_input_chunk.c` around lines 75 - 91, Replace the mirrored struct
definition flb_input_chunk_cio_file_view with a direct use of the public struct
cio_file by including the public header <chunkio/cio_file.h> and update any
casts/usages that referenced flb_input_chunk_cio_file_view to use struct
cio_file instead; if including the header is intentionally avoided for
build/isolation reasons, add a clear comment above the mirrored struct
explaining why and referencing that decision so the duplicate layout is
documented.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/flb_input_chunk.c`:
- Around line 756-764: The loop that increments projected_size by realloc_size
can overflow for very large append_size; replace the iterative loop in
flb_input_chunk.c with a safe calculation: first check for overflow when
computing target = logical_size + append_size, then compute needed = (target >
projected_size) ? target - projected_size : 0; compute increments = (needed +
realloc_size - 1) / realloc_size and add projected_size += increments *
realloc_size while guarding the multiplication/addition for overflow; finally
round projected_size up to page_size with an overflow check — reference
variables projected_size, alloc_size, realloc_size, logical_size, append_size,
and page_size to locate and update the logic.

---

Nitpick comments:
In `@src/flb_input_chunk.c`:
- Around line 75-91: Replace the mirrored struct definition
flb_input_chunk_cio_file_view with a direct use of the public struct cio_file by
including the public header <chunkio/cio_file.h> and update any casts/usages
that referenced flb_input_chunk_cio_file_view to use struct cio_file instead; if
including the header is intentionally avoided for build/isolation reasons, add a
clear comment above the mirrored struct explaining why and referencing that
decision so the duplicate layout is documented.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 208da22d-b623-48b0-a6b5-d661bb3f37b5

📥 Commits

Reviewing files that changed from the base of the PR and between 0723c30 and ccd4de1.

📒 Files selected for processing (1)
  • src/flb_input_chunk.c

Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
src/flb_input_chunk.c (1)

75-94: Fragile struct mirroring pattern - ensure synchronization with chunkio.

The mirror struct must exactly match the leading layout of struct cio_file in lib/chunkio/include/chunkio/cio_file.h. Currently, the field order appears correct based on the chunkio header. However, this pattern is inherently fragile:

  1. Any reordering or insertion of fields in the original struct will silently corrupt reads
  2. Platform-specific padding differences could cause misalignment

Consider adding a static assertion or compile-time check to validate the struct size/offsets match expectations, or document a version dependency on chunkio.

#!/bin/bash
# Verify the mirror struct matches the original cio_file struct layout
# by checking the field order in the chunkio header

echo "=== Checking cio_file struct definition in chunkio ==="
fd -t f 'cio_file.h' --exec cat {} \; 2>/dev/null | head -60

echo ""
echo "=== Checking if there are any other references to cio_file fields ==="
rg -n 'struct cio_file' --type c -A 20 | head -80
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/flb_input_chunk.c` around lines 75 - 94, The mirrored struct
flb_input_chunk_cio_file_view must be validated against chunkio's real struct
cio_file to avoid silent corruption; add compile-time assertions that sizeof(fl
b_input_chunk_cio_file_view) equals sizeof(struct cio_file) and that offsetof
for key fields (fd, flags, synced, allocate_strategy, fs_size, data_size,
page_size, alloc_size, realloc_size) match the corresponding offsetof(struct
cio_file, ...); implement these checks using static_assert/_Static_assert or
equivalent macros (and guard them with `#ifdef` that includes chunkio/cio_file.h
when available), or document a strict chunkio version requirement if the header
cannot be included. Ensure the checks are named/placed with references to
flb_input_chunk_cio_file_view and cio_file so failures point to the mismatch.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/flb_input_chunk.c`:
- Around line 798-805: flb_input_chunk_get_real_size can return -1 on error but
its result is cast to size_t into current_size causing unsigned wrap; in the
block that sets current_size when ic->fs_counted == FLB_TRUE, call
flb_input_chunk_get_real_size(ic) into a signed variable (e.g. ssize_t real =
flb_input_chunk_get_real_size(ic)), check for real == -1 and handle the error
path (either treat current_size as 0, log/propagate the error, or return an
error code) before casting to size_t; update the calculation using that
validated value so projected_size - current_size cannot underflow unexpectedly.
- Around line 389-400: flb_input_chunk_get_real_size() can return -1 on error;
before using its result in the FLB_INPUT_CHUNK_RELEASE_SCOPE_LOCAL branch,
validate chunk_size != -1 and handle the error path: when chunk_size == -1, log
or handle the failure and skip subtracting from output_plugin->fs_chunks_size
and skip adding to released_space (or treat chunk_size as 0) to avoid
underflow/incorrect accounting. Update the block that references chunk_size,
old_input_chunk, output_plugin->fs_chunks_size, and released_space to perform
this check and safe handling.

---

Nitpick comments:
In `@src/flb_input_chunk.c`:
- Around line 75-94: The mirrored struct flb_input_chunk_cio_file_view must be
validated against chunkio's real struct cio_file to avoid silent corruption; add
compile-time assertions that sizeof(fl b_input_chunk_cio_file_view) equals
sizeof(struct cio_file) and that offsetof for key fields (fd, flags, synced,
allocate_strategy, fs_size, data_size, page_size, alloc_size, realloc_size)
match the corresponding offsetof(struct cio_file, ...); implement these checks
using static_assert/_Static_assert or equivalent macros (and guard them with
`#ifdef` that includes chunkio/cio_file.h when available), or document a strict
chunkio version requirement if the header cannot be included. Ensure the checks
are named/placed with references to flb_input_chunk_cio_file_view and cio_file
so failures point to the mismatch.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5a9a3464-e543-4c7c-9406-8605d69a6f75

📥 Commits

Reviewing files that changed from the base of the PR and between ccd4de1 and 732996f.

📒 Files selected for processing (1)
  • src/flb_input_chunk.c

Comment on lines +389 to 400
chunk_size = flb_input_chunk_get_real_size(old_input_chunk);
chunk_released = FLB_FALSE;
chunk_destroy_flag = FLB_FALSE;

chunk_destroy_flag = flb_routes_mask_is_empty(
old_input_chunk->routes_mask,
input_plugin->config->router);
if (release_scope == FLB_INPUT_CHUNK_RELEASE_SCOPE_LOCAL) {
flb_routes_mask_clear_bit(old_input_chunk->routes_mask,
output_plugin->id,
input_plugin->config->router);

chunk_released = FLB_TRUE;
}
else if (release_scope == FLB_INPUT_CHUNK_RELEASE_SCOPE_GLOBAL) {
chunk_destroy_flag = FLB_TRUE;
}
FS_CHUNK_SIZE_DEBUG_MOD(output_plugin, old_input_chunk, chunk_size);
output_plugin->fs_chunks_size -= chunk_size;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing validation for negative chunk_size return value.

flb_input_chunk_get_real_size() can return -1 on error (as seen in its implementation at lines 522-550). If it returns -1, the subsequent subtraction at line 399 would cause an underflow of fs_chunks_size, and the addition at line 480 would incorrectly adjust released_space.

🛡️ Proposed fix to validate chunk_size
             chunk_size = flb_input_chunk_get_real_size(old_input_chunk);
+            if (chunk_size < 0) {
+                continue;
+            }
             chunk_released = FLB_FALSE;
             chunk_destroy_flag = FLB_FALSE;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/flb_input_chunk.c` around lines 389 - 400,
flb_input_chunk_get_real_size() can return -1 on error; before using its result
in the FLB_INPUT_CHUNK_RELEASE_SCOPE_LOCAL branch, validate chunk_size != -1 and
handle the error path: when chunk_size == -1, log or handle the failure and skip
subtracting from output_plugin->fs_chunks_size and skip adding to released_space
(or treat chunk_size as 0) to avoid underflow/incorrect accounting. Update the
block that references chunk_size, old_input_chunk,
output_plugin->fs_chunks_size, and released_space to perform this check and safe
handling.

Comment on lines +798 to +805
current_size = 0;
if (ic->fs_counted == FLB_TRUE) {
current_size = (size_t) flb_input_chunk_get_real_size(ic);
}

if (projected_size > current_size) {
return projected_size - current_size;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential issue with negative return from flb_input_chunk_get_real_size.

At line 800, flb_input_chunk_get_real_size(ic) can return -1 on error. Casting this to size_t would produce a very large value (due to unsigned conversion), causing the subtraction at line 804 to potentially return 0 incorrectly or cause unexpected behavior.

🛡️ Proposed fix to handle negative return
     current_size = 0;
     if (ic->fs_counted == FLB_TRUE) {
-        current_size = (size_t) flb_input_chunk_get_real_size(ic);
+        ssize_t real_size = flb_input_chunk_get_real_size(ic);
+        if (real_size > 0) {
+            current_size = (size_t) real_size;
+        }
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/flb_input_chunk.c` around lines 798 - 805, flb_input_chunk_get_real_size
can return -1 on error but its result is cast to size_t into current_size
causing unsigned wrap; in the block that sets current_size when ic->fs_counted
== FLB_TRUE, call flb_input_chunk_get_real_size(ic) into a signed variable (e.g.
ssize_t real = flb_input_chunk_get_real_size(ic)), check for real == -1 and
handle the error path (either treat current_size as 0, log/propagate the error,
or return an error code) before casting to size_t; update the calculation using
that validated value so projected_size - current_size cannot underflow
unexpectedly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant