Skip to content

Fix off-by-two padding bug in set_string and empty string edge case#586

Open
gijzelaerr wants to merge 1 commit intomasterfrom
fix/479-set-string-padding
Open

Fix off-by-two padding bug in set_string and empty string edge case#586
gijzelaerr wants to merge 1 commit intomasterfrom
fix/479-set-string-padding

Conversation

@gijzelaerr
Copy link
Owner

Summary

Fixes #479

set_string() left the last 2 character positions unpadded when writing a shorter string over a longer one, leaving stale data in the buffer. Additionally, both set_string() and set_fstring() failed to clear the first character position when writing an empty string.

Root cause analysis

Bug 1: Off-by-two in set_string padding (line 247)

# BEFORE (buggy)
for r in range(i + 1, bytearray_[byte_index] - 2):
    bytearray_[byte_index + 2 + r] = ord(" ")

The - 2 was an incorrect attempt to account for the 2-byte string header (max_size byte + length byte). However, the header offset is already handled by the byte_index + 2 + r indexing expression. The variable r is a relative index into the data area (positions 0 through max_size-1), so the range should go up to max_size, not max_size - 2.

Impact: When writing "ab" into a buffer with max_size=10, positions 8 and 9 (relative to the data area) were never cleared:

Expected: [10, 2, 'a', 'b', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
Actual:   [10, 2, 'a', 'b', ' ', ' ', ' ', ' ', ' ', ' ',  ?,  ?]
                                                             ^^  ^^
                                                       stale data!

Bug 2: Empty string edge case (both set_string and set_fstring)

i = 0
for i, c in enumerate(value):  # doesn't execute for empty string
    ...
for r in range(i + 1, ...):    # starts at 1, skipping position 0
    ...

When value is empty, the enumerate loop doesn't execute, so i stays at 0. The padding loop then starts from range(0 + 1, ...), skipping position 0 entirely.

For set_string, this bug was masked by get_string() which reads the length byte (0 for empty) and returns "" regardless. But for set_fstring, the stale character at position 0 would be visible through get_fstring().

Fix

  1. Changed range(i + 1, bytearray_[byte_index] - 2) to range(len(value), bytearray_[byte_index]) in set_string
  2. Changed range(i + 1, max_length) to range(len(value), max_length) in set_fstring

Using len(value) instead of i + 1 correctly handles both the normal case and the empty string edge case, and removing - 2 fixes the off-by-two.

Test plan

  • All 326 existing tests pass
  • Manual verification that padding is correct for "ab" with max_size=10
  • Manual verification that empty string clears all positions
  • Add unit tests that verify buffer contents after set_string (not just round-trip)

🤖 Generated with Claude Code

set_string had an off-by-two error in the padding loop: the range
used `bytearray_[byte_index] - 2` instead of `bytearray_[byte_index]`.
The -2 subtraction was incorrect because the 2-byte header offset is
already handled by the `byte_index + 2 + r` indexing expression.
This left the last 2 character positions unpadded with stale data when
writing a shorter string over a longer one.

Also fix an empty string edge case in both set_string and set_fstring
where the first character position would not be cleared because the
enumerate loop wouldn't execute and `i` would remain at its initial
value of 0. Now use `len(value)` as the range start instead.

Fixes #479

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Error in util.Set_String and util.Get_String

1 participant