fix: resolve serial debugger spurious values from concurrent polling bugs#618
Conversation
…bugs Three interrelated concurrency bugs caused phantom TRUE values on BOOL outputs in the serial (Modbus RTU) debugger. Serial is most affected because its small batch size (20) and slow transport make poll cycles frequently exceed the 200ms interval. 1. Replace setInterval with recursive setTimeout so the next poll only schedules after the current one completes, making concurrent polls structurally impossible. 2. Send a single batch per poll cycle instead of all batches back-to-back. A batchOffsetRef tracks position across cycles so all variables still get polled. Values from previous batches persist because newValues starts as a copy of the current store. 3. Add promise-chain mutex to both RTU and TCP Modbus clients so that a setVariable (force/release) IPC call arriving during a poll's getVariablesList cannot cause concurrent serial/socket access. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
WalkthroughThese changes introduce request serialization mechanisms in the Modbus communication layer and refactor debugger variable polling from interval-based to batch-based, timeout-scheduled processing with memory management and composite-key support. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/renderer/screens/workspace-screen.tsx (1)
1495-1518:⚠️ Potential issue | 🔴 CriticalZombie polling chain when
isDebuggerVisibletoggles while a poll is in-flight.When the effect re-runs (debugger toggled off/on), cleanup clears the pending
setTimeoutbut cannot abort an in-flightpollVariables()promise. Its.finally()still invokesschedulePoll(), and sinceisMountedRef.currentistrue(component is still mounted), a new timeout is scheduled. If the debugger is then toggled back on, a second independent chain starts — re-introducing the concurrent-poll bug this PR aims to fix.Use a closure-scoped cancellation flag set in the cleanup function:
Proposed fix
+ let cancelled = false + const schedulePoll = () => { - if (!isMountedRef.current) return + if (cancelled || !isMountedRef.current) return pollingIntervalRef.current = setTimeout(() => { void pollVariables().finally(() => schedulePoll()) }, DEBUGGER_POLL_INTERVAL_MS) } // Fire first poll immediately, then chain - void pollVariables().finally(() => schedulePoll()) + void pollVariables().finally(() => { + if (!cancelled) schedulePoll() + }) return () => { + cancelled = true if (pollingIntervalRef.current) { clearTimeout(pollingIntervalRef.current) pollingIntervalRef.current = null }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/renderer/screens/workspace-screen.tsx` around lines 1495 - 1518, The polling chain can continue after effect cleanup because an in-flight pollVariables() finally() still calls schedulePoll(); fix by adding a closure-scoped cancelled flag (e.g. let cancelled = false) inside the effect, set cancelled = true in the cleanup, and check this flag before scheduling new timeouts: update schedulePoll() to return early if cancelled or !isMountedRef.current, and in the pollVariables().finally(() => ...) handler only call schedulePoll() when cancelled is false; keep clearing pollingIntervalRef.current as before. Reference symbols: schedulePoll, pollVariables, pollingIntervalRef, isMountedRef, DEBUGGER_POLL_INTERVAL_MS.src/main/modules/modbus/modbus-rtu-client.ts (1)
163-165:⚠️ Potential issue | 🟡 MinorTimeout handler omits listener cleanup, unlike the TCP counterpart
The outer
timeoutHandlecallsrejectbut never removesonData/onErrorfromserialPort. ComparesendTcpRequestImplinmodbus-client.ts(lines 77–81), which explicitly callsremoveListenerfor both before rejecting. After an RTU timeout, the stale listeners persist until the next successful response'sframeCompleteTimeoutfires and removes them. With repeated consecutive timeouts, stale listener pairs accumulate;flushInputBufferon the next request discards buffered OS data but cannot remove them.🔧 Proposed fix
const timeoutHandle = setTimeout(() => { + if (frameCompleteTimeout) { + clearTimeout(frameCompleteTimeout) + } + this.serialPort?.removeListener('data', onData) + this.serialPort?.removeListener('error', onError) reject(new Error('Request timeout')) }, this.timeout)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/main/modules/modbus/modbus-rtu-client.ts` around lines 163 - 165, The RTU timeout handler only rejects the promise and doesn't remove the serialPort listeners, causing stale onData/onError handlers to accumulate; update the timeout callback (the setTimeout using timeoutHandle in modbus-rtu-client.ts) to mirror the TCP implementation by removing the serialPort listeners (removeListener for onData and onError), clearing any frameCompleteTimeout and the timeoutHandle, and then rejecting with the timeout Error so no dangling listeners remain.
🧹 Nitpick comments (2)
src/renderer/screens/workspace-screen.tsx (1)
174-174:batchOffsetRefpersists across debugger sessions.When the debugger is toggled off and back on,
batchOffsetRefretains its previous value. The clamping at line 1393 prevents out-of-bounds access, but polling resumes mid-sequence rather than from the start. If a fresh start is preferred per session, consider resetting it alongsidevariableInfoMapRef.current = nullin the early-return paths (lines 209, 225).Optional: reset offset when polling stops
variableInfoMapRef.current = null + batchOffsetRef.current = 0 returnApply at both early-return sites (lines 209-210 and lines 225-226).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/renderer/screens/workspace-screen.tsx` at line 174, Reset batchOffsetRef to 0 when stopping the debugger so polling restarts fresh: in the early-return branches where you already set variableInfoMapRef.current = null (the two polling-stop sites), add batchOffsetRef.current = 0 alongside that reset; this applies to the same early-return paths that currently clear variableInfoMapRef and prevent further polling so batchOffsetRef won't persist across sessions.src/main/modules/modbus/modbus-rtu-client.ts (1)
144-153:sendRequestis redundantly declaredasyncThe function only wraps
new Promisewithout anyawait; theasynckeyword adds nothing here. The mutex pattern itself is correct — bothonFulfilled/onRejectedalways callsendRequestImpl, so the chain can never reject and subsequent callers always proceed.🔧 Proposed fix
- private async sendRequest(request: Buffer): Promise<Buffer> { + private sendRequest(request: Buffer): Promise<Buffer> { return new Promise<Buffer>((resolve, reject) => {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/main/modules/modbus/modbus-rtu-client.ts` around lines 144 - 153, The sendRequest function is unnecessarily marked async while it simply constructs and returns a new Promise; remove the async modifier and return the promise directly or refactor to return the existing promise chain. Update the function signature to "private sendRequest(request: Buffer): Promise<Buffer>" (remove async) and keep the body that assigns this.sendRequestMutex = this.sendRequestMutex.then(...). Ensure it still resolves/rejects via sendRequestImpl so sendRequestMutex and sendRequestImpl remain the coordination points.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/main/modules/modbus/modbus-client.ts`:
- Around line 133-157: The TCP branch of getMd5Hash currently calls
sendTcpRequest once and fails immediately; update getMd5Hash to apply the same
retry policy used by the RTU path: wrap the call to sendTcpRequest in a retry
loop using MD5_REQUEST_MAX_RETRIES and MD5_REQUEST_RETRY_DELAY_MS, retry on
transient errors (network timeouts, TCP resets, transactionId mismatch from
responseTransactionId !== transactionId, or other errors thrown by
sendTcpRequest), and pause between attempts with the same delay; after
exhausting retries rethrow the last error. Ensure you reference getMd5Hash,
sendTcpRequest, MD5_REQUEST_MAX_RETRIES, and MD5_REQUEST_RETRY_DELAY_MS so the
retry behavior matches the RTU sendRequest implementation.
- Around line 83-88: The TCP read handler currently uses socket.once('data',
onData) in the getVariablesList flow which drops subsequent TCP segments;
replace this with a cumulative on('data', onData) approach that appends incoming
chunks into a buffer and only resolves when the full MBAP frame is received (use
the Length field at offset 4 to compute expected total bytes) or when frame idle
timeout (FRAME_COMPLETE_TIMEOUT_MS) elapses; ensure onError and timeoutHandle
are still removed/cleared and socket.removeListener is called when resolving or
rejecting to avoid leaks.
In `@src/main/modules/modbus/modbus-rtu-client.ts`:
- Around line 194-196: Replace the silent comment in the CRC check with a
warning log: inside the if (receivedCrc !== calculatedCrc) block emit a warning
(using the module's existing logger, e.g., this.logger.warn or logger.warn) that
includes receivedCrc and calculatedCrc (preferably formatted as hex), plus any
available context such as the slave/unit id and the raw frame buffer, then
continue to process the frame as before; reference the receivedCrc and
calculatedCrc variables when building the log message so the mismatch is
recorded for diagnostics.
---
Outside diff comments:
In `@src/main/modules/modbus/modbus-rtu-client.ts`:
- Around line 163-165: The RTU timeout handler only rejects the promise and
doesn't remove the serialPort listeners, causing stale onData/onError handlers
to accumulate; update the timeout callback (the setTimeout using timeoutHandle
in modbus-rtu-client.ts) to mirror the TCP implementation by removing the
serialPort listeners (removeListener for onData and onError), clearing any
frameCompleteTimeout and the timeoutHandle, and then rejecting with the timeout
Error so no dangling listeners remain.
In `@src/renderer/screens/workspace-screen.tsx`:
- Around line 1495-1518: The polling chain can continue after effect cleanup
because an in-flight pollVariables() finally() still calls schedulePoll(); fix
by adding a closure-scoped cancelled flag (e.g. let cancelled = false) inside
the effect, set cancelled = true in the cleanup, and check this flag before
scheduling new timeouts: update schedulePoll() to return early if cancelled or
!isMountedRef.current, and in the pollVariables().finally(() => ...) handler
only call schedulePoll() when cancelled is false; keep clearing
pollingIntervalRef.current as before. Reference symbols: schedulePoll,
pollVariables, pollingIntervalRef, isMountedRef, DEBUGGER_POLL_INTERVAL_MS.
---
Nitpick comments:
In `@src/main/modules/modbus/modbus-rtu-client.ts`:
- Around line 144-153: The sendRequest function is unnecessarily marked async
while it simply constructs and returns a new Promise; remove the async modifier
and return the promise directly or refactor to return the existing promise
chain. Update the function signature to "private sendRequest(request: Buffer):
Promise<Buffer>" (remove async) and keep the body that assigns
this.sendRequestMutex = this.sendRequestMutex.then(...). Ensure it still
resolves/rejects via sendRequestImpl so sendRequestMutex and sendRequestImpl
remain the coordination points.
In `@src/renderer/screens/workspace-screen.tsx`:
- Line 174: Reset batchOffsetRef to 0 when stopping the debugger so polling
restarts fresh: in the early-return branches where you already set
variableInfoMapRef.current = null (the two polling-stop sites), add
batchOffsetRef.current = 0 alongside that reset; this applies to the same
early-return paths that currently clear variableInfoMapRef and prevent further
polling so batchOffsetRef won't persist across sessions.
| const onData = (data: Buffer) => { | ||
| clearTimeout(timeoutHandle) | ||
| this.socket?.removeListener('data', onData) | ||
| this.socket?.removeListener('error', onError) | ||
| resolve(data) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n src/main/modules/modbus/modbus-client.ts | sed -n '70,101p'Repository: Autonomy-Logic/openplc-editor
Length of output: 1310
🏁 Script executed:
git log --oneline --follow -- src/main/modules/modbus/modbus-client.ts | head -10Repository: Autonomy-Logic/openplc-editor
Length of output: 155
🏁 Script executed:
git diff HEAD~1 HEAD -- src/main/modules/modbus/modbus-client.ts | grep -A5 -B5 "once.*data"Repository: Autonomy-Logic/openplc-editor
Length of output: 101
🏁 Script executed:
# Search for RTU implementation to compare the pattern
find . -name "*rtu*" -o -name "*modbus*" | grep -i rtuRepository: Autonomy-Logic/openplc-editor
Length of output: 221
🏁 Script executed:
cat -n src/main/modules/modbus/modbus-rtu-client.ts | head -150Repository: Autonomy-Logic/openplc-editor
Length of output: 7811
🏁 Script executed:
# Search for the TCP request pattern - specifically looking for on('data') and frameCompleteTimeout
rg "frameCompleteTimeout|on\('data'" src/main/modules/modbus/Repository: Autonomy-Logic/openplc-editor
Length of output: 702
🏁 Script executed:
# Search for MBAP protocol details and getVariablesList
rg "MBAP|getVariablesList|Length.*offset" src/main/modules/modbus/Repository: Autonomy-Logic/openplc-editor
Length of output: 284
🏁 Script executed:
cat -n src/main/modules/modbus/modbus-rtu-client.ts | sed -n '146,200p'Repository: Autonomy-Logic/openplc-editor
Length of output: 2516
🏁 Script executed:
# Look for the actual data accumulation pattern in RTU
cat -n src/main/modules/modbus/modbus-rtu-client.ts | sed -n '150,250p'Repository: Autonomy-Logic/openplc-editor
Length of output: 4228
🏁 Script executed:
# Check the modbus-client.ts for getVariablesList and response size handling
rg "getVariablesList" src/main/modules/modbus/modbus-client.ts -A 30 | head -50Repository: Autonomy-Logic/openplc-editor
Length of output: 995
🏁 Script executed:
# Search for MBAP or response parsing logic
rg "readUInt|offset.*4|MBAP|responseSize" src/main/modules/modbus/modbus-client.tsRepository: Autonomy-Logic/openplc-editor
Length of output: 869
socket.once('data', onData) drops subsequent TCP segments for large responses
TCP is a stream protocol where responses can arrive in multiple chunks. Using once('data') captures only the first segment and removes the listener, discarding any subsequent data. For getVariablesList with many debug variables, the response (17-byte header + responseSize bytes at offset 15) can exceed typical MTU (~1460 bytes), guaranteeing fragmentation. The incomplete response will fail the validation check data.length < 17 + responseSize and likely timeout.
The RTU client implements the correct pattern: on('data', onData) with frameCompleteTimeout (50ms idle timeout) that accumulates all chunks before resolving. The TCP implementation should match this approach—either by accumulating chunks until FRAME_COMPLETE_TIMEOUT_MS elapses or by using the MBAP Length field (at offset 4) to determine the expected response size and resolve only when the full response is received.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/main/modules/modbus/modbus-client.ts` around lines 83 - 88, The TCP read
handler currently uses socket.once('data', onData) in the getVariablesList flow
which drops subsequent TCP segments; replace this with a cumulative on('data',
onData) approach that appends incoming chunks into a buffer and only resolves
when the full MBAP frame is received (use the Length field at offset 4 to
compute expected total bytes) or when frame idle timeout
(FRAME_COMPLETE_TIMEOUT_MS) elapses; ensure onError and timeoutHandle are still
removed/cleared and socket.removeListener is called when resolving or rejecting
to avoid leaks.
| const data = await this.sendTcpRequest(request) | ||
|
|
||
| const onData = (data: Buffer) => { | ||
| clearTimeout(timeoutHandle) | ||
| this.socket?.removeListener('data', onData) | ||
| this.socket?.removeListener('error', onError) | ||
| if (data.length < 9) { | ||
| throw new Error('Invalid response: too short') | ||
| } | ||
|
|
||
| try { | ||
| if (data.length < 9) { | ||
| reject(new Error('Invalid response: too short')) | ||
| return | ||
| } | ||
|
|
||
| const responseTransactionId = data.readUInt16BE(0) | ||
| const responseFunctionCode = data.readUInt8(7) | ||
| const statusCode = data.readUInt8(8) | ||
|
|
||
| if (responseTransactionId !== transactionId) { | ||
| reject(new Error('Transaction ID mismatch')) | ||
| return | ||
| } | ||
|
|
||
| if (responseFunctionCode !== (ModbusFunctionCode.DEBUG_GET_MD5 as number)) { | ||
| reject(new Error('Function code mismatch')) | ||
| return | ||
| } | ||
|
|
||
| if (statusCode !== (ModbusDebugResponse.SUCCESS as number)) { | ||
| reject(new Error(`Target returned error code: 0x${statusCode.toString(16)}`)) | ||
| return | ||
| } | ||
|
|
||
| const md5String = data.slice(9).toString('utf-8').trim() | ||
| resolve(md5String) | ||
| } catch (error) { | ||
| reject(error instanceof Error ? error : new Error(String(error))) | ||
| } | ||
| } | ||
| const responseTransactionId = data.readUInt16BE(0) | ||
| const responseFunctionCode = data.readUInt8(7) | ||
| const statusCode = data.readUInt8(8) | ||
|
|
||
| const onError = (error: Error) => { | ||
| clearTimeout(timeoutHandle) | ||
| this.socket?.removeListener('data', onData) | ||
| this.socket?.removeListener('error', onError) | ||
| reject(error) | ||
| } | ||
| if (responseTransactionId !== transactionId) { | ||
| throw new Error('Transaction ID mismatch') | ||
| } | ||
|
|
||
| this.socket!.once('data', onData) | ||
| this.socket!.once('error', onError) | ||
| this.socket!.write(request as unknown as Uint8Array) | ||
| }) | ||
| if (responseFunctionCode !== (ModbusFunctionCode.DEBUG_GET_MD5 as number)) { | ||
| throw new Error('Function code mismatch') | ||
| } | ||
|
|
||
| if (statusCode !== (ModbusDebugResponse.SUCCESS as number)) { | ||
| throw new Error(`Target returned error code: 0x${statusCode.toString(16)}`) | ||
| } | ||
|
|
||
| const md5String = data.slice(9).toString('utf-8').trim() | ||
| return md5String | ||
| } |
There was a problem hiding this comment.
getMd5Hash (TCP) no longer retries; RTU retries up to 3 times
The RTU getMd5Hash wraps sendRequest in a retry loop (MD5_REQUEST_MAX_RETRIES = 3, MD5_REQUEST_RETRY_DELAY_MS = 500). The refactored TCP path calls sendTcpRequest once and propagates any error immediately. A transient timeout, TCP RST, or transactionId mismatch that would have been retried over RTU now causes an immediate connection failure on TCP. Consider whether the same retry policy should apply, or whether the existing callers already handle reconnection.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/main/modules/modbus/modbus-client.ts` around lines 133 - 157, The TCP
branch of getMd5Hash currently calls sendTcpRequest once and fails immediately;
update getMd5Hash to apply the same retry policy used by the RTU path: wrap the
call to sendTcpRequest in a retry loop using MD5_REQUEST_MAX_RETRIES and
MD5_REQUEST_RETRY_DELAY_MS, retry on transient errors (network timeouts, TCP
resets, transactionId mismatch from responseTransactionId !== transactionId, or
other errors thrown by sendTcpRequest), and pause between attempts with the same
delay; after exhausting retries rethrow the last error. Ensure you reference
getMd5Hash, sendTcpRequest, MD5_REQUEST_MAX_RETRIES, and
MD5_REQUEST_RETRY_DELAY_MS so the retry behavior matches the RTU sendRequest
implementation.
| if (receivedCrc !== calculatedCrc) { | ||
| // OpenPLC debugger ignores CRC errors | ||
| //reject(new Error('CRC validation failed')) | ||
| //return | ||
| console.warn('Warning: CRC validation failed, but continuing anyway') | ||
| // OpenPLC debugger ignores CRC errors — mismatch is non-fatal | ||
| } |
There was a problem hiding this comment.
Silent CRC discard emits no warning log
A CRC mismatch means the frame is corrupted on the wire, yet the code proceeds with the corrupted payload and emits nothing to the log. The AI-generated summary described this change as "logs a warning and continues," but the code only contains a comment. Silent discards make it very hard to diagnose intermittent line-noise issues in the field.
🔧 Proposed fix
if (receivedCrc !== calculatedCrc) {
- // OpenPLC debugger ignores CRC errors — mismatch is non-fatal
+ console.warn('Modbus RTU: CRC mismatch (received 0x' + receivedCrc.toString(16) + ', expected 0x' + calculatedCrc.toString(16) + ') — continuing')
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (receivedCrc !== calculatedCrc) { | |
| // OpenPLC debugger ignores CRC errors | |
| //reject(new Error('CRC validation failed')) | |
| //return | |
| console.warn('Warning: CRC validation failed, but continuing anyway') | |
| // OpenPLC debugger ignores CRC errors — mismatch is non-fatal | |
| } | |
| if (receivedCrc !== calculatedCrc) { | |
| console.warn( | |
| `Modbus RTU: CRC mismatch (received 0x${receivedCrc.toString(16)}, expected 0x${calculatedCrc.toString(16)}) — continuing`, | |
| ) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/main/modules/modbus/modbus-rtu-client.ts` around lines 194 - 196, Replace
the silent comment in the CRC check with a warning log: inside the if
(receivedCrc !== calculatedCrc) block emit a warning (using the module's
existing logger, e.g., this.logger.warn or logger.warn) that includes
receivedCrc and calculatedCrc (preferably formatted as hex), plus any available
context such as the slave/unit id and the raw frame buffer, then continue to
process the frame as before; reference the receivedCrc and calculatedCrc
variables when building the log message so the mismatch is recorded for
diagnostics.
Reduce poll interval from 200ms to 50ms with skip-if-busy guard so polling adapts to target speed. Lower RTU frame-complete timeout from 50ms to 10ms to avoid unnecessary waiting after each serial response. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
setIntervalwith recursivesetTimeoutso the next poll only schedules after the current one completes, making concurrent polls structurally impossible on slow serial linksbatchOffsetReftracks position across cycles so all variables still get polled, while values from previous batches persist via store copysetVariable(force/release) calls arriving during a poll'sgetVariablesListcannot cause concurrent serial/socket access and corrupt the data streamTest plan
🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Performance