Skip to content

Deadlock when Lua callback calls :set() during event handling #794

@a322655

Description

@a322655

Problem Description

SketchyBar occasionally stops refreshing. The process remains running but becomes unresponsive. This is a deadlock, not a crash.

Environment

Root Cause Analysis

After investigation using sample command, I found that both SketchyBar and the Lua process are blocked on mach_msg2_trap:

SketchyBar stack:

bar_item_update
  mach_send_message
    mach_msg → mach_msg2_trap  (blocked)

Lua (sketchybar.so) stack:

callback_function
  sketchybar
    mach_send_message
      mach_msg → mach_msg2_trap  (blocked)

The deadlock occurs because:

  1. SketchyBar calls bar_item_update() which sends a Mach message to Lua
  2. The send blocks waiting for the port queue to have space
  3. Lua’s callback is invoked, which calls :set()
  4. :set() sends a message back to SketchyBar
  5. This send also blocks because SketchyBar’s port queue is full (SketchyBar is waiting, not consuming)
  6. Deadlock: Both sides are blocked on send, neither can process incoming messages

The root issue is that mach_send_message() in both src/mach.c (SketchyBar) and src/mach.h (SbarLua) uses MACH_MSG_TIMEOUT_NONE, causing infinite blocking when the port queue is full.

Proposed Workaround

Add MACH_SEND_TIMEOUT flag with a 100ms timeout to prevent infinite blocking:

// Before
mach_msg(&msg.header,
         MACH_SEND_MSG,
         ...
         MACH_MSG_TIMEOUT_NONE,
         ...);

// After
mach_msg_return_t ret = mach_msg(&msg.header,
         MACH_SEND_MSG | MACH_SEND_TIMEOUT,
         ...
         100,  // 100ms timeout
         ...);
if (ret != MACH_MSG_SUCCESS) {
    // Clean up response_port if allocated
    return NULL;
}

The 100ms value was chosen to match the existing receive timeout in mach_receive_message() (SketchyBar side). It's long enough to avoid spurious timeouts under normal load, yet short enough to recover quickly when contention occurs.

Caveat

This workaround may violate the original design intent — it changes the messaging semantics from guaranteed delivery to best effort, allowing messages to be dropped under load.

A more comprehensive fix (e.g., asynchronous callbacks or a redesigned message queue) would require architectural changes across both repositories. This patch is a minimal, localized workaround.

Testing

I have tested this workaround locally:

  • Normal usage works correctly
  • High-frequency events (200ms intervals) work without freezing
  • Extreme stress test (50ms intervals) causes temporary slowdown but recovers after stress ends (vs permanent deadlock before)

Questions

  1. Is this deadlock scenario something you’d like to address?
  2. If so, would you prefer this minimal workaround, or a different approach?
  3. Since this requires changes to both repositories (SketchyBar and SbarLua), should I submit two separate PRs?

Related files:

  • SketchyBar: src/mach.c - mach_send_message() function
  • SbarLua: src/mach.h - mach_send_message() function

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions