[kernel-1116] browser logging: Event Schema & Pipeline#184
[kernel-1116] browser logging: Event Schema & Pipeline#184archandatta wants to merge 5 commits intomainfrom
Conversation
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
| mu sync.RWMutex | ||
| buf []BrowserEvent | ||
| head int // next write position (mod cap) | ||
| count int // items currently stored (0..cap) |
There was a problem hiding this comment.
RingBuffer count field is maintained but never read
Low Severity
The count field in RingBuffer is declared, incremented in Publish, but its value is never consumed by any reader-facing logic. The oldestSeq function — the only place that needs to know buffer occupancy — derives the answer from written and len(rb.buf) instead. This makes count dead state that adds confusion about the ring buffer's invariants.
Additional Locations (1)
09ed5ed to
29f2bbf
Compare
rgarcia
left a comment
There was a problem hiding this comment.
nice direction on the pipeline/schema. one thing i'd consider before cementing BrowserEvent v1: if this is eventually going to drive both capture controls (for example: "turn on cdp console only" or "turn on network request/response capture at headers-only detail") and subscriptions, it might be worth making those selector dimensions first-class in the envelope instead of encoding too much into a single type string.
concretely, i think i'd lean toward:
- keeping the primary event identity semantic, e.g.
console.log,network.request,input.click - adding explicit provenance fields like
source_kind(cdp,kernel_api,extension,local_process) plussource_name/source_event(for exampleRuntime.consoleAPICalled) - adding an explicit
detail_level(minimal,default,verbose,raw) - possibly making
categoryfirst-class too instead of deriving it from the type prefix
i probably would not use raw Runtime.* / Network.* as the primary type, since that makes future non-cdp producers feel awkward/second-class. i think the semantic-type + provenance split ages better if we later want to emit events from things like:
- third-party extensions running in the browser and talking to localhost
- vm-local helper processes/programs running alongside the browser
- server/api-driven tool actions like screenshot/input/recording events
that shape also gives the system a much more natural control surface for both capture config and subscriptions, since selectors can operate directly on stable fields like category, topic, source_kind, and detail_level instead of needing to parse overloaded event names.
Sayan-
left a comment
There was a problem hiding this comment.
focused review on the pipeline since raf had some feedback to the event schema!
| // Writers never block regardless of reader count or speed. | ||
| // Readers track their position by seq value (not ring index) and receive an | ||
| // events_dropped synthetic BrowserEvent when they fall behind the oldest retained event. | ||
| type RingBuffer struct { |
There was a problem hiding this comment.
makes sense to handroll. the combination of non-blocking writes with eviction, multi-reader fan-out with independent positions, synthetic drop notification, and context-aware blocking reads is pretty niche, no standard Go library covers it without significant wrapping.
| // FileWriter is a per-category JSONL appender. It opens each log file lazily on | ||
| // first write (O_APPEND|O_CREATE|O_WRONLY) and serialises concurrent writes | ||
| // within a category with a single mutex. | ||
| type FileWriter struct { |
There was a problem hiding this comment.
doc comment says "serialises concurrent writes within a category" but the mutex is global across all categories. worth either making the comment accurate or switching to per-category mutexes.
| return fmt.Errorf("filewriter: marshal: %w", err) | ||
| } | ||
|
|
||
| var buf bytes.Buffer |
There was a problem hiding this comment.
nit: the bytes.Buffer alloc per write is unnecessary since we're under the mutex. can just f.Write(data) + f.Write([]byte{'\n'}) directly.
|
|
||
| // TestConcurrentReaders: 3 readers subscribe before publish; publish 5 events; | ||
| // each reader independently reads all 5; no reader affects another. | ||
| func TestConcurrentReaders(t *testing.T) { |
There was a problem hiding this comment.
would be good to add a test where publishing and reading happen in parallel to exercise the locking/contention paths under go test -race.


Note
Medium Risk
Adds a new concurrent event ingestion pipeline with file I/O and truncation behavior; risks are mainly around correctness under load (dropped-event signaling, write ordering, and silent file-write failures).
Overview
Introduces a new
server/lib/eventspackage implementing the browser logging pipeline end-to-end: a canonicalBrowserEventschema with category routing and size-based truncation, a non-blocking fan-outRingBufferthat can emit a syntheticevents_droppedevent when readers fall behind, and a per-category JSONLFileWriterthat lazily opens and appends to*.logfiles.Adds a
Pipelinewrapper that stampsCaptureSessionID, monotonicSeq, andTs, applies truncation before both sinks, writes durably to disk before publishing in-memory, and provides reader subscriptions; comprehensive tests cover JSON encoding, routing, concurrency, overflow/drop behavior, and truncation limits.Written by Cursor Bugbot for commit 29f2bbf. This will update automatically on new commits. Configure here.