Background
The story feature currently sends the LLM a list of the top N conversations (sorted by bytes) plus a protocol/category breakdown. The conversation cap (STORY_MAX_CONVERSATIONS=20) was questioned as potentially deceptive — the LLM only sees a slice of traffic and has no sense of what it's missing.
A packet-budget approach was considered but rejected: the LLM never sees raw packets anyway (it sees one summary line per conversation), so a packet count is a poor proxy for prompt quality. More importantly, LLMs don't process long repetitive lists well — attention degrades over distance and the model anchors on the first and last entries, treating the middle as noise. Feeding 50 nearly-identical conversation lines may actually produce worse narratives than a well-structured 20.
Core question
What prompt structure actually helps an LLM reason well about network traffic?
The hypothesis is that the LLM needs the shape of the traffic — not an enumeration of flows. Specifically:
- What protocols/apps dominate (covered)
- What the outliers are — risky, unusual, or anomalous flows (partially covered)
- Aggregate structure: top destinations, unique external hosts, protocol × risk distribution
- A sense of coverage — what fraction of traffic is represented
Research tasks
Goal
Produce a concrete recommendation for how to restructure the story prompt — what to add, what to remove or compress, and what the ideal conversation-list size is given typical LLM attention behaviour.
Background
The story feature currently sends the LLM a list of the top N conversations (sorted by bytes) plus a protocol/category breakdown. The conversation cap (
STORY_MAX_CONVERSATIONS=20) was questioned as potentially deceptive — the LLM only sees a slice of traffic and has no sense of what it's missing.A packet-budget approach was considered but rejected: the LLM never sees raw packets anyway (it sees one summary line per conversation), so a packet count is a poor proxy for prompt quality. More importantly, LLMs don't process long repetitive lists well — attention degrades over distance and the model anchors on the first and last entries, treating the middle as noise. Feeding 50 nearly-identical conversation lines may actually produce worse narratives than a well-structured 20.
Core question
What prompt structure actually helps an LLM reason well about network traffic?
The hypothesis is that the LLM needs the shape of the traffic — not an enumeration of flows. Specifically:
Research tasks
Goal
Produce a concrete recommendation for how to restructure the story prompt — what to add, what to remove or compress, and what the ideal conversation-list size is given typical LLM attention behaviour.