fix: AI assistant not showing logs when run selected mid-session#4475
fix: AI assistant not showing logs when run selected mid-session#4475midigofrank merged 5 commits intomainfrom
Conversation
follow_run_id was only stored in session meta during channel JOIN. When a user created a session without a run selected and later picked one, the backend silently dropped follow_run_id from new_message params, so maybe_add_run_logs never found it and context.log was null. Now follow_run_id is stored per-message on message meta (same pattern as unsaved_job), and the Oban worker propagates it to the in-memory session before enrichment.
Add tests to prevent regression of the bug where logs weren't sent to Apollo when users selected a run mid-session. Tests verify that follow_run_id is properly stored in message.meta and propagated during message processing for log enrichment.
|
This all seems very plausible @lmac-1 but I am still unable to reproduce this against main locally. I consistently see the logs being uploaded. I've tried following your steps several times - I must be doing something subtly wrong? I presume you've managed to repro? I for one would be happy to merge it and see. We'd have to track it closely in prod. |
|
Loom here to see it in action before and after the fix. The steps are a bit fiddly. https://www.loom.com/share/a569fbb2e34f4fb3bb006aea028ec50d |
|
Ah fantastic @lmac-1 thank you - I'll take a look in the morning |
|
That's amazing @lmac-1 I can repro!!! |
josephjclark
left a comment
There was a problem hiding this comment.
Cannot comment on the technical solution but I've tested this and verified
What makes this hard is a litany of bugs around logging (from being able to tick Send Logs when none are available in the first place, to the assistant happily lying about the logs it receives). And the use-case reproduced feels a bit niche - is this the problem that's being reporting in production?
But @lmac-1 has 100% identified a bug and 100% fixed it as far as I'm concerned, so it's a Yes from me, Jim.
|
@lmac-1 looks like you have one failing test. |
The regression test added in 5a22a97 had three setup issues: - Missing workflow and project in test context - Missing required dataclip for run creation - Message lookup found first user message instead of the specific message with follow_run_id Fix by using workflow/project from context, creating a dataclip, and finding the message by content to get the correct one.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4475 +/- ##
=======================================
Coverage 89.41% 89.42%
=======================================
Files 425 425
Lines 20245 20253 +8
=======================================
+ Hits 18102 18111 +9
+ Misses 2143 2142 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@midigofrank tests are now fixed. Thanks! |
* feat: add SearchParams and list_channel_requests/3 to Channels context Introduces Lightning.Channels.SearchParams, an embedded schema that parses and validates channel request filter params from URL query strings. Adds list_channel_requests/3 to the Channels context, returning a paginated Scrivener.Page of ChannelRequest structs with :channel and filtered :channel_events preloaded (source_received and error types only). * feat: add Channel Requests log page Adds a paginated Channel Requests page at /projects/:id/channels/requests with a changeset-backed channel filter dropdown. The page uses SearchParams and list_channel_requests/3 from the Channels context to display request ID, path, channel name, started-at timestamp, status badge, and error message. Also adds a "Channel Requests" nav item in the sidebar (experimental features only) and converts the Requests count and Last Activity cells on the Channels CRUD page into links that navigate to the requests page pre-filtered to that channel. * refactor: use ExMachina factories in list_channel_requests test helpers * improve channel request logs page styling and fix pagination path bug * fix failing tests * update changelog with 2.15.15 closes #4477 * allow URL_HOST, URL_PORT, URL_SCHEME env vars to work in dev/test Previously these env vars only took effect in production. In dev/test the endpoint URL was hardcoded by config/*.exs, making it impossible to configure via env vars in dev containers (e.g. dev-elixir.local:4000). Move the unified URL config (host, port, scheme) before the prod block so it applies in all environments, with sensible per-env defaults. * fix: return 401 consistently for channel source auth failures Previously, invalid credentials returned 404 while missing credentials returned 401. This was inconsistent and leaked resource existence via the 401 anyway. Simplify authenticate_source into a single function. * fix: display request path on channel requests page The events preload and helper were filtering for :source_received events which are never created — the handler only writes :sink_response or :error events. Also remove em dash fallbacks for empty values. * Support batched logs from the worker (#4174) * support batched logs * Introduce another event run:batch_logs for batch logs * update changelog * refactor to make credo happy --------- Co-authored-by: Frank Midigo <midigofrank@gmail.com> * fix: AI assistant not showing logs when run selected mid-session (#4475) * fix: store follow_run_id on message meta so logs reach Apollo (#4380) follow_run_id was only stored in session meta during channel JOIN. When a user created a session without a run selected and later picked one, the backend silently dropped follow_run_id from new_message params, so maybe_add_run_logs never found it and context.log was null. Now follow_run_id is stored per-message on message meta (same pattern as unsaved_job), and the Oban worker propagates it to the in-memory session before enrichment. * test: add regression tests for follow_run_id propagation (#4380) Add tests to prevent regression of the bug where logs weren't sent to Apollo when users selected a run mid-session. Tests verify that follow_run_id is properly stored in message.meta and propagated during message processing for log enrichment. * updates changelog * test: fix failing follow_run_id regression test (#4380) The regression test added in 5a22a97 had three setup issues: - Missing workflow and project in test context - Missing required dataclip for run creation - Message lookup found first user message instead of the specific message with follow_run_id Fix by using workflow/project from context, creating a dataclip, and finding the message by content to get the correct one. --------- Co-authored-by: Elias W. BA <eliaswalyba@gmail.com> * fix: move Plug.Telemetry before ChannelProxyPlug so proxied requests get instrumented * feat: add LOG_QUEUE_QUERIES config to control queue claim log noise The run claim polling generates heavy Ecto debug output in development. Add a LOG_QUEUE_QUERIES env var (default: false) that controls whether queue claim queries are logged, threading the value through all SQL calls in Runs.Queue. * fix: update channel request index tests after template changes - source_event_path now queries :sink_response events (not :source_received) - Dash placeholder test updated since template renders nil instead of em-dashes * fix: add @moduletag :capture_log to all AI assistant test files AI assistant tests were leaking error-level log messages into unrelated test output during concurrent async execution. The MessageProcessor Oban worker and AiAssistant module log errors when AI queries fail (e.g. "AI query failed for session", "[MessageProcessor] Failed to process message"), and with Oban configured as `testing: :inline`, these jobs execute synchronously within the test process. While many individual tests had `@tag :capture_log`, this only captures logs from the tagged test's own process. Logs from spawned processes or PubSub callbacks could escape capture and appear in the output of completely unrelated tests running concurrently — such as channel request LiveView tests that have nothing to do with AI. Adding `@moduletag :capture_log` at the module level ensures all logs from every test in these files are captured, preventing cross-test log contamination. Only ai_assistant_test.exs already had this tag. * feat: extract pill_tabs component for reusable tab bar Move the inline tab switcher styling into a new TabBar component (LightningWeb.Components.TabBar) with a pill_tabs/1 function component. Matches the React Tabs.tsx visual style with slate background and indigo active state. Includes unit tests for rendering, active state styling, patch paths, and multi-tab support. * fix: change channel_snapshots FK to cascade delete on channel removal channel_snapshots.channel_id used on_delete: :restrict which blocked project purging when channels had snapshots. Changed to :delete_all to match the workflow_snapshots pattern. * feat: add channel logs tab to History page Move channel request logs from a standalone page to a tab within the History page. Adds ChannelLogsComponent as a live_component with filtering and pagination, a new /history/channels route, and updates links in the channel index to point to the new location. * refactor: remove standalone channel requests page Remove the separate ChannelRequestLive page and its nav menu item, now that channel request logs are accessible via the History page tab. * fix: resolve credo warnings and update cascade delete test - Add @moduledoc to TabBar component - Reorder import before alias in ChannelLive modules - Remove obsolete foreign_key_constraint on channel_snapshots - Replace snapshot deletion error test with cascade delete assertion * fix: correct channel event types and remove unused enum values The list_channel_requests query correctly filters to :sink_response and :error, but the docstring and test referenced :source_received. Since :source_received and :sink_request are never emitted by production code, remove them from the migration enum, Ecto schema, and test factory. * fix: use parameterized set_config() instead of string-interpolated SET LOCAL Replace string interpolation in SET LOCAL work_mem with PostgreSQL's set_config() function and a bind parameter. SET LOCAL doesn't support $1 placeholders, but set_config('work_mem', $1, true) is equivalent and eliminates the SQL injection surface. * perf: skip re-querying in ChannelLogsComponent when params unchanged Guard update/2 to only re-fetch channel requests when params have actually changed. Use assign_new for the channels list so it's only fetched once per component lifecycle. * refactor: extract base_pill and add channel_state_pill component Split state_pill into a shared base_pill (outer styling) and domain-specific components. Add channel_state_pill with correct mappings for channel request states — notably :pending displays as "In Progress" (blue) instead of "Enqueued" (gray). * fix: replace catch-all handle_info with explicit event clause Replace the blanket handle_info(_event, socket) with a clause that explicitly matches the four known PubSub events (RunCreated, RunUpdated, WorkOrderCreated, WorkOrderUpdated) on the channel_logs tab, preventing silent swallowing of unexpected messages. * fix: move CORSPlug before Plug.Telemetry in endpoint pipeline Place CORSPlug before Plug.Telemetry so CORS preflight (OPTIONS) requests are halted before the telemetry span opens, keeping them out of endpoint duration metrics. Channel proxy and webhook requests remain fully captured within the span. --------- Co-authored-by: Stuart Corbishley <corbish@gmail.com> Co-authored-by: Joe Clark <josephjclark@gmail.com> Co-authored-by: Lucy Macartney <64803272+lmac-1@users.noreply.github.com> Co-authored-by: Elias W. BA <eliaswalyba@gmail.com>
Description
Users reported that logs weren't appearing in the AI assistant even when the "Send logs" checkbox was checked. Sentry showed
context.log = nullerrors.The bug occurred when:
Root cause:
follow_run_idwas only stored during session creation. When users selected a run mid-session, the frontend sentfollow_run_idin params, but the backend wasn't updatingsession.metawith the new run context, so logs couldn't be fetched.Closes #4380
Solution
Store and propagate follow_run_id during message processing:
maybe_put_follow_run_id_in_meta/2helper inAiAssistantChannelto storefollow_run_idfrom params inmessage.metaMessageProcessor.update_session_with_job_context/2to propagatefollow_run_idfrommessage.metato in-memorysession.metamaybe_add_run_logscan find the run_id and fetch logs during enrichmentRegression tests:
follow_run_idis stored in message.metaValidation steps
Additional notes for the reviewer
n/a
AI Usage
Please disclose whether you've used AI anywhere in this PR (it's cool, we just
want to know!):
You can read more details in our
Responsible AI Policy
Pre-submission checklist
/reviewwith Claude Code)
(e.g.,
:owner,:admin,:editor,:viewer)