Skip to content

feat(logs)!: device-local file logger; remove cloud shipper#85

Open
okdistribute wants to merge 9 commits into
mainfrom
feat/device-local-logs
Open

feat(logs)!: device-local file logger; remove cloud shipper#85
okdistribute wants to merge 9 commits into
mainfrom
feat/device-local-logs

Conversation

@okdistribute
Copy link
Copy Markdown
Contributor

Summary

Replaces the cloud log shipper with a device-local rolling file appender. Adds iroh_services::logs::file_layer(config) that writes JSON records to a tracing-appender rolling file, and refactors LogCollector so the cloud-controlled EnvFilter reload handle applies to the file layer. The PutLogs RPC, LogLine / SpanInfo / FieldValue wire types, and the buffer-and-flush plumbing are gone.

The SetLogLevel path is unchanged — the cloud still pushes EnvFilter directives to running clients; they just adjust what reaches the local file instead of what gets buffered for shipment.

Breaking — bumps 0.14 → 0.15. Existing consumers calling logs::install() / logs::layer() / ClientBuilder::with_log_collection(...) need to migrate (see updated examples/logs.rs).

Why

Storing log content on the cloud was expensive (per-project disk caps, trim, etc.) and the customer's logs already pass through their machine — there's no reason to ship them off-device just so the dashboard can show them. Operators already have tail, journalctl, log aggregators, etc.

Test plan

  • cargo test --lib logs:: covers (a) unfiltered file_layer writes through to disk and (b) the cloud-filter end-to-end: starts at off, set_filter("info") lets records through, drop the WorkerGuard flushes.
  • cargo run --example logs writes records to ./logs/iroh-services.* once the cloud pushes a SetLogLevel.
  • Downstream PR in n0des (feat/device-local-logs) wires this up end-to-end.

🤖 Generated with Claude Code

okdistribute and others added 6 commits April 14, 2026 16:53
New opt-in API: iroh_services::logs::file_layer(config) returns a
tracing Layer that writes JSON records to a rolling file plus a
WorkerGuard the caller must hold for the lifetime of the process. The
FileLoggerConfig builder takes the destination directory and tunes
rotation, file name prefix, and retention.

The existing buffer + shipper API is untouched and still works. This
is the foundation for moving log persistence from cloud to device:
PR 2 will switch the shipper off in favour of this layer.

Adds tracing-appender 0.2 and tempfile (dev) deps.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
BREAKING: removes the buffer-and-ship path. Logs are now written to
local rolling files on the device; the cloud no longer stores log
content. The directive-override path (SetLogLevel) is unchanged and now
controls what reaches the local file.

logs.rs: drop LogCollector::buffered/drain, BufferLayer, RingBuffer,
FieldVisitor, DEFAULT_BUFFER_CAPACITY, DEFAULT_RATE_PER_SECOND. The
collector now wraps the reload handle only and applies the cloud-
controlled EnvFilter to the file_layer. install(config) and
layer(config) take FileLoggerConfig and return a WorkerGuard.

client.rs: drop with_log_collection, log_flush_interval, log_max_batch
builder methods; drop log_collector/log_flush_interval/log_max_batch
fields; drop _log_flush_task; drop ClientActorMessage::PutLogs and
ClientActor::put_logs; drop run_log_flush.

protocol.rs: drop LogLine, SpanInfo, FieldValue, PutLogs and the
IrohServicesProtocol::PutLogs variant.

Examples updated; tests cover both the unfiltered file_layer and the
cloud-filtered layer end-to-end.

Bumps version 0.14 -> 0.15.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 14, 2026

Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh-services/pr/85/docs/iroh_services/

Last updated: 2026-05-14T10:38:30Z

okdistribute and others added 3 commits May 14, 2026 10:48
Adds a new cloud-to-endpoint streaming RPC on ClientHostProtocol:
the cloud asks for the contents of the endpoint's currently-active
rolling log file, the SDK reads it in 64 KiB chunks and streams the
bytes back over an mpsc channel. Capped by an optional max_bytes on
the request.

New LogsCap::Fetch capability gates the call; missing caps surface as
a terminal MissingCapability item on the stream. With no log_collector
configured the endpoint returns an AuthError chunk.

LogCollector now carries its rolling appender's log_dir and
file_name_prefix so client_host can locate the newest file (mtime-
ordered, prefix match) without extra config.

Two integration tests exercise (a) end-to-end streaming via QUIC over
the actual ClientHostProtocol, and (b) the missing-cap rejection.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the cloud's push-with-sleep dance for restoring per-endpoint
log directives on reconnect. The cloud's spawn_initial_set_log_level
existed because the dial-back over CLIENT_HOST_ALPN couldn't run until
the client had landed its GrantCap(LogsCap::SetLevel), so the cloud
slept 2s and hoped the grant beat it. Race-prone and slow.

Inverted: the client now pulls. Right after Auth succeeds in
ClientActor, it RPCs the cloud with GetLogLevel and applies the result
to its LogCollector. No sleeps, no race, no LogsCap::SetLevel needed
for initial state — the call is the client reading its own setting.

Re-introduces ClientBuilder::with_log_collector(collector) so consumers
opt into the pull behaviour. Updates the logs example.

The dashboard-triggered live override path (SetLogLevel pushed via
ClientHost) is unchanged. It still requires LogsCap::SetLevel, but by
the time the operator clicks Apply the connection is long up and the
grant is on file, so there is no race to avoid.

New protocol types:

- IrohServicesProtocol::GetLogLevel — client-to-cloud RPC.
- LogLevelSettings { directives, expires_in_secs?, revert_to? } — same
  shape as SetLogLevel so the client applies through the same
  set_filter code path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…face

A pass of review cleanups:

- LogLevelSettings was a structural copy of SetLogLevel (same three
  fields, same client-side application via set_filter). Dropped the
  duplicate; GetLogLevel now returns Option<SetLogLevel> and the
  doc-comment on SetLogLevel notes it carries both directions.

- LogCollector::log_dir() / file_name_prefix() were public but only
  used internally to compute current_log_file(). Narrowed
  current_log_file() to pub(crate) (only the ClientHost handler reads
  it) and inlined the two getters into its body.

- Removed the dead InstallError::InvalidDirectives variant left over
  from the deleted shipper.

- Fixed a stale comment in the install doctest that still described
  the pre-pull push semantics.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants