Skip to content

Support centralised logging and metrics in cardano-testnet via cardano-tracer #6490

@carbolymer

Description

@carbolymer

Summary

Currently, enabling the Prometheus/PrometheusSimple backend in cardano-testnet is not practical for multi-node testnets because all nodes share the same configuration. This causes port collisions since every node attempts to listen on the same PrometheusSimple endpoint (hardcoded to 0.0.0.0:12798). More broadly, there is no centralised logging — each node logs independently to its own stdout/file.

The relevant code is commented out in cardano-testnet/src/Testnet/Defaults.hs (line 306) with a note explaining the limitation and suggesting cardano-tracer as the proper solution.

Problem

  • All testnet nodes share a single config, so there is no way to assign unique Prometheus ports per node — only single-node testnets can enable Prometheus without collisions.
  • Each node logs independently (Katip file/stdout scribes). There is no unified view of traces across a multi-node testnet.
  • Developers debugging or monitoring multi-node testnets lack an easy built-in metrics endpoint and centralised log aggregation.

Proposed solution: integrate cardano-tracer

cardano-tracer is purpose-built for this. It acts as a centralised aggregator that connects to multiple nodes and provides both unified logging and a single Prometheus endpoint with per-node sub-routes.

Architecture

testnet nodes (N)                    cardano-tracer (1 process)
+------------+                       +---------------------------+
| node-spo1  |--\                    |  Accepts on local socket  |
| node-spo2  |----->  forwarder.sock |                           |
| node-spo3  |--/                    |  Exposes:                 |
+------------+                       |   - Prometheus :3200      |
                                     |   - Per-node log dirs     |
                                     +---------------------------+
                                     /tracer-logs
                                       /node-spo1/node.json
                                       /node-spo2/node.json
                                       /node-spo3/node.json

Implementation outline

  1. New CLI flag — add an --enable-tracer option to CardanoTestnetOptions, following the existing --enable-grpc / RpcSupport pattern in Testnet/Start/Types.hs.

  2. Spawn cardano-tracer as an auxiliary process — following the existing SubmitApi pattern in Testnet/SubmitApi.hs:

    • Generate a tracer config (AcceptAt on a local socket, logging to a per-testnet directory, Prometheus on a free port).
    • Spawn cardano-tracer --config <path> before starting nodes.
    • Register it for cleanup with the existing MonadResource / SIGINT handler infrastructure.
  3. Add tracer socket arg to each node — in Testnet/Start/Cardano.hs, append --tracer-socket-path-connect <socket> to each node's CLI args. All nodes connect to the same socket, so no per-node config divergence is needed.

  4. Clean up commented-out code — the PrometheusSimple workaround in Defaults.hs:306-329 becomes obsolete and can be removed.

What this enables

  • Centralised logs: Per-node subdirectories under a single root, with rotation — one place to look at all testnet traces.
  • Single Prometheus endpoint: Lists all connected nodes at the root, each with its own metrics sub-route.
  • Prometheus service discovery: GET /targets for dynamic scraping configurations.
  • No port collisions: Only the tracer binds network ports, not individual nodes.
  • Scales to any node count: AcceptAt mode requires zero tracer config changes when adding nodes.

Files likely affected

  • Testnet/Start/Types.hs — new option type
  • Parsers/Cardano.hs — CLI flag parsing
  • Testnet/Start/Cardano.hs — spawn tracer process, add --tracer-socket-path-connect to node args
  • Testnet/Defaults.hs — tracer config generation, remove commented-out PrometheusSimple code

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions