Skip to content

fix: change keystore pool monitor to OTLP metric#220

Open
minh-nghia wants to merge 1 commit intomainfrom
fix/keystore-pool-as-otel
Open

fix: change keystore pool monitor to OTLP metric#220
minh-nghia wants to merge 1 commit intomainfrom
fix/keystore-pool-as-otel

Conversation

@minh-nghia
Copy link
Copy Markdown
Contributor

@minh-nghia minh-nghia commented Mar 24, 2026

Summary by CodeRabbit

  • Refactor

    • Keystore pool monitoring migrated to OpenTelemetry metrics (replacing the previous Prometheus approach).
    • Monitoring is now implemented as a background daemon and starts unconditionally at service startup, improving observability.
    • Monitoring uses interval-based caching to reduce database load and stabilize reported values.
  • Tests

    • Added integration tests that validate metric reporting and caching/refresh behavior over time.

@push-tags-from-workflow push-tags-from-workflow bot added bug Something isn't working tests labels Mar 24, 2026
@minh-nghia
Copy link
Copy Markdown
Contributor Author

@coderabbitai summary

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 24, 2026

✅ Actions performed

Summary regeneration triggered.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 24, 2026

📝 Walkthrough

Walkthrough

Moves keystore pool monitoring from the API server into a new internal daemon using OpenTelemetry metrics and a cached refresh loop; adds daemon implementation and tests, and removes the prior Prometheus polling and its test/export helper.

Changes

Cohort / File(s) Summary
API server entrypoint
cmd/api-server/main.go, cmd/api-server/main_test.go, cmd/api-server/export_test.go
Removed in-file Prometheus gauge and polling; deleted the test TestMonitorKeystorePoolSize; removed exported MonitorKeystorePoolSize test variable and simplified Run export; now calls daemon.MonitorKeystorePoolSize(ctx, cfg) unconditionally.
Daemon monitor implementation
internal/daemon/keystore_pool_monitor.go
Added KeystorePoolMonitor, NewKeystorePoolMonitorFromCfg, exported ErrInvalidKeystorePoolInterval, and MonitorKeystorePoolSize(ctx, cfg); sets up OTel Int64 gauge keystore_pool_available, lazy DB/repo initialization, cached-count + interval refresh loop, and logging/error handling.
Daemon monitor tests
internal/daemon/keystore_pool_monitor_test.go
New tests validating metric emissions and caching using a temporary SQL DB and an OTel metric.ManualReader; includes helpers to insert keystores and to extract the keystore_pool_available gauge value.

Sequence Diagram

sequenceDiagram
    participant API as API Server
    participant Daemon as internal/daemon
    participant OTel as OpenTelemetry Meter
    participant DB as Database

    API->>Daemon: MonitorKeystorePoolSize(ctx, cfg)
    activate Daemon
    Daemon->>DB: Start DB connection & build SQL repo/pool
    Daemon->>OTel: Create meter & register gauge callback (keystore_pool_available)
    deactivate Daemon

    loop Collection / Ticker cycle
        OTel->>Daemon: invoke callback(ctx)
        activate Daemon
        alt cached value missing or stale
            Daemon->>DB: pool.Count(ctx)
            DB-->>Daemon: count
            Daemon->>Daemon: update cachedCount, lastRead
        end
        Daemon->>OTel: Observe(cachedCount)
        deactivate Daemon
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Suggested labels

dev-ops

Suggested reviewers

  • MarkMcAleese
  • doanhluongsap

Poem

🐰 I hopped from main into daemon's den,

I count the keys and nap again,
A metric here, a tick, a hum,
Cached numbers waiting for the drum,
🥕 Quiet monitoring, done by a friend.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The pull request description is completely empty, missing all required template sections including 'What this PR does', 'Special notes for reviewer', and 'Release note'. Add a complete description following the template: explain why keystore pool monitoring was refactored to use OTLP metrics, note any behavioral changes or migration considerations, and specify the release note.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: converting keystore pool monitoring to use OTLP metrics instead of the previous implementation.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/keystore-pool-as-otel

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/api-server/main.go`:
- Line 128: The call to daemon.MonitorKeystorePoolSize(ctx, cfg) starts the
keystore monitor (which via NewKeystorePoolMonitorFromCfg opens its own DB
connection) even when metrics are disabled; move this invocation behind the
service metrics-enabled gate in startStatusServer or make
MonitorKeystorePoolSize lazy so it does not call
NewKeystorePoolMonitorFromCfg/allocate a DB pool until the gauge is actually
collected. Specifically: check the service metrics flag in startStatusServer
before calling daemon.MonitorKeystorePoolSize(ctx, cfg), or modify
daemon.MonitorKeystorePoolSize / NewKeystorePoolMonitorFromCfg to defer DB
initialization until the first metrics scrape (e.g., lazy-init inside the
monitor/gauge creation), ensuring no extra DB pool is opened when metrics are
off.

In `@internal/daemon/keystore_pool_monitor_test.go`:
- Around line 72-75: This test mutates global OTel state by calling
otel.SetMeterProvider(provider); save the current global with prev :=
otel.GetMeterProvider() before setting the test provider and register a
t.Cleanup that restores it via otel.SetMeterProvider(prev) (and if needed shuts
down or closes the test provider/reader created by
metric.NewMeterProvider/metric.NewManualReader) so the global MeterProvider and
registered callbacks are not leaked into other tests; update the setup around
metric.NewManualReader, metric.NewMeterProvider, and otel.SetMeterProvider to
perform this save-and-restore in t.Cleanup.

In `@internal/daemon/keystore_pool_monitor.go`:
- Around line 24-36: The monitor currently drops the DB connection created in
NewKeystorePoolMonitorFromCfg and the registration handle returned by
meter.RegisterCallback() in MonitorKeystorePoolSize; add fields to
KeystorePoolMonitor (e.g., dbConn of the DB connection type and reg
metric.Registration or callback handle) to retain both resources, update
NewKeystorePoolMonitorFromCfg to assign the created DB connection to
monitor.dbConn before passing it to manager.Pool, and change
MonitorKeystorePoolSize to store the return value of meter.RegisterCallback() on
monitor.reg so it can be unregistered; also add a Close/Shutdown method on
KeystorePoolMonitor that closes monitor.dbConn and calls
monitor.reg.Unregister() (or equivalent) to clean up.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 85869ccd-db05-493b-9770-329935e88940

📥 Commits

Reviewing files that changed from the base of the PR and between 38ba08a and 25e0a4b.

📒 Files selected for processing (5)
  • cmd/api-server/export_test.go
  • cmd/api-server/main.go
  • cmd/api-server/main_test.go
  • internal/daemon/keystore_pool_monitor.go
  • internal/daemon/keystore_pool_monitor_test.go
💤 Files with no reviewable changes (1)
  • cmd/api-server/main_test.go

Comment on lines +72 to +75
reader := metric.NewManualReader()
provider := metric.NewMeterProvider(metric.WithReader(reader))
otel.SetMeterProvider(provider)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's read the test file to see the full context around lines 72-75
cd /tmp && git ls-files | grep -i keystore_pool_monitor_test && head -1

Repository: openkcm/cmk

Length of output: 123


🏁 Script executed:

# Find the test file
fd keystore_pool_monitor_test

Repository: openkcm/cmk

Length of output: 100


🏁 Script executed:

# Read the test file with context around the flagged lines
cat -n internal/daemon/keystore_pool_monitor_test.go | sed -n '60,90p'

Repository: openkcm/cmk

Length of output: 958


🏁 Script executed:

# Check go.mod for Go version requirement
cat go.mod | head -20

Repository: openkcm/cmk

Length of output: 696


🏁 Script executed:

# Look for other tests that might register the same keystore_pool_available instrument
rg "keystore_pool_available" -A 3 -B 3 internal/daemon/

Repository: openkcm/cmk

Length of output: 3324


🏁 Script executed:

# Read the full test function
cat -n internal/daemon/keystore_pool_monitor_test.go | sed -n '66,150p'

Repository: openkcm/cmk

Length of output: 2811


🏁 Script executed:

# Check for any cleanup patterns in the test file
rg -i "cleanup|shutdown|setmeterprovider" internal/daemon/keystore_pool_monitor_test.go -B 2 -A 2

Repository: openkcm/cmk

Length of output: 249


🏁 Script executed:

# Check if there are other test files that might also use OTel
fd -e go internal/daemon/ | xargs rg -l "SetMeterProvider|otel\." | head -10

Repository: openkcm/cmk

Length of output: 777


Restore the global MeterProvider in t.Cleanup.

otel.SetMeterProvider(provider) mutates process-global OTel state. Leaving it installed after this test can leak the registered callback into later tests and produce duplicate keystore_pool_available observations once another test also registers the same instrument.

Suggested cleanup
 import (
+	"context"
 	"encoding/json"
 	"testing"
 	"time"
@@
 	reader := metric.NewManualReader()
 	provider := metric.NewMeterProvider(metric.WithReader(reader))
+	prevProvider := otel.GetMeterProvider()
 	otel.SetMeterProvider(provider)
+	t.Cleanup(func() {
+		_ = provider.Shutdown(context.Background())
+		otel.SetMeterProvider(prevProvider)
+	})
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
reader := metric.NewManualReader()
provider := metric.NewMeterProvider(metric.WithReader(reader))
otel.SetMeterProvider(provider)
import (
"context"
"encoding/json"
"testing"
"time"
...
)
...
reader := metric.NewManualReader()
provider := metric.NewMeterProvider(metric.WithReader(reader))
prevProvider := otel.GetMeterProvider()
otel.SetMeterProvider(provider)
t.Cleanup(func() {
_ = provider.Shutdown(context.Background())
otel.SetMeterProvider(prevProvider)
})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/daemon/keystore_pool_monitor_test.go` around lines 72 - 75, This
test mutates global OTel state by calling otel.SetMeterProvider(provider); save
the current global with prev := otel.GetMeterProvider() before setting the test
provider and register a t.Cleanup that restores it via
otel.SetMeterProvider(prev) (and if needed shuts down or closes the test
provider/reader created by metric.NewMeterProvider/metric.NewManualReader) so
the global MeterProvider and registered callbacks are not leaked into other
tests; update the setup around metric.NewManualReader, metric.NewMeterProvider,
and otel.SetMeterProvider to perform this save-and-restore in t.Cleanup.

now := time.Now()

m.mu.Lock()
shouldRefresh := m.cachedCount == nil || now.Sub(*m.lastRead) >= m.interval
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cache feels odd here. Maybe the period of the metrics push to the Otel-collector should rather be configurable in common-sdk.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree it doesn't look good.

This caching was to make it work with the global periodic reader callback. I'm not sure we can configure the interval by each metric.

I've changed it to the previous approach to keep its own ticker and push the record upon each interval, specific to this metric only.


log.Debug(ctx, "Registering keystore pool size gauge metric")

_, err = monitor.meter.RegisterCallback(monitor.callback, monitor.gauge)
Copy link
Copy Markdown
Contributor

@bkieling bkieling Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks encapsulation. Couldn't this be done inside of the constructor method (as parameter to meter.Int64ObservableGauge)?

@minh-nghia minh-nghia force-pushed the fix/keystore-pool-as-otel branch 2 times, most recently from 0bc7b1e to 3893dbd Compare March 25, 2026 10:33
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
internal/daemon/keystore_pool_monitor.go (1)

52-57: ⚠️ Potential issue | 🟠 Major

Don't create a standalone DB pool in monitor init.

NewKeystorePoolMonitorFromCfg is called from status-server startup, before the main application DB is initialized, so opening dbCon here creates a second pool dedicated to this metric path. Because the monitor is started unconditionally, that extra pool is allocated even when no metrics reader/exporter is active. Please reuse the application DB/pool or defer this initialization until the metric is actually collected.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/daemon/keystore_pool_monitor.go` around lines 52 - 57, The monitor
currently opens its own DB connection via db.StartDBConnection and creates a
separate pool with manager.NewPool inside NewKeystorePoolMonitorFromCfg; change
this so the function does not create a standalone DB/pool at init — instead
accept the application's repository/pool (e.g., pass an existing sql.Repository
or manager.Pool into NewKeystorePoolMonitorFromCfg) or store cfg and defer
calling db.StartDBConnection/manager.NewPool until the metric collection runs;
update NewKeystorePoolMonitorFromCfg signature and internal use of dbCon/NewPool
(and the caller in status-server startup) to reuse the main app DB/pool or
lazily initialize only when metrics are enabled.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/daemon/keystore_pool_monitor.go`:
- Around line 82-85: The current error path in the monitor around
m.pool.Count(ctx) logs context cancellation as a failure; update the error
handling in the keystore pool monitor so that if err is context.Canceled or
context.DeadlineExceeded (use errors.Is(err, context.Canceled) || errors.Is(err,
context.DeadlineExceeded)) you return without logging, otherwise keep the
existing log.Error(ctx, "failed to get keystore pool size", err) and return;
this change should be applied around the m.pool.Count(ctx) call in the monitor
loop.
- Around line 67-76: Start currently waits for the first ticker tick so the
keystore_pool_available gauge can be missing after startup; call m.record(ctx)
once before entering the for/select loop in KeystorePoolMonitor.Start to prime
the metric (i.e., invoke record immediately, then create/start the ticker and
continue with the existing loop), referencing the Start method and the
m.record(ctx) helper to locate where to add the initial call.

---

Duplicate comments:
In `@internal/daemon/keystore_pool_monitor.go`:
- Around line 52-57: The monitor currently opens its own DB connection via
db.StartDBConnection and creates a separate pool with manager.NewPool inside
NewKeystorePoolMonitorFromCfg; change this so the function does not create a
standalone DB/pool at init — instead accept the application's repository/pool
(e.g., pass an existing sql.Repository or manager.Pool into
NewKeystorePoolMonitorFromCfg) or store cfg and defer calling
db.StartDBConnection/manager.NewPool until the metric collection runs; update
NewKeystorePoolMonitorFromCfg signature and internal use of dbCon/NewPool (and
the caller in status-server startup) to reuse the main app DB/pool or lazily
initialize only when metrics are enabled.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4c3ba778-faa2-4cf2-b129-16e88d91f9e3

📥 Commits

Reviewing files that changed from the base of the PR and between 25e0a4b and 3893dbd.

📒 Files selected for processing (5)
  • cmd/api-server/export_test.go
  • cmd/api-server/main.go
  • cmd/api-server/main_test.go
  • internal/daemon/keystore_pool_monitor.go
  • internal/daemon/keystore_pool_monitor_test.go
💤 Files with no reviewable changes (1)
  • cmd/api-server/main_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • cmd/api-server/export_test.go
  • internal/daemon/keystore_pool_monitor_test.go

@minh-nghia minh-nghia force-pushed the fix/keystore-pool-as-otel branch from 3893dbd to 4325008 Compare March 25, 2026 12:54
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
internal/daemon/keystore_pool_monitor.go (1)

78-90: ⚠️ Potential issue | 🟠 Major

DB connection created in getPool is not retained for cleanup.

The DB connection created at line 83 is passed to manager.NewPool but not stored on the monitor struct. When the application shuts down or the monitor is stopped, there's no way to close this connection. Consider storing dbCon on the monitor and adding a Close() method to properly clean up resources.

,

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/daemon/keystore_pool_monitor.go` around lines 78 - 90, The DB
connection returned by db.StartDBConnection in KeystorePoolMonitor.getPool is
not retained for later cleanup; modify the KeystorePoolMonitor struct to add a
field (e.g., dbCon *sql.DB or appropriate connection type) and assign m.dbCon =
dbCon before creating the pool (manager.NewPool), then implement a Close()
method on KeystorePoolMonitor that closes m.dbCon (and any pool cleanup) and
ensure callers invoke Close() when stopping the monitor or shutting down; update
references to m.pool and manager.NewPool accordingly so the connection is owned
by the monitor and can be closed.
internal/daemon/keystore_pool_monitor_test.go (1)

72-75: ⚠️ Potential issue | 🟡 Minor

Restore global MeterProvider in t.Cleanup.

otel.SetMeterProvider(provider) mutates process-global state. Without cleanup, this can leak the registered callback into subsequent tests. Save the previous provider and restore it in t.Cleanup.

,

💡 Suggested fix
 	reader := metric.NewManualReader()
 	provider := metric.NewMeterProvider(metric.WithReader(reader))
+	prevProvider := otel.GetMeterProvider()
 	otel.SetMeterProvider(provider)
+	t.Cleanup(func() {
+		_ = provider.Shutdown(t.Context())
+		otel.SetMeterProvider(prevProvider)
+	})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/daemon/keystore_pool_monitor_test.go` around lines 72 - 75, Save the
existing global MeterProvider before calling otel.SetMeterProvider and restore
it in t.Cleanup to avoid leaking state between tests: capture prev :=
otel.GetMeterProvider() before creating/setting the test provider (the block
with metric.NewManualReader, metric.NewMeterProvider and otel.SetMeterProvider)
and add t.Cleanup(func() { otel.SetMeterProvider(prev) }) so the original
provider is reinstated after the test.
🧹 Nitpick comments (2)
internal/daemon/keystore_pool_monitor.go (1)

113-122: Consider returning a shutdown signal for graceful termination.

MonitorKeystorePoolSize spawns a goroutine but provides no way for the caller to wait for it to finish. If graceful shutdown requires ensuring the monitor has stopped before proceeding, consider returning a done channel or accepting a sync.WaitGroup.

💡 Example signature
func MonitorKeystorePoolSize(ctx context.Context, cfg *config.Config) <-chan struct{} {
    // ...
    done := make(chan struct{})
    go func() {
        defer close(done)
        monitor.Start(ctx)
    }()
    return done
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/daemon/keystore_pool_monitor.go` around lines 113 - 122, Change
MonitorKeystorePoolSize to return a shutdown/done signal so callers can wait for
the goroutine to finish: modify the signature (e.g., return <-chan struct{}),
create a done channel after NewKeystorePoolMonitorFromCfg succeeds, spawn the
goroutine to call monitor.Start(ctx) and defer close(done) so the channel closes
when Start returns, and return the done channel to the caller; keep the existing
error handling when NewKeystorePoolMonitorFromCfg fails (return a closed or nil
channel as appropriate).
internal/daemon/keystore_pool_monitor_test.go (1)

86-129: Consider shorter intervals to reduce test duration.

The test uses a 5-second interval with multiple sleeps, totaling ~8 seconds of wall-clock time. For faster CI feedback, consider reducing the interval (e.g., 100ms) and adjusting sleep durations proportionally.

💡 Example with shorter intervals
 		KeystorePool: config.KeystorePool{
-			Interval: 5 * time.Second,
+			Interval: 100 * time.Millisecond,
 		},
 	})

 	t.Run("initial read", func(t *testing.T) {
-		time.Sleep(1 * time.Second)
+		time.Sleep(50 * time.Millisecond)
 		// ...
 	})

 	t.Run("previous value is returned within interval", func(t *testing.T) {
-		time.Sleep(2 * time.Second)
+		time.Sleep(40 * time.Millisecond)
 		// ...
 	})

 	t.Run("value is refreshed after interval", func(t *testing.T) {
-		time.Sleep(5 * time.Second)
+		time.Sleep(100 * time.Millisecond)
 		// ...
 	})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/daemon/keystore_pool_monitor_test.go` around lines 86 - 129, Reduce
the test timing to speed CI by shrinking the KeystorePool interval and
corresponding sleeps: set config.KeystorePool.Interval from 5*s to something
like 100*time.Millisecond, change the initial sleep to ~50-100ms to allow commit
before monitoring, in the "previous value" case sleep for less than the interval
(e.g., 50ms) before calling reader.Collect and addToKeystorePool, and in the
"value is refreshed" case sleep for slightly more than the interval (e.g.,
150-200ms) before collecting; update references in the test to use the
KeystorePool.Interval change and keep assertions using getKeystorePoolMetric and
reader unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@internal/daemon/keystore_pool_monitor_test.go`:
- Around line 72-75: Save the existing global MeterProvider before calling
otel.SetMeterProvider and restore it in t.Cleanup to avoid leaking state between
tests: capture prev := otel.GetMeterProvider() before creating/setting the test
provider (the block with metric.NewManualReader, metric.NewMeterProvider and
otel.SetMeterProvider) and add t.Cleanup(func() { otel.SetMeterProvider(prev) })
so the original provider is reinstated after the test.

In `@internal/daemon/keystore_pool_monitor.go`:
- Around line 78-90: The DB connection returned by db.StartDBConnection in
KeystorePoolMonitor.getPool is not retained for later cleanup; modify the
KeystorePoolMonitor struct to add a field (e.g., dbCon *sql.DB or appropriate
connection type) and assign m.dbCon = dbCon before creating the pool
(manager.NewPool), then implement a Close() method on KeystorePoolMonitor that
closes m.dbCon (and any pool cleanup) and ensure callers invoke Close() when
stopping the monitor or shutting down; update references to m.pool and
manager.NewPool accordingly so the connection is owned by the monitor and can be
closed.

---

Nitpick comments:
In `@internal/daemon/keystore_pool_monitor_test.go`:
- Around line 86-129: Reduce the test timing to speed CI by shrinking the
KeystorePool interval and corresponding sleeps: set config.KeystorePool.Interval
from 5*s to something like 100*time.Millisecond, change the initial sleep to
~50-100ms to allow commit before monitoring, in the "previous value" case sleep
for less than the interval (e.g., 50ms) before calling reader.Collect and
addToKeystorePool, and in the "value is refreshed" case sleep for slightly more
than the interval (e.g., 150-200ms) before collecting; update references in the
test to use the KeystorePool.Interval change and keep assertions using
getKeystorePoolMetric and reader unchanged.

In `@internal/daemon/keystore_pool_monitor.go`:
- Around line 113-122: Change MonitorKeystorePoolSize to return a shutdown/done
signal so callers can wait for the goroutine to finish: modify the signature
(e.g., return <-chan struct{}), create a done channel after
NewKeystorePoolMonitorFromCfg succeeds, spawn the goroutine to call
monitor.Start(ctx) and defer close(done) so the channel closes when Start
returns, and return the done channel to the caller; keep the existing error
handling when NewKeystorePoolMonitorFromCfg fails (return a closed or nil
channel as appropriate).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e27ec83d-c312-4079-9dde-c02536b9a384

📥 Commits

Reviewing files that changed from the base of the PR and between 3893dbd and 4325008.

📒 Files selected for processing (5)
  • cmd/api-server/export_test.go
  • cmd/api-server/main.go
  • cmd/api-server/main_test.go
  • internal/daemon/keystore_pool_monitor.go
  • internal/daemon/keystore_pool_monitor_test.go
💤 Files with no reviewable changes (1)
  • cmd/api-server/main_test.go
✅ Files skipped from review due to trivial changes (1)
  • cmd/api-server/main.go

@minh-nghia minh-nghia force-pushed the fix/keystore-pool-as-otel branch from 4325008 to d7f5a05 Compare March 25, 2026 13:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants