Skip to content

Phase 6: Integration tests for OTEL observability #459

@yairfalse

Description

@yairfalse

Phase 6: Integration Tests & Validation

Goal: Verify all OTEL changes work end-to-end with real collector

Background

Phases 1-5 of the OTEL observability redesign are complete:

  • ✅ Phase 1: Semantic conventions
  • ✅ Phase 2: Enhanced metrics with attributes
  • ✅ Phase 3: Rewritten OTELEmitter (metrics instead of fake spans)
  • ✅ Phase 4: W3C trace context propagation
  • ✅ Phase 5: OTEL SDK initialization in BaseObserver

What's Needed

6.1 Integration Test Setup

Create test/integration/otel_test.go with:

  • Test OTLP collector (in-memory or docker)
  • End-to-end metric export verification
  • Context propagation validation
  • Resource attribute verification

6.2 Test Coverage

  • Verify metrics exported with correct attributes
  • Verify trace context propagation across observers
  • Verify resource attributes (cluster/namespace/node)
  • Verify graceful shutdown flushes telemetry
  • Test with real observer (deployment/status)

6.3 Example Test Structure

func TestOTELMetricsExport(t *testing.T) {
    // Start test OTEL collector
    collector := startTestCollector(t)
    defer collector.Stop()

    // Create observer with telemetry
    config := base.LoadTelemetryConfigFromEnv()
    config.OTLPEndpoint = collector.Endpoint()
    
    observer, err := base.NewBaseObserverWithTelemetry("test-observer", config)
    require.NoError(t, err)

    // Emit events
    ctx := context.Background()
    event := &domain.ObserverEvent{
        ID: "test-1",
        Type: "tcp_connect",
        Source: "test-observer",
    }
    observer.RecordEvent(ctx, event)

    // Wait for export
    time.Sleep(2 * time.Second)

    // Verify metrics in collector
    metrics := collector.GetMetrics()
    assert.Contains(t, metrics, "observer_events_processed_total")
    
    // Verify attributes
    attrs := collector.GetAttributes("observer_events_processed_total")
    assert.Equal(t, "test-observer", attrs["observer.name"])
    assert.Equal(t, "tcp_connect", attrs["event.type"])
}

func TestContextPropagation(t *testing.T) {
    // Create parent span
    ctx, span := otel.Tracer("test").Start(context.Background(), "parent")
    defer span.End()

    // Extract trace context into event
    event := &domain.ObserverEvent{ID: "test-1", Type: "test"}
    base.ExtractTraceContext(ctx, event)

    // Verify trace context
    assert.NotEmpty(t, event.TraceID)
    assert.NotEmpty(t, event.SpanID)

    // Inject back into new context
    newCtx := base.InjectTraceContext(event)
    newSpan := otelTrace.SpanContextFromContext(newCtx)
    
    assert.True(t, newSpan.IsValid())
    assert.Equal(t, event.TraceID, newSpan.TraceID().String())
}

6.4 Dependencies Needed

  • OTLP collector (testcontainers or in-memory)
  • Helper to verify exported metrics/traces
  • Docker setup for integration tests in CI

Acceptance Criteria

  • Integration tests pass with real OTLP collector
  • All metrics include correct semantic convention attributes
  • Trace context propagates correctly across observer boundaries
  • Resource attributes match expected cluster/namespace/node
  • Tests run in CI with docker-compose

Related Files

  • internal/base/telemetry.go - OTEL SDK initialization
  • internal/base/trace.go - Context propagation helpers
  • internal/base/emitter.go - OTELEmitter with metrics
  • internal/base/semconv.go - Semantic conventions

Implementation Notes

The core library work is done. This phase is about validation with real infrastructure.

Consider using:

  • go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp for easier testing
  • testcontainers-go for spinning up collector
  • In-memory exporter for unit-style integration tests

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions