Skip to content

feat: HAR collector middleware#71

Open
evg4b wants to merge 16 commits intomainfrom
feature/har-collector
Open

feat: HAR collector middleware#71
evg4b wants to merge 16 commits intomainfrom
feature/har-collector

Conversation

@evg4b
Copy link
Copy Markdown
Owner

@evg4b evg4b commented Mar 16, 2026

Summary

  • Adds a non-blocking HAR (HTTP Archive 1.2) collector middleware that records every proxied request/response pair to a .har file
  • Configured per-mapping; supports string shorthand (har: ./file.har) and full object form with capture-secure-headers flag
  • Security-sensitive headers (Cookie, Set-Cookie, Authorization, WWW-Authenticate, Proxy-Authorization, Proxy-Authenticate) are excluded by default
  • Response bodies are transparently decompressed (gzip/deflate); unknown encodings are base64-encoded
  • Async writer uses buffered channel (4096) + background goroutine + atomic rename — requests are never blocked
  • JSON schema and ARCHITECTURE.md updated

Test plan

  • go test ./internal/handler/har/...
  • go test ./internal/config/...
  • go test ./internal/config/validators/...
  • Manual: set har: ./out.har, send requests, verify readable HAR in Chrome DevTools
  • Verify auth/cookie headers absent by default; present with capture-secure-headers: true

evg4b and others added 16 commits March 15, 2026 23:29
Add HARConfig struct with File field and Enabled()/Clone() helpers.
Wire the HAR field into the Mapping struct so each mapping can
independently configure HAR collection.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Define Go structs for the full HAR (HTTP Archive) 1.2 specification:
HAR, Log, Creator, Entry, Request, Response, Content, Timings,
NameValue, Cookie, and PostData.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Introduce a channel-based async writer that never blocks request
goroutines. Entries are queued via a buffered channel (4096 capacity).
A single background goroutine accumulates entries and atomically
writes the complete HAR JSON to disk (write-to-tmp then rename) so
the output file is always valid. Close() drains the channel and does
a final flush before returning.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Introduce captureWriter that wraps contracts.ResponseWriter and tees
the response body into an in-memory buffer using io.MultiWriter.
Status code is tracked via WriteHeader so both can be forwarded to
the real writer and read back by the middleware for HAR entry building.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Middleware.Wrap captures every request (method, URL, headers, query,
cookies, body size) and response (status, headers, cookies, body text,
MIME type) after the inner handler returns, then non-blockingly enqueues
a HAR Entry to the async Writer via AddEntry. Request body is buffered
and restored so downstream handlers are unaffected.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add HARMiddlewareFactory type and harMiddlewareFactory field to
  RequestHandler.
- Add WithHARMiddlewareFactory option.
- wrapHARMiddleware wraps the default handler for each mapping that
  has HAR.File configured, placing the collector as the outermost
  middleware so all proxy/cache/options traffic is captured.
- Register the factory in buildHandlerForMappings: creates a Writer
  per mapping and registers it via registerCloser so it is properly
  flushed on server shutdown or config-reload restart.
- Add registerCloser / closeAll helpers and io.Closer tracking to the
  Uncors app; Close() and Restart() honour the lifecycle.
- Fix Writer.Close() to return error (satisfies io.Closer).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
HARValidator checks that when HAR collection is enabled the configured
file path has an extension (e.g. .har). The validator is wired into
MappingValidator so it runs as part of existing config validation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- writer_test: verifies valid HAR JSON output after Close, idempotent
  Close calls, non-blocking behaviour under heavy load, and empty-entry
  output.
- middleware_test: verifies downstream handler is called, response body
  is teed correctly, and request body is restored for downstream handlers.
- har_test (validators): verifies disabled config passes, valid file
  paths with extensions pass, and paths without extensions are rejected.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add HAR Collector section describing the design goals (non-blocking
  writes, high-throughput atomic file updates, per-mapping isolation,
  lifecycle management).
- Add example YAML configuration snippet.
- Update middleware list, request flow, and project structure diagram.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Cookies (request Cookie header and response Set-Cookie header) are now
excluded from HAR entries by default to avoid persisting sensitive
session data. Opt in per-mapping with:

  har:
    capture-cookies: true

Compressed response bodies (gzip, deflate) are now decoded before being
stored in the HAR Content object so entries are human-readable in
browsers. Unknown or undecipherable encodings fall back to base64 per
the HAR 1.2 spec (Content.encoding = "base64").

Changes:
- HARConfig gains CaptureCookies bool (mapstructure: capture-cookies)
- Content type gains Encoding string field (omitempty)
- New content.go: buildContent() handles gzip/deflate/identity/unknown
- Middleware: captureCookies field; headersToNameValues filters cookie
  headers when captureCookies is false; buildResponse uses buildContent
- WithCaptureCookies option added
- Factory in uncors/handler.go passes CaptureCookies to middleware

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- cookies_not_captured_by_default: asserts Cookie/Set-Cookie headers and
  cookies arrays are absent from HAR entries when captureCookies is false.
- cookies_captured_when_WithCaptureCookies(true): asserts cookies are
  present when the flag is enabled.
- gzip_response_body_is_decoded_in_HAR: asserts that a gzip-compressed
  response body is stored as readable text (no base64 encoding field).
- TestContent_Encoding/unknown_encoding_stored_as_base64: unit test for
  the base64 fallback path in buildContent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Rename capture-cookies → capture-secure-headers and expand the set of
headers that are stripped by default to include all RFC-defined
authentication and credential headers:

  Cookie / Set-Cookie       — session identifiers
  Authorization             — Bearer tokens, Basic credentials
  WWW-Authenticate          — server auth challenges (reveals scheme/realm)
  Proxy-Authorization       — proxy credentials
  Proxy-Authenticate        — proxy auth challenges

When capture-secure-headers is false (default) all of the above are
excluded from both the headers arrays and the cookies arrays in the HAR
output. Enable per-mapping with:

  har:
    capture-secure-headers: true

Renamed symbols:
  HARConfig.CaptureCookies         → CaptureSecureHeaders
  WithCaptureCookies()             → WithCaptureSecureHeaders()
  Middleware.captureCookies        → captureSecureHeaders
  cookieHeaderNames map            → secureHeaderNames map

Tests updated to cover Authorization and WWW-Authenticate filtering.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Allow har to be written as a plain file path string in addition to the
full object form:

  # shorthand
  har: ./recordings/api.har

  # full form (unchanged)
  har:
    file: ./recordings/api.har
    capture-secure-headers: true

HARConfigHookFunc() is a mapstructure.DecodeHookFunc that converts a
string input to HARConfig{File: string}. It is registered alongside
StaticDirMappingHookFunc in the URLMappingHookFunc decoder chain.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add HARConfig definition supporting both forms:
- Short form (string): har: ./recordings/api.har
- Full form (object):  har: { file: ..., capture-secure-headers: true }

Wire the $ref into the Mapping properties so IDE tooling and schema
validators recognise the har field.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Expand the HAR Collector section in ARCHITECTURE.md to cover:
- String shorthand (`har: ./recordings/api.har`) config form
- Full object form with capture-secure-headers flag
- Table of headers excluded by default and why

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- funcorder: move registerCloser/closeAll after exported Shutdown in app.go
- godot: add trailing period to HARConfigHookFunc comment
- intrange: use range-over-integer in writer_test.go
- mnd: extract nanosecondsPerMillisecond and harFileMode constants
- noctx: replace http.NewRequest with http.NewRequestWithContext in tests
- noinlineerr: refactor inline err checks in content.go and writer.go
- revive: rename unused parameter r→_ in middleware_test.go
- tagliatelle: fix JSON tag redirectURL→redirectUrl in types.go
- testifylint: use assert.JSONEq for JSON comparison in middleware_test.go
- varnamelen: rename short variables (r→req, cw→capture, w→harWriter, etc.)
- wsl_v5: add required blank lines in test files

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud bot commented Mar 16, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant