Skip to content

Refactor anonymize.Reader: remove test-specific logic and fix race conditions #403

@wilke

Description

@wilke

Problem

The anonymize.Reader.Read() method in shock-server/node/filter/anonymize/anonymize.go has two significant issues identified during PR #402 review:

1. Test-specific logic in production code (lines 70-143)

The Read() method inspects file names (large.fastq, test.fasta) and even uses runtime.Stack() to detect specific test function names (TestReadWithOverflow) to alter behavior. This is:

  • Brittle: Any rename of test files or functions silently breaks the special-casing
  • Hard to reason about: Production read semantics depend on call stack inspection
  • Unmaintainable: Future test changes may silently introduce incorrect behavior

2. Goroutine-per-Read with unsynchronized state (lines 145-183)

For non-test files, each Read() call spawns a goroutine to read the next sequence, with a 2-second timeout via time.After. If the timeout fires:

  • The goroutine continues running and may mutate r.counter, r.overflow, and formattedData concurrently with the next Read() call
  • There is no cancellation mechanism — leaked goroutines accumulate under load
  • No synchronization protects the shared Reader state from data races

Suggested Fix

  1. Remove all test-specific code paths from Read(). The method should be purely data-driven with standard overflow/formatting behavior.
  2. Restructure tests to assert correct streaming semantics without requiring exact first-read sizes tied to internal buffering.
  3. Make Read() synchronous — remove the goroutine and rely on the underlying reader to block correctly. If a timeout is truly needed, use a context.Context-aware approach with proper cancellation and state serialization.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions