Skip to content

Race condition in database status updates causes spurious warnings during crate processing #71

@rrrodzilla

Description

@rrrodzilla

Problem

When processing crates via the HTTP endpoint, the system logs warnings indicating database validation failures even though the crate processing completes successfully:

WARN: Cannot update acton-reactive@5.0.0 to 'chunked' status: No doc_chunks found in database (expected 18)
WARN: Cannot update acton-reactive@5.0.0 to 'vectorized' status: No embeddings found in database (expected 18)
WARN: Cannot update acton-reactive@5.0.0 to 'complete' status: Crate record not found in database

These warnings appear to the user as errors even though processing completes successfully moments later.

Root Cause

Race Condition in Message Ordering

The DatabaseActor subscribes to DocumentationChunked and DocumentationVectorized messages and immediately attempts to validate that chunks/embeddings exist in the database and update the crate status. However, these validation queries run BEFORE the actual PersistDocChunk/PersistEmbedding messages have been processed.

Current Broken Flow:

ProcessorActor finishes chunking
    ↓ IMMEDIATELY broadcasts
DocumentationChunked → DatabaseActor tries to validate chunks → FINDS 0 ❌
    ↓ ALSO broadcasts (async, processed later)
PersistDocChunk (x18) → DatabaseActor processes these → Inserts chunks ✓

The CrateCoordinatorActor correctly tracks DocChunkPersisted and EmbeddingPersisted messages to know when persistence is complete, but the DatabaseActor doesn't wait for this confirmation before attempting status updates.

Proposed Solution

Remove premature validation from DatabaseActor and add coordination messages:

  1. DatabaseActor should STOP subscribing to DocumentationChunked and DocumentationVectorized
  2. CrateCoordinatorActor should broadcast NEW messages after confirming persistence:
    • ChunksPersistenceComplete - after all DocChunkPersisted received
    • EmbeddingsPersistenceComplete - after all EmbeddingPersisted received
  3. DatabaseActor should subscribe to these new messages and update status then

Corrected Flow:

ProcessorActor broadcasts DocumentationChunked + PersistDocChunk(x18)
    ↓
DatabaseActor processes PersistDocChunk → broadcasts DocChunkPersisted
    ↓
CrateCoordinatorActor tracks DocChunkPersisted → when count==expected
    ↓ broadcasts
ChunksPersistenceComplete → DatabaseActor updates status to "chunked" ✓

Implementation Tasks

  • Create new message types:
    • ChunksPersistenceComplete in src/messages/chunks_persistence_complete.rs
    • EmbeddingsPersistenceComplete in src/messages/embeddings_persistence_complete.rs
  • Update src/messages/mod.rs to export new messages
  • Update CrateCoordinatorActor to broadcast completion messages:
    • When persisted_chunks.len() == expected_chunks, broadcast ChunksPersistenceComplete
    • When persisted_vectors.len() == expected_vectors, broadcast EmbeddingsPersistenceComplete
  • Update DatabaseActor:
    • Remove DocumentationChunked and DocumentationVectorized handlers
    • Add ChunksPersistenceComplete handler to update status to "chunked"
    • Add EmbeddingsPersistenceComplete handler to update status to "vectorized"
    • Validation queries can now be removed since persistence is confirmed by coordinator

Testing

Test with the following curl command to verify warnings no longer appear:

curl -X POST http://localhost:3333/crate \
    -H "Content-Type: application/json" \
    -d '{"name": "acton-reactive", "version": "5.0.0", "features": ["instrument"]}'

Expected: No warnings in logs, crate processes successfully to completion.

Files Affected

  • src/actors/crate_coordinator_actor.rs - Add broadcasts after persistence tracking
  • src/actors/database.rs - Replace handlers and remove validation queries
  • src/messages/chunks_persistence_complete.rs - New message type
  • src/messages/embeddings_persistence_complete.rs - New message type
  • src/messages/mod.rs - Export new messages

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingdatabase

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions