Skip to content

Use separate in-memory types#1913

Open
jmpesp wants to merge 1 commit intooxidecomputer:mainfrom
jmpesp:different_in_memory_types
Open

Use separate in-memory types#1913
jmpesp wants to merge 1 commit intooxidecomputer:mainfrom
jmpesp:different_in_memory_types

Conversation

@jmpesp
Copy link
Copy Markdown
Contributor

@jmpesp jmpesp commented Mar 23, 2026

Serializing and deserializing raw JSON to the Crucible Agent's data file creates the potential for a versioning problem that would be hidden from the normal validation performed to make sure that our API surface doesn't change.

Consider the following:

  • a normal update occurs that changes the Crucible Agent's version

    • that newer version's serialized data file is not compatible with the older version
  • some Bad Situation occurs and an operator decides to MUPdate back to the previous version

In this scenario the older Crucible Agent will not be able to deserialize what the newer Agent serialized, adding to the existing Bad Situation. In a future when Nexus pays attention to Crucible health and there are Agents that cannot start, this would result in potentially stuck Upstairs and that may cascade to interrupt other operations while Nexus waits for repairs or reconciliations.

This commit separates the in-memory types used in the DataFile struct from the types committed to disk (and adds a comment that warns against changing said type), and further distinguishes between Region states and RunningSnapshot states.

This work is a prerequisite to getting the Agent to work on multiple read-only region clones concurrently as that work adds a new State, and internally discussing the Bad Situation scenario referenced earlier in this message lead to the inspiration to use separate in-memory types.

Additionally, fix a bug where the key_pem field stored on disk would be output in the error message that would result from requesting a region that matches an existing one except for that field. Luckily no production system uses Crucible's support for X509 yet and this endpoint is not exposed to users.

Serializing and deserializing raw JSON to the Crucible Agent's data file
creates the potential for a versioning problem that would be hidden from
the normal validation performed to make sure that our API surface
doesn't change.

Consider the following:

- a normal update occurs that changes the Crucible Agent's version
  - that newer version's serialized data file is not compatible with the
    older version

- some Bad Situation occurs and an operator decides to MUPdate back to
  the previous version

In this scenario the older Crucible Agent will not be able to
deserialize what the newer Agent serialized, adding to the existing Bad
Situation. In a future when Nexus pays attention to Crucible health and
there are Agents that cannot start, this would result in potentially
stuck Upstairs and that may cascade to interrupt other operations while
Nexus waits for repairs or reconciliations.

This commit separates the in-memory types used in the DataFile struct
from the types committed to disk (and adds a comment that warns against
changing said type), and further distinguishes between Region states and
RunningSnapshot states.

This work is a prerequisite to getting the Agent to work on multiple
read-only region clones concurrently as that work adds a new State, and
internally discussing the Bad Situation scenario referenced earlier in
this message lead to the inspiration to use separate in-memory types.

Additionally, fix a bug where the `key_pem` field stored on disk would
be output in the error message that would result from requesting a
region that matches an existing one except for that field. Luckily no
production system uses Crucible's support for X509 yet and this endpoint
is not exposed to users.
@jmpesp jmpesp requested a review from leftwo March 23, 2026 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant