Skip to content

feat(ntx-builder): deactivate accounts which crash repeatedly#1712

Open
SantiagoPittella wants to merge 9 commits intosantiagopittella-ntx-builder-actor-deactivationfrom
santiagopittella-ntx-builder-account-blacklisting
Open

feat(ntx-builder): deactivate accounts which crash repeatedly#1712
SantiagoPittella wants to merge 9 commits intosantiagopittella-ntx-builder-actor-deactivationfrom
santiagopittella-ntx-builder-account-blacklisting

Conversation

@SantiagoPittella
Copy link
Collaborator

Implements account blacklisting for the NTX Builder (4th task from #1694).

Track crash counts per account in the coordinator. When an actor shuts down due to a DbError, its crash count is incremented. Once it reaches a configurable threshold, the account is blacklisted and spawn_actor skips it.

Only DbError shutdowns count as crashes because other shutdown reasons (Cancelled, IdleTimeout, SemaphoreFailed) are either intentional or system-wide and not indicative of a per-account bug.

tracing::warn!(
%account_id,
crash_count = count,
"Account blacklisted due to repeated crashes, skipping actor spawn"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: should we make this more explicit? tracing::error!, since it signifies a bug in our impl or add "BUG" deliberately to the message?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should already be logging the crash itself, so this is probably okay at warn since it will repeat often for each account.

return;
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: should we first check for running actors and terminate them?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think so. By the time the crash threshold is reached, the actor has already shut itself down. DbError shutdowns are handled in next(), which removes the actor from the registry before any re-spawn attempt. So when spawn_actor is called for a deactivated account, there's no running actor to terminate.

assert_eq!(inactive_targets[0], inactive_id);
}

// BLACKLIST TESTS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we might want to find a different term to "blacklist", i.e. repeated_failure_block_list to avoid any confusion / fud

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer something much shorter -- we're already overly verbose and stuttery throughout the codebase imo. Maybe deactivated_accounts

Copy link
Collaborator

@Mirko-von-Leipzig Mirko-von-Leipzig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just naming nits :) I guess blacklist was the wrong term to use -- let's try deactivated instead 🙃

tracing::warn!(
%account_id,
crash_count = count,
"Account blacklisted due to repeated crashes, skipping actor spawn"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should already be logging the crash itself, so this is probably okay at warn since it will repeat often for each account.

assert_eq!(inactive_targets[0], inactive_id);
}

// BLACKLIST TESTS
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer something much shorter -- we're already overly verbose and stuttery throughout the codebase imo. Maybe deactivated_accounts

@Mirko-von-Leipzig Mirko-von-Leipzig changed the title feat(ntx-builder): blacklist accounts whose actors crash repeatedly feat(ntx-builder): deactivate accounts which crash repeatedly Mar 9, 2026
@SantiagoPittella SantiagoPittella force-pushed the santiagopittella-ntx-builder-actor-deactivation branch from ebf9d91 to 870292b Compare March 11, 2026 17:59
@SantiagoPittella SantiagoPittella force-pushed the santiagopittella-ntx-builder-account-blacklisting branch from b993661 to e681868 Compare March 11, 2026 18:33
@SantiagoPittella SantiagoPittella force-pushed the santiagopittella-ntx-builder-account-blacklisting branch from c5f91d8 to d0c58c2 Compare March 11, 2026 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants