Skip to content

Memoize safety evaluator#2812

Open
xanderbailey wants to merge 5 commits intodevelopfrom
xb/memoize_safety_eval
Open

Memoize safety evaluator#2812
xanderbailey wants to merge 5 commits intodevelopfrom
xb/memoize_safety_eval

Conversation

@xanderbailey
Copy link

Memoize SafetyEvaluator type traversals

Summary

  • SafetyEvaluator now caches computed safety results per type, avoiding redundant recursive traversals of the type graph
  • Results are only cached when no cycle was encountered during evaluation, preserving correctness for mutually-recursive type definitions

Motivation

JFR profiling of compileConjureObjects on a large IR (~4600 types) showed that 74% of CPU time was spent in SafetyEvaluator, with the top frames being TypeDefinitionSafetyVisitor.with() and SafetyEvaluator.combine(). Each call to evaluate() created a fresh visitor and re-traversed the entire reachable type graph from scratch. Since many types reference the same shared types, this resulted in massive redundant work — effectively O(n * m) where n is the number of types and m is the average reachable subgraph size.

What changed

A HashMap<TypeName, Optional<LogSafety>> cache is shared across all evaluations on a SafetyEvaluator instance. Before traversing a type's subgraph, with() checks the cache and returns immediately on a hit. After evaluation, the result is cached — but only if no cycle-breaking occurred in the subtree, since the cycle-breaking heuristic (assume SAFE) produces results that depend on evaluation order and should not be persisted.

Impact

Tested against a production IR with ~4600 types: compileConjureObjects dropped from 42s to 20s in Gradle.

Test plan

  • All existing conjure-java-core tests pass
  • Generated output verified identical to baseline against a production IR with ~4600 types

After this PR

==COMMIT_MSG==
Memoize safety evaluator
==COMMIT_MSG==

Possible downsides?

@changelog-app
Copy link

changelog-app bot commented Feb 23, 2026

Generate changelog in changelog/@unreleased

Type (Select exactly one)

  • Feature (Adding new functionality)
  • Improvement (Improving existing functionality)
  • Fix (Fixing an issue with existing functionality)
  • Break (Creating a new major version by breaking public APIs)
  • Deprecation (Removing functionality in a non-breaking way)
  • Migration (Automatically moving data/functionality to a new system)

Description

Memoize safety evaluator

Check the box to generate changelog(s)

  • Generate changelog entry

@changelog-app
Copy link

changelog-app bot commented Feb 23, 2026

Successfully generated changelog entry!

Need to regenerate?

Simply interact with the changelog bot comment again to regenerate these entries.


📋Changelog Preview

💡 Improvements

  • Memoize safety evaluator (#2812)

@xanderbailey
Copy link
Author

Did a dry run to check the size of the cache for this large IR and got:

SafetyEvaluator cache: 3437 entries, estimated 268 KB

Comment on lines +371 to +386
// Foo has a field referencing Bar, Bar has a field referencing Foo.
// Both fields are marked safe, so both types should evaluate as safe.
TypeDefinition foo = TypeDefinition.object(ObjectDefinition.builder()
.typeName(FOO)
.fields(FieldDefinition.builder()
.fieldName(FieldName.of("bar"))
.type(Type.reference(BAR))
.build())
.build());
TypeDefinition bar = TypeDefinition.object(ObjectDefinition.builder()
.typeName(BAR)
.fields(FieldDefinition.builder()
.fieldName(FieldName.of("foo"))
.type(Type.reference(FOO))
.build())
.build());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fields are not marked as safe. Shouldn't we be returning unknown/empty in this case? 🤔 (I don't think your change changes this, but this seems weird to me 🤔 )

Comment on lines +389 to +392
SafetyEvaluator evaluator = new SafetyEvaluator(conjureDef);
// Evaluation order should not matter for the result
assertThat(evaluator.evaluate(foo)).hasValue(LogSafety.SAFE);
assertThat(evaluator.evaluate(bar)).hasValue(LogSafety.SAFE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're not checking the evaluation order here. Since you're already doing it in a dedicated test, might as well remove the comment?

Suggested change
SafetyEvaluator evaluator = new SafetyEvaluator(conjureDef);
// Evaluation order should not matter for the result
assertThat(evaluator.evaluate(foo)).hasValue(LogSafety.SAFE);
assertThat(evaluator.evaluate(bar)).hasValue(LogSafety.SAFE);
SafetyEvaluator evaluator = new SafetyEvaluator(conjureDef);
assertThat(evaluator.evaluate(foo)).hasValue(LogSafety.SAFE);
assertThat(evaluator.evaluate(bar)).hasValue(LogSafety.SAFE);

Comment on lines +201 to +209
// Save and reset cycle state so we can detect cycles within this type's subtree only.
boolean previousCycleState = encounteredCycle;
encounteredCycle = false;

Optional<LogSafety> result = task.get();

boolean subtreeHadCycle = encounteredCycle;
// Propagate cycle detection upward: if this subtree had a cycle, callers should know.
encounteredCycle = previousCycleState || subtreeHadCycle;
Copy link
Contributor

@aldexis aldexis Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's slightly confusing (to me) that we store the cyclic state in a field, unset it before calling the method, then reset it afterwards.

How would you feel about returning a pair from this method e.g. record(Optional<LogSafety> safety, boolean cycleDetected) and then use that instead here? (and maybe make a second method that extracts the safety result for usage in the rest of the code)

Edit: We'd want to check the memory impact of doing these allocations though

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I wanted to keep this lean. Want me to try that way?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have a bit of time to try this out, I'd be curious about the memory impact yeah (especially around the total allocations). If it's low enough, I feel like this would make the cognitive overhead of this piece of code a bit lower, and thus more maintainable.

If it's not, I think this should be fine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants