Conversation
Generate changelog in
|
✅ Successfully generated changelog entry!Need to regenerate?Simply interact with the changelog bot comment again to regenerate these entries. 📋Changelog Preview💡 Improvements
|
|
Did a dry run to check the size of the cache for this large IR and got: |
| // Foo has a field referencing Bar, Bar has a field referencing Foo. | ||
| // Both fields are marked safe, so both types should evaluate as safe. | ||
| TypeDefinition foo = TypeDefinition.object(ObjectDefinition.builder() | ||
| .typeName(FOO) | ||
| .fields(FieldDefinition.builder() | ||
| .fieldName(FieldName.of("bar")) | ||
| .type(Type.reference(BAR)) | ||
| .build()) | ||
| .build()); | ||
| TypeDefinition bar = TypeDefinition.object(ObjectDefinition.builder() | ||
| .typeName(BAR) | ||
| .fields(FieldDefinition.builder() | ||
| .fieldName(FieldName.of("foo")) | ||
| .type(Type.reference(FOO)) | ||
| .build()) | ||
| .build()); |
There was a problem hiding this comment.
The fields are not marked as safe. Shouldn't we be returning unknown/empty in this case? 🤔 (I don't think your change changes this, but this seems weird to me 🤔 )
| SafetyEvaluator evaluator = new SafetyEvaluator(conjureDef); | ||
| // Evaluation order should not matter for the result | ||
| assertThat(evaluator.evaluate(foo)).hasValue(LogSafety.SAFE); | ||
| assertThat(evaluator.evaluate(bar)).hasValue(LogSafety.SAFE); |
There was a problem hiding this comment.
You're not checking the evaluation order here. Since you're already doing it in a dedicated test, might as well remove the comment?
| SafetyEvaluator evaluator = new SafetyEvaluator(conjureDef); | |
| // Evaluation order should not matter for the result | |
| assertThat(evaluator.evaluate(foo)).hasValue(LogSafety.SAFE); | |
| assertThat(evaluator.evaluate(bar)).hasValue(LogSafety.SAFE); | |
| SafetyEvaluator evaluator = new SafetyEvaluator(conjureDef); | |
| assertThat(evaluator.evaluate(foo)).hasValue(LogSafety.SAFE); | |
| assertThat(evaluator.evaluate(bar)).hasValue(LogSafety.SAFE); |
| // Save and reset cycle state so we can detect cycles within this type's subtree only. | ||
| boolean previousCycleState = encounteredCycle; | ||
| encounteredCycle = false; | ||
|
|
||
| Optional<LogSafety> result = task.get(); | ||
|
|
||
| boolean subtreeHadCycle = encounteredCycle; | ||
| // Propagate cycle detection upward: if this subtree had a cycle, callers should know. | ||
| encounteredCycle = previousCycleState || subtreeHadCycle; |
There was a problem hiding this comment.
It's slightly confusing (to me) that we store the cyclic state in a field, unset it before calling the method, then reset it afterwards.
How would you feel about returning a pair from this method e.g. record(Optional<LogSafety> safety, boolean cycleDetected) and then use that instead here? (and maybe make a second method that extracts the safety result for usage in the rest of the code)
Edit: We'd want to check the memory impact of doing these allocations though
There was a problem hiding this comment.
Yeah I wanted to keep this lean. Want me to try that way?
There was a problem hiding this comment.
If you have a bit of time to try this out, I'd be curious about the memory impact yeah (especially around the total allocations). If it's low enough, I feel like this would make the cognitive overhead of this piece of code a bit lower, and thus more maintainable.
If it's not, I think this should be fine
Memoize SafetyEvaluator type traversals
Summary
SafetyEvaluatornow caches computed safety results per type, avoiding redundant recursive traversals of the type graphMotivation
JFR profiling of
compileConjureObjectson a large IR (~4600 types) showed that 74% of CPU time was spent inSafetyEvaluator, with the top frames beingTypeDefinitionSafetyVisitor.with()andSafetyEvaluator.combine(). Each call toevaluate()created a fresh visitor and re-traversed the entire reachable type graph from scratch. Since many types reference the same shared types, this resulted in massive redundant work — effectively O(n * m) where n is the number of types and m is the average reachable subgraph size.What changed
A
HashMap<TypeName, Optional<LogSafety>>cache is shared across all evaluations on aSafetyEvaluatorinstance. Before traversing a type's subgraph,with()checks the cache and returns immediately on a hit. After evaluation, the result is cached — but only if no cycle-breaking occurred in the subtree, since the cycle-breaking heuristic (assumeSAFE) produces results that depend on evaluation order and should not be persisted.Impact
Tested against a production IR with ~4600 types:
compileConjureObjectsdropped from 42s to 20s in Gradle.Test plan
After this PR
==COMMIT_MSG==
Memoize safety evaluator
==COMMIT_MSG==
Possible downsides?