Multi-SWE-Bench: Evaluation Changes#47
Draft
Vladislav0Art wants to merge 67 commits into
Draft
Conversation
…unused import optimization) Left optimizations: 1. Import of a class from the same package. 2. Unused import. Both imports above are removed when references are updated in a file that uses a renamed class/method.
Qodana Community for JVM19 new problems were found
💡 Qodana analysis was run in the pull request mode: only the changed files were checked View the detailed Qodana reportTo be able to view the detailed Qodana report, you can either:
To get - name: 'Qodana Scan'
uses: JetBrains/qodana-action@v2025.1.1
with:
upload-result: trueContact Qodana teamContact us at qodana-support@jetbrains.com
|
…aming transformations
… `java.lang.Object` (don't filter)
…ilter overrides before grouping
…MethodTransformation`
…ethodTransformation`
…lassTransformation`
… in `RenameMethodTransformation`
… in `RenameClassTransformation`
…leFilters` method in `RenameVariableTransformation`
…nsformation` and update documentation
…hods in reverse-alphabetical order
…er` and related components
…ion and persistent memory components
…and CLI support via `RewriteProblemStatementStarter`
… and integrate with CLI starters
- Introduced `BenchmarkInstanceIO` for JSON parsing and transformation of benchmark records.
- Updated `RewriteProblemStatementStarter` and `TransformTextsStarter` to process `{title, body}` pairs and `resolved_issues`.
- Added `TextBlock` data class to support coherent updates across textual fields.
- Enhanced Gradle tasks `rewriteProblemStatement` and `transformMetamorphicTexts` with improved input/output handling.
…n-deterministic behavior during project close and renaming operations
… prevent inconsistent state during subsequent operations
… inconsistent behavior across transformations
…eordering - Skip non-physical, compiled, and anonymous classes to avoid unintended modifications. - Fix method filtering to exclude non-physical and compiled methods. - Pre-validate method copyability to prevent partial class modifications during reordering. - Add error handling for unexpected exceptions to improve reliability.
In multi-module projects with same-simple-name classes (e.g. fastjson v1-compat `com.alibaba.fastjson.JSON` alongside v2 `com.alibaba.fastjson2.JSON`), MethodReferencesSearch's strict signature search drops call sites whose overload PSI cannot unambiguously bind. RenameProcessor.findUsages() never sees them, so they survive the rename with the old method name and break compilation (e.g. fastjson2 PR-82: `JSONReader.java:922` and `TypeUtils.java:187` left as `JsonMapper.toJSONString(...)` after the family rename). Add a post-rename safety net inside `tryRenameMethodFamily`: walk every Java file in project scope and patch call sites whose `referenceName` matches the old name AND that either resolve into the family or that PSI failed to resolve while their qualifier still points at the family's containing class. Sites PSI resolves to a different method are left alone. Logged via `Post-rename safety net: patched N missed call site(s)` so healthy runs report N=0.
Issue 1 (diagnostics): the post-rename safety net's reported patch counts were surprisingly large (e.g. 681 for `parseObject`, 529 for `toJSONString`). The numbers are real misses by `MethodReferencesSearch` strict-signature search, not over-matching — but the log made it hard to verify. Now log the count split into `resolved-to-family` vs `qualifier-fallback` buckets and print up to 10 sample `path:line (branch)` sites so the user can spot-check. Issue 2 (wildcard import stripping): IntelliJ's `RenameProcessor` invokes `JavaCodeStyleManager.shortenClassReferences()` on every file whose references it rewrites, which can also strip unrelated `import static X.*;` lines on those files (observed: rename of `JSON` → `JsonCodec` removed `import static junit.framework.TestCase.*;` from test files, breaking `assertNull(...)` and failing test compile). There is no documented IntelliJ toggle (`IDEABKL-3561`) and the existing code-style settings only prevent CREATION of new wildcards. Add `WildcardImportExpander`: a one-shot project-wide pass that runs before any transformation. For each Java file in project scope, replace `import static X.*;` and `import pkg.*;` with explicit single imports for the symbols actually used in the file. Each remaining import then points at a name PSI sees as referenced, so the optimizer cannot drop it. Wired into `TransformationService.applyTransformations` at the top.
Prevent accidental modifications by adding a check to exclude method reference identifiers located within annotation arguments during post-rename patching.
The first version used `(resolved as PsiMember).containingClass == targetClass` to attribute references to wildcards. Wrong because `import static X.*;` inherits — `TestCase` exposes `assertNull` declared on its super `Assert`, and PSI's resolver returns the declaring class. Two failure modes on fastjson2 PR-82: - Empty replacements: `Issue1344.java` only uses `assertNull(String)`. The resolver returned a member on `junit.framework.Assert`, equality rejected it, the wildcard was deleted with no single-import replacement. - Multi-wildcard drops: files with both `org.junit.jupiter.api.Assertions.*` and `junit.framework.TestCase.*` lost names like `assertEquals(int,int)` from one of the two expansions because the resolver picked one origin. Replace the equality check with a positive query against the target class's visible (inherited) members via `findMethodsByName(checkBases = true)` / `findFieldByName(checkBases = true)` / `findInnerClassByName(checkBases = true)`. Walk the file once, split unqualified refs into resolved / unresolved name sets, then per wildcard intersect with what the target class exposes. The same name can land in multiple wildcards' expansions — correct, since both originals exposed it. Conservative keep: if a wildcard's `usedNames` is empty AND any unresolved reference in the file matches a name the target class would expose, leave the wildcard untouched. Better to retain a working wildcard than delete a load-bearing one. Stats now report `expanded N; kept M as conservative`.
The previous version captured `List<PsiImportStaticStatement>` once, then
iterated and rewrote each in its own `WriteCommandAction`. After the first
WriteCommandAction (`importList.add` × N + `wildcard.delete()`), the OTHER
captured `PsiImportStaticStatement` siblings became invalidated. The next
iteration's `wildcard.resolveTargetClass()` then threw
`PsiInvalidElementAccessException: containing file is null` and the entire
transformation pipeline aborted before even creating the memory file —
observed on fastjson2 PR-82 right after the
"[TransformationService] Pre-expanding wildcard imports project-wide..."
log line.
Structural fix — never hold PSI element references across mutations:
- The captured plan now stores `staticTargetFqns: List<String>` and
`regularPackageFqns: List<String>` instead of element references.
- Inside each rewrite's `WriteCommandAction` the wildcard is re-located by
scanning the live `importList.importStaticStatements` /
`importStatements` and matching by `importReference.qualifiedName` +
`isOnDemand` + `isValid`. If the wildcard is gone, we silently skip.
- The target `PsiClass` is re-resolved fresh inside the WriteCommandAction
via `JavaPsiFacade.findClass(fqn, allScope)`.
Defense in depth — the expander is best-effort and never aborts the
pipeline:
- New `safeReadAction(fallback) { ... }` helper wraps every read action,
rethrowing only `ProcessCanceledException` / `InterruptedException`.
- `expandAll` and `expandInFile` wrap per-file / per-wildcard work in
`try/catch (Throwable)`, logging WARN with file path + FQN and bumping
`filesFailed` / `wildcardsFailed` counters.
- The visitor blocks in `summarizeStaticRefs` and `collectClassUses` now
swallow per-node throwables.
- `Stats` extended with `filesFailed` and `wildcardsFailed`; final log
line: "Pre-processed N files; expanded M; kept K conservative;
failed F file(s) / W wildcard(s)".
- Introduced `TimeoutException` handling on move operations with a 3-minute limit. - Added logging to warn about timed-out suggestions before proceeding to the next.
…mpts in `FixImportHunks` agent
…` transformations in `FixImportHunks`
…leTransformation` to avoid additional modifications in base+test_patch run
…ansformation` with exponential backoff and robustness enhancements
…by explicitly attaching transitive overriders to the rename processor `RenameProcessor`'s implicit override expansion via `RenameJavaMethodProcessor.prepareRenaming` + `OverridingMethodsSearch` silently dropped sibling overloads' overriders when multiple same-name overloads were renamed to the same target name through a single processor — only the seed overload's overrider got renamed, the others kept the old name. The post-rename safety net `verifyAndPatchMissedCallSites` only inspected `PsiMethodCallExpression` nodes, never `PsiMethod` declarations, so missed override definitions were invisible to it. Reproducer: an abstract base `A` with `write(JsonValue)` + `write(String)` and `B extends A` overriding both. After renaming `A.write` to `A.writeTo`, `B.write(JsonValue)` followed but `B.write(String)` was left dangling with the old name. In `tryRenameMethodFamily`, after the existing overload-sibling `addElement` loop, enumerate transitive overriders via `OverridingMethodsSearch.search(method, checkDeep = true)` for every family method and attach each (skipping family members already added and overriders in libraries) as a first-class rename target. Adding them explicitly forces the same rename path that already works for the seed; when implicit expansion would have caught them anyway, it is a no-op via `myAllRenames` dedup, so the change is backward compatible.
…sformation` with exponential backoff Mirror the robust LLM call path already used in `RenameVariableTransformation`: each file now issues a single batched LLM request listing every overload family (chunked at LLM_BATCH_SIZE=20 to keep prompt size manageable since each entry embeds a method body), with up to LLM_MAX_ATTEMPTS=3 retries per batch and exponential backoff (4s → 8s, capped at 12s) on transient failures. `ProcessCanceledException` is rethrown per IntelliJ contract; permanent failures return an empty list so affected families are skipped (no rename, no memory write) without failing the whole transformation, and other batches and other files keep progressing. Wire format: replace `MethodNameSuggestions(suggestions)` with `MethodFamilyRenaming(familyKey, suggestions)` + `MethodRenameSuggestions(renamings)`. Each overload family carries a precomputed `familyKey` of the form `<classFqn>#<methodName>[<static|instance>]` so the model can echo it back per entry — the `[static]/[instance]` tag prevents same-name static/instance siblings in the same class from colliding in the batch. `generateRenamesForFamilies` now matches results back via `familyKey == family.familyKey` and pipes them through the unchanged `buildSuggestionList` helper. `extractRenamesFromMemoryForFamilies`, `tryRenameMethodFamily`, the overrider-attachment / verify-and-patch / rich-logging code, and the memory key format are untouched.
…oting discovery reads to smart-mode
`RenameProcessor.run()` + `commitAllDocuments()` + `saveAllDocuments()`
drops the IDE back into dumb mode while the stub index is recomputed,
so the next file's transformation can land mid-reindex. Any read action
that resolves a super-class hierarchy then hits the stub index and
throws. Reproduced on Multi-SWE-Bench:
- `RenameClassTransformation` — `psiClass.allFields` walks `MemberCache`
→ `getSupers()` → `findSpecialSuperClass()` → `JavaFullClassNameIndex`.
- `RenameMethodTransformation` — `method.findSuperMethods()` builds a
hierarchical signature → `getSuperTypes()` →
`findClass(java.lang.Object)`.
Per the `IndexNotReadyException` Javadoc, promote the topmost read
action to smart mode. Add a `withSmartReadAction(project) { ... }`
companion helper on `IntelliJAwareTransformation` (mirrors
`withReadAction` but uses suspending `smartReadAction(project)`, so the
block waits for index readiness before running) and use it at the
discovery sites:
- `RenameClassTransformation.apply` — the `findAllValidClasses(...)`
wrap (also covers the per-class `ReferencesSearch.search(cls)` and
`fileIndex.isInTestSourceContent(...)` calls inside).
- `RenameClassTransformation.generateNewClassNames` — the PSI-context
extraction (covers `psiClass.allFields` — the exact line that threw).
- `RenameMethodTransformation.apply` — the
`findAllValidMethodFamilies(...)` wrap (covers `findSuperMethods()`,
`psiClass.supers`, and `ReferencesSearch.search(method)` inside the
override-filter walk).
`tryRenameClassAndUsages` and `tryRenameMethodFamily` keep using plain
`withReadAction { ... }` — they run after the discovery walk (the
natural index-settling point) and adding `runBlocking` inside their
existing `invokeAndWait` envelopes risks deadlocks. Cancellation
contract preserved: `smartReadAction` honours `ProcessCanceledException`
propagation.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.