Skip to content

feat: frame control, input simulation, invoke enhancements, and screenshot improvements#4

Draft
shinomailer wants to merge 6 commits into
masterfrom
shino/3-gameplay-and-frame-control-tools
Draft

feat: frame control, input simulation, invoke enhancements, and screenshot improvements#4
shinomailer wants to merge 6 commits into
masterfrom
shino/3-gameplay-and-frame-control-tools

Conversation

@shinomailer
Copy link
Copy Markdown
Contributor

@shinomailer shinomailer commented Apr 2, 2026

Motivation

When an LLM/agent drives a Unity game via MCP, three fundamental gaps block effective gameplay:

  1. Frame pacing — The game runs at 60 FPS while the LLM processes at ~1 req/s. The agent needs to pause, step, and advance frames on its own schedule.
  2. Input simulation — The agent has no way to press keys or click the mouse. invoke can call game methods directly, but many game systems (movement, building placement, weapon firing) are driven by input polling in Update().
  3. Observation — Screenshots captured the wrong panel, came out flipped, or missed UI overlays. invoke couldn't target specific component instances or access DontDestroyOnLoad objects.

This PR addresses all three, turning the MCP bridge into a complete gameplay automation toolkit.


Changes

1. Frame Control Tools (issue #3)

Commits: d69e79e, f325c7b

Adds three new actions to the unity_editor meta-tool:

Action Description
step_frame Advance exactly one frame via EditorApplication.Step()
set_update_frequency Get/set Time.timeScale and Time.captureFramerate. No-arg call acts as getter. captureFramerate is the recommended knob for deterministic LLM-driven simulation
play_for_frames(N) Async advance N frames then pause, with domain reload recovery via SessionState + AssemblyReloadEvents + bridge replay

2. Invoke Enhancements

Commit: e71fffa

Addresses gaps found during MCP gameplay testing:

  • Instance targeting: instance_id and game_object params let you target a specific component instance instead of always getting the first FindObjectOfType match
  • DontDestroyOnLoad visibility: Fallback search in DDOL scene for ResolveInstance; expose DDOL roots in SceneGetHierarchy during Play Mode
  • Serialization fixes: Use UnityJsonSerializer.Settings so Vector3/Quaternion/Color serialize as {x,y,z} instead of including computed properties. Expand CoerceArg to handle Vector4, Quaternion, Color, Rect, Bounds
  • Member enumeration: Wildcard resolve_method queries (Type.*) to list all public members, enabling LLM self-discovery of available APIs

3. Input Simulation

Commits: 6ae1b75, 7bc67fe

Adds unity_input tool with full keyboard and mouse support:

  • Actions: key_down, key_up, type_text, mouse_button_down, mouse_button_up, mouse_move, mouse_scroll, click, release_all
  • Native input hook: Unity's NativeInputRuntime overwrites injected device state with real hardware events every frame. We hook onUpdate via reflection (same technique as Unity's InputTestFixture) to let hardware events through, then overwrite keyboard/mouse state with our synthetic values after each update
  • Coordinate conversion: Mouse positions auto-convert from screenshot space (top-left origin) to InputSystem screen space (bottom-left origin) using Camera.pixelHeight, working around the known Unity Editor bug where Screen.height returns the focused window's height instead of the Game View's
  • Focus management: FocusGameView() called on mouse actions so Screen.width/height return correct values for UGUI GraphicRaycaster hit testing
  • editorInputBehaviorInPlayMode: Set to AllDeviceInputAlwaysGoesToGameView so input routes to the game regardless of editor focus

4. Screenshot Improvements

Commit: 6ae1b75

  • Multi-mode capture: mode parameter with auto (default), game_view (coroutine-based ScreenCapture), window (Game View m_RenderTexture via reflection), camera (direct Camera.Render)
  • Fallback chain: auto tries coroutine → window → camera, ensuring a screenshot always succeeds
  • Y-flip fix: ReadPixels from RenderTexture returns Y-inverted data; FlipTextureVertically corrects this in the window capture path
  • UI overlay support: window and game_view modes include Screen Space - Overlay UI (health bars, building cards, etc.), unlike camera mode which only renders the camera's view

5. Misc

  • execute_menu_item action added to unity_editor meta-tool
  • GameObjectLookup: Better error messages for unknown search methods
  • NATIVE_MCP_HAS_INPUT_SYSTEM: Version define in asmdef for conditional compilation when InputSystem package is present

6. Refactoring (code quality)

Commit: 7bc67fe

  • Domain reload safety: [InitializeOnLoad] ReloadWatcher resets all static state (held keys, reflection hooks, config flags) on domain reload, preventing stale hooks after recompilation
  • Logging convention: Debug.Log/Warn/ErrorMcpLog.Info/Warn/Error across all new code
  • Cached reflection: NativeInputInterceptor caches MethodInfo/target/object[] to avoid per-frame reflection and allocation overhead
  • Click Y-flip: Extracted ConvertScreenshotY() helper shared by HandleMouseMove and HandleClick (previously only HandleMouseMove did the conversion)
  • Exception specificity: Bare catch {} narrowed to catch (ArgumentException) in HandleReleaseAll; McpLog.Warn added to SceneScreenshot.FocusGameView
  • Type caching: GameView type reflection lookup cached in static field

Test plan

  • EditMode unit tests pass (parameter validation, non-play-mode error paths)
  • step_frame / play_for_frames / set_update_frequency work in Play Mode
  • Keyboard input: WASD movement, 1-5 building selection, F placement all verified via screenshot comparison
  • Mouse input: Y-flip coordinate conversion verified — clicks register on UGUI elements
  • Screenshot mode=window: correct orientation, includes overlay UI
  • invoke instance targeting: game_object param targets specific player instances
  • Domain reload: input hook re-installs correctly after recompilation
  • play_for_frames domain reload recovery (full cycle)

Closes #3

🤖 Generated with Claude Code

shinomailer and others added 3 commits April 1, 2026 22:04
Adds three new MCP tools to let LLM/agents control Unity's game loop
at inference speed, addressing the 60 FPS vs LLM throughput mismatch.

New internal tools (exposed via unity_editor meta-tool):
- editor_step_frame: advance exactly one frame via EditorApplication.Step()
- editor_set_update_frequency: get/set Time.timeScale and Time.captureFramerate;
  no-arg call acts as a getter. captureFramerate is the recommended knob for
  deterministic LLM-driven simulation (deltaTime = timeScale/captureFramerate)
- editor_play_for_frames: async advance N frames then pause, with full domain
  reload recovery via SessionState + AssemblyReloadEvents + bridge replay

Also adds EditMode unit tests covering parameter validation and non-play-mode
error paths for all three tools.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds three new MCP tools to let LLM/agents control Unity's game loop
at inference speed, addressing the 60 FPS vs LLM throughput mismatch.

New internal tools (exposed via unity_editor meta-tool):
- editor_step_frame: advance exactly one frame via EditorApplication.Step()
- editor_set_update_frequency: get/set Time.timeScale and Time.captureFramerate;
  no-arg call acts as a getter. captureFramerate is the recommended knob for
  deterministic LLM-driven simulation (deltaTime = timeScale/captureFramerate)
- editor_play_for_frames: async advance N frames then pause, with full domain
  reload recovery via SessionState + AssemblyReloadEvents + bridge replay

Also adds EditMode unit tests covering parameter validation and non-play-mode
error paths for all three tools.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@shinomailer
Copy link
Copy Markdown
Contributor Author

MCP Gameplay Session — Difficulty Report

Session: 2026-04-02, Branch: shino/test-tool-and-runner
Game: ApplianceGoneWild (wave-based tower defense, FishNet networking)
MCP Tools: unity-mcp-native via unity-mcp-bridge

Executive Summary

An LLM attempted to play the game entirely through MCP tools, controlling pacing with frame-control tools (step_frame, play_for_frames, set_update_frequency). Singleton-based game state was fully observable and controllable, but individual game entity interaction was severely limited by the inability to target specific object instances and the absence of input simulation.


What Worked Well

Capability Example Notes
Play/Pause/Stop lifecycle editor playpausestop Reliable
Frame-precise advancement play_for_frames(30) Exact frame counts, deterministic with capture_framerate
Time scale control set_update_frequency(time_scale=0.2) Slow-motion observation worked perfectly
Singleton state queries GameManager.CurrentState, NetworkWaveSpawner.CurrentWaveState Returned correct enum values
Singleton method calls StartWave(1), DebugSkipCurrentWave(), ForceCompleteLevelForDebug() Full wave lifecycle controlled
Economy manipulation EconomyManager.AddTeamMaterials(500) 32 → 532 confirmed
Cross-scene object search find(by_component, "ZombieBase") Found 166 zombies across DontDestroyOnLoad
Console log reading read_console(types=["error"]) Captured NavMesh errors, NullRefs
Domain reload recovery Play → MCP unreachable → retry after 5-8s Server auto-reconnects

Difficulties Encountered

Critical (Blocks Gameplay)

D1. No Input Simulation

Impact: Cannot move players, place turrets, interact with objects (F key), aim weapons.
Consequence: Players stand still → enemies swarm → GameOver inevitable.
Workaround: Debug methods (DebugSkipCurrentWave, ForceCompleteLevelForDebug).
Suggested Fix: Register DynamicToolRegistry functions:

  • move_player(playerIndex, direction, distance)
  • place_turret(prefabName, x, y, z, rotation)
  • interact(playerIndex) — triggers F-key interaction
  • fire_weapon(playerIndex, targetX, targetY, targetZ)

D2. Cannot Target Specific Instances via invoke

Impact: call_method on NetworkPlayerHealth.CurrentHealth always returns Player2; can't query Player1. ZombieBase.Health always returns the same zombie.
Consequence: Can only observe/affect ONE arbitrary instance per component type. Can't iterate, can't choose.
Suggested Fix: Add target parameter to invoke (instanceID or name), e.g.:

invoke(method="NetworkPlayerHealth.CurrentHealth", target={instanceID: -9512})

D3. unity_edit Cannot Reach DontDestroyOnLoad Objects

Impact: set_property, modify_gameobject fail with "not found" for all network-spawned objects (players, enemies, turrets).
Consequence: The most important runtime objects are invisible to the edit tools. Only find and invoke can see them.
Suggested Fix: Extend GameObject lookup in edit tools to search DontDestroyOnLoad scene.

Major (Severely Limits Observation)

D4. get_tree Cannot See DontDestroyOnLoad

Impact: get_tree(componentFilter="ZombieBase") returns 0 nodes. All spawned entities are invisible.
Workaround: Use find(by_component) which does search cross-scene, but returns only instance IDs — no position, health, or other data.
Suggested Fix: Include DontDestroyOnLoad in get_tree results, or add a sceneIndex value for it.

D5. Screenshots Capture Arbitrary Editor Panels

Impact: Three attempts captured: Scene view (useful), a 2076x20 Game view strip (useless), Inspector panel (useless).
Cause: ScreenCapture.CaptureScreenshotAsTexture captures the focused/Game view, but its size depends on the editor layout.
Suggested Fix: Force Game view focus/resize before capture, or render from a specific camera to a RenderTexture.

D6. invoke Cannot Serialize Complex Return Values

Impact: Transform.position, GameObject.Find(), Resources.Load() all return opaque SuccessResponse — the actual data (Vector3, GameObject reference) is lost.
Consequence: Can't read positions, can't chain operations (load prefab → instantiate).
Suggested Fix: Add serialization for common Unity types: Vector3, Vector2, Quaternion, Color, Bounds. For Unity Object references, return {name, instanceID, type}.

D7. invoke Cannot Chain Operations

Impact: Can't do GameManager.Player1.transform.position — each call is isolated, and complex objects can't be passed between calls.
Consequence: Multi-step operations (load → instantiate → configure → place) are impossible.
Suggested Fix: Support dotted property chains in method parameter, or add a batch_invoke that passes results between steps.

Moderate (Causes Friction)

D8. Namespace Collisions in Type Resolution

Impact: TransformUnityEngine.Rendering.RadeonRays.Transform (not UnityEngine.Transform). TimeCodice.Client.Common.Time (Plastic SCM).
Workaround: Use full namespace: UnityEngine.Transform.position.
Suggested Fix: Prioritize UnityEngine.* namespace when resolving ambiguous types, or add a namespace parameter.

D9. resolve_method Cannot List All Members

Impact: Empty query on NetworkWaveSpawner. returns "No members found" — have to guess property names or read source code.
Suggested Fix: Support empty/wildcard queries to enumerate all public members of a type.

D10. Network Prefab Instantiation via create_gameobject Fails at Runtime

Impact: Turret prefab instantiated via create_gameobject wasn't properly FishNet-spawned, causing it to self-destruct.
Workaround: None viable — need server-side NetworkServer.Spawn().
Suggested Fix: Add spawn_network_object(prefabPath, position, rotation) DynamicToolRegistry function, or detect NetworkObject on prefab and auto-spawn.

D11. play_for_frames Timeout at High Frame Counts

Impact: 300 frames at 5x speed with heavy AI load timed out (30s wall-clock).
Workaround: Use smaller batches (100 frames).
Suggested Fix: Allow larger timeout, or add progress callback.

D12. High Time Scale Makes UI Unreadable

Impact: At 5x speed, settlement/game-over screens appear and vanish faster than the LLM can observe or react.
Workaround: Use 1x speed and advance in small frame batches.
Suggested Fix: Auto-pause on GameState transitions (e.g., pause when WaveComplete/GameOver triggers).


Recommended DynamicToolRegistry Functions

These game-specific tools would bridge the gap between MCP capabilities and gameplay needs:

// Player control
DynamicToolRegistry.Register("move_player",
    "Move player by index to a world position or by delta",
    (args) => { /* teleport or translate player */ });

DynamicToolRegistry.Register("player_interact",
    "Trigger interaction (F key equivalent) for a player",
    (args) => { /* call interaction system */ });

// Building
DynamicToolRegistry.Register("spawn_and_place_turret",
    "Network-spawn a turret prefab at a position",
    (args) => { /* Resources.Load → Instantiate → NetworkSpawn → Place */ });

// Combat
DynamicToolRegistry.Register("kill_all_enemies",
    "Kill all active zombies",
    (args) => { /* iterate ActiveEnemies, TakeDamage(9999) */ });

DynamicToolRegistry.Register("get_all_enemy_states",
    "Return position, health, type for all living enemies",
    (args) => { /* serialize enemy list as JSON */ });

// Observation
DynamicToolRegistry.Register("get_game_snapshot",
    "Return full game state: players, enemies, buildings, wave, economy",
    (args) => { /* aggregate all key state into one JSON response */ });

Session Timeline

Time Action Result
Phase 0 Check scene GamePlay loaded, Demo art scene
Phase 1 Enter play mode 2 players spawned, WaitingForNextWave
Phase 2 Set 0.2x slow-mo, advance 30 frames Frame control works precisely
Phase 2 StartWave(1) Combat started, zombies spawning
Phase 3 set_property on Player (DontDestroyOnLoad) FAILED — object not found
Phase 3 Set Transform.position via invoke FAILED — moved "One shot audio" instead of player
Phase 3 create_gameobject (turret prefab) Created, but self-destructed (no network spawn)
Phase 3 TakeDamage(9999) on zombie SUCCESS — killed one zombie
Phase 3 AddTeamMaterials(500) SUCCESS — 32 → 532
Phase 4 Advance 300 frames at 2x Timeout at 299/300
Phase 4 Advance at 5x GameOver — settlement screen flashed by
Phase 5 Restart, StartWave → DebugSkipCurrentWave → ForceCompleteLevel SUCCESS — reached Victory state

Conclusion

MCP frame-control tools are excellent for pacing, but the current toolset creates an asymmetry: the LLM can observe and manipulate game-level singletons (wave system, economy, level progress) but cannot meaningfully interact with individual game entities (players, enemies, buildings). The three highest-impact improvements would be:

  1. Instance targeting for invoke — specify which object to call methods on
  2. DontDestroyOnLoad support in get_tree and unity_edit
  3. DynamicToolRegistry game-specific functions — custom tools for player movement, turret placement, and state snapshots

@Danielmelody
Copy link
Copy Markdown
Contributor

Danielmelody commented Apr 2, 2026

@codex extract tool level problem out from adhoc game level from this report and PR

@Codex
Copy link
Copy Markdown

Codex AI commented Apr 2, 2026

@Danielmelody The model is not available for your account. This can happen if the model was disabled by your organization's policy or if your Copilot plan doesn't include access to it.

You can try again without specifying a model (just @copilot) to use the default, or choose a different model from the model picker.

If you want to contact GitHub about this error, please mention the following identifier so they can better serve you: 0b884bc9-d40b-4a89-bb22-ae8d249c377e

3 similar comments
@Codex
Copy link
Copy Markdown

Codex AI commented Apr 2, 2026

@Danielmelody The model is not available for your account. This can happen if the model was disabled by your organization's policy or if your Copilot plan doesn't include access to it.

You can try again without specifying a model (just @copilot) to use the default, or choose a different model from the model picker.

If you want to contact GitHub about this error, please mention the following identifier so they can better serve you: 0b884bc9-d40b-4a89-bb22-ae8d249c377e

@Codex
Copy link
Copy Markdown

Codex AI commented Apr 2, 2026

@Danielmelody The model is not available for your account. This can happen if the model was disabled by your organization's policy or if your Copilot plan doesn't include access to it.

You can try again without specifying a model (just @copilot) to use the default, or choose a different model from the model picker.

If you want to contact GitHub about this error, please mention the following identifier so they can better serve you: 0b884bc9-d40b-4a89-bb22-ae8d249c377e

@Codex
Copy link
Copy Markdown

Codex AI commented Apr 2, 2026

@Danielmelody The model is not available for your account. This can happen if the model was disabled by your organization's policy or if your Copilot plan doesn't include access to it.

You can try again without specifying a model (just @copilot) to use the default, or choose a different model from the model picker.

If you want to contact GitHub about this error, please mention the following identifier so they can better serve you: 0b884bc9-d40b-4a89-bb22-ae8d249c377e

@LambdaLabsHQ LambdaLabsHQ deleted a comment from Codex AI Apr 2, 2026
@LambdaLabsHQ LambdaLabsHQ deleted a comment from chatgpt-codex-connector Bot Apr 2, 2026
@LambdaLabsHQ LambdaLabsHQ deleted a comment from Codex AI Apr 2, 2026
@Danielmelody
Copy link
Copy Markdown
Contributor

@codex extract tool level problem out from adhoc game level from this report and PR

@chatgpt-codex-connector
Copy link
Copy Markdown

To use Codex here, create an environment for this repo.

@Danielmelody
Copy link
Copy Markdown
Contributor

@codex extract tool level problem out from adhoc game level from this report and PR

@chatgpt-codex-connector
Copy link
Copy Markdown

Summary

  • Added a new extraction document that separates core tool-level/platform issues from ad-hoc game/project integration gaps based on the PR report context, so follow-up work can be scoped cleanly.
  • Captured a prioritized list of core unity-mcp-native issues (instance targeting, DontDestroyOnLoad visibility parity, serialization, chaining, type resolution, reflection discoverability, screenshot determinism, and frame-timeout ergonomics).
  • Isolated game-specific needs (input bridge, networking-aware spawn, game snapshot tooling) into a separate section to avoid mixing platform defects with title-specific implementation work.
  • Added a practical “what to do where” split and explicit tie-back to PR Control how unity update game #3 so future PRs can target either core platform changes or game-level tooling intentionally.

Testing

  • git status --short && wc -l TOOL_LEVEL_ISSUES_FROM_GAMEPLAY_REPORT.md
  • git add TOOL_LEVEL_ISSUES_FROM_GAMEPLAY_REPORT.md && git commit -m "docs: extract tool-level issues from gameplay difficulty report"
  • git status --short && git log -1 --oneline

View task →

@Danielmelody
Copy link
Copy Markdown
Contributor

Danielmelody commented Apr 2, 2026

@codex for tool level control, we could add user input trajectory along with play with frame, but this can be another feature request. For game level issues, those are out of topic.

@Codex
Copy link
Copy Markdown

Codex AI commented Apr 2, 2026

@Danielmelody The model is not available for your account. This can happen if the model was disabled by your organization's policy or if your Copilot plan doesn't include access to it.

You can try again without specifying a model (just @copilot) to use the default, or choose a different model from the model picker.

If you want to contact GitHub about this error, please mention the following identifier so they can better serve you: f0d5e020-ec28-46d2-9a32-e99b0093bbc0

2 similar comments
@Codex
Copy link
Copy Markdown

Codex AI commented Apr 2, 2026

@Danielmelody The model is not available for your account. This can happen if the model was disabled by your organization's policy or if your Copilot plan doesn't include access to it.

You can try again without specifying a model (just @copilot) to use the default, or choose a different model from the model picker.

If you want to contact GitHub about this error, please mention the following identifier so they can better serve you: f0d5e020-ec28-46d2-9a32-e99b0093bbc0

@Codex
Copy link
Copy Markdown

Codex AI commented Apr 2, 2026

@Danielmelody The model is not available for your account. This can happen if the model was disabled by your organization's policy or if your Copilot plan doesn't include access to it.

You can try again without specifying a model (just @copilot) to use the default, or choose a different model from the model picker.

If you want to contact GitHub about this error, please mention the following identifier so they can better serve you: f0d5e020-ec28-46d2-9a32-e99b0093bbc0

@Danielmelody
Copy link
Copy Markdown
Contributor

Howerver DontDestroyOnLoad should be resolved @shinomailer

@Codex
Copy link
Copy Markdown

Codex AI commented Apr 2, 2026

@Danielmelody The model is not available for your account. This can happen if the model was disabled by your organization's policy or if your Copilot plan doesn't include access to it.

You can try again without specifying a model (just @copilot) to use the default, or choose a different model from the model picker.

If you want to contact GitHub about this error, please mention the following identifier so they can better serve you: f0d5e020-ec28-46d2-9a32-e99b0093bbc0

2 similar comments
@Codex
Copy link
Copy Markdown

Codex AI commented Apr 2, 2026

@Danielmelody The model is not available for your account. This can happen if the model was disabled by your organization's policy or if your Copilot plan doesn't include access to it.

You can try again without specifying a model (just @copilot) to use the default, or choose a different model from the model picker.

If you want to contact GitHub about this error, please mention the following identifier so they can better serve you: f0d5e020-ec28-46d2-9a32-e99b0093bbc0

@Codex
Copy link
Copy Markdown

Codex AI commented Apr 2, 2026

@Danielmelody The model is not available for your account. This can happen if the model was disabled by your organization's policy or if your Copilot plan doesn't include access to it.

You can try again without specifying a model (just @copilot) to use the default, or choose a different model from the model picker.

If you want to contact GitHub about this error, please mention the following identifier so they can better serve you: f0d5e020-ec28-46d2-9a32-e99b0093bbc0

@chatgpt-codex-connector
Copy link
Copy Markdown

Summary

Testing

  • git status --short && wc -l TOOL_LEVEL_ISSUES_FROM_GAMEPLAY_REPORT.md
  • git add TOOL_LEVEL_ISSUES_FROM_GAMEPLAY_REPORT.md && git commit -m "docs: separate tool-level issues from game-specific report feedback"
  • git log -1 --oneline

Also completed:

  • ✅ Created a follow-up PR message via make_pr with title “docs: extract tool-level backlog from PR Control how unity update game #3 gameplay report feedback” and a summary/body aligned to the trigger request.

View task →

@Danielmelody
Copy link
Copy Markdown
Contributor

Disabled Copilot for github

shinomailer and others added 3 commits April 4, 2026 14:29
…erialization fixes, and member enumeration

Address four key gaps identified in the MCP gameplay report:

- D2: Add instance_id and game_object params to invoke for targeting specific
  component instances instead of always getting the first FindObjectOfType match
- D3/D4: Add DontDestroyOnLoad fallback in ResolveInstance and expose DDOL roots
  in SceneGetHierarchy during play mode
- D6: Use UnityJsonSerializer.Settings in WrapResult so Vector3/Quaternion/Color
  etc. serialize cleanly as {x,y,z} instead of including computed properties;
  expand CoerceArg to handle Vector4, Quaternion, Color, Rect, Bounds
- D9: Support wildcard resolve_method queries (Type.* or Type.) to enumerate all
  public members of a type, enabling LLM self-discovery of available APIs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ecution

- Add keyboard/mouse input simulation via InputSystem low-level events
  - Hook NativeInputRuntime.onUpdate to prevent hardware state overwrite
  - Track held keys/buttons and re-inject after each native update
  - Auto Y-flip mouse coordinates from screenshot space to screen space
  - Focus Game View on mouse actions to fix Screen.width/height for UGUI raycasting
- Fix screenshot Y-flip for window capture mode (ReadPixels from RenderTexture)
- Add multi-mode screenshot capture (auto/game_view/window/camera) with fallback chain
- Add execute_menu_item action to unity_editor tool
- Improve search method error messages in GameObjectLookup
- Add NATIVE_MCP_HAS_INPUT_SYSTEM version define for InputSystem detection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add [InitializeOnLoad] ReloadWatcher to reset static state on domain
  reload, preventing stale reflection hooks after recompilation
- Replace Debug.Log/Warn/Error with McpLog to follow project convention
- Extract ConvertScreenshotY() helper and apply Y-flip in HandleClick
  (was only done in HandleMouseMove, causing click coordinate mismatch)
- Cache MethodInfo/target/args array in NativeInputInterceptor to avoid
  per-frame reflection extraction and object[] allocation
- Narrow bare catch to ArgumentException in HandleReleaseAll
- Add McpLog.Warn to SceneScreenshot.FocusGameView bare catch
- Cache GameView type reflection lookup in static field

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@shinomailer shinomailer changed the title feat: add game update frequency control tools (issue #3) feat: frame control, input simulation, invoke enhancements, and screenshot improvements Apr 4, 2026
@shinomailer
Copy link
Copy Markdown
Contributor Author

Status Update — All Critical Issues Resolved

Responding to the Difficulty Report (D1-D12) and @Danielmelody's feedback. Here's where everything stands after the latest commits:

Resolved in This PR

Issue Status How
D1. No Input Simulation ✅ Resolved New unity_input tool with full keyboard (key_down/up, type_text) and mouse (mouse_move/button_down/up, click, scroll) support. Hooks NativeInputRuntime.onUpdate via reflection to prevent hardware state overwrite — same technique as Unity's InputTestFixture.
D2. Cannot Target Specific Instances ✅ Resolved invoke now accepts instance_id and game_object params to target specific component instances.
D3. unity_edit Can't Reach DDOL Objects ✅ Resolved GameObject lookup falls back to DontDestroyOnLoad scene during Play Mode.
D4. get_tree Can't See DDOL ✅ Resolved get_hierarchy exposes DDOL roots during Play Mode.
D5. Screenshots Capture Wrong Panel ✅ Resolved Multi-mode screenshot (auto/game_view/window/camera) with fallback chain. window mode reads Game View's internal m_RenderTexture — includes overlay UI, Y-flip corrected.
D6. invoke Can't Serialize Complex Types ✅ Resolved Vector3, Vector2, Vector4, Quaternion, Color, Rect, Bounds now serialize cleanly via UnityJsonSerializer.Settings.
D8. Namespace Collisions ✅ Mitigated Full namespace still required for ambiguous types, but wildcard queries (Type.*) now let the LLM discover available APIs.
D9. resolve_method Can't List Members ✅ Resolved Wildcard queries (Type.* or Type.) enumerate all public members.

Not Addressed (Out of Scope for Tool-Level)

Issue Status Notes
D7. invoke Can't Chain Operations Deferred Would require a mini expression evaluator or batch-invoke with result passing. Significant complexity.
D10. Network Prefab Instantiation Out of scope Game-specific — requires FishNet.ServerManager.Spawn(). Best handled via game-side DynamicToolRegistry functions.
D11. play_for_frames Timeout Partially mitigated timeout param already exists (default 30s). Users can pass larger values. Auto-scaling could be added later.
D12. High Time Scale UI Readability Out of scope Game-specific — auto-pause on state transitions would need game code hooks.

@Danielmelody's Points

for tool level control, we could add user input trajectory along with play with frame

Agreed this would be a natural extension. Current approach: key_downplay_for_frames(N)key_up works well for sustained input. A combined play_for_frames_with_input(frames, keys, mouse_trajectory) could bundle these into a single call for efficiency, but the current decomposed approach is functional and flexible.

DontDestroyOnLoad should be resolved

Done — invoke instance resolution, get_hierarchy, and unity_edit all search DDOL now (commit e71fffa).

Mouse Coordinate Note

Mouse clicks work for UGUI (Canvas + InputSystemUIInputModule) with auto Y-flip conversion from screenshot coordinates. IMGUI (OnGUI) is not affected by InputSystem mouse state — but this only impacts the editor-side debug level picker, not actual game UI. All gameplay UI uses UGUI.

@Danielmelody Danielmelody force-pushed the master branch 3 times, most recently from 8bd679f to 474e9dd Compare April 4, 2026 20:41
@Danielmelody
Copy link
Copy Markdown
Contributor

Most feature of this PR, may need a re-arrangent for unity-repl. maybe should be another repo for how we opperate unity. We can discuss later

@Danielmelody
Copy link
Copy Markdown
Contributor

I have removed most manuplation tool for clean purpose. Doesn't mean those are not important, just need a better architecture

@Danielmelody
Copy link
Copy Markdown
Contributor

Does this feature been moved to unity-agent-input?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Control how unity update game

3 participants