Skip to content

Click accuracy: return bounding box and viewport context on failure #2

@rafiki270

Description

@rafiki270

Problem

When click or tap fails (element not found, element obscured, coordinates miss), the error response gives minimal context. LLMs can't diagnose whether:

  • The selector was wrong
  • The element moved (dynamic page)
  • The element was off-screen
  • The coordinates were calculated incorrectly
  • Another element was overlapping

Current behavior

{ "success": false, "error": { "code": "ELEMENT_NOT_FOUND", "message": "..." } }

Proposed behavior

On failure, include diagnostic context:

{
  "success": false,
  "error": {
    "code": "ELEMENT_NOT_FOUND",
    "message": "No element matches selector '#submit'",
    "diagnostics": {
      "selector": "#submit",
      "viewport": { "width": 390, "height": 844 },
      "scrollPosition": { "x": 0, "y": 1200 },
      "similarElements": [
        { "selector": "#submit-btn", "text": "Submit", "rect": { "x": 120, "y": 400, "w": 150, "h": 44 } }
      ]
    }
  }
}

On click with coordinates, if the click lands on an unexpected element:

{
  "diagnostics": {
    "targetSelector": "#login-btn",
    "targetRect": { "x": 100, "y": 300, "w": 200, "h": 48 },
    "clickedPoint": { "x": 200, "y": 324 },
    "actualElementAtPoint": "div.overlay-modal"
  }
}

Impact

This is the #1 failure mode LLMs hit when automating browsers. Better diagnostics would let them self-correct immediately instead of retrying blindly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions