docs(interaction-skills): flesh out cross-origin-iframes.md#286
Open
shashwatgokhe wants to merge 1 commit intobrowser-use:mainfrom
Open
docs(interaction-skills): flesh out cross-origin-iframes.md#286shashwatgokhe wants to merge 1 commit intobrowser-use:mainfrom
shashwatgokhe wants to merge 1 commit intobrowser-use:mainfrom
Conversation
The file was a one-line scaffold. Replace it with the actual mechanics: - coordinate clicks pass through cross-origin iframes (compositor-level hit-testing) — the default and right tool for most clickable UI - iframe_target() + js(..., target_id=...) for DOM reads, structure, and selector-based focus inside the iframe - typing into a cross-origin iframe: focus must be inside the iframe first (click_at_xy or js focus() via the iframe target), then type_text() / press_key() route correctly - pitfalls: contentDocument throws on cross-origin, iframe_target() can be None during load, nested iframes are separate CDP targets, target IDs change after iframe navigation, sandboxed iframes need a real click before keystrokes - a quick decision table at the bottom Based on what helpers.py already exposes (click_at_xy, type_text, press_key, iframe_target, js with target_id). No code changes.
✅ Skill review passedReviewed 1 file(s) — no findings. |
Contributor
There was a problem hiding this comment.
1 issue found across 1 file
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="interaction-skills/cross-origin-iframes.md">
<violation number="1" location="interaction-skills/cross-origin-iframes.md:55">
P2: The recommended polling loop can run forever if the iframe never appears; add a timeout/deadline or use a bounded wait helper.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.
| ## Pitfalls | ||
|
|
||
| - `js("document.querySelectorAll('iframe')")` from the parent only sees the iframe **element**, not the document inside it. `querySelector(...).contentDocument` throws on cross-origin. Use `iframe_target()` instead. | ||
| - `iframe_target()` can return `None` if the iframe is still loading. Either `wait_for_load()` first or poll: `while iframe_target("...") is None: wait(0.5)`. |
Contributor
There was a problem hiding this comment.
P2: The recommended polling loop can run forever if the iframe never appears; add a timeout/deadline or use a bounded wait helper.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At interaction-skills/cross-origin-iframes.md, line 55:
<comment>The recommended polling loop can run forever if the iframe never appears; add a timeout/deadline or use a bounded wait helper.</comment>
<file context>
@@ -1,3 +1,69 @@
+## Pitfalls
+
+- `js("document.querySelectorAll('iframe')")` from the parent only sees the iframe **element**, not the document inside it. `querySelector(...).contentDocument` throws on cross-origin. Use `iframe_target()` instead.
+- `iframe_target()` can return `None` if the iframe is still loading. Either `wait_for_load()` first or poll: `while iframe_target("...") is None: wait(0.5)`.
+- A page can have nested cross-origin iframes (iframe-in-iframe). Each is its own CDP target — `iframe_target()` matches by URL substring, so use a path fragment specific to the inner one.
+- After a navigation inside the iframe, the target ID changes. Re-fetch with `iframe_target()` rather than caching it across navigations.
</file context>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hey, first time contributing here so be gentle.
I noticed
interaction-skills/cross-origin-iframes.mdwas a one-line scaffold and I had just been bitten by exactly this scenario, so I tried filling it in.I was driving a web IDE that runs inside a cross-origin iframe and kept hitting the same wall: clicks worked fine, but anything I "typed" seemed to vanish or land in the parent page. After reading
helpers.pyI figured out what was actually going on — coordinate clicks pass through the iframe boundary because the hit-test happens in Chrome itself, buttype_text/press_keyonly go where focus already is, and focus inside the iframe is a separate thing you have to set first (either by clicking into it or by focusing through its own CDP target).So this PR rewrites that file with what I wish it had said when I started:
iframe_target("...")plusjs(..., target_id=...)It's all docs, no code touched. I tried to match the tone and density of the other interaction-skills files. Happy to take feedback / cut things / split it up if anything feels off.
Summary by cubic
Expanded
interaction-skills/cross-origin-iframes.mdwith clear guidance for interacting with cross-origin iframes. Covers coordinate clicks that pass through, usingiframe_target(...)withjs(target_id=...)for DOM access, typing workflows that require focus inside the iframe, and key pitfalls (loading, nested targets, navigation, sandbox).Written for commit 8029799. Summary will update on new commits.