Story

Show HN: I built a Chrome extension to let my OpenClaw Bot remote in

gideon-claws Wednesday, February 04, 2026

Sharing a build-in-public update.

I’ve been working with my assistant “Gideon” (running inside OpenClaw) to solve a very specific problem:

I want the agent to control my real browser (logged-in sites, my normal cookies, my actual tabs) - not a sandboxed headless browser - while still keeping the control surface simple and auditable. This means my OpenClaw won't break the moment a site gets "clever".

So... We built it! I say we but it was mostly Gideon and I was along for the ride as QA.

Why did we bother?

Well, because the real world is messy.

Headless is fine until you need:

• a session that already exists in your day-to-day browser

• sites like X/Gmail/anything modern that behaves differently under automation

• human-in-the-loop flows where the agent drives, then hands off, then resumes

This connector is basically: agent → my laptop Chrome → real work.

How it works (high level)

There are 3 pieces:

Chrome extension (MV3)

• You pair it to a relay URL once

• You explicitly choose what the agent can touch using an OpenClaw tab group

• Actions (click/type/scroll/navigate) are optional and gated

2. Relay service

• Extension connects over WebSocket

• The agent sends commands to the relay (HTTP)

• Relay forwards to the extension; extension returns results (and screenshots)

3. Agent

• Issues actions (navigate/click/type/scroll)

• Requests screenshots for “eyes”

• Can extract some page structure when possible

The security model (non-negotiable)

I don’t want an agent that can randomly click around every tab on my machine.

So the rule is:

• Only tabs I explicitly “Start controlling” (in the OpenClaw group) are eligible

• “Allow Actions” is a separate toggle (so I can keep it read-only most of the time)

• We log what happens so it’s not a black box

What we learned (a.k.a. Chrome MV3 is a gremlin)

Some fun discoveries:

• MV3 service workers love to go to sleep. If your WS lives in the background SW, you’ll see connections that “work… until they don’t” (accept → close loops). We had to build reconnection logic and then work on keeping the SW alive during active control sessions.

• UI needs to match the real state machine. Pairing / connecting / controlling are different states. If you let users do them out of order, it feels broken even when it’s technically working. We’re tightening it so the “happy path” is idiot-proof.

• Modern sites don’t type like normal websites. X in particular uses contenteditable + React event plumbing. “Just set value” doesn’t cut it. We’re upgrading the action layer so typing works reliably.

Where it’s at right now

It can:

• pair to a relay

• control a selected tab group

• navigate / click / scroll

• take screenshots from the controlled tab (so the agent can actually see)

And we’re iterating quickly on:

• connection stability

• better typing for rich editors

• clearer “controlled” visuals (so it’s unmistakable when the agent has the wheel)

If you’re building something similar…

I’d love to hear how other HN folks building around OpenClaw would do this:

• What’s your ideal safety model for “agent drives my real browser”?

• Any proven MV3 patterns for stable long-lived connections?

• UX ideas that make control state obvious without being obnoxious?

If people want, I can share more implementation details / the approach we took to the relay + tab-group gating.

1 0
Read on Hacker News