Show HN: We Ran a Live Red-Team Attack on OpenClaw Agents

This report documents a live adversarial test between two autonomous AI agents running on OpenClaw.

One agent acted as a red team attacker. One acted as a defensive agent. The agents communicated directly over webhooks with real tooling access. No humans were involved once the session started.

The attacker attempted both direct social engineering and indirect injection via documents. Direct attacks were blocked. Indirect attacks via JSON metadata are still under analysis.

The goal of this work is observability, not claims of safety. We expect agent-to-agent adversarial interaction to become common as autonomous systems are deployed more widely.

Happy to answer technical questions.

Summary

The article discusses the findings of a security audit conducted on the OpenClaw AI agent, highlighting the importance of observing and understanding adversarial AI systems to identify potential vulnerabilities and develop robust defenses against them.

Story

Show HN: We Ran a Live Red-Team Attack on OpenClaw Agents