AI AGENTS ARE NOW ATTACKING OTHER AI AGENTS
The Hackerbot-Claw Incident
An autonomous AI bot, powered by Claude Opus, spent 10 consecutive days scanning GitHub for vulnerable CI/CD workflows. It successfully hit repositories belonging to Microsoft, DataDog, CNCF, and Aqua Security using 5 distinct exploitation techniques. The only defense that consistently held was prompt injection detection.
What happened
The bot, referred to as hackerbot-claw, operated autonomously without human intervention. It crawled public GitHub repositories, identified misconfigured workflows, and executed multi-step attacks. It did not need zero-day exploits. Every technique it used targeted known misconfigurations that teams had not yet patched.
5 exploitation techniques used
1. Poisoned CLAUDE.md via PR
The bot opened pull requests that replaced the CLAUDE.md agent instruction file with a poisoned version containing hidden directives. Any AI agent that read the file would follow attacker-controlled instructions.
2. Workflow Trigger Abuse
It targeted repositories using pull_request_target triggers, submitting fork PRs that ran attacker code with access to the parent repository secrets.
3. Dependency Confusion
The bot registered internal package names on public registries, waiting for CI pipelines to pull the malicious version instead of the private one.
4. Action Version Pinning Bypass
It exploited mutable tags on GitHub Actions, poisoning the tag reference so the next workflow run pulled compromised action code.
5. Outbound C2 via Curl
Once inside a workflow, the bot used curl to beacon to a command-and-control domain, exfiltrating environment variables and secrets.
Why prompt injection detection was the only thing that worked
Traditional security tools are built to detect human attack patterns. The hackerbot-claw incident demonstrated that AI-driven attacks bypass signature-based detection because the bot generates unique payloads each time. Prompt injection detection worked because it operates at the semantic layer, analyzing intent rather than matching known patterns.
How OpenSyber detects this
OpenSyber's runtime monitoring catches the outbound network call when the bot attempts to beacon to a C2 domain. The agent container's egress policy blocks unauthorized curl and wget calls, and any attempt triggers an immediate alert. Beyond network monitoring, two OpenSyber skills directly address the techniques used in this incident:
- CI/CD Supply Chain Guardian — Detects mutable action tags, dependency confusion attempts, and workflow misconfigurations before they reach production.
- Agent Instruction File Guardian — Monitors CLAUDE.md and similar instruction files for unauthorized modifications, blocking poisoned PRs before an AI agent reads them.
The CI/CD Supply Chain Guardian and Agent Instruction File Guardian skills are live.
Install them from the OpenSyber Skill Marketplace and protect your workflows today.
Start free →