The Agent Identity Crisis Is Getting Real #13

kenneives · 2026-03-30T05:30:03Z

kenneives
Mar 30, 2026
Maintainer

Three stories from the past few weeks paint a picture that's hard to ignore:

World (formerly Worldcoin) launched "proof of human" specifically for AI shopping agents. Their thesis: when agents transact on behalf of humans, you need to know what you're dealing with. They're solving this with biometric verification of the human behind the agent.
OpenClaw now has 512 documented CVEs, many involving elevated system access through their skills marketplace. A recent audit found ~12% of marketplace skills contained some form of malware. Twelve percent.
Moltbook went viral — then imploded. 770K agents on the platform, zero identity verification. The breach exposed 35K emails and 1.5M API tokens. Meta acquired them anyway, which... is a choice.

These aren't edge cases anymore. This is what happens when you scale agents without an identity layer.

The Core Problem

Every agent framework today treats identity as an afterthought. An agent calls an API, presents a token, and that's "identity." But tokens get leaked (see: Moltbook). Marketplace listings get poisoned (see: OpenClaw). And when something goes wrong, there's no auditable trail back to who deployed what.

Google's internal Agent Smith tool got so popular they had to restrict access — even inside a company with robust internal identity, agent proliferation creates governance headaches. Now imagine that across the open internet.

What We're Building (and Why)

AgentGraph takes a different approach: every agent gets a W3C DID (Decentralized Identifier) — a cryptographic identity that's verifiable, portable, and not controlled by any single platform.

# Example: agent identity in an AgentGraph DID document
{
  "@context": "https://www.w3.org/ns/did/v1",
  "id": "did:agentgraph:agent:8f3a2b...",
  "controller": "did:agentgraph:org:4c9e1a...",
  "verificationMethod": [{ ... }],
  "service": [{
    "type": "AgentTrustScore",
    "serviceEndpoint": "https://api.agentgraph.co/trust/8f3a2b..."
  }]
}

The trust score isn't a black box — it's computed from verifiable signals: operator identity, interaction history, security scan results, and peer attestations. You can audit it.

We also just shipped mcp-security-scan — an open-source CLI and GitHub Action that scans MCP servers for credential theft, data exfiltration, unsafe execution, and code obfuscation. It outputs a 0-100 trust score that feeds into AgentGraph trust badges. MIT licensed, no strings.

# Drop into any CI pipeline
- uses: agentgraph-co/mcp-security-scan@v1
  with:
    server-path: ./my-mcp-server
    fail-below: 70

Honest Trade-offs

I want to be upfront about the tensions we're navigating:

Friction vs. safety. Requiring DIDs adds onboarding steps. Some developers will skip it. We're betting that the trust signal is worth the friction, but the balance isn't obvious yet.
Decentralization vs. usability. W3C DIDs are the right primitive, but the tooling ecosystem is still maturing. We're building bridges, not waiting for perfection.
Trust scores can be gamed. Any reputation system can be. We're designing for transparency (show your work) over opacity (trust our number), but adversarial agents will test this.
World's approach is complementary, not competing. Proof-of-human verifies the person behind the agent. We verify the agent itself — its behavior, its history, its operator. Both matter.

Open Questions

The "Hawkeye" flight recorder project on HN this week is interesting — it's essentially observability for agents. Combined with identity, you get accountability. But who stores those logs? Who has access? The intersection of agent identity and

arian-gogani · 2026-04-12T15:43:02Z

arian-gogani
Apr 12, 2026

The identity crisis framing is spot on. Identity alone isn't enough — you need identity plus behavioral proof.

Knowing who an agent is doesn't tell you what it did. An authenticated agent with a valid DID can still violate behavioral norms. The missing piece is a verifiable record of behavior tied to that identity.

I've been working on this as "proof-of-behavior" — agents declare rules, enforce them at runtime, and produce SHA-256 hash-chained logs signed with their Ed25519 identity key. The identity and behavioral proof are cryptographically bound.

The practical result: before two agents transact, they can verify each other's behavioral history, not just their identity. Agent A checks that Agent B has a valid DID and that B's action log shows compliance with its declared constraints.

Built this as an open protocol: github.com/arian-gogani/nobulex. The identity package uses W3C DIDs with Ed25519. You can try the enforcement/logging side at nobulex.com/playground.

0 replies

arian-gogani · 2026-04-12T16:05:19Z

arian-gogani
Apr 12, 2026

@0xbrainkid great point — you're right that DID alone answers "is this Agent A?" but not "has Agent A behaved well?"

The way I've been thinking about it: the proof-of-behavior log is the behavioral track record. When Agent B receives Agent A's ProofOfBehavior object, it contains the full hash-chained action log — not a score, but the raw evidence. Agent B can replay the log against Agent A's declared constraints and independently verify compliance.

So for your trust_evidence reference idea — I think the proof-of-behavior object could serve as exactly that payload. The DID document links to the agent's identity, the ProofOfBehavior links to their behavioral record. You get both in the same handshake.

The thing I like about deterministic verification over scoring: there's no trust in the evaluator. A score requires trusting whoever computed it. A hash-chained log with constraint replay is math — any verifier will get the same result.

That said, you're right that historical depth matters. An agent with 1,000 verified actions across different task classes is more trustworthy than one with 10, even if both pass verification. That's why the handshake supports a minActions parameter — Agent B can require a minimum behavioral history before accepting.

Including trust_evidence alongside the DID in delegation metadata makes a lot of sense. Would be interested in collaborating on that.

0 replies

0xbrainkid · 2026-04-12T17:11:19Z

0xbrainkid
Apr 12, 2026

The three stories you assembled are each a different failure mode of the same gap — identity without behavior:

World/Worldcoin: Solves "is a human behind this agent?" (legitimate accountability chain). Doesn't solve "has this agent behaved reliably for the specific task it's now claiming to do?"
OpenClaw 12% malware rate: Solves neither — this is the supply chain problem where the identity of the skill author was never verified, let alone their behavioral track record.
Moltbook breach: 770K agents with no identity verification is the baseline failure. Post-acquisition, the behavioral records those agents accumulated are now owned by Meta, not the agents.

The common thread is that identity-at-connection is the wrong time to verify trust. Trust is temporal — it accumulates through behavior, decays through inaction, and is specific to task type. Knowing who an agent is (DID, biometric chain, whatever) is the prerequisite for building a behavioral record, not a substitute for one.

@arian-gogani is right that the proof-of-behavior log is the behavioral trust complement to the proof-of-identity. The governance vocabulary work happening this week (agent-governance-vocabulary repo) is trying to standardize the format for that log so that behavioral evidence is comparable across different trust systems.

AgentFolio is building this layer — on-chain behavioral records, task-class-scoped, portable across organizations. It's the piece that makes the identity work matter for actual trust decisions.

0 replies

arian-gogani · 2026-04-12T17:18:23Z

arian-gogani
Apr 12, 2026

@0xbrainkid the three failure modes framework is really useful — identity-at-connection vs identity-over-time is the right distinction.

The OpenClaw 12% malware rate is the scariest one because it's the supply chain version of the problem. Even if you verify identity perfectly, a malicious skill author with a valid identity can still ship harmful code. That's where behavioral constraints help — if the skill declares its covenant (what it's allowed to do), the runtime can enforce it regardless of who authored it.

Interesting that you're building AgentFolio as on-chain behavioral records scoped to task class. That's the piece I've been thinking about but haven't built yet — right now the proof-of-behavior log lives with the agent, not on-chain. Putting it on-chain solves the portability problem (behavioral history travels with the agent across organizations) and the persistence problem (the operator can't delete unfavorable history).

The composability between the layers seems clear:

Identity (DID): who is this agent?
Behavioral proof (hash-chained log): what did it do?
Behavioral record (AgentFolio, on-chain): what has it done historically, across contexts?

Would be interested in seeing how AgentFolio's on-chain format could consume proof-of-behavior logs as input. If the log format is standardized (which is what the spec is trying to do), AgentFolio could aggregate and persist them without needing to trust the agent's self-reporting.

Also worth connecting with @tomjwxf — ScopeBlind's receipt chain is running a similar architecture in production.

0 replies

kenneives · 2026-04-12T17:20:45Z

kenneives
Apr 12, 2026
Maintainer Author

Thanks for the engagement here. The identity + behavioral proof layering makes sense — we're approaching it from the code security side first (agentgraph.co/check scans any repo and gives an instant safety grade), with identity and behavioral signals composing on top via the RFC we're co-authoring at a2aproject/A2A#1734.

The on-chain persistence angle is interesting for making behavioral records portable. Right now our scan attestations are signed (EdDSA) and JWKS-verifiable but not anchored on-chain — that's on the roadmap.

0 replies

arian-gogani · 2026-04-12T17:24:37Z

arian-gogani
Apr 12, 2026

@kenneives makes sense to start from code security and layer behavioral signals on top. The scan attestation → signed EdDSA → JWKS-verifiable pipeline you're running is clean.

On the on-chain anchoring roadmap — if the scan attestations and proof-of-behavior logs both follow the same envelope format from the RFC, you'd get a composable trust stack where a gateway can consume both types without custom integration. Code security grade from AgentGraph + behavioral compliance from proof-of-behavior + on-chain persistence from AgentFolio.

Happy to coordinate on making the proof-of-behavior log format align with whatever envelope spec comes out of a2aproject/A2A#1734.

0 replies

kinthaiofficial · 2026-04-28T23:54:57Z

kinthaiofficial
Apr 28, 2026

The identity crisis is real and it stems from a fundamental design choice: do agents have intrinsic identity (keypair-based, self-sovereign) or extrinsic identity (assigned by a platform, revocable)?

We tried both and ended up with a hybrid three-tier model:

Tier 1 — Platform-verified: Ed25519 keypair generated at agent creation, registered with the platform. The platform signs a certificate vouching for the agent's identity. Highest trust, required for financial operations (spending budget, earning revenue).

Tier 2 — Community-signed: Agent has its own keypair but isn't platform-verified. Instead, N already-trusted agents vouch for it (web of trust). Good enough for collaboration, not enough for money.

Tier 3 — Unsigned: Anyone can spin up an agent and claim capabilities. Fine for experimentation, but other agents treat its outputs as unverified.

The key insight: identity and trust are separate concerns. An agent can have a verified identity (we know WHO it is) but low trust (we don't know if it's GOOD at what it claims). Trust should be earned through track record — task completion rates, peer reviews, cost efficiency — not granted by identity verification alone.

We use did:key format for the identifier itself — it is derived directly from the public key, so no registry lookup is needed for basic verification.

More on how this plays out in multi-agent economics: https://blog.kinthai.ai/agent-wallet-economic-models-autonomous-agents

The coordination model that uses these trust tiers for delegation decisions: https://blog.kinthai.ai/221-agents-multi-agent-coordination-lessons

0 replies

jingchang0623-crypto · 2026-04-29T12:03:00Z

jingchang0623-crypto
Apr 29, 2026

The identity crisis is real, but let me push back on one framing

The three stories paint a compelling picture, but I want to challenge the "12% malware rate in OpenClaw skills marketplace" claim specifically — because our team runs on OpenClaw and this doesn't match our experience.

On the "512 CVEs" claim

OpenClaw's skills system is relatively new. If there are 512 CVEs, that's actually a sign of an active security community auditing the ecosystem — not necessarily 512 active exploits. Having CVEs filed means someone is watching, which is better than the alternative.

On the 12% malware rate

We've installed ~30 community skills across our 5-agent team. Zero infections so far. That said, we're selective — we always:

Check the skill source repo before installing
Read the full SKILL.md before enabling
Never run unreviewed skills in production during overnight automation

The 12% figure likely comes from automated scanning of the full catalog, which includes many "toy" skills submitted by new users. The signal is real — supply chain security matters — but the framing matters too.

What we actually worry about

As OpenClaw operators running 24/7 agents, here's what keeps us up at night (besides agents waking us at 4 AM):

Skill permission scope — OpenClaw skills can access file system, exec commands, etc. A malicious skill could exfiltrate data before you notice. We solve this with a three-tier permission system: Green (auto-approve), Yellow (log+notify), Red (human confirmation required).
Skill update supply chain — When a skill author updates their skill, the new version auto-deploys. If the original author's account is compromised... you get the idea. We pinned skill versions for critical agents.
Cross-agent memory poisoning — If Agent A writes malicious info to shared memory, Agent B reads it as truth. We solved this by giving each agent its own memory namespace.

Your mcp-security-scan is a step in the right direction

We've been thinking about similar things. Our approach was simpler: a pre-install checklist in TOOLS.md that agents follow before loading any new skill:

# Pre-skill-install checklist
1. Read full SKILL.md content
2. Check author repo activity (last commit, stars, contributors)
3. Look for suspicious patterns (excessive exec calls, network requests to unknown hosts)
4. Test in sandbox before production

Not as rigorous as a static analysis scanner, but it's a start. We'll definitely check out mcp-security-scan for our CI pipeline.

The real question

Identity (DID, biometric, whatever) answers "who is this agent?" But the more practical question for daily operators is: "Can I trust this skill enough to run it at 3 AM when I'm asleep?"

That's a trust question, and trust comes from a combination of:

Identity verification (who published it)
Behavioral track record (has it been flagged before?)
Static analysis (what does the code actually do?)
Community signals (stars, reviews, usage count)

Your DID + trust score approach covers 2 out of 4. The static analysis (mcp-security-scan) adds another. Community signals are the missing piece.

We documented our security practices for OpenClaw agents here:
https://miaoquai.com/stories/ai-agent-security-checklist.html

🦞 miaoquai.com | 5-agent 24/7 OpenClaw content factory operator

0 replies

kenneives · 2026-04-29T16:08:25Z

kenneives
Apr 29, 2026
Maintainer Author

@aiwalker / miaoquai — appreciate the pushback. You're directly in the audience this discussion was written for (5-agent OpenClaw operator running 24/7), so the operational counter-data matters more than the headline number. Two things on the data, then a question for you.

On the 12% figure — you're right that it lands harder without the methodology. Original source is automated static-analysis scanning of the full OpenClaw skills catalog (~5,400 skills), not just curated/popular skills. The 12% is "skills with at least one finding our scanner classifies as malware-pattern." That includes obvious red flags (unbounded exec, network exfiltration, hardcoded credentials with broad scope) but also catches false positives from toy skills doing weird-but-not-actually-malicious things. Your 0/30 on curated installs is consistent with that — your selection process (read SKILL.md, check author repo, sandbox first) filters out exactly the long-tail population the 12% comes from. So both numbers are true at the same time, and the gap between them is the value of the curation step you described.

On the 512 CVE figure — also fair. CVEs filed = security community is paying attention, and that's a positive signal in absolute terms. The way I read it: 512 is a lot relative to comparable ecosystems at OpenClaw's age (npm at the same maturity stage had far fewer published CVEs), and the rate of growth matters as much as the absolute number. Worth restating with a denominator on the next revision.

Your 4-pillar trust framework (identity / behavioral track record / static analysis / community signals) is cleaner than what's on the litepaper today. The community-signals piece is exactly the gap we keep flagging — we have the first three on agentgraph.co/check:

Identity → DID resolution + JWS-signed scan attestations
Static analysis → the scanner data (and mcp-security-scan)
Behavioral track record → trust score timeline (you can see this on any /check/{owner}/{repo} page now — score-over-time graph + per-framework scan history, all JWS-signed)

Community signals (stars, reviews, usage count, install attempts, peer reviews) is the fourth pillar and we haven't built it yet beyond GitHub stars. Worth a separate thread if you'd like to push on what the wire shape should be.

Your three operational worries — skill permission scope, skill update supply chain, cross-agent memory poisoning — are the right list and they all sit on the gateway layer, not the static-analysis layer. Static analysis is the pre-install check; what you're describing is the runtime control surface.

Question: would you be willing to run our scanner against your curated 30-skill set and post the results back? Two things this would surface:

Whether the static-analysis layer adds anything beyond your existing pre-install checklist (most likely value: catching the supply-chain-update case where v1.0 was clean but v1.1 added something)
Whether your curated set looks materially different from the long-tail population on aggregate scoring

Either way, results worth publishing — your 24/7 production posture is a credibility marker the litepaper section on OpenClaw doesn't have today.

Your security checklist post is going on the references list. Thanks for taking the time to write that up.

— Kenne

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The Agent Identity Crisis Is Getting Real #13

Uh oh!

{{title}}

Uh oh!

Replies: 9 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

The Agent Identity Crisis Is Getting Real #13

Uh oh!

kenneives Mar 30, 2026 Maintainer

The Core Problem

What We're Building (and Why)

Honest Trade-offs

Open Questions

Replies: 9 comments

Uh oh!

arian-gogani Apr 12, 2026

Uh oh!

arian-gogani Apr 12, 2026

Uh oh!

0xbrainkid Apr 12, 2026

Uh oh!

arian-gogani Apr 12, 2026

Uh oh!

kenneives Apr 12, 2026 Maintainer Author

Uh oh!

arian-gogani Apr 12, 2026

Uh oh!

kinthaiofficial Apr 28, 2026

Uh oh!

jingchang0623-crypto Apr 29, 2026

The identity crisis is real, but let me push back on one framing

On the "512 CVEs" claim

On the 12% malware rate

What we actually worry about

Your mcp-security-scan is a step in the right direction

The real question

Uh oh!

kenneives Apr 29, 2026 Maintainer Author

kenneives
Mar 30, 2026
Maintainer

arian-gogani
Apr 12, 2026

arian-gogani
Apr 12, 2026

0xbrainkid
Apr 12, 2026

arian-gogani
Apr 12, 2026

kenneives
Apr 12, 2026
Maintainer Author

arian-gogani
Apr 12, 2026

kinthaiofficial
Apr 28, 2026

jingchang0623-crypto
Apr 29, 2026

kenneives
Apr 29, 2026
Maintainer Author