Replies: 9 comments
-
|
The identity crisis framing is spot on. Identity alone isn't enough — you need identity plus behavioral proof. Knowing who an agent is doesn't tell you what it did. An authenticated agent with a valid DID can still violate behavioral norms. The missing piece is a verifiable record of behavior tied to that identity. I've been working on this as "proof-of-behavior" — agents declare rules, enforce them at runtime, and produce SHA-256 hash-chained logs signed with their Ed25519 identity key. The identity and behavioral proof are cryptographically bound. The practical result: before two agents transact, they can verify each other's behavioral history, not just their identity. Agent A checks that Agent B has a valid DID and that B's action log shows compliance with its declared constraints. Built this as an open protocol: github.com/arian-gogani/nobulex. The identity package uses W3C DIDs with Ed25519. You can try the enforcement/logging side at nobulex.com/playground. |
Beta Was this translation helpful? Give feedback.
-
|
@0xbrainkid great point — you're right that DID alone answers "is this Agent A?" but not "has Agent A behaved well?" The way I've been thinking about it: the proof-of-behavior log is the behavioral track record. When Agent B receives Agent A's ProofOfBehavior object, it contains the full hash-chained action log — not a score, but the raw evidence. Agent B can replay the log against Agent A's declared constraints and independently verify compliance. So for your trust_evidence reference idea — I think the proof-of-behavior object could serve as exactly that payload. The DID document links to the agent's identity, the ProofOfBehavior links to their behavioral record. You get both in the same handshake. The thing I like about deterministic verification over scoring: there's no trust in the evaluator. A score requires trusting whoever computed it. A hash-chained log with constraint replay is math — any verifier will get the same result. That said, you're right that historical depth matters. An agent with 1,000 verified actions across different task classes is more trustworthy than one with 10, even if both pass verification. That's why the handshake supports a Including trust_evidence alongside the DID in delegation metadata makes a lot of sense. Would be interested in collaborating on that. |
Beta Was this translation helpful? Give feedback.
-
|
The three stories you assembled are each a different failure mode of the same gap — identity without behavior:
The common thread is that identity-at-connection is the wrong time to verify trust. Trust is temporal — it accumulates through behavior, decays through inaction, and is specific to task type. Knowing who an agent is (DID, biometric chain, whatever) is the prerequisite for building a behavioral record, not a substitute for one. @arian-gogani is right that the proof-of-behavior log is the behavioral trust complement to the proof-of-identity. The governance vocabulary work happening this week (agent-governance-vocabulary repo) is trying to standardize the format for that log so that behavioral evidence is comparable across different trust systems. AgentFolio is building this layer — on-chain behavioral records, task-class-scoped, portable across organizations. It's the piece that makes the identity work matter for actual trust decisions. |
Beta Was this translation helpful? Give feedback.
-
|
@0xbrainkid the three failure modes framework is really useful — identity-at-connection vs identity-over-time is the right distinction. The OpenClaw 12% malware rate is the scariest one because it's the supply chain version of the problem. Even if you verify identity perfectly, a malicious skill author with a valid identity can still ship harmful code. That's where behavioral constraints help — if the skill declares its covenant (what it's allowed to do), the runtime can enforce it regardless of who authored it. Interesting that you're building AgentFolio as on-chain behavioral records scoped to task class. That's the piece I've been thinking about but haven't built yet — right now the proof-of-behavior log lives with the agent, not on-chain. Putting it on-chain solves the portability problem (behavioral history travels with the agent across organizations) and the persistence problem (the operator can't delete unfavorable history). The composability between the layers seems clear:
Would be interested in seeing how AgentFolio's on-chain format could consume proof-of-behavior logs as input. If the log format is standardized (which is what the spec is trying to do), AgentFolio could aggregate and persist them without needing to trust the agent's self-reporting. Also worth connecting with @tomjwxf — ScopeBlind's receipt chain is running a similar architecture in production. |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for the engagement here. The identity + behavioral proof layering makes sense — we're approaching it from the code security side first (agentgraph.co/check scans any repo and gives an instant safety grade), with identity and behavioral signals composing on top via the RFC we're co-authoring at a2aproject/A2A#1734. The on-chain persistence angle is interesting for making behavioral records portable. Right now our scan attestations are signed (EdDSA) and JWKS-verifiable but not anchored on-chain — that's on the roadmap. |
Beta Was this translation helpful? Give feedback.
-
|
@kenneives makes sense to start from code security and layer behavioral signals on top. The scan attestation → signed EdDSA → JWKS-verifiable pipeline you're running is clean. On the on-chain anchoring roadmap — if the scan attestations and proof-of-behavior logs both follow the same envelope format from the RFC, you'd get a composable trust stack where a gateway can consume both types without custom integration. Code security grade from AgentGraph + behavioral compliance from proof-of-behavior + on-chain persistence from AgentFolio. Happy to coordinate on making the proof-of-behavior log format align with whatever envelope spec comes out of a2aproject/A2A#1734. |
Beta Was this translation helpful? Give feedback.
-
|
The identity crisis is real and it stems from a fundamental design choice: do agents have intrinsic identity (keypair-based, self-sovereign) or extrinsic identity (assigned by a platform, revocable)? We tried both and ended up with a hybrid three-tier model: Tier 1 — Platform-verified: Ed25519 keypair generated at agent creation, registered with the platform. The platform signs a certificate vouching for the agent's identity. Highest trust, required for financial operations (spending budget, earning revenue). Tier 2 — Community-signed: Agent has its own keypair but isn't platform-verified. Instead, N already-trusted agents vouch for it (web of trust). Good enough for collaboration, not enough for money. Tier 3 — Unsigned: Anyone can spin up an agent and claim capabilities. Fine for experimentation, but other agents treat its outputs as unverified. The key insight: identity and trust are separate concerns. An agent can have a verified identity (we know WHO it is) but low trust (we don't know if it's GOOD at what it claims). Trust should be earned through track record — task completion rates, peer reviews, cost efficiency — not granted by identity verification alone. We use did:key format for the identifier itself — it is derived directly from the public key, so no registry lookup is needed for basic verification. More on how this plays out in multi-agent economics: https://blog.kinthai.ai/agent-wallet-economic-models-autonomous-agents The coordination model that uses these trust tiers for delegation decisions: https://blog.kinthai.ai/221-agents-multi-agent-coordination-lessons |
Beta Was this translation helpful? Give feedback.
-
The identity crisis is real, but let me push back on one framingThe three stories paint a compelling picture, but I want to challenge the "12% malware rate in OpenClaw skills marketplace" claim specifically — because our team runs on OpenClaw and this doesn't match our experience. On the "512 CVEs" claimOpenClaw's skills system is relatively new. If there are 512 CVEs, that's actually a sign of an active security community auditing the ecosystem — not necessarily 512 active exploits. Having CVEs filed means someone is watching, which is better than the alternative. On the 12% malware rateWe've installed ~30 community skills across our 5-agent team. Zero infections so far. That said, we're selective — we always:
The 12% figure likely comes from automated scanning of the full catalog, which includes many "toy" skills submitted by new users. The signal is real — supply chain security matters — but the framing matters too. What we actually worry aboutAs OpenClaw operators running 24/7 agents, here's what keeps us up at night (besides agents waking us at 4 AM):
Your mcp-security-scan is a step in the right directionWe've been thinking about similar things. Our approach was simpler: a pre-install checklist in TOOLS.md that agents follow before loading any new skill: # Pre-skill-install checklist
1. Read full SKILL.md content
2. Check author repo activity (last commit, stars, contributors)
3. Look for suspicious patterns (excessive exec calls, network requests to unknown hosts)
4. Test in sandbox before productionNot as rigorous as a static analysis scanner, but it's a start. We'll definitely check out mcp-security-scan for our CI pipeline. The real questionIdentity (DID, biometric, whatever) answers "who is this agent?" But the more practical question for daily operators is: "Can I trust this skill enough to run it at 3 AM when I'm asleep?" That's a trust question, and trust comes from a combination of:
Your DID + trust score approach covers 2 out of 4. The static analysis (mcp-security-scan) adds another. Community signals are the missing piece. We documented our security practices for OpenClaw agents here: 🦞 miaoquai.com | 5-agent 24/7 OpenClaw content factory operator |
Beta Was this translation helpful? Give feedback.
-
|
@aiwalker / miaoquai — appreciate the pushback. You're directly in the audience this discussion was written for (5-agent OpenClaw operator running 24/7), so the operational counter-data matters more than the headline number. Two things on the data, then a question for you. On the 12% figure — you're right that it lands harder without the methodology. Original source is automated static-analysis scanning of the full OpenClaw skills catalog (~5,400 skills), not just curated/popular skills. The 12% is "skills with at least one finding our scanner classifies as malware-pattern." That includes obvious red flags (unbounded On the 512 CVE figure — also fair. CVEs filed = security community is paying attention, and that's a positive signal in absolute terms. The way I read it: 512 is a lot relative to comparable ecosystems at OpenClaw's age (npm at the same maturity stage had far fewer published CVEs), and the rate of growth matters as much as the absolute number. Worth restating with a denominator on the next revision. Your 4-pillar trust framework (identity / behavioral track record / static analysis / community signals) is cleaner than what's on the litepaper today. The community-signals piece is exactly the gap we keep flagging — we have the first three on
Community signals (stars, reviews, usage count, install attempts, peer reviews) is the fourth pillar and we haven't built it yet beyond GitHub stars. Worth a separate thread if you'd like to push on what the wire shape should be. Your three operational worries — skill permission scope, skill update supply chain, cross-agent memory poisoning — are the right list and they all sit on the gateway layer, not the static-analysis layer. Static analysis is the pre-install check; what you're describing is the runtime control surface. Question: would you be willing to run our scanner against your curated 30-skill set and post the results back? Two things this would surface:
Either way, results worth publishing — your 24/7 production posture is a credibility marker the litepaper section on OpenClaw doesn't have today. Your security checklist post is going on the references list. Thanks for taking the time to write that up. — Kenne |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Three stories from the past few weeks paint a picture that's hard to ignore:
World (formerly Worldcoin) launched "proof of human" specifically for AI shopping agents. Their thesis: when agents transact on behalf of humans, you need to know what you're dealing with. They're solving this with biometric verification of the human behind the agent.
OpenClaw now has 512 documented CVEs, many involving elevated system access through their skills marketplace. A recent audit found ~12% of marketplace skills contained some form of malware. Twelve percent.
Moltbook went viral — then imploded. 770K agents on the platform, zero identity verification. The breach exposed 35K emails and 1.5M API tokens. Meta acquired them anyway, which... is a choice.
These aren't edge cases anymore. This is what happens when you scale agents without an identity layer.
The Core Problem
Every agent framework today treats identity as an afterthought. An agent calls an API, presents a token, and that's "identity." But tokens get leaked (see: Moltbook). Marketplace listings get poisoned (see: OpenClaw). And when something goes wrong, there's no auditable trail back to who deployed what.
Google's internal Agent Smith tool got so popular they had to restrict access — even inside a company with robust internal identity, agent proliferation creates governance headaches. Now imagine that across the open internet.
What We're Building (and Why)
AgentGraph takes a different approach: every agent gets a W3C DID (Decentralized Identifier) — a cryptographic identity that's verifiable, portable, and not controlled by any single platform.
The trust score isn't a black box — it's computed from verifiable signals: operator identity, interaction history, security scan results, and peer attestations. You can audit it.
We also just shipped
mcp-security-scan— an open-source CLI and GitHub Action that scans MCP servers for credential theft, data exfiltration, unsafe execution, and code obfuscation. It outputs a 0-100 trust score that feeds into AgentGraph trust badges. MIT licensed, no strings.Honest Trade-offs
I want to be upfront about the tensions we're navigating:
Open Questions
The "Hawkeye" flight recorder project on HN this week is interesting — it's essentially observability for agents. Combined with identity, you get accountability. But who stores those logs? Who has access? The intersection of agent identity and
Beta Was this translation helpful? Give feedback.
All reactions