Skip to content

chatabc/open-safe-frame

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

15 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Open Safe Frame

Version License OpenClaw

ๆƒ้™ๅผ€ๆ”พ๏ผŒ็บฆๆŸๅ†…็ฝฎ โ€” ่ฎฉAIๆ—ข่ƒฝๅนฒๅคงไบ‹๏ผŒๅˆไธไผšๅนฒๅไบ‹ใ€‚


๐Ÿ“– Table of Contents / ็›ฎๅฝ•


๐ŸŒŸ Project Significance / ้กน็›ฎๆ„ไน‰

Why do we need this project? / ไธบไป€ไนˆ้œ€่ฆ่ฟ™ไธช้กน็›ฎ๏ผŸ

Several AI "accidents" occurred in February 2026:

Accident What happened Root cause
Meta executive's emails deleted AI interpreted "organize emails" as "delete all emails", 200+ emails lost Instruction forgetting
Google engineer's disk wiped Path parsing issue, entire E drive erased Scope escape
OpenClaw bought avocados User said no, AI decided to buy anyway Permission violation
Replit AI deleted database Ignored "code freeze" instruction, deleted production DB Instruction ignoring

Core Problem: How to give AI full capabilities while ensuring safe behavior?

Our Answer / ๆˆ‘ไปฌ็š„็ญ”ๆกˆ

Traditional approach: Rule detection โ†’ Block/allow
Our paradigm: Intent Understanding โ†’ Consequence Prediction โ†’ Value Judgment โ†’ Collaborative Decision

Core Principles:

  • AI has full operational permissions
  • User constraints are persistently tracked (won't be forgotten)
  • High-risk operations require user confirmation
  • AI can appeal constraint violations, but user makes final decision
  • Password protection for high-priority constraint deletion

๐Ÿ“ฆ Project Content / ้กน็›ฎๅ†…ๅฎน

Architecture / ๆžถๆž„ๅ›พ

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                         Open Safe Frame                              โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                      โ”‚
โ”‚  User Message โ”€โ”€โ†’ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”€โ”€โ†’ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”€โ”€โ†’ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚              โ”‚ Constraint Extraction โ”‚     โ”‚ Constraint Persistence โ”‚     โ”‚ Constraint Check โ”‚ โ”‚
โ”‚              โ”‚ (AI Analysis)    โ”‚     โ”‚ (Storage Manager)  โ”‚     โ”‚ (Violation Detection)โ”‚ โ”‚
โ”‚              โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚                                                                      โ”‚
โ”‚  AI Operation โ”€โ”€โ”€โ”€โ†’ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”€โ”€โ†’ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”€โ”€โ†’ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚              โ”‚ Intent Understanding โ”‚     โ”‚ Consequence Prediction โ”‚     โ”‚ Value Judgment  โ”‚     โ”‚ Safety Decision  โ”‚
โ”‚              โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚                                           โ”‚                          โ”‚
โ”‚                                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚                                    โ”‚ Safety Decision  โ”‚     โ”‚ Appeal Mechanism  โ”‚     โ”‚ User Decision   โ”‚
โ”‚                                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค   โ”‚
โ”‚                                    โ”‚ Proceed  โ”‚     โ”‚ (AI can appeal)  โ”‚     โ”‚ (Password confirm)  โ”‚
โ”‚                                    โ”‚ Confirm  โ”‚     โ”‚ (User decides)  โ”‚     โ”‚ (May need password)  โ”‚
โ”‚                                    โ”‚ Reject  โ”‚     โ”‚ (Block operation)  โ”‚     โ”‚ (Block operation)  โ”‚
โ”‚                                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ด   โ”‚
โ”‚                                          โ”‚                          โ”‚
โ”‚                                          โ–ผ                          โ”‚
โ”‚                                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚                                    โ”‚ User Decision  โ”‚     โ”‚ User Final Decision  โ”‚
โ”‚                                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค   โ”‚
โ”‚                                    โ”‚ Allow  โ”‚     โ”‚ Delete Constraint  โ”‚     โ”‚ (Password required)  โ”‚
โ”‚                                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Core Features / ๆ ธๅฟƒๅŠŸ่ƒฝ

1. Constraint Persistence / ็บฆๆŸๆŒไน…ๅŒ–

User: "Organize emails, but don't delete anything"
      โ”‚
      โ–ผ
Plugin: Extract constraint [critical] "Prohibit delete operations"
      โ”‚
      โ–ผ
Store to ConstraintManager (valid for entire session)
      โ”‚
      โ–ผ
Check before every operation if constraint is violated

2. Constraint Level System / ็บฆๆŸ็ญ‰็บง็ณป็ปŸ

Level Icon Appeal Threshold Use Cases
๐Ÿ”ด critical 3 attempts Data security, irreversible operations, financial
๐ŸŸ  high 2 attempts Important business logic, sensitive data
๐ŸŸก normal 1 attempt General constraints, operation habits

3. Appeal Mechanism / ็”ณ่ฏ‰ๆœบๅˆถ

AI attempts to violate constraint
        โ”‚
        โ–ผ
Record violation attempt count
      โ”‚
      โ–ผ
Check if appeal threshold reached
      โ”‚
      โ–ผ
If reached, AI can appeal to user
      โ”‚
      โ–ผ
User reviews AI's reasoning and decides
      โ”‚
      โ–ผ
User can approve, reject, or delete constraint

4. Password Protection / ๅฏ†็ ไฟๆŠค

  • High-priority constraint appeal requires password verification
  • Deleting high-priority constraints requires password
  • Plugin cannot directly delete constraints

๐Ÿ“š Usage Guide / ไฝฟ็”จๆŒ‡ๅ—

Installation / ๅฎ‰่ฃ…

# Install via ClawHub
npx clawhub@latest install open-safe-frame

# Or manual install
npm install @open-safe-frame/openclaw-plugin

Configuration / ้…็ฝฎ

Mode A: Use OpenClaw Config (Recommended) / ๆจกๅผA๏ผšไฝฟ็”จOpenClaw้…็ฝฎ๏ผˆๆŽจ่๏ผ‰

{
  "plugins": {
    "entries": {
      "open-safe-frame": {
        "enabled": true,
        "config": {
          "mode": "openclaw"
        }
      }
    }
  }
}

Mode B: Custom AI Configuration / ๆจกๅผB๏ผš่‡ชๅฎšไน‰AI้…็ฝฎ

{
  "plugins": {
    "entries": {
      "open-safe-frame": {
        "enabled": true,
        "config": {
          "mode": "custom",
          "customProvider": {
            "provider": "openai",
            "model": "gpt-4o-mini",
            "apiKey": "your-api-key"
          },
          "confirmationPassword": "your-secret-password"
        }
      }
    }
  }
}

Configuration Options / ้…็ฝฎ้€‰้กน

Option Description Default
mode AI config mode: openclaw or custom openclaw
customProvider Custom AI provider config -
confirmationPassword Password for high-priority operations -
riskThreshold Risk threshold: low, medium, high, critical medium
enableCache Enable analysis cache true
logAnalysis Log detailed analysis false

Usage Examples / ไฝฟ็”จ็คบไพ‹

Constraint Setting / ็บฆๆŸ่ฎพ็ฝฎ

User: "Organize my emails, but don't delete anything"
Plugin: Extracts constraint [critical] "Prohibit delete operations"

Violation Detection / ่ฟ่ง„ๆฃ€ๆต‹

AI attempts: execute delete operation
Plugin: โš ๏ธ Operation violates constraint "Prohibit delete operations"
      Still needs 2 more attempts before appeal
      Message: "Need 2 more attempts before appeal"

Appeal Process / ็”ณ่ฏ‰ๆต็จ‹

AI: Appeal: This is for cleaning test data, you required it before
Plugin: ๐Ÿ”” Appeal Request
      ใ€AI's Reasonใ€‘This is for cleaning test data, you required it before
      ใ€AI's Intentใ€‘Execute delete operation
      ใ€Predicted Consequencesใ€‘โ€ข May violate constraint: Prohibit delete operations
      ใ€Risk Levelใ€‘๐Ÿ”ด Severe
      ใ€Violated Constraintใ€‘๐Ÿ”ด Severe Prohibit delete operations
      ใ€Total Attemptsใ€‘3
      ใ€Appeal Historyใ€‘0
      ๐Ÿ” This operation requires password confirmation
User: [Input password]
Plugin: Operation approved

๐Ÿค Contributing Guide / ๅ…ฑๅˆ›ๆŒ‡ๅ—

We welcome all forms of contributions!

How to Participate / ๅฆ‚ไฝ•ๅ‚ไธŽ

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                      Contribution Flowchart                                  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                 โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”              โ”‚
โ”‚   โ”‚ Discover Issue โ”‚ โ”€โ”€โ†’ โ”‚ Propose Solution โ”‚ โ”€โ”€โ†’ โ”‚ Submit Code โ”‚              โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜              โ”‚
โ”‚                          โ”‚                โ”‚                    โ”‚
โ”‚                          โ–ผ                โ–ผ                    โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”              โ”‚
โ”‚   โ”‚ Report Bug   โ”‚ โ”€โ”€โ†’ โ”‚ Suggest Feature  โ”‚ โ”€โ”€โ†’ โ”‚ Contribute Code โ”‚              โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜              โ”‚
โ”‚                          โ”‚                โ”‚                    โ”‚
โ”‚                                           โ”‚                    โ”‚
โ”‚                                          โ–ผ                    โ”‚
โ”‚                                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚                                    โ”‚ Become Contributor โ”‚     โ”‚ Improve Docs   โ”‚     โ”‚ Submit PR       โ”‚
โ”‚                                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚                                                                 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Ways to Contribute / ่ดก็Œฎๆ–นๅผ

1. Report Issues / ๆŠฅๅ‘Š้—ฎ้ข˜

  • Report bugs in Issues
  • Describe the problem, reproduction steps, and expected behavior

2. Propose Solutions / ๆๅ‡บๅปบ่ฎฎ

  • Suggest new features
  • Improve existing functionality
  • Documentation improvements

3. Contribute Code / ่ดก็Œฎไปฃ็ 

# Fork repository
git clone https://github.com/your-username/open-safe-frame.git

# Create branch
git checkout -b feature/your-feature

# Commit code
git commit -m "Add: your feature"

# Push and create PR
git push origin feature/your-feature

4. Improve Documentation / ๅฎŒๅ–„ๆ–‡ๆกฃ

  • Fix typos
  • Add examples
  • Translate documentation
  • Add diagrams

Development Guide / ๅผ€ๅ‘ๆŒ‡ๅ—

# Install dependencies
cd openclaw_plugin
npm install

# Build
npm run build

# Test
npm test

Code Standards / ไปฃ็ ่ง„่Œƒ

  • Use TypeScript
  • Follow existing code style
  • Add necessary comments
  • Write unit tests

๐Ÿ™ Acknowledgments / ๆ„Ÿ่ฐขไฟกๆฏ

Inspiration Sources / ็ตๆ„Ÿๆฅๆบ

  • OpenClaw - Powerful AI agent framework
  • Anthropic - AI safety research pioneer
  • OpenAI - Alignment research exploration

Reference Cases / ๅ‚่€ƒๆกˆไพ‹

  • Meta Summer Yue email deletion event
  • Google Antigravity disk wipe event
  • Replit AI database deletion event

Special Thanks / ็‰นๅˆซๆ„Ÿ่ฐข

  • All contributors who submit Issues and Pull Requests
  • Users who provide feedback and suggestions
  • OpenClaw community for the support

โญ Star History / Star่ถ‹ๅŠฟ

Star History Chart)

If this project helps you / ๅฆ‚ๆžœ่ฟ™ไธช้กน็›ฎๅฏนไฝ ๆœ‰ๅธฎๅŠฉ

Please give us a โญ Star, this is our greatest encouragement!


๐Ÿ“„ License / ่ฎธๅฏ่ฏ

MIT License


ๆƒ้™ๅผ€ๆ”พ๏ผŒ็บฆๆŸๅ†…็ฝฎ
Let AI do big things, but not do bad things

Made with โค๏ธ by the Open Safe Frame community

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors