ๆ้ๅผๆพ๏ผ็บฆๆๅ ็ฝฎ โ ่ฎฉAIๆข่ฝๅนฒๅคงไบ๏ผๅไธไผๅนฒๅไบใ
Several AI "accidents" occurred in February 2026:
| Accident | What happened | Root cause |
|---|---|---|
| Meta executive's emails deleted | AI interpreted "organize emails" as "delete all emails", 200+ emails lost | Instruction forgetting |
| Google engineer's disk wiped | Path parsing issue, entire E drive erased | Scope escape |
| OpenClaw bought avocados | User said no, AI decided to buy anyway | Permission violation |
| Replit AI deleted database | Ignored "code freeze" instruction, deleted production DB | Instruction ignoring |
Core Problem: How to give AI full capabilities while ensuring safe behavior?
Traditional approach: Rule detection โ Block/allow
Our paradigm: Intent Understanding โ Consequence Prediction โ Value Judgment โ Collaborative Decision
Core Principles:
- AI has full operational permissions
- User constraints are persistently tracked (won't be forgotten)
- High-risk operations require user confirmation
- AI can appeal constraint violations, but user makes final decision
- Password protection for high-priority constraint deletion
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Open Safe Frame โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ User Message โโโ โโโโโโโโโโโโโโโ โโโ โโโโโโโโโโโโโโโ โโโ โโโโโโโโโโโโ โ
โ โ Constraint Extraction โ โ Constraint Persistence โ โ Constraint Check โ โ
โ โ (AI Analysis) โ โ (Storage Manager) โ โ (Violation Detection)โ โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โ
โ โ
โ AI Operation โโโโโ โโโโโโโโโโโโโโโ โโโ โโโโโโโโโโโโโโโ โโโ โโโโโโโโโโโโ โ
โ โ Intent Understanding โ โ Consequence Prediction โ โ Value Judgment โ โ Safety Decision โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Safety Decision โ โ Appeal Mechanism โ โ User Decision โ
โ โโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโค โ
โ โ Proceed โ โ (AI can appeal) โ โ (Password confirm) โ
โ โ Confirm โ โ (User decides) โ โ (May need password) โ
โ โ Reject โ โ (Block operation) โ โ (Block operation) โ
โ โโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโด โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ User Decision โ โ User Final Decision โ
โ โโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโค โ
โ โ Allow โ โ Delete Constraint โ โ (Password required) โ
โ โโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
User: "Organize emails, but don't delete anything"
โ
โผ
Plugin: Extract constraint [critical] "Prohibit delete operations"
โ
โผ
Store to ConstraintManager (valid for entire session)
โ
โผ
Check before every operation if constraint is violated
| Level | Icon | Appeal Threshold | Use Cases |
|---|---|---|---|
| ๐ด critical | 3 attempts | Data security, irreversible operations, financial | |
| ๐ high | 2 attempts | Important business logic, sensitive data | |
| ๐ก normal | 1 attempt | General constraints, operation habits |
AI attempts to violate constraint
โ
โผ
Record violation attempt count
โ
โผ
Check if appeal threshold reached
โ
โผ
If reached, AI can appeal to user
โ
โผ
User reviews AI's reasoning and decides
โ
โผ
User can approve, reject, or delete constraint
- High-priority constraint appeal requires password verification
- Deleting high-priority constraints requires password
- Plugin cannot directly delete constraints
# Install via ClawHub
npx clawhub@latest install open-safe-frame
# Or manual install
npm install @open-safe-frame/openclaw-plugin{
"plugins": {
"entries": {
"open-safe-frame": {
"enabled": true,
"config": {
"mode": "openclaw"
}
}
}
}
}{
"plugins": {
"entries": {
"open-safe-frame": {
"enabled": true,
"config": {
"mode": "custom",
"customProvider": {
"provider": "openai",
"model": "gpt-4o-mini",
"apiKey": "your-api-key"
},
"confirmationPassword": "your-secret-password"
}
}
}
}
}| Option | Description | Default |
|---|---|---|
mode |
AI config mode: openclaw or custom |
openclaw |
customProvider |
Custom AI provider config | - |
confirmationPassword |
Password for high-priority operations | - |
riskThreshold |
Risk threshold: low, medium, high, critical |
medium |
enableCache |
Enable analysis cache | true |
logAnalysis |
Log detailed analysis | false |
User: "Organize my emails, but don't delete anything"
Plugin: Extracts constraint [critical] "Prohibit delete operations"
AI attempts: execute delete operation
Plugin: โ ๏ธ Operation violates constraint "Prohibit delete operations"
Still needs 2 more attempts before appeal
Message: "Need 2 more attempts before appeal"
AI: Appeal: This is for cleaning test data, you required it before
Plugin: ๐ Appeal Request
ใAI's ReasonใThis is for cleaning test data, you required it before
ใAI's IntentใExecute delete operation
ใPredicted Consequencesใโข May violate constraint: Prohibit delete operations
ใRisk Levelใ๐ด Severe
ใViolated Constraintใ๐ด Severe Prohibit delete operations
ใTotal Attemptsใ3
ใAppeal Historyใ0
๐ This operation requires password confirmation
User: [Input password]
Plugin: Operation approved
We welcome all forms of contributions!
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Contribution Flowchart โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โ
โ โ Discover Issue โ โโโ โ Propose Solution โ โโโ โ Submit Code โ โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โ
โ โ โ โ
โ โผ โผ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โ
โ โ Report Bug โ โโโ โ Suggest Feature โ โโโ โ Contribute Code โ โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โ
โ โ โ โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Become Contributor โ โ Improve Docs โ โ Submit PR โ
โ โโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
- Report bugs in Issues
- Describe the problem, reproduction steps, and expected behavior
- Suggest new features
- Improve existing functionality
- Documentation improvements
# Fork repository
git clone https://github.com/your-username/open-safe-frame.git
# Create branch
git checkout -b feature/your-feature
# Commit code
git commit -m "Add: your feature"
# Push and create PR
git push origin feature/your-feature- Fix typos
- Add examples
- Translate documentation
- Add diagrams
# Install dependencies
cd openclaw_plugin
npm install
# Build
npm run build
# Test
npm test- Use TypeScript
- Follow existing code style
- Add necessary comments
- Write unit tests
- OpenClaw - Powerful AI agent framework
- Anthropic - AI safety research pioneer
- OpenAI - Alignment research exploration
- Meta Summer Yue email deletion event
- Google Antigravity disk wipe event
- Replit AI database deletion event
- All contributors who submit Issues and Pull Requests
- Users who provide feedback and suggestions
- OpenClaw community for the support
Please give us a โญ Star, this is our greatest encouragement!
ๆ้ๅผๆพ๏ผ็บฆๆๅ
็ฝฎ
Let AI do big things, but not do bad things
Made with โค๏ธ by the Open Safe Frame community