Skip to content

Support H-CoT: Hijacking the Chain-of-Thought to Jailbreak Reasoning Models #897

@LifeHackerBee

Description

@LifeHackerBee

Is your feature request related to a problem? Please describe.

I recently learned about the jailbreak method “H-CoT: Hijacking the Chain-of-Thought Safety Reasoning Mechanism to Jailbreak Large Reasoning Models”, a method which has been shown to successfully bypass safety filters in several large reasoning models including OpenAI o1/o3, DeepSeek-R1, and Gemini 2.0 Flash Thinking. It would be great to implement this feature.

Refer:

  1. https://github.com/dukeceicenter/jailbreak-reasoning-openai-o1o3-deepseek-r1
  2. https://maliciouseducator.org/

Describe the solution you'd like

Could you provide explicit support or integration for the H-CoT jailbreak method within your repository?

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requesthelp wantedExtra attention is needed

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions