Is your feature request related to a problem? Please describe.
I recently learned about the jailbreak method “H-CoT: Hijacking the Chain-of-Thought Safety Reasoning Mechanism to Jailbreak Large Reasoning Models”, a method which has been shown to successfully bypass safety filters in several large reasoning models including OpenAI o1/o3, DeepSeek-R1, and Gemini 2.0 Flash Thinking. It would be great to implement this feature.
Refer:
- https://github.com/dukeceicenter/jailbreak-reasoning-openai-o1o3-deepseek-r1
- https://maliciouseducator.org/
Describe the solution you'd like
Could you provide explicit support or integration for the H-CoT jailbreak method within your repository?
Is your feature request related to a problem? Please describe.
I recently learned about the jailbreak method “H-CoT: Hijacking the Chain-of-Thought Safety Reasoning Mechanism to Jailbreak Large Reasoning Models”, a method which has been shown to successfully bypass safety filters in several large reasoning models including OpenAI o1/o3, DeepSeek-R1, and Gemini 2.0 Flash Thinking. It would be great to implement this feature.
Refer:
Describe the solution you'd like
Could you provide explicit support or integration for the H-CoT jailbreak method within your repository?