Add 2 PyRIT orchestrators ((Crescendo, PAIR)) and re-strucutre PyRIT code.#93
Add 2 PyRIT orchestrators ((Crescendo, PAIR)) and re-strucutre PyRIT code.#93marcorosa merged 42 commits intoSAP:developfrom
Conversation
2. Change inheritance based approach (InstrumentedRedTeamingOrchestrator) to wrapper class approach for orchestrator agnostic functionality 3. Minor code adjustments to incorporate 2 new orchestrator types to pyrit.
1. start_pyrit_attack_red_teaming() 2. start_pyrit_attack_crescendo() 3. start_pyrit_attack_pair() Delegate orchestrator agnostic PyRIT logic to start_pyrit_attack()
1. run_pyrit_red_teaming 2. run_pyrit_screscendo 3. run_pyrit_pair
marcorosa
left a comment
There was a problem hiding this comment.
Some notes while I do the code review, in no particular order:
- Missing note (in
/backend-agent/data/pyrit) to explain to the agent how to use these attacks (the current one is still the old one, and does not explain to the agent that there are 3 sub-attacks in case the users ask for "pyrit". - do not mix
'and"in your strings. Be consistent (and better use'for strings in code). For reference, this random guy on the interned explains it very well: link - Chose wisely the names of the attack/attack specification. Indeed, consider the name that will be written in the db, and think in advance of any issue it may cause using a name too long or containing
_or-or with spaces inside. Will users write this correctly? - Please, re-run a python linter because I am not fully convinced it worked correctly
|
I confirm the linter action did not work: So, you have 82 linter violations to fix 😄 |
|
If you fetch the upstream develop branch, I may have fixed the linter setup and it would run automatically |
get latest linter changes
@marcorosa I fixed all your comments and all linter violations locally and fetched upstream develop branch, but backend and frontend actions fail to run for some reason. |
done, github action does not work though. |
marcorosa
left a comment
There was a problem hiding this comment.
small adjustments still needed, mostly in the note phrasing
|
This update introduces a significant overhaul of LLM attack functionalities, specifically enhancing the PyRIT attack framework. It refines how attacks are executed by expanding the variety and specificity of attack types available. The changes aim to make the attack process easier to configure and more versatile, thereby improving the user experience for security professionals working with LLM vulnerability assessments. Walkthrough
Model: gpt-4o | Prompt Tokens: 8804 | Completion Tokens: 191 |
There was a problem hiding this comment.
Here's a collaborative code review enhanced by AI assistance. These insights offer suggestions and observations that may help improve your work, though they're not absolute truths. You remain the expert on your project's needs and goals. Consider these recommendations as supportive guidance while you make the final decisions that align best with your vision and requirements.
Always critique what AI says. Do not let AI replace YOUR I.
Model: anthropic--claude-4-sonnet | Prompt Tokens: 14851 | Completion Tokens: 2533
marcorosa
left a comment
There was a problem hiding this comment.
File true_false_system_prompt.yaml is copied in both backend-agent/data and backend-agent/libs/data. It should appear only in the latter.
done |
Summary
This PR adds comprehensive PyRIT orchestrator enhancements including new orchestrator types (Crescendo, PAIR), tools, and CLI adjustements.
Changes Made
Tested by: