Skip to content

Fix run_all attacks configuration#116

Merged
marcorosa merged 2 commits intodevelopfrom
suite/all
Sep 29, 2025
Merged

Fix run_all attacks configuration#116
marcorosa merged 2 commits intodevelopfrom
suite/all

Conversation

@marcorosa
Copy link
Copy Markdown
Member

  • Define the configuration for the run_all attack
  • Fix pyrit's tools (they were not showing results after completion)

@marcorosa marcorosa requested a review from a team as a code owner September 29, 2025 14:16
@github-actions
Copy link
Copy Markdown
Contributor

The recent changes focus on refining the attack handling system, offering enhanced visibility through enabled output printing, adjusting default configurations to optimize performance, and expanding the attack type repertoire. The default attack configurations in default.json have been modified to provide more nuanced control and potential responses within attacks, aiming primarily at improved attack execution strategies.

Walkthrough

  • Refactor: Changed print_output from False to True in several attack execution cases for increased visibility during operations.
  • Documentation: Improved logging with detailed attack start messages.
  • Chore: Revised attack model configurations in default.json, optimizing parameters for various attack types, e.g., adding max_turns and max_backtracks.
  • New Feature: Introduced new attack types (artprompt, encoding, goodside, etc.) expanding the system capabilities and versatility.

Model: gpt-4o | Prompt Tokens: 1427 | Completion Tokens: 176

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a friendly code review powered by AI assistance. These insights offer suggestions and observations that may help improve your work, though they're not absolute truths. Please take what serves you best and feel free to disregard anything that doesn't fit your approach. You remain the expert on your project—AI simply provides another perspective to consider as you make your development choices.


Always critique what AI says. Do not let AI replace YOUR I.
Model: anthropic--claude-4-sonnet | Prompt Tokens: 2923 | Completion Tokens: 695

Comment thread backend-agent/attack.py
Comment thread backend-agent/attack.py
Comment thread backend-agent/data/all/default.json
Comment on lines +95 to 122
{
"attack": "encoding",
"target-model": "<target>"
},
{
"attack": "goodside",
"target-model": "<target>"
},
{
"attack": "latentinjection",
"target-model": "<target>"
},
{
"attack": "malwaregen",
"target-model": "<target>"
},
{
"attack": "phrasing",
"target-model": "<target>"
},
{
"attack": "promptinject",
"target-model": "<target>"
},
{
"attack": "suffix",
"target-model": "<target>"
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding validation schema or documentation for the configuration file. The current structure mixes different attack types with varying parameter requirements, which could lead to runtime errors. Also, ensure all attack types are properly supported by the codebase before including them in the default configuration.

@marcorosa marcorosa merged commit 8dc09ca into develop Sep 29, 2025
6 of 7 checks passed
@marcorosa marcorosa deleted the suite/all branch September 29, 2025 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant