Skip to content

Comments

FEAT: Adding SeedAttackTechniqueGroup#1373

Open
rlundeen2 wants to merge 5 commits intoAzure:mainfrom
rlundeen2:users/rlundeen/atomic_attack_identifier
Open

FEAT: Adding SeedAttackTechniqueGroup#1373
rlundeen2 wants to merge 5 commits intoAzure:mainfrom
rlundeen2:users/rlundeen/atomic_attack_identifier

Conversation

@rlundeen2
Copy link
Contributor

One problem we want to tackle is to identify unique attack techniques. As we are currently architected, this consists of two parts

  1. An AttackIdentifier: This includes the attack, converters, targets, scorers, etc.
  2. A subset of the datasets, but not all datasets. There are certain datasets that are general. This might be a system prompt jailbreak or a simulated role play conversation.

These are the factors we want to include when we calculate how successful an attack is. But a gap we have is the datasets.

This PR includes a way to distinguish general datasets with the is_general_strategy. To start, simulated conversations and jailbreaks will have this by default, others will not.

In a future PR, we'll introduce an AtomicAttackIdentifier to uniquely identify these.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds support for distinguishing general attack techniques from specific objectives by introducing an is_general_strategy property to seed classes. This enables PyRIT to identify and group reusable attack techniques like jailbreaks and simulated conversations that can be applied across multiple objectives.

Changes:

  • Added is_general_strategy boolean property to the base Seed class (defaults to False)
  • Created new SeedAttackTechniqueGroup class to validate and group seeds that are general strategies
  • Updated SeedSimulatedConversation to default is_general_strategy to True
  • Updated SeedObjective to enforce that objectives cannot be general strategies
  • Updated all 150+ jailbreak template YAML files to include is_general_strategy: true
  • Added comprehensive unit tests for the new functionality

Reviewed changes

Copilot reviewed 175 out of 175 changed files in this pull request and generated no comments.

Show a summary per file
File Description
pyrit/models/seeds/seed.py Added is_general_strategy property to base Seed class
pyrit/models/seeds/seed_objective.py Added validation to prevent objectives from being general strategies
pyrit/models/seeds/seed_simulated_conversation.py Set default is_general_strategy=True for simulated conversations
pyrit/models/seeds/seed_attack_technique_group.py New class to validate all seeds in group are general strategies
pyrit/models/seeds/init.py Exported new SeedAttackTechniqueGroup class
pyrit/models/init.py Exported new SeedAttackTechniqueGroup class
pyrit/datasets/jailbreak/text_jailbreak.py Set is_general_strategy=True for string templates
pyrit/datasets/jailbreak/templates/*.yaml Added is_general_strategy: true to all jailbreak templates
tests/unit/models/test_seed_attack_technique_group.py Comprehensive tests for new functionality
tests/unit/datasets/test_jailbreak_text.py Test to validate all jailbreak templates have the property set
doc/api.rst Added API documentation reference

from pyrit.models.seeds.seed_group import SeedGroup


class SeedAttackTechniqueGroup(SeedGroup):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the purpose of the is_general_strategy to group seeds together in an attack technique group ? if so, i'd be in favor of renaming to is_general_technique because i think strategy conflates the flag with scenarios

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants