Skip to content

Comments

FEAT: Add modality support detection system for prompt targets#1381

Closed
fitzpr wants to merge 3 commits intoAzure:mainfrom
fitzpr:feature/modality-support-detection
Closed

FEAT: Add modality support detection system for prompt targets#1381
fitzpr wants to merge 3 commits intoAzure:mainfrom
fitzpr:feature/modality-support-detection

Conversation

@fitzpr
Copy link
Contributor

@fitzpr fitzpr commented Feb 19, 2026

@romanlutz This addresses your feedback from PR #1377 about needing modality support detection.

This PR implements the capability detection system you suggested, allowing attacks to determine whether targets support multimodal input (text + image/video) before attempting to send multimodal messages.

Key features:

  • SUPPORTED_INPUT_MODALITIES class attribute for each target
  • Methods to check modality support and detect multimodal capabilities
  • TargetIdentifier fields to track supported modalities
  • Comprehensive tests

This should be merged before updating #1377 to use the new detection system.

@romanlutz
Copy link
Contributor

@fitzpr looks like now only a test file is left. ping me when the code is back 🙂 Appreciate you taking a stab at this!

@fitzpr fitzpr force-pushed the feature/modality-support-detection branch from 8a4723c to 8328b60 Compare February 19, 2026 22:42
target_specific_params: Optional[Dict[str, Any]] = None
"""Additional target-specific parameters."""

supported_input_modalities: Optional[list[str]] = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bashirpartovi may have thoughts on whether this belongs on the identifier or not

@fitzpr
Copy link
Contributor Author

fitzpr commented Feb 19, 2026

@fitzpr looks like now only a test file is left. ping me when the code is back 🙂 Appreciate you taking a stab at this!

In my efforts to remove some extraneous commits i removed the files lol. Should be good now. @romanlutz

@fitzpr fitzpr force-pushed the feature/modality-support-detection branch from b86c25b to 8328b60 Compare February 19, 2026 22:55
"""

#: OpenAI Chat targets support both text and image_path modalities
SUPPORTED_INPUT_MODALITIES = ("text", "image_path")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'd probably have to do all the other targets, too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All major targets implemented - OpenAI, TextTarget, HuggingFace targets updated with proper modality definitions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean ALL 🙂 There are quite a few. I can help with that, though, if you want.

@fitzpr fitzpr force-pushed the feature/modality-support-detection branch from d2a76bd to 149b115 Compare February 19, 2026 23:20
fitzpr pushed a commit to fitzpr/PyRIT that referenced this pull request Feb 20, 2026
- Address all Roman's feedback from PR Azure#1381
- Remove quoted type annotations, add future imports
- Create generic ModalityDiscovery utility class
- Simplify code with direct returns and set.union
- Remove confusing multimodal helper methods
- Move runtime testing to reusable common module
- Update tests to use new architecture
- Enable any target to use discovery via protocol
@fitzpr fitzpr force-pushed the feature/modality-support-detection branch from 5f23b8d to 0141366 Compare February 20, 2026 14:10


def verify_target_capabilities(
target: Any,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be of type PromptTarget but then we'd have a circular dependency. Maybe this should live in pyrit.prompt_target to solve that? I don't think there are downsides to this since it requires targets anyway.

# Try the request
async def _test():
try:
await target._async_client.chat.completions.create(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this would only tell us if it works for chat completions on a openaichattarget

I would love to make this generic. For that, we should craft a proper Message for each type, matching the PromptDataType in PyRIT, and use the target's send methods so that it actually works end-to-end. Since we have that method standardized for all targets it should be possible to validate in a straightforward manner. We would need to ship one example image, audio, and video with PyRIT but we can make them tiny like you suggested. Then, we can load those and use in this function.

We also need to test combinations of modalities based on what the target says it supports.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and if it turns out that's not supported we need to spit out a modality configuration that can be passed to the target to update itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@romanlutz The PR page seems stuck on an old state - it shows "1 commit" but we actually have 11 commits with all the set[frozenset[PromptDataType]] changes you requested.

Can you check the commits tab: https://github.com//pull/1381/commits

Our latest commit (7709e55) has exactly what you suggested! The main PR page seems to have a GitHub caching issue.

@fitzpr fitzpr closed this Feb 20, 2026
@fitzpr fitzpr force-pushed the feature/modality-support-detection branch from 4358b92 to 2484292 Compare February 20, 2026 18:52
Robert Fitzpatrick added 3 commits February 20, 2026 18:52
Addresses all Roman's feedback from PR Azure#1377:
- Uses set[frozenset[PromptDataType]] instead of tuples
- Exact frozenset matching prevents ordering issues
- Implemented across all target types (OpenAI, HuggingFace, TextTarget)
- Future-proof pattern matching for new OpenAI models
- Optional verification utility for runtime testing
- Comprehensive test suite with 8 passing tests
The PR interface is showing cached old commits instead of our current code.
This commit should trigger GitHub to display the correct current state.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants