[BREAKING] FEAT add TAP to content harms scenario#1378
[BREAKING] FEAT add TAP to content harms scenario#1378hannahwestra25 wants to merge 15 commits intoAzure:mainfrom
Conversation
…dd_tap_to_content_harms
| __all__ = [ | ||
| "ContentHarms", | ||
| "ContentHarmsStrategy", | ||
| "PsychosocialScenario", |
There was a problem hiding this comment.
If these were released that way it could be breaking.
| name="Content Harms", | ||
| name="ContentHarms", |
There was a problem hiding this comment.
Is this a display name or something else? Do we have an established casing rule? Asking because I want to know, not to criticize 🙂
There was a problem hiding this comment.
Yep for display. I changed it to one word because that's what the red team agent scenario does (which is the only other 2+ word scenario) but it also matches the scenario class name. I'm indifferent about whether its camelcase or spaces between words but want it to be consistent
There was a problem hiding this comment.
If you want it to match class name why don't we use that rather than setting custom strings that could diverge?
| Args: | ||
| strategy (ScenarioCompositeStrategy): The strategy to create the attack from. | ||
| seed_groups (List[SeedAttackGroup]): The seed attack groups associated with the harm dataset. | ||
| strategy (str): The harm strategy name to create attacks for. |
There was a problem hiding this comment.
Having this as list rather than types to the strategy seems worse? "harm strategy" sounds dangerous 😆
There was a problem hiding this comment.
what do you mean "this as a list rather than types to the strategy seems worse"? are you talking about the strategy type or the seed_groups type ? this is just a comment change but strategy is just the name of the harm which then is used for the atomic attack name so maybe I rename the variable to strategy_name at least if that's what you're referring to 😅
There was a problem hiding this comment.
Generally, having it typed is better than not (or just "string" when you actually just want one from a specific set of literals). That's essentially the point.
There was a problem hiding this comment.
yeah makes sense i guess at this point it's really just the strategy enum so a string. imo it doesn't make sense to construct a scenariostrategy object at this point when we're just using the enum
| FIRST_LETTER = ("first_letter", {"single_turn", "ip"}) # Good for copyright extraction | ||
| IMAGE = ("image", {"single_turn", "ip", "sensitive_data"}) | ||
| ROLE_PLAY = ("role_play", {"single_turn", "sensitive_data"}) # Good for system prompt extraction | ||
| FirstLetter = ("first_letter", {"single_turn", "ip"}) # Good for copyright extraction |
There was a problem hiding this comment.
does this break people who are using it in all caps from the last release?
There was a problem hiding this comment.
yes good call!
…dd_tap_to_content_harms
…dd_tap_to_content_harms
Description
Tests and Documentation
added / updated tests