FEAT: Scientific Translation Converter by jbolor21 · Pull Request #1379 · Azure/PyRIT

jbolor21 · 2026-02-19T19:21:54Z

Description

Adding scientific translation converter to translate queries into various "scientific" modes

Tests and Documentation

Added unit tests and added converter into converters notebook for text->text using LLMs

…converter

romanlutz · 2026-02-19T21:05:16Z

pyrit/datasets/prompt_converters/scientific_obfuscation_converter.yaml

@@ -0,0 +1,87 @@
+name: scientific_obfuscation_converter


pr title says "translation" - can we standardize to one of them?

romanlutz · 2026-02-19T21:07:55Z

pyrit/datasets/prompt_converters/scientific_obfuscation_converter.yaml

+
+  ## Mode-specific guidelines:
+
+  {% if mode == "academic" %}


I am confused. This includes only a specific section depending on the mode BUT at the end there's a combined mode. How will it know all the modes if we exclude most of them? Examples below also include all of them.

Also not sure if I understand the combined mode. In the class it's explicitly listed as a mode, but here it's a catchall, so "foobar" would resolve to a combined prompt. I feel like it would be better to just drop "combined" and refer to the default/wildcard as a combined mode

romanlutz · 2026-02-19T21:12:59Z

pyrit/prompt_converter/scientific_obfuscation_converter.py

+        Raises:
+            ValueError: If an invalid mode is provided.
+        """
+        valid_modes = ("academic", "technical", "smiles", "research", "reaction", "combined")


is there an easier way to check for this given that it's a literal that's defined above

I suggested one below by just attaching the valid modes to the class itself, but it's a nit so feel free to disregard

I suppose you could have both.

from typing import Literal, get_args ObfuscationMode = Literal[ "academic", "technical", "smiles", "research", "reaction", "combined" ] OBFUSCATION_MODES = set(get_args(ObfuscationMode)) def is_valid_mode(value: str) -> bool: return value in OBFUSCATION_MODES

Copilot

Pull request overview

Adds a new LLM-based prompt converter that rewrites prompts into “scientific/technical” phrasing across multiple modes, along with the seed prompt template, documentation wiring, and unit tests.

Changes:

Introduces ScientificObfuscationConverter (mode-driven) backed by a new YAML seed prompt template.
Exposes the converter via pyrit.prompt_converter exports and API docs.
Adds unit tests and an example usage snippet in the converters documentation notebook.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`pyrit/prompt_converter/scientific_obfuscation_converter.py`	Implements the new LLM-based converter and identifier construction.
`pyrit/datasets/prompt_converters/scientific_obfuscation_converter.yaml`	Adds the mode-parameterized system prompt template used by the converter.
`pyrit/prompt_converter/__init__.py`	Exports the new converter from the prompt_converter package.
`tests/unit/converter/test_scientific_obfuscation_converter.py`	Adds unit tests validating initialization, mode validation, and conversion behavior.
`doc/code/converters/1_text_to_text_converters.py`	Documents example usage of the new converter in the text-to-text converters notebook source.
`doc/code/converters/1_text_to_text_converters.ipynb`	Adds the corresponding notebook cell content for the new converter example.
`doc/api.rst`	Adds the converter to the API reference list.

Comments suppressed due to low confidence (2)

pyrit/prompt_converter/scientific_obfuscation_converter.py:23

The PR title/description refer to a "Scientific Translation Converter", but the implementation and dataset are named "ScientificObfuscationConverter" / "scientific_obfuscation_converter". If this is intended to be a translation-style converter, consider aligning the naming (or update the PR description) to avoid confusion for API consumers and documentation readers.

class ScientificObfuscationConverter(LLMGenericTextConverter):
    """
    Uses an LLM to transform simple or direct prompts into

pyrit/prompt_converter/scientific_obfuscation_converter.py:67

valid_modes duplicates the allowed values already defined in ObfuscationMode. To avoid the tuple and the type alias drifting out of sync, derive the runtime list from the type (e.g., typing.get_args(ObfuscationMode)) or centralize the allowed modes as a single constant reused for both validation and typing.

        valid_modes = ("academic", "technical", "smiles", "research", "reaction", "combined")
        if mode not in valid_modes:
            raise ValueError(f"Invalid mode '{mode}'. Must be one of: {valid_modes}")

Copilot · 2026-02-19T21:19:33Z

tests/unit/converter/test_scientific_obfuscation_converter.py

+
+    system_arg = mock_target.set_system_prompt.call_args[1]["system_prompt"]
+    assert isinstance(system_arg, str)
+    assert len(system_arg) > 0


The test only asserts the system prompt is non-empty, which won't catch regressions where the wrong mode/template branch is rendered. Consider asserting for a technical-mode specific marker from the template (e.g., the rendered "Technical Mode" heading or another unique phrase).

Suggested change

assert len(system_arg) > 0

assert "technical" in system_arg.lower()

Copilot · 2026-02-19T21:19:34Z

tests/unit/converter/test_scientific_obfuscation_converter.py

+
+    system_arg = mock_target.set_system_prompt.call_args[1]["system_prompt"]
+    assert isinstance(system_arg, str)
+    assert len(system_arg) > 0


The test only asserts the system prompt is non-empty, which won't catch regressions where the converter renders the wrong mode branch. Consider asserting for a combined-mode specific marker from the template (e.g., the rendered "Combined Mode" heading or another unique phrase).

Suggested change

assert len(system_arg) > 0

assert "combined" in system_arg.lower()

Copilot · 2026-02-19T21:19:34Z

tests/unit/converter/test_scientific_obfuscation_converter.py

+
+    system_arg = mock_target.set_system_prompt.call_args[1]["system_prompt"]
+    assert isinstance(system_arg, str)
+    assert "academic" in system_arg.lower() or len(system_arg) > 0


This assertion is effectively only checking that the system prompt is non-empty: "academic" in system_arg.lower() or len(system_arg) > 0 will pass for any non-empty prompt even if the mode-specific rendering is wrong. Consider asserting for an academic-mode specific marker from the template (e.g., the rendered "Academic Mode" section) so the test will fail if the wrong mode is used.

Suggested change

assert "academic" in system_arg.lower() or len(system_arg) > 0

assert "academic mode" in system_arg.lower()

ValbuenaVC · 2026-02-19T22:22:58Z

pyrit/prompt_converter/scientific_obfuscation_converter.py

+        - ``combined``: Use all techniques together
+
+    """
+


Suggested change

MODES: Tuple[str] = ("academic", "technical", "smiles", "research", "reaction", "combined")

ValbuenaVC · 2026-02-19T22:23:21Z

pyrit/prompt_converter/scientific_obfuscation_converter.py

+            ValueError: If an invalid mode is provided.
+        """
+        valid_modes = ("academic", "technical", "smiles", "research", "reaction", "combined")
+        if mode not in valid_modes:


Suggested change

if mode not in valid_modes:

if mode not in self.MODES:

ValbuenaVC · 2026-02-19T22:25:52Z

pyrit/prompt_converter/scientific_obfuscation_converter.py

+
+import logging
+import pathlib
+from typing import Literal, Optional


Suggested change

from typing import Literal, Optional

from typing import Optional, Tuple

ValbuenaVC · 2026-02-19T22:26:17Z

pyrit/prompt_converter/scientific_obfuscation_converter.py

+logger = logging.getLogger(__name__)
+
+# Supported obfuscation modes
+ObfuscationMode = Literal["academic", "technical", "smiles", "research", "reaction", "combined"]


Suggested change

ObfuscationMode = Literal["academic", "technical", "smiles", "research", "reaction", "combined"]

ValbuenaVC · 2026-02-19T22:26:36Z

pyrit/prompt_converter/scientific_obfuscation_converter.py

+        self,
+        *,
+        converter_target: PromptChatTarget = REQUIRED_VALUE,  # type: ignore[assignment]
+        mode: ObfuscationMode = "combined",


Suggested change

mode: ObfuscationMode = "combined",

mode: str = "combined",

ValbuenaVC · 2026-02-19T22:27:00Z

pyrit/prompt_converter/scientific_obfuscation_converter.py

+        Args:
+            converter_target (PromptChatTarget): The LLM target to perform the conversion.
+                Can be omitted if a default has been configured via PyRIT initialization.
+            mode (ObfuscationMode): The obfuscation mode to use. Options are:


Suggested change

mode (ObfuscationMode): The obfuscation mode to use. Options are:

mode (str): The obfuscation mode to use. Options are:

ValbuenaVC · 2026-02-19T22:27:20Z

pyrit/prompt_converter/scientific_obfuscation_converter.py

+        Raises:
+            ValueError: If an invalid mode is provided.
+        """
+        valid_modes = ("academic", "technical", "smiles", "research", "reaction", "combined")


Suggested change

valid_modes = ("academic", "technical", "smiles", "research", "reaction", "combined")

ValbuenaVC · 2026-02-19T22:27:37Z

pyrit/prompt_converter/scientific_obfuscation_converter.py

+        """
+        valid_modes = ("academic", "technical", "smiles", "research", "reaction", "combined")
+        if mode not in valid_modes:
+            raise ValueError(f"Invalid mode '{mode}'. Must be one of: {valid_modes}")


Suggested change

raise ValueError(f"Invalid mode '{mode}'. Must be one of: {valid_modes}")

raise ValueError(f"Invalid mode '{mode}'. Must be one of: {self.MODES}")

ValbuenaVC · 2026-02-19T22:28:24Z

pyrit/prompt_converter/scientific_obfuscation_converter.py

+        Raises:
+            ValueError: If an invalid mode is provided.
+        """
+        valid_modes = ("academic", "technical", "smiles", "research", "reaction", "combined")


I suggested one below by just attaching the valid modes to the class itself, but it's a nit so feel free to disregard

ValbuenaVC · 2026-02-19T22:31:12Z

pyrit/datasets/prompt_converters/scientific_obfuscation_converter.yaml

+
+  ## Mode-specific guidelines:
+
+  {% if mode == "academic" %}


Also not sure if I understand the combined mode. In the class it's explicitly listed as a mode, but here it's a catchall, so "foobar" would resolve to a combined prompt. I feel like it would be better to just drop "combined" and refer to the default/wildcard as a combined mode

hannahwestra25 · 2026-02-20T21:59:20Z

pyrit/prompt_converter/scientific_obfuscation_converter.py

+    Supports multiple modes:
+        - ``academic``: Frame as scholarly, homework style questions
+        - ``technical``: Use precise technical jargon and nomenclature
+        - ``smiles``: Use SMILES chemical notation and IUPAC names


do you have links / more information on what SMILES and IUPAC are ? maybe it's more common knowledge than I think but it might be helpful to have this listed

Bolor added 3 commits February 19, 2026 11:19

initial commit adding all changed files new converter

fcb33a8

adding in api

501c1b2

Merge remote-tracking branch 'origin' into users/bjagdagdorj/science_…

30349f1

…converter

jbolor21 changed the title ~~[DRAFT]: FEAT: Scientific Translation Converter~~ FEAT: Scientific Translation Converter Feb 19, 2026

romanlutz reviewed Feb 19, 2026

View reviewed changes

romanlutz requested a review from Copilot February 19, 2026 21:13

Copilot started reviewing on behalf of romanlutz February 19, 2026 21:14 View session

Copilot AI reviewed Feb 19, 2026

View reviewed changes

yaml edit

ef716a9

ValbuenaVC reviewed Feb 19, 2026

View reviewed changes

hannahwestra25 reviewed Feb 20, 2026

View reviewed changes

	assert len(system_arg) > 0
	assert "technical" in system_arg.lower()

	assert len(system_arg) > 0
	assert "combined" in system_arg.lower()

	assert "academic" in system_arg.lower() or len(system_arg) > 0
	assert "academic mode" in system_arg.lower()



	MODES: Tuple[str] = ("academic", "technical", "smiles", "research", "reaction", "combined")

	from typing import Literal, Optional
	from typing import Optional, Tuple

	mode (ObfuscationMode): The obfuscation mode to use. Options are:
	mode (str): The obfuscation mode to use. Options are:

	raise ValueError(f"Invalid mode '{mode}'. Must be one of: {valid_modes}")
	raise ValueError(f"Invalid mode '{mode}'. Must be one of: {self.MODES}")

Comments

Conversation

jbolor21 commented Feb 19, 2026

Description

Tests and Documentation

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants