Fix AutoTP custom patterns: respect use_default_specs by tohtana · Pull Request #7827 · deepspeedai/DeepSpeed

tohtana · 2026-02-01T05:52:39Z

The current code has the following issues:

use_default_specs: false doesn't work
Injection by the traditional pattern runs even when custom patterns are set
mpu needs to be passed to deepspeed.initialize (HF integration doesn't pass mpu)

This PR fixes AutoTP setup to respect use_default_specs: false and disable the traditional injection path when custom patterns are enabled. Also, when mpu is not passed, we create a TP group in the initialization process.

With these changes, the related tests pass and all AutoTP examples in DeepSpeedExamples work now (PR).

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

delock · 2026-02-05T08:40:52Z

deepspeed/module_inject/auto_tp.py

+        # Only use fused-QKV heuristics when no partition_config is provided.
+        elif self.partition_config is None and require_tp_fused_qkvw(name, self.mp_size):
+            # Check and handle fused qkv for TP
+            return fused_LinearLayer(module, self.mp_group, fused_module=self.module)


Are these fix exposed by a test? i.e. a model with conv linear layer or fused qkv weight.

Great catch! I added a test to validate we use the layers for new custom patterns when a partition is given.

delock · 2026-02-05T09:16:56Z

Hi @tohtana I reviewed this PR. I have one extended question. I saw current partition_config has a field called "use_default_specs" which allows user to specify default behavior, for example taken from your test:

       partition_config = {
            "use_default_specs":
            False,
            "layer_specs": [
                {
                    "patterns": [".*linears\\.0\\.weight$"],
                    "partition_type": "row",
                },
                {
                    "patterns": [".*linears\\.1\\.weight$"],
                    "partition_type": "column",
                },
            ],
        }

Does this equal to define a default fallback pattern and partition type in layer_specs? I feel this might make the config in a more unified format. It might just be personal favor.

       partition_config = {
            #"use_default_specs":
            #False,
            "layer_specs": [
                {
                    "patterns": [".*linears\\.0\\.weight$"],
                    "partition_type": "row",
                },
                {
                    "patterns": [".*linears\\.1\\.weight$"],
                    "partition_type": "column",
                },
                { # the default fallback pattern and partition type
                    "patterns": ["*"],
                    "partition_type": "none",
                },
            ],
        }

tohtana · 2026-02-05T22:11:03Z

@delock Thank you for your review!

I saw current partition_config has a field called "use_default_specs" which allows user to specify default behavior, for example taken from your test:

use_default_specs doesn't fallback to "partition_type": "none". It controls whether we use preset or not.
With use_default_specs=True, we will add the custom patterns to the preset patterns.
When it's False, only the custom patterns in the config are used (This assumes we disable the preset when it is enabled by a different stack)

The document explains "You can also set use_default_specs to true to merge your custom patterns on top of the preset (when preset_model is provided)." I think we could clarify this by explaining when use_default_specs=False is useful.
We can also add a warning when preset_model and use_default_specs=False are both given.

Can you share your thoughts?

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

delock · 2026-02-07T08:51:24Z

@tohtana Thanks for the explaination! I don't have further comments.

@delock Thank you for your review!

I saw current partition_config has a field called "use_default_specs" which allows user to specify default behavior, for example taken from your test:

use_default_specs doesn't fallback to "partition_type": "none". It controls whether we use preset or not. With use_default_specs=True, we will add the custom patterns to the preset patterns. When it's False, only the custom patterns in the config are used (This assumes we disable the preset when it is enabled by a different stack)

The document explains "You can also set use_default_specs to true to merge your custom patterns on top of the preset (when preset_model is provided)." I think we could clarify this by explaining when use_default_specs=False is useful. We can also add a warning when preset_model and use_default_specs=False are both given.

Can you share your thoughts?

The current code has the following issues: - `use_default_specs: false` doesn't work - Injection by the traditional pattern runs even when custom patterns are set - `mpu` needs to be passed to `deepspeed.initialize` (HF integration doesn't pass mpu) This PR fixes AutoTP setup to respect `use_default_specs: false` and disable the traditional injection path when custom patterns are enabled. Also, when `mpu` is not passed, we create a TP group in the initialization process. With these changes, the [related tests](https://github.com/deepspeedai/DeepSpeed/tree/master/tests/unit/model_parallelism) pass and [all AutoTP examples](https://github.com/tohtana/DeepSpeedExamples/tree/tohtana/custom_auto_tp/training/tensor_parallel) in DeepSpeedExamples work now ([PR](deepspeedai/DeepSpeedExamples#998)). --------- Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com> Signed-off-by: Kento Sugama <kentosugama@protonmail.ch>

tohtana added 4 commits January 31, 2026 21:39

fix autotp custom pattern handling

1b86e1a

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

exclude default patterns

d62eb69

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

itniialize TP group when mpu is not passed

a84677e

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

ignore shape inference when partition config is passed

2d7afe3

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

tohtana marked this pull request as ready for review February 3, 2026 16:50

tohtana requested review from hwchen2017, loadams and tjruwase as code owners February 3, 2026 16:50

tohtana requested a review from delock February 3, 2026 16:51

delock reviewed Feb 5, 2026

View reviewed changes

tohtana added 2 commits February 5, 2026 15:04

add test to verify priority of patterns

44805c4

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

fix format

411278b

Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>

delock approved these changes Feb 7, 2026

View reviewed changes

tohtana merged commit a44fb58 into master Feb 7, 2026
10 of 11 checks passed

tohtana deleted the tohtana/fix_autotp_custom_pattern branch February 7, 2026 09:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix AutoTP custom patterns: respect use_default_specs#7827

Fix AutoTP custom patterns: respect use_default_specs#7827
tohtana merged 6 commits intomasterfrom
tohtana/fix_autotp_custom_pattern

tohtana commented Feb 1, 2026 •

edited

Loading

Uh oh!

delock Feb 5, 2026

Uh oh!

tohtana Feb 5, 2026

Uh oh!

delock commented Feb 5, 2026

Uh oh!

tohtana commented Feb 5, 2026

Uh oh!

delock commented Feb 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tohtana commented Feb 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

delock Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

tohtana Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

delock commented Feb 5, 2026

Uh oh!

tohtana commented Feb 5, 2026

Uh oh!

delock commented Feb 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tohtana commented Feb 1, 2026 •

edited

Loading