Fix count_used_parameters_in_backward crash on PyTorch < 2.3 (#7756) by harshang03 · Pull Request #7849 · deepspeedai/DeepSpeed

harshang03 · 2026-02-12T20:06:53Z

The function asserted the presence of internal PyTorch APIs (_get_grad_fn_or_grad_acc, _current_graph_task_id, _will_engine_execute_node) that only exist in PyTorch >= 2.3. On older builds (e.g. 2.1.2), the assert fired unconditionally inside gradient hooks, crashing training with ZeRO stage 1/2/3.

Changes:

runtime/utils.py: Replace the hard assert with a graceful fallback that counts all grad-requiring parameters (conservative upper bound) when internal APIs are unavailable.
runtime/engine.py: Enable _support_torch_style_backward for all ZeRO optimizers regardless of PyTorch version, since the fallback counting is safe and correct. Remove unused import.
base_optimizer.py: No changes needed (already handles missing APIs in queue_post_backward_callback).
tests/: Add comprehensive test suite covering fallback behaviour, native path, edge cases, and API availability checks.

Fixes #7756

…edai#7756) The function asserted the presence of internal PyTorch APIs (_get_grad_fn_or_grad_acc, _current_graph_task_id, _will_engine_execute_node) that only exist in PyTorch >= 2.3. On older builds (e.g. 2.1.2), the assert fired unconditionally inside gradient hooks, crashing training with ZeRO stage 1/2/3. Changes: - runtime/utils.py: Replace the hard assert with a graceful fallback that counts all grad-requiring parameters (conservative upper bound) when internal APIs are unavailable. - runtime/engine.py: Enable _support_torch_style_backward for all ZeRO optimizers regardless of PyTorch version, since the fallback counting is safe and correct. Remove unused import. - base_optimizer.py: No changes needed (already handles missing APIs in queue_post_backward_callback). - tests/: Add comprehensive test suite covering fallback behaviour, native path, edge cases, and API availability checks. Fixes deepspeedai#7756 Signed-off-by: Harshang Akabari <a.harshang@gmail.com> Co-authored-by: Cursor <cursoragent@cursor.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix count_used_parameters_in_backward crash on PyTorch < 2.3 (#7756)#7849

Fix count_used_parameters_in_backward crash on PyTorch < 2.3 (#7756)#7849
harshang03 wants to merge 1 commit intodeepspeedai:masterfrom
harshang03:fix/7756-count-used-params-fallback

harshang03 commented Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

harshang03 commented Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant