Skip to content

Fix crash in getcurrent()/greenlet construction during early Py_FinalizeEx#499

Open
nbouvrette wants to merge 9 commits intopython-greenlet:masterfrom
nbouvrette:fix/safe-getcurrent-during-finalization
Open

Fix crash in getcurrent()/greenlet construction during early Py_FinalizeEx#499
nbouvrette wants to merge 9 commits intopython-greenlet:masterfrom
nbouvrette:fix/safe-getcurrent-during-finalization

Conversation

@nbouvrette
Copy link
Contributor

@nbouvrette nbouvrette commented Mar 11, 2026

Summary

While regression-testing greenlet 3.2.5 (the backport of PR #495) against Python 3.9.7 under uWSGI, we discovered multiple independent crash paths that were not fixed by PR #495. These crashes affect Python < 3.11.

This PR fixes three root causes of SIGSEGV during Py_FinalizeEx and a latent exception-swallowing bug:

  1. clear_deleteme_list() vector allocation crash: The vector copy used PythonAllocator (PyMem_Malloc), which can SIGSEGV during early Py_FinalizeEx when Python's allocator pools are partially torn down. Replaced with std::swap (zero-allocation, constant-time) and switched deleteme_t to std::allocator (system malloc).

  2. ThreadState memory corruption: ThreadState objects were allocated via PyObject_Malloc, placing them in pymalloc pools that can be disrupted during finalization. Switched to std::malloc/std::free so ThreadState memory remains valid throughout Py_FinalizeEx.

  3. getcurrent() crash on invalidated type objects: _Py_IsFinalizing() is only set after call_py_exitfuncs and _PyGC_CollectIfEnabled complete inside Py_FinalizeEx, so code in atexit handlers or __del__ methods could still call greenlet.getcurrent() when type objects had already been invalidated, crashing in PyType_IsSubtype. An atexit handler is now registered at module init (LIFO = runs first) that sets a shutdown flag checked by getcurrent(), PyGreenlet_GetCurrent(), and clear_deleteme_list().

  4. Exception preservation: clear_deleteme_list() now preserves any pending Python exception around its cleanup loop, fixing a latent bug where an unrelated exception (e.g. one set by throw()) could be swallowed by PyErr_WriteUnraisable/PyErr_Clear.

All new code is guarded by #if !GREENLET_PY311zero impact on Python 3.11+.

How this was discovered

After PR #495 was merged and 3.2.5 was released, we ran our uWSGI-based crash reproducer (which simulates production worker recycling with max-requests=1) against Python 3.9.7 with greenlet 3.2.5. The segfaults persisted. Using fprintf debug tracing and objdump disassembly of crash backtraces, we narrowed the crashes to three distinct code paths, each requiring its own fix.

Root Cause Analysis

Why PR #495's guard doesn't help here

PR #495 added _Py_IsFinalizing() guards to prevent crashes during greenlet deallocation (GC-triggered, late in Py_FinalizeEx). At that point, _Py_IsFinalizing() is already true, so the guard works.

These new crashes occur earlier — during call_py_exitfuncs() (atexit handlers) and _PyGC_CollectIfEnabled(), which run before _PyRuntimeState_SetFinalizing():

Py_FinalizeEx()
├── call_py_exitfuncs()              ← crashes happen HERE
├── _PyGC_CollectIfEnabled()         ← and HERE (_Py_IsFinalizing() == false)
├── _PyRuntimeState_SetFinalizing()  ← flag set HERE (too late)
├── finalize_interp_clear()          ← PR #495's guard helps HERE

The three crash mechanisms

  1. Allocator disruption: During early Py_FinalizeEx, Python's pymalloc pools can be disrupted. Any C++ object allocated via PyObject_Malloc or PyMem_Malloc (including ThreadState and its deleteme vector's internal buffer) may have its memory corrupted. Fix: use system malloc for both.

  2. Type object invalidation: Python cleanup code (atexit handlers, __del__ methods) can call greenlet.getcurrent(), which returns an OwnedGreenlet smart pointer. The GreenletChecker type validator calls PyType_IsSubtype, which dereferences the greenlet's ob_type — if that type object has been freed or overwritten, SIGSEGV. Fix: an atexit handler registered at module init (last registered = first called in LIFO order) sets a flag before any other cleanup runs.

  3. Exception loss: clear_deleteme_list() runs Py_DECREF + PyErr_WriteUnraisable + PyErr_Clear in a loop, which can clear a pending exception set by an earlier throw() call. Fix: save and restore the exception state around the loop using PyErrPieces.

Test plan

  • Existing test_interpreter_shutdown.py passes (subprocess-based shutdown tests)
  • Existing test_throw_exception_not_lost passes (exception preservation)
  • Existing test_dealloc_other_thread passes (cross-thread cleanup)
  • Existing test_issue251_killing_cross_thread_leaks_list passes
  • Full CI on all supported Python versions (3.10–3.15)
  • Manual uWSGI stress test: 0 segfaults in 200 worker recycles on Python 3.9.7 (previously 3-4 crashes in 30 recycles)

Backport note

If the maintainer plans a 3.2.x backport (for Python 3.9 support), we recommend a full sync of master into the 3.2.x branch rather than a surgical cherry-pick. All C++ changes between 3.2.4 and 3.3.x are behind #ifdef Py_GIL_DISABLED or #if GREENLET_PY312+ guards — they compile to nothing on Python 3.9, so there is zero runtime risk. Keeping the branches nearly identical (differing only in pyproject.toml and CI config) makes future maintenance significantly easier.

I can create that follow up PR once this one has been reviewed.

While regression-testing greenlet 3.2.5 (backport of PR python-greenlet#495) against
Python 3.9.7 under uWSGI, we discovered a second, independent crash
path that was NOT fixed by PR python-greenlet#495.

Root cause
----------
On Python < 3.11, Py_FinalizeEx calls call_py_exitfuncs() BEFORE
setting _PyRuntimeState.finalizing (which backs _Py_IsFinalizing()).
If any exit function — or code triggered by one — calls
greenlet.getcurrent(), greenlet.greenlet(...), or the C API
PyGreenlet_GetCurrent(), the non-const get_current()/borrow_current()
methods run clear_deleteme_list().  That helper copies a std::vector
through PythonAllocator (PyMem_Malloc), and during this early
finalization phase the allocator state can be partially torn down,
causing a SIGSEGV.

Fix
---
Add ThreadStateCreator::readonly_state(), which returns a const
ThreadState& and therefore selects the const overloads of
get_current() / borrow_current() — these simply return the current
greenlet pointer without touching the deleteme list.  The deleteme
cleanup is safely deferred to the next greenlet switch or
thread-state teardown.

Because the fix removes a side-effect from simple accessors, it is
applied unconditionally (all Python versions), not just < 3.11.

Changed files:
- TThreadStateCreator.hpp: new readonly_state() method
- TThreadState.hpp: new const borrow_current() overload
- PyModule.cpp: mod_getcurrent uses readonly_state()
- CObjects.cpp: PyGreenlet_GetCurrent uses readonly_state()
- PyGreenlet.cpp: green_new uses readonly_state()
- PyGreenletUnswitchable.cpp: green_unswitchable_new uses readonly_state()
- test_interpreter_shutdown.py: 7 new subprocess-based tests
- CHANGES.rst: release note

Made-with: Cursor
Two fixes to clear_deleteme_list():

1. Use std::swap instead of vector copy to avoid PythonAllocator
   (PyMem_Malloc) allocation that crashes during early Py_FinalizeEx
   on Python < 3.11.

2. Save/restore the pending exception around the cleanup loop so
   that PyErr_WriteUnraisable/PyErr_Clear inside the loop cannot
   swallow an unrelated exception (e.g. one set by throw()).

Revert mod_getcurrent and PyGreenlet_GetCurrent to use the non-const
path so that getcurrent() continues to trigger cross-thread cleanup.
Keep green_new/green_unswitchable_new using readonly_state() since
construction doesn't need to trigger cleanup.

Made-with: Cursor
Greenlet construction (green_new, green_unswitchable_new) must also
trigger clear_deleteme_list because existing code and tests depend
on it for cross-thread cleanup (e.g. test_dealloc_other_thread uses
RawGreenlet() to trigger deleteme processing).

With std::swap + exception preservation in clear_deleteme_list,
the crash and exception-loss bugs are fixed at the source.
The readonly_state / const borrow_current utilities are no longer
needed and are removed.

Made-with: Cursor
@nbouvrette nbouvrette force-pushed the fix/safe-getcurrent-during-finalization branch from 374c98e to 17d17e3 Compare March 11, 2026 05:35
Three root causes of SIGSEGV during interpreter shutdown on Python < 3.11
were identified through uWSGI worker recycling stress tests and fixed:

1. ThreadState allocated via PyObject_Malloc — pymalloc pools can be
   disrupted during early Py_FinalizeEx, corrupting the C++ object.
   Switched to std::malloc/std::free.

2. deleteme vector used PythonAllocator (PyMem_Malloc) — same pool
   corruption issue.  Switched to std::allocator (system malloc).

3. _Py_IsFinalizing() is only set AFTER call_py_exitfuncs and
   _PyGC_CollectIfEnabled complete, so atexit handlers and __del__
   methods could call greenlet.getcurrent() when type objects were
   already invalidated, crashing in PyType_IsSubtype.  An atexit
   handler registered at module init (LIFO = runs first) now sets
   a shutdown flag checked by getcurrent(), PyGreenlet_GetCurrent(),
   and clear_deleteme_list().

All new code is guarded by #if !GREENLET_PY311 — zero impact on
Python 3.11+.  Verified: 0 segfaults in 200 uWSGI worker recycles
on Python 3.9.7 (previously 3-4 crashes in 30 recycles).

Made-with: Cursor
@nbouvrette nbouvrette force-pushed the fix/safe-getcurrent-during-finalization branch from 17d17e3 to 733a419 Compare March 11, 2026 05:37
Python 3.13 renamed _Py_IsFinalizing() to Py_IsFinalizing(), requiring
#if/#elif/#else chains at every call site. Centralize the version check
in greenlet_cpython_compat.hpp so call sites use a single macro and
future Python versions need only one update.

Made-with: Cursor
The final USS measurement in _check_untracked_memory_thread can pick
up tens of KB of OS-level noise (working set trimming, page table
updates, thread-local caches) between the loop exit and the assertion.
Add a 1 MB tolerance on Windows only, keeping the strict check on
Linux/macOS where USS is more stable.  Real leaks grow by MBs over
100 iterations, so this tolerance cannot mask genuine issues.

Made-with: Cursor
Py_IsFinalizing() has been a public CPython API since Python 3.6 and
is available on all Python versions greenlet supports (>=3.9).  The
private _Py_IsFinalizing() name was removed in Python 3.13, but the
public function works on all versions.  Drop the custom macro in favor
of the standard API — when older Python support is eventually dropped,
there is nothing to clean up.

Made-with: Cursor
Py_IsFinalizing() only became a public C API in Python 3.13; older
versions only expose _Py_IsFinalizing().  Add a two-line macro that
maps the public name to the private one on < 3.13, so all call sites
use the standard name.  Remove the macro when < 3.13 is dropped.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant