Skip to content

Enable multi-threaded execution for TableFunction#6

Draft
otegami wants to merge 6 commits intomainfrom
refactor/extract-executor-module
Draft

Enable multi-threaded execution for TableFunction#6
otegami wants to merge 6 commits intomainfrom
refactor/extract-executor-module

Conversation

@otegami
Copy link
Copy Markdown
Owner

@otegami otegami commented Apr 3, 2026

Summary

Enable multi-threaded execution for DuckDB::TableFunction on DuckDB >= 1.5.0 by introducing per-worker proxy threads.

DuckDB invokes table function callbacks from its own worker threads, which are not Ruby threads. Since rb_thread_call_with_gvl crashes when called from non-Ruby threads, we previously forced single-threaded execution. This PR gives each DuckDB worker thread a dedicated Ruby proxy thread that acquires the GVL on its behalf, making table function callbacks safe under multi-threaded DuckDB execution.

otegami added 6 commits April 3, 2026 17:51
…r.{c,h}

Move the global executor thread implementation (~230 lines) from
scalar_function.c into a new shared module executor.{c,h}. This
makes the executor reusable by other C files (e.g., table_function.c)
that also need to dispatch callbacks from non-Ruby threads.

The executor API is generalized with a callback function pointer
(rbduckdb_callback_fn) instead of the scalar-specific callback_arg.
scalar_function.c adds a thin wrapper (scalar_execute_via_executor)
to adapt the generic signature.

No behavior change — all existing tests pass.
Add per-worker proxy threads to the shared executor module. Each
DuckDB worker thread can be assigned a dedicated Ruby proxy thread
that waits on its own condvar, acquires the GVL independently, and
executes callbacks without going through the global executor queue.

This eliminates the global executor bottleneck by distributing GVL
acquisition across multiple Ruby threads — the Ruby equivalent of
Python's PyGILState_Ensure() approach.

Key components:
- struct worker_proxy with dedicated condvar per proxy
- rbduckdb_worker_proxy_create(): spawns proxy Ruby thread
- rbduckdb_worker_proxy_dispatch(): sends callback, blocks until done
- rbduckdb_worker_proxy_destroy(): safe cleanup from non-Ruby threads
- g_proxy_threads GC protection array

Not yet wired up to any callback — integration follows in the next
commit.
Wire up the per-worker proxy infrastructure in scalar_function.c:

- Add scalar_function_init_local_state() callback that creates a
  per-worker proxy for each non-Ruby DuckDB worker thread via
  duckdb_scalar_function_set_init (DuckDB >= 1.5.0)
- Update Case 3 dispatch to use per-worker proxy when available,
  falling back to global executor otherwise
- Register init_local_state in set_function
- Add version gate to multithread scalar function test

All code is guarded by HAVE_DUCKDB_H_GE_V1_5_0. On older DuckDB
versions, the global executor fallback path is unchanged.
Refactor table_function.c bind/init/execute callbacks to use the
same three-path dispatch pattern as scalar_function.c:

1. Ruby thread WITH GVL    -> call directly
2. Ruby thread WITHOUT GVL -> rb_thread_call_with_gvl
3. Non-Ruby thread         -> dispatch to global executor

Each callback's core logic is extracted into a *_with_gvl() function
(table_bind_with_gvl, table_init_with_gvl, table_execute_with_gvl)
that the dispatch paths invoke. The global executor is started when
the execute callback is registered.

This makes table function callbacks thread-safe when invoked from
non-Ruby threads. The SET threads=1 restriction is not yet removed
(done in a subsequent commit).
Add per-worker proxy support to the table function execute callback
for DuckDB >= 1.5.0:

- Add table_function_local_init_callback() that creates a per-worker
  proxy for each non-Ruby DuckDB worker thread
- Register via duckdb_table_function_set_local_init in set_execute
- Update execute callback Case 3 to use proxy when available,
  falling back to global executor otherwise

All code is guarded by HAVE_DUCKDB_H_GE_V1_5_0. The SET threads=1
restriction is not yet removed (done in the next commit).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant