Enable multi-threaded execution for TableFunction#6
Draft
Conversation
…r.{c,h}
Move the global executor thread implementation (~230 lines) from
scalar_function.c into a new shared module executor.{c,h}. This
makes the executor reusable by other C files (e.g., table_function.c)
that also need to dispatch callbacks from non-Ruby threads.
The executor API is generalized with a callback function pointer
(rbduckdb_callback_fn) instead of the scalar-specific callback_arg.
scalar_function.c adds a thin wrapper (scalar_execute_via_executor)
to adapt the generic signature.
No behavior change — all existing tests pass.
Add per-worker proxy threads to the shared executor module. Each DuckDB worker thread can be assigned a dedicated Ruby proxy thread that waits on its own condvar, acquires the GVL independently, and executes callbacks without going through the global executor queue. This eliminates the global executor bottleneck by distributing GVL acquisition across multiple Ruby threads — the Ruby equivalent of Python's PyGILState_Ensure() approach. Key components: - struct worker_proxy with dedicated condvar per proxy - rbduckdb_worker_proxy_create(): spawns proxy Ruby thread - rbduckdb_worker_proxy_dispatch(): sends callback, blocks until done - rbduckdb_worker_proxy_destroy(): safe cleanup from non-Ruby threads - g_proxy_threads GC protection array Not yet wired up to any callback — integration follows in the next commit.
Wire up the per-worker proxy infrastructure in scalar_function.c: - Add scalar_function_init_local_state() callback that creates a per-worker proxy for each non-Ruby DuckDB worker thread via duckdb_scalar_function_set_init (DuckDB >= 1.5.0) - Update Case 3 dispatch to use per-worker proxy when available, falling back to global executor otherwise - Register init_local_state in set_function - Add version gate to multithread scalar function test All code is guarded by HAVE_DUCKDB_H_GE_V1_5_0. On older DuckDB versions, the global executor fallback path is unchanged.
Refactor table_function.c bind/init/execute callbacks to use the same three-path dispatch pattern as scalar_function.c: 1. Ruby thread WITH GVL -> call directly 2. Ruby thread WITHOUT GVL -> rb_thread_call_with_gvl 3. Non-Ruby thread -> dispatch to global executor Each callback's core logic is extracted into a *_with_gvl() function (table_bind_with_gvl, table_init_with_gvl, table_execute_with_gvl) that the dispatch paths invoke. The global executor is started when the execute callback is registered. This makes table function callbacks thread-safe when invoked from non-Ruby threads. The SET threads=1 restriction is not yet removed (done in a subsequent commit).
Add per-worker proxy support to the table function execute callback for DuckDB >= 1.5.0: - Add table_function_local_init_callback() that creates a per-worker proxy for each non-Ruby DuckDB worker thread - Register via duckdb_table_function_set_local_init in set_execute - Update execute callback Case 3 to use proxy when available, falling back to global executor otherwise All code is guarded by HAVE_DUCKDB_H_GE_V1_5_0. The SET threads=1 restriction is not yet removed (done in the next commit).
530b616 to
8b1d2f0
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Enable multi-threaded execution for
DuckDB::TableFunctionon DuckDB >= 1.5.0 by introducing per-worker proxy threads.DuckDB invokes table function callbacks from its own worker threads, which are not Ruby threads. Since
rb_thread_call_with_gvlcrashes when called from non-Ruby threads, we previously forced single-threaded execution. This PR gives each DuckDB worker thread a dedicated Ruby proxy thread that acquires the GVL on its behalf, making table function callbacks safe under multi-threaded DuckDB execution.