Closed
Conversation
…callbacks DuckDB calls scalar function callbacks from its own worker threads, which are not Ruby threads. Ruby's GVL cannot be acquired from non-Ruby threads (rb_thread_call_with_gvl crashes with rb_bug), so an executor pattern is required. Previously, scalar_function.c contained a single global executor thread that serialized all callbacks through one condvar. On Windows, this caused deadlocks due to GVL starvation under Minitest parallel execution (Issue suketa#1158). This commit makes two changes: 1. Extract the executor infrastructure into a shared module (executor.c/executor.h) so both scalar_function.c and table_function.c can use it. 2. Add per-worker proxy threads using DuckDB's init_local_state API. Each DuckDB worker thread gets a dedicated Ruby proxy thread created via duckdb_scalar_function_set_init(). The proxy acquires the GVL independently, eliminating the global executor bottleneck. The global executor is retained as a bootstrap mechanism for proxy creation and as a fallback. Refs: suketa#1158 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
table_function.c previously called Ruby C API functions (rb_funcall, rb_class_new_instance, rb_protect) directly from whichever thread DuckDB invoked the callback on. When called from a non-Ruby worker thread, this was undefined behavior — potential crash or deadlock. The workaround was a Ruby-level check_threads guard in connection.rb that raised an error if threads > 1 when registering table functions, forcing all table function queries to run single-threaded. This commit adds the same three-path dispatch pattern used by scalar_function.c to all three table function callbacks: 1. Ruby thread WITH GVL -> call directly 2. Ruby thread WITHOUT GVL -> rb_thread_call_with_gvl 3. Non-Ruby thread -> per-worker proxy (or global executor) The execute callback also registers a local_init that creates a per-worker proxy thread for non-Ruby workers, matching the scalar function approach. With table functions now thread-safe, the check_threads guard and its threads=1 requirement are removed from connection.rb. Refs: suketa#1158 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… table function test With scalar and table function callbacks now thread-safe via per-worker proxy threads, the SET threads=1 workaround is no longer needed in UDF-related tests. - Remove SET threads=1 from table_function_test.rb, data_chunk_test.rb, gc_stress_test.rb, table_function_csv_test.rb, and table_function_integration_test.rb - Add test_table_function_with_multithread that runs a table function with threads=4 to verify thread safety Non-UDF tests (query_progress, pending_result) retain their SET threads=1 as those are unrelated to callback thread safety. Refs: suketa#1158 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
duckdb_scalar_function_set_init, duckdb_scalar_function_get_state, and duckdb_table_function_set_local_init were introduced in DuckDB 1.5.0. CI tests against DuckDB 1.4.4, which lacks these symbols. Wrap per-worker proxy creation and dispatch behind HAVE_DUCKDB_H_GE_V1_5_0 (#ifdef). On DuckDB < 1.5.0, the global executor thread handles all non-Ruby thread callbacks (the previous behavior). The three-path dispatch (Ruby+GVL / Ruby-GVL / non-Ruby) remains active on all versions. Refs: suketa#1158 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Per-worker proxy threads require DuckDB >= 1.5.0 APIs (duckdb_scalar_function_set_init, duckdb_scalar_function_get_state, duckdb_table_function_set_local_init). On older versions, these symbols are unavailable and the global executor is the only dispatch path for non-Ruby thread callbacks. Restore backward-compatible behavior: - connection.rb: Restore check_threads for DuckDB < 1.5.0. On >= 1.5.0 the check is skipped (per-worker proxy handles thread safety). On < 1.5.0 the threads=1 restriction remains to prevent the global executor deadlock. - Tests: Keep SET threads=1 in all table function tests for compatibility with DuckDB 1.4.x. Version-gate the new multithread tests behind DuckDB >= 1.5.0. Refs: suketa#1158 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Break long skip lines into unless/skip form - Add rubocop:disable for MethodLength on scalar multithread test - Replace empty block with nil body in table function init Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previously, rbduckdb_worker_proxy_destroy only signaled the proxy thread to stop but never waited for it to exit or freed the struct. This leaked the worker_proxy struct, its mutex, and condvars for every DuckDB worker thread on each UDF query — unbounded growth in long-lived processes. Add an exit_cond condvar so destroy can block until the proxy thread has fully exited, then safely destroy all OS primitives and free the struct. Switch from xcalloc/xfree to calloc/free since destroy runs on a non-Ruby thread where xfree is unsafe. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rename terse field names in executor_request and worker_proxy structs to be more descriptive, following Arrow C GLib naming conventions: - fn/data → callback_func/callback_data - done → request_done - done_lock/done_mutex/done_cond → request_done_lock/request_done_mutex/request_done_cond - stop → stop_requested - exited → thread_exited - exit_cond → thread_exit_cond Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.