Enable multi-threaded execution for TableFunction#6
Draft
otegami wants to merge 2 commits into
Draft
Conversation
530b616 to
8b1d2f0
Compare
Add a per-worker proxy: one dedicated Ruby thread per DuckDB worker thread, using the same mutex/condvar hand-off protocol as the global executor but private to a single worker. This lets callbacks from different workers run concurrently instead of serializing through the one global executor queue. rbduckdb_function_executor_dispatch_via_proxy() routes the non-Ruby thread path (Case 3) through a given proxy when non-NULL, falling back to the global executor when NULL; the existing dispatch() now delegates to it with NULL, so behavior is unchanged. Live proxies are held in a GC-protected array. The new symbols are unused until table function integration lands, so this commit is behavior-preserving (full suite green).
Wire the execute path to per-worker proxy threads on DuckDB >= 1.5.0. A local_init callback registered via duckdb_table_function_set_local_init runs once per worker thread, creates a proxy (allocating its Ruby thread under the GVL through the global executor, since local_init runs on a non-Ruby thread), and stores it as thread-local init data. The execute callback retrieves that proxy and dispatches through it via rbduckdb_function_executor_dispatch_via_proxy, so callbacks from different workers run concurrently instead of serializing on the single global executor. DuckDB frees each proxy through rbduckdb_worker_proxy_destroy. bind and init stay on the global executor (not on the hot path). On DuckDB < 1.5.0 the local_init hook is absent and the execute callback keeps using the global executor unchanged. Verified: with SET threads=4 plus cardinality/max_threads hints, a GVL-releasing callback reaches max_concurrent=4 (vs 2 on the global executor) for a ~2x speedup; results are identical. The added test asserts correctness of the local_init -> proxy -> destroy lifecycle under multi-threaded execution (throughput is checked manually to avoid CI flakiness).
8b1d2f0 to
83684e6
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Enable multi-threaded execution for
DuckDB::TableFunctionon DuckDB >= 1.5.0 by introducing per-worker proxy threads.DuckDB invokes table function callbacks from its own worker threads, which are not Ruby threads. Since
rb_thread_call_with_gvlcrashes when called from non-Ruby threads, we previously forced single-threaded execution. This PR gives each DuckDB worker thread a dedicated Ruby proxy thread that acquires the GVL on its behalf, making table function callbacks safe under multi-threaded DuckDB execution.