Skip to content

fix(scrapy): async-thread startup race, shutdown lifecycle, and timeout setting#979

Draft
vdusek wants to merge 8 commits into
masterfrom
fix/async-thread-startup-race
Draft

fix(scrapy): async-thread startup race, shutdown lifecycle, and timeout setting#979
vdusek wants to merge 8 commits into
masterfrom
fix/async-thread-startup-race

Conversation

@vdusek

@vdusek vdusek commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Description

Fixes several defects in the Scrapy integration's background event-loop thread (AsyncThread), the scheduler, and the HTTP cache storage.

Fixes

  • run_coro startup race. The guard if not self._eventloop.is_running() fired spuriously during startup. __init__ starts the loop thread and returns immediately, but callers (ApifyScheduler.open, the HTTP cache extension) invoke run_coro right after construction, before the thread reaches run_forever(), so is_running() was False and the run crashed (observed ~122/500 times in scheduler.open()). The guard now checks is_closed(). A coroutine submitted to a created-but-not-yet-running loop is queued by asyncio.run_coroutine_threadsafe and runs once the loop starts; only a genuinely closed loop raises.
  • close() could leak the loop thread when task cancellation failed. If _shutdown_tasks() timed out or raised, close() returned before stopping the loop and joining the thread. The stop, join, and forced-shutdown fallback now run in a finally, so the thread is always torn down. The original error still propagates.
  • close() raised on a second call. A repeated close (e.g. ApifyScheduler.open() closing on failure, then Scrapy closing again) called into the already-closed loop and raised RuntimeError: Event loop is closed. An is_closed() early-return makes a second close a no-op.
  • close() ignored its timeout argument for the task-cancellation step (it used the constructor default). It now passes the caller's timeout to that step too.
  • run_coro timeout left the coroutine running. On timeout it now cancels the future so the coroutine does not outlive the timeout.
  • HTTP cache cleanup could leak the loop thread. In ApifyCacheStorage.close_spider, the expiration sweep ran outside the try, so a failure there skipped AsyncThread.close(). The sweep now runs inside try, with close() in a finally so it always runs.
  • HTTP cache open could leak the loop thread. open_spider started the AsyncThread and then opened the key-value store, but on failure it did not close the thread (and close_spider may never run if open_spider fails). It now closes the thread on failure, matching ApifyScheduler.open.
  • refactor(scrapy): make AsyncThread timeout configurable #955's configurable timeout was unreachable by users. Added the APIFY_ASYNC_THREAD_TIMEOUT_SECS Scrapy setting, wired into the scheduler (via from_crawler) and the cache storage, so the AsyncThread timeout is configurable.

Error logging

run_coro no longer logs-and-raises, so it no longer double-reports its own errors. Each call site keeps a traceback.print_exc(), so the raw traceback is still printed alongside Scrapy's own logging of the propagated exception.

Tests

New tests/unit/scrapy/test_async_thread.py covers the startup race, normal run, run-after-close, timeout cancellation, the no-self-logging behaviour, idempotent close, the caller timeout reaching the shutdown step, and stop/join even when task cancellation fails. Plus additions to the scheduler and HTTP cache test modules: the timeout-setting wiring, closing the thread on open failure, and the cleanup-failure path still closing the thread.

@vdusek vdusek added adhoc Ad-hoc unplanned task added during the sprint. t-tooling Issues with this label are in the ownership of the tooling team. labels Jun 12, 2026
@vdusek vdusek self-assigned this Jun 12, 2026
@github-actions github-actions Bot added this to the 142nd sprint - Tooling team milestone Jun 12, 2026
@github-actions github-actions Bot added the tested Temporary label used only programatically for some analytics. label Jun 12, 2026
@codecov

codecov Bot commented Jun 12, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 81.57895% with 14 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.14%. Comparing base (0daca28) to head (a60a55c).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
src/apify/scrapy/extensions/_httpcache.py 78.18% 12 Missing ⚠️
src/apify/scrapy/_async_thread.py 85.71% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #979      +/-   ##
==========================================
+ Coverage   89.90%   91.14%   +1.24%     
==========================================
  Files          49       49              
  Lines        3091     3118      +27     
==========================================
+ Hits         2779     2842      +63     
+ Misses        312      276      -36     
Flag Coverage Δ
e2e 35.59% <0.00%> (-0.32%) ⬇️
integration 56.38% <0.00%> (-0.50%) ⬇️
unit 80.08% <81.57%> (+1.33%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@vdusek vdusek changed the title fix: Resolve AsyncThread.run_coro startup race fix(scrapy): Resolve AsyncThread.run_coro startup race Jun 12, 2026
@vdusek vdusek requested a review from Pijukatel June 12, 2026 17:03
@vdusek vdusek marked this pull request as ready for review June 12, 2026 17:04
@vdusek vdusek changed the title fix(scrapy): Resolve AsyncThread.run_coro startup race fix(scrapy): async-thread startup race, shutdown lifecycle, and timeout setting Jun 13, 2026
@vdusek vdusek marked this pull request as draft June 13, 2026 08:14
@vdusek vdusek removed the request for review from Pijukatel June 13, 2026 08:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

adhoc Ad-hoc unplanned task added during the sprint. t-tooling Issues with this label are in the ownership of the tooling team. tested Temporary label used only programatically for some analytics.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants