fix(libev): guard handle_read/handle_write against close() race condition by vponomaryov · Pull Request #889 · scylladb/python-driver

vponomaryov · 2026-05-19T13:15:06Z

When close() is called from one thread, it sets is_closed=True and closes the socket immediately. However, libev watchers are stopped asynchronously in _loop_will_run(), so handle_read()/handle_write() can still fire on the now-closed fd, causing EBADF errors that surface as ConnectionShutdown('Bad file descriptor') and prevent reconnection.

So, fix it applying following changes:

Early-return guards at the top of handle_read() and handle_write() that check is_closed/is_defunct before touching the socket
Secondary is_closed/is_defunct checks in error handlers to catch the race when close() happens between watcher dispatch and syscall
Peer disconnect detection (EBADF, ECONNRESET, ENOTCONN, etc.) that calls close() cleanly instead of defunct()
last_error preservation in close() when connected_event is unset, preventing factory() from returning dead connections

Fixes: #614

Pre-review checklist

I have split my patch into logically separate commits.
All commit messages clearly explain what they change and why.
I added relevant tests for new features and bug fixes.
All commits compile, pass static checks and pass test.
PR description sums up the changes and reasons why they should be introduced.
I have provided docstrings for the public items that I want to introduce.
I have adjusted the documentation in ./docs/source/.
I added appropriate Fixes: annotations to PR description.

…tion When close() is called from one thread, it sets is_closed=True and closes the socket immediately. However, libev watchers are stopped asynchronously in _loop_will_run(), so handle_read()/handle_write() can still fire on the now-closed fd, causing EBADF errors that surface as ConnectionShutdown('Bad file descriptor') and prevent reconnection. So, fix it applying following changes: - Early-return guards at the top of handle_read() and handle_write() that check is_closed/is_defunct before touching the socket - Secondary is_closed/is_defunct checks in error handlers to catch the race when close() happens between watcher dispatch and syscall - Peer disconnect detection (EBADF, ECONNRESET, ENOTCONN, etc.) that calls close() cleanly instead of defunct() - last_error preservation in close() when connected_event is unset, preventing factory() from returning dead connections Fixes: scylladb#614

vponomaryov · 2026-05-19T13:17:53Z

Ref: https://scylladb.atlassian.net/browse/SCT-83

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(libev): guard handle_read/handle_write against close() race condition#889

fix(libev): guard handle_read/handle_write against close() race condition#889
vponomaryov wants to merge 1 commit into
scylladb:masterfrom
vponomaryov:fix-race-issue-614

vponomaryov commented May 19, 2026

Uh oh!

vponomaryov commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vponomaryov commented May 19, 2026

Pre-review checklist

Uh oh!

vponomaryov commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant