Skip to content

Commit c773a8f

Browse files
udev/resource: fix mux device race condition leading to outdated control/disk paths
When a udev event is retrieved from the UdevManager's queue, it is matched against each resource. If a match is found, the resource's device is updated (on "add"/"change"/"move" events) or removed (on "unbind"/"remove" events). Normally the `avail` attribute is set afterwards, but not for the USBSDMuxDevice and USBSDWireDevice: those devices prevent their USBResource super class from setting the `avail` attribute. Instead, the devices are available depending on the availability of their disk and control paths. Then the device's `update()` method is called, which is a no-op for those devices. On `poll()`, their disk and control paths is set to None if their device is None. Otherwise, these paths are set if the device is not already available. In case of a USB reset (or fast replug) of a big USB tree, loads of udev events need to be processed. Cases were observed where `poll()` did not run between processing of a "remove" and "add" event of the same device. This leads to a race condition, following the description from above: The USBSDMuxDevice's or USBSDWireDevice's underlying device is set to None on "remove" and then set to the new device again, on "add". The `avail` attribute is not updated as intended. After that, the `poll()` method runs: the device is set and the `avail` attribute evaluates to True because the disk and control paths are still set from the time before the reset replug. If this situation happens, these paths stay invalid until the next reset/replug or restart of the labgrid exporter. These invalid paths might point to non-existent /dev/sg* and dev/sd* devices, rendering interaction with the Mux and its SD card impossible. Or, even worse, the paths point to valid /dev/sg* and dev/sd* devices of another Mux connected as observed here (note different USB devnums): $ labgrid-client -vv resources rlab/usb-2-p3/NetworkUSBSDMuxDevice Exporter 'rlab': Group 'usb-2-p3' (rlab/usb-2-p3/*): Resource 'USBSDMuxDevice' (rlab/usb-2-p3/NetworkUSBSDMuxDevice[/USBSDMuxDevice]): {'acquired': 'board1', 'avail': True, 'cls': 'NetworkUSBSDMuxDevice', 'params': {'busnum': 2, 'control_path': '/dev/sg6', 'devnum': 42, 'extra': {'proxy': '[...]', 'proxy_required': False}, 'host': 'rlab', 'model_id': 16449, 'path': '/dev/sdg', 'vendor_id': 1060}} $ labgrid-client -vv r rlab/usb-1-p4/NetworkUSBSDMuxDevice Exporter 'rlab': Group 'usb-1-p4' (rlabC-srv/c-usb-1-p4/*): Resource 'USBSDMuxDevice' (rlab/usb-1-p4/NetworkUSBSDMuxDevice[/USBSDMuxDevice]): {'acquired': 'board2', 'avail': True, 'cls': 'NetworkUSBSDMuxDevice', 'params': {'busnum': 2, 'control_path': '/dev/sg6', 'devnum': 43, 'extra': {'proxy': '[...]', 'proxy_required': False}, 'host': 'rlab', 'model_id': 16449, 'path': '/dev/sdg', 'vendor_id': 1060}} Fix this race condition by implementing the disk and control paths reset in the USBSDMuxDevice's or USBSDWireDevice's update() method, in case the underlying device is gone. The update() method is called during the udev event processing, so it does no longer matter when the poll() method is called. Fixes: 27ff02f ("udev: correct availability for USBSDMuxDevice") Fixes: 9324d32 ("Add sdwire driver") Signed-off-by: Bastian Krause <bst@pengutronix.de> [bst: cherry-picked from commit 029e60b]
1 parent fe91aab commit c773a8f

1 file changed

Lines changed: 17 additions & 13 deletions

File tree

labgrid/resource/udev.py

Lines changed: 17 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -467,15 +467,17 @@ def avail(self, prop):
467467
# paths are available.
468468
def poll(self):
469469
super().poll()
470+
if self.device is not None and not self.avail:
471+
for child in self.device.parent.children:
472+
if child.subsystem == 'block' and child.device_type == 'disk':
473+
self.disk_path = child.device_node
474+
self.control_serial = self.device.properties.get('ID_SERIAL_SHORT')
475+
476+
def update(self):
477+
super().update()
470478
if self.device is None:
471479
self.disk_path = None
472480
self.control_serial = None
473-
else:
474-
if not self.avail:
475-
for child in self.device.parent.children:
476-
if child.subsystem == 'block' and child.device_type == 'disk':
477-
self.disk_path = child.device_node
478-
self.control_serial = self.device.properties.get('ID_SERIAL_SHORT')
479481

480482
@property
481483
def path(self):
@@ -510,16 +512,18 @@ def avail(self, prop):
510512
# paths are available.
511513
def poll(self):
512514
super().poll()
515+
if self.device is not None and not self.avail:
516+
for child in self.device.children:
517+
if child.subsystem == 'block' and child.device_type == 'disk':
518+
self.disk_path = child.device_node
519+
elif child.subsystem == 'scsi_generic':
520+
self.control_path = child.device_node
521+
522+
def update(self):
523+
super().update()
513524
if self.device is None:
514525
self.control_path = None
515526
self.disk_path = None
516-
else:
517-
if not self.avail:
518-
for child in self.device.children:
519-
if child.subsystem == 'block' and child.device_type == 'disk':
520-
self.disk_path = child.device_node
521-
elif child.subsystem == 'scsi_generic':
522-
self.control_path = child.device_node
523527

524528
@property
525529
def path(self):

0 commit comments

Comments
 (0)