Skip to content

Agent restart failed after concurrent update-core runs #7877

@DavidePrincipi

Description

@DavidePrincipi

Steps to reproduce

  • On an NS8 cluster worker node, start update-core automatically (scheduler)
  • Within ~1 minute, start a second update-core manually on the same node
  • Wait for update-core to send shutdown to all rootless module agents
  • Observe agent.service restart failures; later, run cluster update-module and observe it aborts because module agents are missing

Expected behavior

Rootless module agents restart correctly after core update, and update-core is serialized (no concurrent runs).

Actual behavior

All rootless module agents stop and remain down until manual intervention (systemctl --user daemon-reload + restart). Cluster update-module aborts with Client module/ was not found.

Relevant excerpt:

Feb 13 03:29:40.109771 <node> agent@node[PID]: Signal "user defined signal 1" caught: shutdown started.
...
Feb 13 03:29:40.432107 <node> systemd[USER_PID]: agent.service: Failed to load environment files: No such file or directory
Feb 13 03:29:40.432127 <node> systemd[USER_PID]: agent.service: Failed to run 'start' task: No such file or directory
...
Feb 13 03:29:41.666918 <node> systemd[USER_PID]: agent.service: Start request repeated too quickly.
Feb 13 03:29:41.666937 <node> systemd[USER_PID]: agent.service: Failed with result 'resources'.

Feb 15 08:05:28 <node> agent@cluster[PID]: agent.tasks.exceptions.TaskSubmissionCheckFailed: Client "module/<module>" was not found
Feb 15 08:05:28 <node> agent@cluster[PID]: task/cluster/<task>: action "update-module" status is "aborted" (1) at step 50update

Components

NS8 core 3.17.1 (from logs: ghcr.io/nethserver/core:3.17.1).
Modules affected: all rootless module agents on the worker (not a single module).

See also


Thanks to @nrauso

Metadata

Metadata

Assignees

No one assigned

    Labels

    verifiedAll test cases were verified successfully

    Type

    No fields configured for Bug.

    Projects

    Status

    Verified

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions