feat(service): emit resume-on-restart wrapper alongside service unit#7
Merged
crisandrews merged 1 commit intocrisandrews:mainfrom Apr 16, 2026
Merged
Conversation
/agent:service install now writes a small bash shim at
~/.clawcode/service/<slug>-resume-wrapper.sh and points the unit's
ExecStart (or plist ProgramArguments) at it. The wrapper runs
`claude --continue` so the service rehydrates the prior session on
restart, preserving conversation history instead of starting fresh.
This is especially useful with the new opt-in watchdog (v1.3.0) —
when the watchdog detects a silent failure and restarts the service,
the replacement process resumes the prior conversation instead of
losing all context.
Behavior (all inside the wrapper):
- Runs `claude --continue` by default.
- Falls back to a plain start when there is no prior session jsonl
(first boot — --continue would error).
- Falls back to a plain start when the last session is more than 7
days old. Long-stale resumes can behave oddly and in practice the
user usually wants a fresh start after a week-off anyway.
Opt out via `service_plan({ action: 'install', resumeOnRestart: false })`.
That returns the pre-change plan — unit/plist invokes `claude` directly.
Changes:
- lib/service-generator.ts
- New ExtraFile type + ServicePlan.extraFiles — lets the plan
request auxiliary files be written before install commands run.
Used for the wrapper; general enough for future auxiliary files.
- generateResumeWrapper() — pure generator for the bash shim.
Takes claudeBin + workspace + extraArgs; bakes them into a
self-contained script.
- resumeWrapperPath(slug) — canonical per-slug install path
under ~/.clawcode/service/.
- generateSystemdUnit + generatePlist: new skipDefaultArgs flag.
When the wrapper is in use it already includes
--dangerously-skip-permissions + extraArgs, so the unit/plist
point at the wrapper bare.
- ServiceOptions.resumeOnRestart (default true).
- uninstall appends a best-effort `rm -f` for the wrapper so
teardown leaves no stragglers.
- skills/service/SKILL.md: install flow writes each plan.extraFiles
entry (mkdir parent, Write content, chmod to declared mode) after
the settings.json pre-check and before writing the unit/plist.
- docs/service.md: new "Resume-on-restart wrapper" section explaining
the default behavior, fallbacks, and the opt-out.
Sanity-tested via `bun -e` on the generator directly:
install / linux -> wrapper emitted, unit points at it
install / darwin -> wrapper emitted, plist ProgramArguments points at it
install / opt-out -> stock claude invocation, no extraFiles
uninstall -> wrapper-cleanup command appended
This was referenced Apr 15, 2026
Contributor
Author
|
Heads-up: this PR should land together with or after #9. #7 alone emits a resume-wrapper whose |
JD2005L
added a commit
to JD2005L/ClawCode
that referenced
this pull request
Apr 17, 2026
Follow-up to crisandrews#16 with a clarified rationale for keeping the env var. The PTY wrap from crisandrews#9 fixes the SessionEnd-hook failure that turns graceful exit into code 1 and causes restart churn. DISABLE_AUTOUPDATER=1 addresses a different problem: Claude Code's in-process auto-updater regenerates files it manages mid-run, including the resume-on-restart wrapper script generated by crisandrews#7. A long-running daemon rewriting its own ExecStart target while live is a file-integrity issue, separate from the crash loop, and the PTY wrap does nothing for it. On the "don't modify Claude Code internals" principle from crisandrews#16: the principle stands, but DISABLE_AUTOUPDATER is a documented env var Claude Code exposes for this use case. Setting it is a supported interface, not a monkey-patch. The restored comment names the env var and the specific file-regeneration scenario inline so future readers see the intent. Also relevant to crisandrews#12 (/agent:update skill): the skill's explicit-manual- update flow assumes in-process auto-update is off in service mode. With auto-update running again, the skill competes with an updater that may silently rewrite service files behind the operator.
2 tasks
crisandrews
added a commit
that referenced
this pull request
Apr 17, 2026
Supersedes #17 (conflict resolution against post-#8 main). JD's #17 targeted pre-#8 main so the block collided with the newly-added HOME/TERM Environment lines. This commit places the same 8-line block immediately after HOME/TERM so systemd sees all three env vars side by side. The rationale (verbatim from #17, which is correct): PTY wrap and DISABLE_AUTOUPDATER fix different problems: - PTY wrap: SessionEnd hook needs a controlling terminal to spawn /bin/sh at graceful shutdown. - DISABLE_AUTOUPDATER: prevents Claude Code's in-process auto-updater from regenerating daemon-relevant files (including the resume-on-restart wrapper from #7) while the daemon is running. #16 incorrectly treated DISABLE_AUTOUPDATER as redundant defense-in-depth against the crash loop. It is actually addressing the file-integrity scenario, which the PTY wrap does not cover. Setting DISABLE_AUTOUPDATER is also using a documented Claude Code env var, not monkey-patching internals. macOS plist remains unchanged (JD's #17 scope; parity can follow once the systemd-side default settles). Co-Authored-By: JD2005L <34459020+JD2005L@users.noreply.github.com>
crisandrews
added a commit
that referenced
this pull request
Apr 17, 2026
Supersedes #17 (conflict resolution against post-#8 main). JD's #17 targeted pre-#8 main so the block collided with the newly-added HOME/TERM Environment lines. This commit places the same 8-line block immediately after HOME/TERM so systemd sees all three env vars side by side. The rationale (verbatim from #17, which is correct): PTY wrap and DISABLE_AUTOUPDATER fix different problems: - PTY wrap: SessionEnd hook needs a controlling terminal to spawn /bin/sh at graceful shutdown. - DISABLE_AUTOUPDATER: prevents Claude Code's in-process auto-updater from regenerating daemon-relevant files (including the resume-on-restart wrapper from #7) while the daemon is running. #16 incorrectly treated DISABLE_AUTOUPDATER as redundant defense-in-depth against the crash loop. It is actually addressing the file-integrity scenario, which the PTY wrap does not cover. Setting DISABLE_AUTOUPDATER is also using a documented Claude Code env var, not monkey-patching internals. macOS plist remains unchanged (JD's #17 scope; parity can follow once the systemd-side default settles). Co-authored-by: crisandrews <crisandrews@users.noreply.github.com> Co-authored-by: JD2005L <34459020+JD2005L@users.noreply.github.com>
crisandrews
added a commit
that referenced
this pull request
Apr 17, 2026
Summary of changes in this release (full detail in CHANGELOG.md): Added - Resume-on-restart wrapper for service mode (#7) - Service hardening defaults: HOME/TERM env, StartLimitBurst guard, persistent log path (#8) - /agent:update skill + heartbeat version-check with day-gate and per-version dedupe (#12) Fixed - WORKSPACE resolution so memory_search hits user's project dir, not plugin dir (#6, closes #5) - Linux systemd crash loop after Claude Code auto-updates mid-run — PTY wrap in ExecStart + DISABLE_AUTOUPDATER=1 for file-integrity (#9, #17/#18) - macOS launchd PTY wrap parity (#16) - Cross-user /agent:import discovery + post-import path sanity check (#10) Performance - reconcile-crons.sh fast-path on steady-state sessions (#11) Thanks to @JD2005L for the whole batch.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
/agent:service installnow writes a small bash shim at~/.clawcode/service/<slug>-resume-wrapper.shand points the unit'sExecStart(or plistProgramArguments) at it. The wrapper runsclaude --continueso the service rehydrates the prior session on restart, preserving conversation history instead of starting fresh.Why
Pairs naturally with the new opt-in watchdog from v1.3.0: when an external probe detects a silent failure and triggers a restart, the replacement process resumes the prior conversation instead of losing all context. Previously, every restart — whether from a crash, a deploy, or a watchdog-initiated recovery — meant a clean slate. For agents running long conversations on messaging channels (Telegram, WhatsApp, etc.), context loss across restarts is the kind of papercut that accumulates fast.
The motivating case for me: on my fork, a stalled inference → watchdog kills the process → service restarts → user's next message lands in a brand-new session that has no idea what they were just talking about. With this wrapper the user barely notices the restart happened.
Wrapper behavior
claude --continueby default.--continuewould otherwise error).All three paths exec the same underlying
claude --dangerously-skip-permissions <extraArgs>— the wrapper only decides whether--continueis on the command line.Opt out
Returns the pre-change plan: unit/plist invokes
claudedirectly, no wrapper is written, noextraFilesin the plan. Intended for users who explicitly want a fresh session on every restart.Changes
lib/service-generator.tsExtraFiletype +ServicePlan.extraFiles— lets the plan request auxiliary files be written before install commands run. Used here for the wrapper; general enough for future auxiliary files (e.g. a loader, a pre-flight probe) without needing to extendServicePlanagain.generateResumeWrapper()— pure generator for the bash shim. TakesclaudeBin+workspace+extraArgs, bakes them into a self-contained script.resumeWrapperPath(slug)— canonical per-slug install path under~/.clawcode/service/.generateSystemdUnit+generatePlist: newskipDefaultArgsflag. When the wrapper is in use it already embeds--dangerously-skip-permissions+ extraArgs, so the unit/plist point at the wrapper bare without any args.ServiceOptions.resumeOnRestart(defaulttrue).uninstallaction appends a best-effortrm -ffor the wrapper so teardown leaves no stragglers.skills/service/SKILL.mdplan.extraFilesentry (mkdir parent, Write content, chmod to declared mode) after thesettings.jsonpre-check from v1.3.0 and before writing the unit/plist.docs/service.mdVerification
Sanity-tested the generator directly via
bun -e:install / linux~/.clawcode/service/<slug>-resume-wrapper.sh, unitExecStartpoints at itinstall / darwinProgramArgumentspoints at itinstall / resumeOnRestart: falseclaudeinvocation, noextraFilesuninstallrm -fappended to commandsAlso verified end-to-end on my agent: killed the service while mid-conversation, waited for auto-restart, observed that the new process picked up with full prior context.
Interactions with the watchdog recipe
No code coupling — they are orthogonal features. The watchdog detects "service is up but silently broken" and issues a restart; the wrapper controls what happens on restart regardless of who triggered it. Running both gives the best recovery experience but neither requires the other.
Not changed
generatePliston macOS still relies on launchd'sProcessType=Interactiveto provide a TTY — no systemd-stylescript(1)wrapping here.pkillExecStartPre from v1.3.0 left alone; remains the right answer for the multi-instance race./agent:service install.Alternative considered
Embedding the
--continuelogic directly ingenerateSystemdUnitviaExecStartPre+ conditionalExecStart. Rejected because launchd has noExecStartPreequivalent and we'd end up with two different code paths. A wrapper script generalizes cleanly across both platforms.