feat(service): harden defaults — env, crash-loop guard, persistent logs#8
Open
JD2005L wants to merge 1 commit intocrisandrews:mainfrom
Open
feat(service): harden defaults — env, crash-loop guard, persistent logs#8JD2005L wants to merge 1 commit intocrisandrews:mainfrom
JD2005L wants to merge 1 commit intocrisandrews:mainfrom
Conversation
Three orthogonal install hardening items that stand on their own: 1. Inject HOME and TERM explicitly. systemd user services and launchd both start with a largely empty environment. Anything in Claude Code or a plugin that reads $HOME (e.g. resolving `~`), or probes $TERM (color output, TUI detection, some node libraries), behaves unpredictably when those are unset. Setting HOME=<os.homedir()> and TERM=xterm-256color at the unit / plist level is a cheap, universally safe default. 2. Crash-loop guard on systemd (StartLimitIntervalSec + StartLimitBurst). With `Restart=always` and `RestartSec=10`, a deterministic boot-time error (bad config, missing binary, malformed extraArgs) churns forever at 10 s intervals, flooding logs and journald. Adding `StartLimitIntervalSec=300` + `StartLimitBurst=5` tells systemd to give up after 5 restarts in 5 minutes so the failure surfaces in `systemctl status` instead of hiding in noise. No behavior change on healthy services. macOS launchd already has its own throttling (ThrottleInterval / ExitTimeOut), so no plist change needed. 3. Persistent log path: /tmp → ~/.clawcode/logs/<slug>.log. /tmp is wiped on reboot, so a service that auto-restarts through a reboot loses the log that explains why it was failing. Moving under ~/.clawcode/logs/ matches where the service's other per-agent state lives (~/.claude/... for Claude Code, ~/.clawcode/ for clawcode scaffolding) and keeps logs around for post-mortem. The install plan now includes a `Create log directory` command up front — systemd's `append:` and launchd's StandardOutPath do NOT create missing parent dirs and the service silently refuses to start without them. Docs updated: - The example systemd unit and plist in `docs/service.md` reflect the new env lines, crash-loop guard, and log path. - The "Logs" section mentions the persistent default + why the log directory is created at install time. - The restart-loop troubleshooting row now points at the new log location and mentions StartLimitBurst so users know to check `systemctl status` when the service has given up. Not changed: - `pkill` ExecStartPre (v1.3.0 already has the right version). - Restart policy kept as `always` + `RestartSec=10` — these are opinionated and the defaults are reasonable; crash-loop guard is the right fix for their downside. - macOS launchd's EnvironmentVariables is the plist-level equivalent of systemd's `Environment=` lines — added for parity.
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Three orthogonal install-hardening items that each stand on their own:
HOMEandTERMexplicitly into both systemd unit and launchd plist.StartLimitIntervalSec=300+StartLimitBurst=5./tmp/clawcode-<slug>.log→~/.clawcode/logs/<slug>.log, with install plan creating the dir.Why
1. Explicit
HOME+TERMsystemd user services and launchd both start with a largely empty environment. Anything in Claude Code or a plugin that reads
\$HOME(resolving~), or probes\$TERM(color output, TUI detection, several Node libraries), behaves unpredictably when those are unset. Some combinations fail loudly; more often they fail subtly — e.g. a skill that works interactively and then no-ops under the service with no clear error. Setting them at the unit/plist level is cheap and universally safe.2. Crash-loop guard
With
Restart=alwaysandRestartSec=10, a deterministic boot-time error (bad config, missing binary, malformedextraArgs) churns forever at 10 s intervals, flooding logs and journald.StartLimitIntervalSec=300+StartLimitBurst=5tells systemd to give up after 5 restarts in 5 minutes so the failure surfaces insystemctl statusinstead of hiding in noise. No change on healthy services.macOS launchd already has its own throttling (
ThrottleInterval,ExitTimeOut), so no plist change is needed for this item.3. Persistent log path
/tmpis wiped on reboot, so a service that auto-restarts through a reboot loses the log that explains why it was failing in the first place. Moving to~/.clawcode/logs/keeps logs available for post-mortem and matches where the rest of the agent's per-user state sits.systemd'sappend:and launchd'sStandardOutPathdo NOT create missing parent dirs and the service will silently refuse to start without them, so the install plan now includes a "Create log directory" command up front.Changes
lib/service-generator.tsdefaultLogPathnow returns~/.clawcode/logs/<slug>.log(was/tmp/clawcode-<slug>.log).generateSystemdUnitemitsEnvironment=HOME=...,Environment=TERM=xterm-256color,StartLimitIntervalSec=300,StartLimitBurst=5.generatePlistemitsEnvironmentVariableswithHOMEandTERM.buildPlaninstall action prepends a "Create log directory" command.docs/service.mdStartLimitBurstso users know to checksystemctl statuswhen systemd has given up.Verification
bun -eon the generator directly:Plist output similarly verified —
EnvironmentVariablesdict withHOMEandTERM, persistentStandardOutPath.Not changed (explicitly)
ExecStartPre=-/usr/bin/pkill ...from v1.3.0 left alone — it's the right answer for the multi-instance race.Restart=alwaysandRestartSec=10kept. These are opinionated and the current defaults are reasonable; crash-loop guard is the right fix for their only real downside.script(1)-wrap ofExecStart. I run this locally and it's arguably helpful for some skills that expect a PTY, but the value is niche enough that it doesn't belong in a default.Relationship to other PRs
fix: resolve WORKSPACE ...) — no overlap.feat(service): resume-on-restart wrapper) — no file overlap; both PRs can land in either order. If they both land, the wrapper PR'sextraFileswriting happens in the same install flow as this PR'sCreate log directorycommand.