SUP-5805: Add shared plugin path, caching, and file locking for plugins when used in agent-stack-k8s#3652
SUP-5805: Add shared plugin path, caching, and file locking for plugins when used in agent-stack-k8s#3652
Conversation
…ed in agent-stack-k8s
|
Tested with EKS cluster using EFS-backed PVC for Job 1: checks to ensure shared plugin path is empty Fix is mainly between Jobs 2 and 3. No race condition between them, and only one of them downloaded the binary by locking the process. |
|
Hi @Mykematt, thanks for this. I realise it's been a while since you raised it, but it deserves consideration. Aside from the syscall dependency, which breaks the Windows build, I have reservations about introducing more file-based locking (the prior art for file locks here is in git mirrors, which, let's just say, is not my favourite part of the agent). I would like to solve caching more holistically, especially for the k8s stack, and I imagine plugins would fit into whatever solution we arrive at. I'm going to remove reviewers from this. This isn't "we're not doing this". Maybe it can be made to work well - if that's the case feel free to re-add me. |
Description
Problem: In Kubernetes environments with ephemeral pods, multiple jobs starting simultaneously would redundantly download the same plugin, wasting time and bandwidth.
Solution: Implemented file locking for shared plugin storage in Kubernetes agents to prevent race conditions during plugin downloads. Added enhanced logging to show:
Context
Linear: SUP-5805
Changes
Core Implementation (
internal/job/plugin.go):openCachedPlugin()helper function to DRY up duplicated cached plugin handling codeacquirePluginLock()with logging for lock acquisition wait statescheckoutPlugin()to support shared plugin storage with file locking whenBUILDKITE_PLUGINS_PATH_INCLUDES_AGENT_NAME=falseTesting
go test ./...). Buildkite employees may check this if the pipeline has run automatically.go tool gofumpt -extra -w .)Disclosures / Credits
I consulted Claude for potential approaches, then wrote the implementation myself