Skip to content

fix: prefix custom-endpoint model refs with endpoint/ when model ID contains slash#102

Draft
usize wants to merge 1 commit intosallyom:mainfrom
usize:fix/custom-endpoint-model-routing
Draft

fix: prefix custom-endpoint model refs with endpoint/ when model ID contains slash#102
usize wants to merge 1 commit intosallyom:mainfrom
usize:fix/custom-endpoint-model-routing

Conversation

@usize
Copy link
Copy Markdown

@usize usize commented Apr 6, 2026

Summary

  • Fixes normalizeModelRef and normalizeProviderModelRef in k8s-helpers.ts to always prefix custom-endpoint model IDs with endpoint/, even when the model ID already contains a / (e.g. google/gemma-4-26B-A4B-it)
  • Without this fix, the gateway treats the first segment as a provider name instead of routing to the endpoint provider
  • Adds regression tests covering single-agent, double-prefix, and full config rendering scenarios

Fixes #93

Test plan

  • npm run build passes
  • npm test passes (280/280, including 3 new regression tests)

Generated with agent.sh

… ID contains slash (sallyom#93)

When inferenceProvider is 'custom-endpoint', model IDs like
'google/gemma-4-26B-A4B-it' were passed through unprefixed because
normalizeModelRef and normalizeProviderModelRef had an early return
for IDs containing '/'. The gateway then parsed 'google/' as a
provider prefix, routing to a nonexistent 'google' provider.

Now both functions always prefix with 'endpoint/' for custom-endpoint
configs, producing 'endpoint/google/gemma-4-26B-A4B-it'. The
provider's own models[].id keeps the raw model ID for the vLLM API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jwm4
Copy link
Copy Markdown
Collaborator

jwm4 commented Apr 20, 2026

Hi @usize — thanks for this PR and the well-written issue in #93. The bug report was excellent: clear repro steps, expected behavior, workaround, and environment details. That's a model for how to file an issue.

The bug you identified was real — normalizeModelRef had the includes("/") early return before the custom-endpoint check, so model IDs like google/gemma-4-26B-A4B-it were passed through unprefixed. Your diagnosis was spot on.

It looks like this was independently fixed in the main branch around April 7–8 (commits 4d38a61 and 1195db4), which moved the custom-endpoint check above the includes("/") guard and added the startsWith double-prefix protection — essentially the same fix you have here. There are also regression tests covering this scenario now (see k8s-helpers.test.ts lines 142–170).

Given that, I think this PR can be closed unless there's something in your fix that the current code doesn't cover. If you see a gap I missed, please let me know!

(Comment from Claude Code, under the supervision of Bill Murdock.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Custom endpoint deploys generate unroutable model IDs in openclaw.json

2 participants