fix: sanitize non-identifier field names in MCP/OpenAPI tool schemas#99
Conversation
MCP / OpenAPI 工具的 JSON Schema 经常包含含 `-` 的字段名 (如 `x-access-id`、`api-version`)、Python 保留字 (`class`、`from`) 或数字开头 的字段。Pydantic 接受这类字段名, 但下游 `inspect.Parameter` 会抛 ValueError 导致整个工具加载失败、被静默丢弃。 本提交把 JSON Schema → Pydantic 的转换层加上字段名 sanitizer: 内部用合法 Python 标识符做 Pydantic 字段名 (`x_access_id`), 通过 `alias` 同时保留原名给 JSON Schema 输出和 MCP 调用使用。配合 `populate_by_name=True`, 两种写法都能验证通过, `model_dump(by_alias=True)` 确保实际下发到 MCP backend 的字段名仍是原始名 `x-access-id`。 同步给 `_create_function_with_signature` 的 alias 循环加上防御性 sanitize, 避免未来扩展 `__agentrun_argument_aliases__` 时再次踩坑。 新增 13 个回归测试覆盖: 含 `-` / `.` 的字段名、数字开头、保留字 (`class`)、 空串、`_build_tool_from_meta` 端到端链路、alias 循环防御。 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes MCP/OpenAPI tool loading failures caused by JSON Schema field names that are not valid Python identifiers (e.g., x-access-id, api-version, Python keywords). It does so by sanitizing field names for Pydantic/inspect.Parameter compatibility while preserving the original names via Pydantic aliases so schemas and backend calls continue to use the original keys.
Changes:
- Added a field-name sanitizer and applied it in
_json_schema_to_pydantic, using Pydanticalias+populate_by_nameto preserve original schema keys while allowing sanitized input. - Ensured tool invocation payloads are dumped with
by_alias=Trueso backend receives the original (unsanitized) field names. - Added integration/regression tests covering sanitizer behavior, schema conversion, end-to-end tool building, and alias sanitization in signature generation.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
agentrun/integration/utils/tool.py |
Adds identifier sanitization for schema-to-Pydantic conversion, updates dumping to use aliases, and defensively sanitizes alias parameters in generated signatures. |
tests/unittests/integration/test_tool_utils.py |
Adds regression tests for sanitizer branches, schema conversion behavior, signature alias sanitization, and _build_tool_from_meta end-to-end. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| alias_name = ( | ||
| alias | ||
| if alias.isidentifier() | ||
| else _sanitize_python_identifier(alias) | ||
| ) |
There was a problem hiding this comment.
Good catch — fixed in e97f1ba. When an alias is sanitized, we now also add the sanitized form to __agentrun_argument_aliases__ so that _normalize_tool_arguments can translate it back to the canonical field. Added a regression test (test_call_via_sanitized_alias_name_routes_to_canonical) covering the end-to-end "call via sanitized alias → route to canonical → send original name to MCP" path.
| if not sanitized: | ||
| sanitized = "field" | ||
| if sanitized[0].isdigit(): | ||
| # Pydantic 不允许字段名以下划线开头, 因此用字母前缀. |
There was a problem hiding this comment.
Fixed in e97f1ba — updated the comment to clarify it's the digit-leading branch and explain why we must use a letter prefix (field_) instead of _ (Pydantic forbids leading underscores in field names).
1. `_create_function_with_signature` 里 alias 被 sanitize 后, 同步把 sanitized 形式加入 `__agentrun_argument_aliases__`, 让 `_normalize_tool_arguments` 在 调用方使用签名暴露的 sanitized 名字时也能翻译到 canonical 字段。 2. 修正 `_sanitize_python_identifier` 中 "数字开头" 分支的注释, 原注释提到 "Pydantic 不允许下划线开头" 容易让人误以为分支判断的是下划线开头。 新增 1 个回归测试 (`test_call_via_sanitized_alias_name_routes_to_canonical`) 显式覆盖 "用 sanitized alias 名调用 → 翻译回 canonical → 下发给 MCP" 的链路。 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Sodawyx <sodawyx@126.com>
Summary
MCP / OpenAPI 工具的 JSON Schema 中常包含非合法 Python 标识符的字段名(如 HTTP header 风格的
x-access-id、api-version,或 Python 保留字class、from,或数字开头的字段)。Pydantic 接受这类字段名,但下游
inspect.Parameter会抛ValueError: 'X' is not a valid parameter name,导致整个工具加载失败、被静默丢弃。Agent 完全用不上这类 MCP 工具。修复
_json_schema_to_pydantic加上字段名 sanitizer:内部用合法 Python 标识符做 Pydantic 字段名(x-access-id→x_access_id),通过 Pydanticalias同时保留原名。populate_by_name=True+model_dump(by_alias=True)保证:x-access-id)_create_function_with_signature的 alias 循环加上防御性 sanitize,堵掉同类型潜在 bug。覆盖的同类问题
-字符x-access-id,api-version,content-type./@/\$等a.b.c,@type,\$ref123abc,2fa-codeclass,from,return,def"","---"Test plan
_json_schema_to_pydantic+_build_tool_from_meta端到端 + alias 循环防御tests/unittests/integration/全部 191 passed + 2 skippedexamples/quick_start_skills.py配AGENTRUN_TOOL_NAMES=zhixingli-jingsu-test(含x-access-id字段)WARNING 加载 Tool 'zhixingli-jingsu-test' 失败: 'x-access-id' is not a valid parameter name已加载 Tool 'zhixingli-jingsu-test' 的 5 个子工具🤖 Generated with Claude Code