Skip to content

feat(tools): add HugeGraph AI DeepWiki assistant#355

Open
LRriver wants to merge 11 commits into
apache:mainfrom
LRriver:deepwiki-skill
Open

feat(tools): add HugeGraph AI DeepWiki assistant#355
LRriver wants to merge 11 commits into
apache:mainfrom
LRriver:deepwiki-skill

Conversation

@LRriver
Copy link
Copy Markdown
Contributor

@LRriver LRriver commented Jun 1, 2026

Purpose

Add an optional HugeGraph AI repository knowledge assistant under tools/ai. The assistant is intended for Claude Code and Codex users who want repository-scoped Q&A for https://github.com/apache/hugegraph-ai, using DeepWiki as the online knowledge source while caching wiki contents locally for repeated context search.

Changes

  • Add tools/ai/hugegraph-ai-deepwiki-skill as a standalone installable module.
  • Include Claude and Codex plugin manifests plus marketplace manifests.
  • Add bilingual installation and usage docs: README.md and README-zh.md.
  • Add a small DeepWiki MCP client CLI for structure, contents, context, ask, and tools.
  • Keep the assistant isolated from runtime code and project dependencies.

Verification

  • python3 -m json.tool on all new JSON manifests.
  • python3 -m py_compile tools/ai/hugegraph-ai-deepwiki-skill/plugins/hugegraph-ai-deepwiki-skill/skills/hugegraph-ai-deepwiki-skill/scripts/deepwiki_mcp.py.
  • uv run --extra dev ruff format --check tools/ai/hugegraph-ai-deepwiki-skill/plugins/hugegraph-ai-deepwiki-skill/skills/hugegraph-ai-deepwiki-skill/scripts/deepwiki_mcp.py.
  • uv run --extra dev ruff check tools/ai/hugegraph-ai-deepwiki-skill/plugins/hugegraph-ai-deepwiki-skill/skills/hugegraph-ai-deepwiki-skill/scripts/deepwiki_mcp.py.
  • claude plugin validate tools/ai/hugegraph-ai-deepwiki-skill.
  • claude plugin validate tools/ai/hugegraph-ai-deepwiki-skill/plugins/hugegraph-ai-deepwiki-skill.
  • Temporary Codex install via codex plugin marketplace add and codex plugin add.
  • DeepWiki smoke tests: structure, cached context, and online ask for apache/hugegraph-ai.

Compatibility

This is an optional tool module only. It does not change HugeGraph AI runtime behavior, public APIs, package dependencies, or default configuration.

Copilot AI review requested due to automatic review settings June 1, 2026 10:57
@dosubot dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. enhancement New feature or request labels Jun 1, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a new “HugeGraph AI DeepWiki” skill/plugin that can query the official DeepWiki MCP server, cache wiki contents locally, and provide repository-scoped Q&A and context search for apache/hugegraph-ai.

Changes:

  • Introduces a Python MCP client script (deepwiki_mcp.py) with commands for ask, structure, contents, context, and tools.
  • Adds repository profile mapping (references/repos.json) and agent/tool configuration (agents/openai.yaml).
  • Adds plugin packaging + documentation for Codex/Claude installs (plugin manifests, marketplace entries, READMEs, SKILL.md).

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tools/ai/hugegraph-ai-deepwiki-skill/plugins/hugegraph-ai-deepwiki-skill/skills/hugegraph-ai-deepwiki-skill/scripts/deepwiki_mcp.py Implements the MCP client, caching, and CLI workflows described in docs.
tools/ai/hugegraph-ai-deepwiki-skill/plugins/hugegraph-ai-deepwiki-skill/skills/hugegraph-ai-deepwiki-skill/references/repos.json Defines repository alias → repoName mapping used by the CLI and skill.
tools/ai/hugegraph-ai-deepwiki-skill/plugins/hugegraph-ai-deepwiki-skill/skills/hugegraph-ai-deepwiki-skill/agents/openai.yaml Declares the MCP dependency and default prompt wiring for the agent.
tools/ai/hugegraph-ai-deepwiki-skill/plugins/hugegraph-ai-deepwiki-skill/skills/hugegraph-ai-deepwiki-skill/SKILL.md Documents the intended workflow (context search first, then ask).
tools/ai/hugegraph-ai-deepwiki-skill/plugins/hugegraph-ai-deepwiki-skill/.codex-plugin/plugin.json Codex plugin manifest for distributing the skill.
tools/ai/hugegraph-ai-deepwiki-skill/plugins/hugegraph-ai-deepwiki-skill/.claude-plugin/plugin.json Claude plugin manifest for distributing the skill.
tools/ai/hugegraph-ai-deepwiki-skill/README.md Installation and usage docs (English).
tools/ai/hugegraph-ai-deepwiki-skill/README-zh.md Installation and usage docs (Chinese).
tools/ai/hugegraph-ai-deepwiki-skill/.claude-plugin/marketplace.json Marketplace entry for Claude plugin discovery.
tools/ai/hugegraph-ai-deepwiki-skill/.agents/plugins/marketplace.json Marketplace entry for agents/plugin discovery.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +137 to +145
max_seconds = float(os.environ.get("DEEPWIKI_MCP_STREAM_TIMEOUT", "120"))
deadline = time.monotonic() + max_seconds
timed_out = False

while True:
if time.monotonic() > deadline:
timed_out = True
break
raw_line = response.readline()
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 3a3e8e6.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Python 3.9 socket timeout compatibility fix added in 8305e3d.

Comment on lines +201 to +203
try:
with urllib.request.urlopen(req, timeout=90) as response:
session_id = response.headers.get("Mcp-Session-Id")
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 3a3e8e6.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Python 3.9 socket timeout compatibility fix added in 8305e3d.

Comment on lines +117 to +121
def write_text_atomic(path: Path, text: str) -> None:
path.parent.mkdir(parents=True, exist_ok=True)
tmp_path = path.with_suffix(path.suffix + ".tmp")
tmp_path.write_text(text, encoding="utf-8")
tmp_path.replace(path)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 3a3e8e6.

"Accept": "application/json, text/event-stream",
"Content-Type": "application/json",
"Mcp-Protocol-Version": self.protocol_version,
"User-Agent": "hugegraph-ai-deepwiki-skill/0.1.4",
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 3a3e8e6.

{
"protocolVersion": self.protocol_version,
"capabilities": {},
"clientInfo": {"name": "hugegraph-ai-deepwiki-skill", "version": "0.1.4"},
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 3a3e8e6.

Copy link
Copy Markdown
Member

@imbajin imbajin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking: yes. Summary: The cache hardening still fails recoverable cache cases and the new MCP client lacks regression tests for its failure-prone paths. Evidence: py_compile/JSON smoke passed; fake-client cache write failure reproduced a hard McpError.

return path.read_text(encoding="utf-8"), path, False

text = read_wiki_contents(client, repo_name)
write_text_atomic(path, text)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

‼️ Do not discard freshly fetched contents on cache write failure

Evidence: read_wiki_contents() runs before this write, but any OSError from write_text_atomic() is converted into McpError, so the command fails even though it already has the fresh wiki text. Impact: a broken cache directory makes context/contents unusable instead of falling back to the live result, which is the recovery path this change is trying to harden. Please return the fetched contents when only the cache write fails, and refetch when an existing cache read fails or is invalid.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. ensure_cached_contents() now refetches when an existing cache read fails, and if read_wiki_contents() succeeds but the cache write fails, it returns the fresh DeepWiki contents instead of raising McpError. Added regression coverage for both cache write fallback and invalid cache refetch.

return parsed


def read_sse_response(response: Any, expected_id: int | None) -> dict[str, Any]:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

‼️ Add regression tests for the MCP client failure paths

Evidence: this PR adds a custom MCP/SSE client with timeout handling, atomic cache writes, CLI validation, and context scoring, but no automated tests cover those behaviors. Impact: the same timeout/cache paths already needed follow-up fixes in this PR, and future changes can regress them while py_compile and smoke checks still pass. Please add focused tests for read_sse_response(), cache read/write failure fallback, and the cached-context scoring/selection behavior.

@LRriver
Copy link
Copy Markdown
Contributor Author

LRriver commented Jun 2, 2026

Blocking: yes. Summary: The cache hardening still fails recoverable cache cases and the new MCP client lacks regression tests for its failure-prone paths. Evidence: py_compile/JSON smoke passed; fake-client cache write failure reproduced a hard McpError.阻塞项:是。摘要:缓存加固仍无法处理可恢复的缓存故障场景,且新增的 MCP 客户端缺少针对其易故障路径的回归测试。证据:py_compile/JSON 冒烟测试通过;模拟客户端缓存写入失败复现了硬性 McpError 错误。

@imbajin Addressed the DeepWiki MCP review feedback in the latest deepwiki-skill commits. Summary:

  • Cache read failures now trigger a fresh read_wiki_contents() fetch.
  • Cache write failures no longer discard already fetched DeepWiki contents.
  • Added regression tests for SSE timeout handling, cache fallback/refetch, and cached-context selection.
  • Fixed Ruff format/lint issues from CI.

Local validation passed: uv run ruff check ., uv run ruff format --check ., python3 -m py_compile ..., and python3 -m unittest .../tests/test_deepwiki_mcp.py.

Latest apache/hugegraph-ai#355 CI is passing.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated no new comments.

Copy link
Copy Markdown
Member

@imbajin imbajin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking: yes. Summary: two packaging/runtime edge cases remain in the DeepWiki skill. Evidence: git archive/static review plus a local partial-SSE timeout reproduction.

@@ -0,0 +1,559 @@
#!/usr/bin/env python3
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

‼️ Core script is omitted from source archives

Evidence: on current head 29aae41, git archive HEAD | tar -tf - includes tools/ai/hugegraph-ai-deepwiki-skill/plugins/hugegraph-ai-deepwiki-skill/tests/test_deepwiki_mcp.py but not this scripts/deepwiki_mcp.py file. The existing root .gitattributes has scripts/ export-ignore, so archive-based installs and release source tarballs ship the tests/manifests without the executable client. Please either move/rename the plugin script directory or add an explicit archive rule so the skill package is complete in source archives.

if expected_id is None or parsed.get("id") == expected_id:
return parsed

if data_lines:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Partial SSE timeouts still surface as parse errors

Evidence: a fake response that first returns b'data: {"jsonrpc":"2.0","id":7,\n' and then raises TimeoutError makes read_sse_response() raise McpError: DeepWiki MCP returned non-JSON content... instead of the timeout error path added above. The new timeout test only covers an immediate timeout before any bytes are buffered. Please treat the buffered partial event as a timeout when timed_out is true before parsing the trailing data_lines, and add a regression test for that branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants