Skip to content

feat: AgentX v1.0#1970

Merged
cquil11 merged 20 commits into
mainfrom
feat/agentx-v1.0
Jul 3, 2026
Merged

feat: AgentX v1.0#1970
cquil11 merged 20 commits into
mainfrom
feat/agentx-v1.0

Conversation

@cquil11

@cquil11 cquil11 commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Summary

Reorganizes AgentX v1.0 into five squashed implementation groups. This keeps the existing PR/branch but removes the exploratory commit history and separates utility code, runtime plumbing, recipes, and final config registration.

Groups

1. AgentX result utilities

Implemented the utils.agentic package for result aggregation, backend-specific metric extraction, server-log parsing, trace metadata, dataset helpers, and result validation. The package is intentionally modular: common aggregation flow lives in shared code, while backend adapters under utils.agentic.aggregation.backends isolate vLLM, SGLang, and Dynamo-vLLM metric differences.

This modularity is required because AgentX results are not uniform across backends. Different engines expose cache hits, KV usage, request accounting, and server-side telemetry through different metric names and log formats. Keeping those mappings behind backend-specific adapters lets future contributors add another backend by implementing a focused adapter instead of modifying one large result processor. It also makes unit testing practical because request metrics, server metrics, server-log parsing, trace metadata, and validation can be tested independently.

The utility layer also includes smaller supporting changes, including success-rate calculation updates and shared constants used by AgentX result processing. These keep downstream aggregation behavior consistent between normal benchmark rows and AgentX-derived rows.

2. Runtime and CI plumbing

Wired AgentX into the shared benchmark/runtime path: workflow templates, matrix schema/generation, runner launchers, shared benchmark helpers, and the AIPerf submodule. This provides the common execution path consumed by the recipe/config layers.

AgentX e2e workflow runs now also trigger the InferenceX-app ingest-agentic-results repository-dispatch receiver after successful manual agentic sweeps. The dispatch passes the GitHub run ID and attempt once agentic artifacts and run stats are available, and it is gated to avoid ingesting PR/comment reusable workflow calls or partial failures.

This group also extends runner metadata in runners.yaml. AgentX needs runner-level resource information that fixed-sequence benchmarks mostly did not need, especially host DRAM availability for CPU/DRAM KV-offload configurations. The matrix logic uses runner metadata plus per-config model/runtime fields to reason about whether a proposed offload point is valid for a given runner. In practice, the available host-memory budget is derived from the runner entry rather than hardcoded inside each benchmark script, so config generation can consistently filter or size host-offload points across B200/B300/GB-class runners.

The intent is that resource capacity lives in runner/config metadata, while benchmark scripts focus on launching the server. That keeps decisions like DRAM-offload eligibility, runner selection, and generated sweep shape in the matrix layer instead of scattering those calculations through shell scripts.

3. Single-node AgentX recipes

Added and updated single-node benchmark scripts for DSv4, MiniMax, Kimi, and Qwen AgentX runs across NVIDIA and AMD runners. Deprecated AgentX scripts that are no longer part of the v1.0 surface were removed.

These recipes are still best-effort and experimental. They are included in the v1.0 release to document working patterns and provide templates for future contributors, not because every model/backend combination should be treated as fully production-hardened.

4. Multi-node AgentX recipes

Added GB200/GB300 disaggregated AgentX recipes and updated the SRT launcher path. This isolates multi-node recipe review from single-node runtime scripts.

These recipes are also best-effort and experimental. They are intentionally left in v1.0 as examples for future multi-node AgentX contributors, especially around disaggregated serving topology, SRT launcher integration, and backend-specific recipe structure.

5. Final sweep config registration

Updated NVIDIA/AMD master configs, runner metadata, and config docs for the final AgentX v1.0 sweep surface. This intentionally collapses config-testing churn into one reviewable final-state commit.

The config changes include the final AgentX matrix definitions, runner metadata needed for resource-aware generation, and documentation updates describing the new config surface. The goal is to keep sweep registration declarative: model/backend/runner capabilities are described in config files, then matrix generation applies the shared validation/resource logic.

Main sync

Rebased onto latest main and resolved the process_agentic_result conflicts by keeping the new utils.agentic.aggregation package layout and deleting the old top-level processor/test files.

Validation

  • python -m pytest utils/matrix_logic/ utils/agentic/aggregation/test_process_agentic_result.py utils/agentic/aggregation/test_server_log_metrics.py utils/agentic/datasets/test_build_weka_hf_dataset.py utils/agentic/validation/test_validate_agentic_result.py utils/test_calc_success_rate.py -q
  • Result: 220 passed
  • uv run pytest tests/unit/dataset/loader/test_weka_aux_classification.py tests/unit/dataset/loader/test_weka_flat_split_v1_contract_adv.py tests/unit/dataset/loader/test_weka_async_subagent.py tests/unit/dataset/loader/test_weka_overlap_groups.py -q in utils/aiperf
  • Result: 78 passed
  • uv run pytest tests/unit/dataset/loader/test_weka_trace.py -q -k 'not test_flattened_fanout_logs_detection_summary' in utils/aiperf
  • Result: 38 passed, 1 deselected
  • uv run --extra dev ruff check <touched AIPerf Python files>
  • Result: All checks passed

Comment thread utils/calc_success_rate.py Dismissed
Comment thread utils/matrix_logic/generate_sweep_configs.py Dismissed
@cquil11 cquil11 changed the title Feat/agentx v1.0 feat: AgentX v1.0 Jul 1, 2026
Comment thread utils/agentic/aggregation/trace_metadata.py Dismissed
Comment thread utils/agentic/aggregation/trace_metadata.py Dismissed
@cquil11 cquil11 force-pushed the feat/agentx-v1.0 branch from 5f0cf20 to 12a7ff1 Compare July 2, 2026 16:12
Signed-off-by: Cam Quilici <cjquilici@gmail.com>
@cquil11 cquil11 force-pushed the feat/agentx-v1.0 branch from 83510ce to c46a8db Compare July 2, 2026 18:16
Comment thread .github/workflows/e2e-tests.yml Fixed
@cquil11 cquil11 force-pushed the feat/agentx-v1.0 branch 2 times, most recently from 9d7393e to 46248ae Compare July 2, 2026 19:47
cquil11 added 4 commits July 2, 2026 15:28
Signed-off-by: Cam Quilici <cjquilici@gmail.com>
Signed-off-by: Cam Quilici <cjquilici@gmail.com>
Signed-off-by: Cam Quilici <cjquilici@gmail.com>
Signed-off-by: Cam Quilici <cjquilici@gmail.com>
@cquil11 cquil11 force-pushed the feat/agentx-v1.0 branch from 46248ae to f37f017 Compare July 2, 2026 20:29
@cquil11 cquil11 marked this pull request as ready for review July 2, 2026 21:20
@cquil11 cquil11 requested a review from a team July 2, 2026 21:20
cquil11 added 5 commits July 2, 2026 16:21
# Conflicts:
#	.github/workflows/README.md
#	AGENTS.md
#	configs/nvidia-master.yaml
#	utils/calc_success_rate.py
#	utils/constants.py
#	utils/runner_setup/RUNNER_SETUP.md
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.


感谢你的贡献!对于 vLLM 与 SGLang,请确保你的 recipe 与官方 vLLM recipes 和/或 SGLang cookbook 保持一致

如果不一致,请先创建一个 PR,之后我们才能将你的单节点 PR 合并到 master 分支。让我们确保文档保持一流水准,使整个 ML 社区都能从你的辛勤工作中受益!谢谢

PR 作者有责任确保合并后所有 GitHub Action 任务完全通过。 很多时候失败只是偶发抖动(flake),重新运行失败的任务即可解决。如果选择重新运行失败的任务,PR 作者有责任确保其最终通过。参见 GitHub 关于重新运行失败任务的文档:https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

一般而言,PR 作者应先向相应公司的 CODEOWNERS 请求审阅并获得 PR 批准,然后再请求核心维护者审阅。

如需更多帮助,PR 作者可通过 Slack 联系核心维护者。

2 similar comments
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.


感谢你的贡献!对于 vLLM 与 SGLang,请确保你的 recipe 与官方 vLLM recipes 和/或 SGLang cookbook 保持一致

如果不一致,请先创建一个 PR,之后我们才能将你的单节点 PR 合并到 master 分支。让我们确保文档保持一流水准,使整个 ML 社区都能从你的辛勤工作中受益!谢谢

PR 作者有责任确保合并后所有 GitHub Action 任务完全通过。 很多时候失败只是偶发抖动(flake),重新运行失败的任务即可解决。如果选择重新运行失败的任务,PR 作者有责任确保其最终通过。参见 GitHub 关于重新运行失败任务的文档:https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

一般而言,PR 作者应先向相应公司的 CODEOWNERS 请求审阅并获得 PR 批准,然后再请求核心维护者审阅。

如需更多帮助,PR 作者可通过 Slack 联系核心维护者。

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.


感谢你的贡献!对于 vLLM 与 SGLang,请确保你的 recipe 与官方 vLLM recipes 和/或 SGLang cookbook 保持一致

如果不一致,请先创建一个 PR,之后我们才能将你的单节点 PR 合并到 master 分支。让我们确保文档保持一流水准,使整个 ML 社区都能从你的辛勤工作中受益!谢谢

PR 作者有责任确保合并后所有 GitHub Action 任务完全通过。 很多时候失败只是偶发抖动(flake),重新运行失败的任务即可解决。如果选择重新运行失败的任务,PR 作者有责任确保其最终通过。参见 GitHub 关于重新运行失败任务的文档:https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

一般而言,PR 作者应先向相应公司的 CODEOWNERS 请求审阅并获得 PR 批准,然后再请求核心维护者审阅。

如需更多帮助,PR 作者可通过 Slack 联系核心维护者。

Comment on lines 218 to +221
model-prefix: ${{ matrix.config.model-prefix }}
framework: ${{ matrix.config.framework }}
precision: ${{ matrix.config.precision }}
conc-list: '[${{ matrix.config.conc }}]'
conc-list: ${{ toJson(matrix.config.conc) }}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The multi-node agentic job in .github/workflows/run-sweep.yml (lines 531, 544) still passes conc-list: '[${{ matrix.config.conc }}]' and conc: ${{ matrix.config.conc }}, but MultiNodeAgenticMatrixEntry.conc is now list[int] (utils/matrix_logic/validation.py:202). When conc is a list like [16, 24, 32, 40], GHA interpolation produces [[16, 24, 32, 40]] and agentic_srt.sh's ^[1-9][0-9]*$ regex rejects the joined string, so every new multi-node agentic recipe added here (e.g. dsv4-fp4-gb200-dynamo-vllm-agentic-3p2d-tep8-tp8) will fail in the post-merge full sweep even though e2e-tests.yml passes. Mirror the e2e-tests.yml update: conc-list: ${{ toJson(matrix.config.conc) }} and conc: ${{ matrix.config.conc[0] }}.

Extended reasoning...

The bug

.github/workflows/e2e-tests.yml was updated in this PR to the new list-valued shape for multi-node agentic entries:

# e2e-tests.yml:221, 234
conc-list: ${{ toJson(matrix.config.conc) }}
conc: ${{ matrix.config.conc[0] }}

The sibling sweep-multi-node-agentic job in .github/workflows/run-sweep.yml was left on the pre-PR scalar shape:

# run-sweep.yml:531, 544
conc-list: '[${{ matrix.config.conc }}]'
conc: ${{ matrix.config.conc }}

That shape assumed matrix.config.conc was a scalar int. This PR changes it to a list:

  • utils/matrix_logic/validation.py:202: MultiNodeAgenticMatrixEntry.conc: list[int].
  • utils/matrix_logic/generate_sweep_configs.py emits multi-node agentic entries with Fields.CONC.value: conc_batch where conc_batch is a list from chunk_multinode_agentic_concurrencies.

Step-by-step proof of failure

Take dsv4-fp4-gb200-dynamo-vllm-agentic-3p2d-tep8-tp8 (added by this PR in .github/configs/nvidia-master.yaml), which has conc-list: [16, 24, 32, 40]:

  1. generate_sweep_configs.py emits a matrix entry with conc: [16, 24, 32, 40].
  2. GitHub Actions expands '[${{ matrix.config.conc }}]'. Because matrix.config.conc is an array, GHA serializes it as JSON and produces the string '[[16, 24, 32, 40]]' (a nested list).
  3. benchmark-multinode-tmpl.yml:144 computes CONC_LIST = join(fromJson(inputs.conc-list), ' '). fromJson('[[16, 24, 32, 40]]') yields a single-element list whose lone element is the inner array; join(..., ' ') stringifies that element as '16,24,32,40'.
  4. benchmarks/multi_node/agentic_srt.sh:15 does read -r -a CONCURRENCIES <<< "$CONC_LIST". read splits on IFS whitespace, so CONCURRENCIES becomes the single-element array ['16,24,32,40'].
  5. The regex validator at agentic_srt.sh:23 (^[1-9][0-9]*$) rejects that element and aborts with ERROR: invalid agentic concurrency: 16,24,32,40.

The whole loop dies before the first server ever starts.

Why PR CI doesn't catch it

e2e-tests.yml (which PR CI runs) was updated to the list-valued shape (lines 221/234), so it passes for these new recipes. run-sweep.yml runs on push to main via perf-changelog.yaml, so the failure only surfaces after merge, on every multi-node agentic recipe added by this PR (dsv4-fp4-gb200-dynamo-vllm-agentic-3p2d-tep8-tp8, dsv4-fp4-gb200-dynamo-vllm-agentic-2p1d-dep8-dep8, and the existing dsv4-fp4-gb300-dynamo-vllm-agentic whose conc is now list-valued).

Fix

Mirror the exact e2e-tests.yml update in run-sweep.yml lines 531 and 544:

conc-list: ${{ toJson(matrix.config.conc) }}
conc: ${{ matrix.config.conc[0] }}

Trivial two-line diff; identical semantics to the already-validated e2e path.

Comment on lines +436 to +456
salloc --partition=$SLURM_PARTITION --account=$SLURM_ACCOUNT --gres=gpu:$TP --exclusive --mem=0 --time="$SALLOC_TIME_LIMIT" --no-shell --job-name="$RUNNER_NAME"
JOB_ID=$(squeue --name="$RUNNER_NAME" -u "$USER" -h -o %A | head -n1)

# DSv4 is also staged on the compute nodes' local RAID. Loading the 806 GB
# checkpoint independently from Lustre on every TP rank leaves the loader
# threads blocked in Lustre I/O for hours. Select the local copy only after
# Slurm assigns a node, and retain the shared-Lustre path as a fallback for
# nodes whose local staging is incomplete.
if [[ "$MODEL_PREFIX" == "dsv4" && "$PRECISION" == "fp4" && "$FRAMEWORK" == "sglang" ]]; then
LOCAL_MODEL_PATH=/raid/models/DeepSeek-V4-Pro-NVFP4
if srun --jobid="$JOB_ID" bash -c \
'test -f "$1/config.json" && test -f "$1/model.safetensors.index.json" && test "$(find "$1" -maxdepth 1 -name "model-*.safetensors" | wc -l)" -eq 64' \
_ "$LOCAL_MODEL_PATH"; then
export MODEL_PATH="$LOCAL_MODEL_PATH"
export MODEL="$MODEL_PATH"
echo "Using node-local DSv4 checkpoint: $MODEL_PATH"
else
echo "Node-local DSv4 checkpoint unavailable; using shared checkpoint: $MODEL_PATH"
fi
fi

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The workflow input rename offloadingkv-offloading (and env var OFFLOADINGKV_OFFLOADING) missed three references in the launchers. ${OFFLOADING:-none} now always expands to "none", so runners/launch_b200-dgxc.sh:436 (--mem=0), runners/launch_b200-dgxc.sh:452 (DSv4 SGLang HiCache C≥512 time-limit extension), and runners/launch_b300-nv.sh:453 (--mem=0) are dead. Per the in-code comment, without --mem=0 Slurm caps the exclusive job at ~2 TB and OOM-kills multi-TB DRAM KV runs on the ~4 TB B200/B300 nodes, so every agentic kv-offloading: dram configuration this PR introduces (Mooncake, HiCache, LMCache, native) will silently fail its memory sizing. Fix: rename OFFLOADINGKV_OFFLOADING on all three lines and switch the == "hicache" string check to KV_OFFLOAD_BACKEND == "hicache".

Extended reasoning...

What the bug is

The agentic KV-offload knob was renamed by this PR:

  • Workflow input: offloadingkv-offloading (plus new kv-offload-backend)
  • Derived env var: OFFLOADINGKV_OFFLOADING

The rename landed in .github/workflows/benchmark-tmpl.yml:119, benchmark-multinode-tmpl.yml:156, e2e-tests.yml:187, and run-sweep.yml:491. No file in the repo exports OFFLOADING= anymore. But three references to the old name survived in the runner scripts:

  • runners/launch_b200-dgxc.sh:436if [[ "${OFFLOADING:-none}" != "none" ]]; then SALLOC_MEMORY_ARGS=(--mem=0); fi
  • runners/launch_b200-dgxc.sh:452elif [[ ... "${OFFLOADING:-none}" == "hicache" && "$CONC" -ge 512 ]]; then DEFAULT_SALLOC_TIME_LIMIT=300
  • runners/launch_b300-nv.sh:453if [[ "${OFFLOADING:-none}" != "none" ]]; then SALLOC_MEMORY_ARGS=(--mem=0); fi

Since OFFLOADING is never set by the caller anymore, ${OFFLOADING:-none} always evaluates to "none", and every one of these branches is dead code.

Why the existing code doesn't prevent it

The launchers are invoked directly from the workflow via bash ./runners/launch_${RUNNER_NAME%%_*}.sh (benchmark-tmpl.yml:181). They inherit only the env vars exported by the workflow, and the workflow no longer exports OFFLOADING. Nothing in the launchers themselves sets OFFLOADING either — grep -n 'export OFFLOADING\|^OFFLOADING=' returns nothing under runners/.

Impact

The in-code comment above SALLOC_MEMORY_ARGS=(--mem=0) explains the exact scenario this branch is supposed to protect against: without --mem=0, Slurm caps the exclusive job at ~2 TB despite the node having ~4 TB of physical RAM, and multi-TB DRAM KV pools OOM. This PR introduces the following configs (nvidia-master.yaml + amd-master.yaml) that specifically depend on that memory headroom:

  • dsv4-fp4-b200-vllm-agentic (kv-offloading: dram, kv-offload-backend: mooncake, cluster:b200-dgxc)
  • dsv4-fp4-b300-vllm-agentic (mooncake, cluster:b300-nv)
  • dsv4-fp4-b300-sglang-agentic-hicache (hicache, cluster:b300-nv)
  • dsv4-fp4-b200-sglang-agentic-hicache (hicache, cluster:b200-dgxc)
  • kimik2.5-fp4-b200-vllm-agentic-lmcache (lmcache), kimik2.5-int4-b200-vllm-agentic (native), kimik2.5-fp4-b300-vllm-agentic (native)
  • qwen3.5-fp8-b300-sglang-agentic-hicache, and more.

Every one of these will start without --mem=0 and either OOM at DRAM-pool init time or, for the hicache-C512 branch, run out of Slurm wall time before finishing warmup.

Step-by-step proof

Consider dsv4-fp4-b200-vllm-agentic at TP=8, kv-offloading: dram, kv-offload-backend: mooncake, cluster:b200-dgxc:

  1. Matrix generation emits a job with workflow inputs kv-offloading=dram and kv-offload-backend=mooncake (see .github/workflows/e2e-tests.yml:187-189 and run-sweep.yml:491-493).
  2. benchmark-tmpl.yml:119-120 exports KV_OFFLOADING=dram, KV_OFFLOAD_BACKEND=mooncake into the job environment. OFFLOADING is not exported.
  3. The job invokes bash ./runners/launch_b200-dgxc.sh (benchmark-tmpl.yml:181).
  4. At line 436 the launcher evaluates [[ "${OFFLOADING:-none}" != "none" ]]. Since OFFLOADING is unset, this expands to [[ "none" != "none" ]], which is false. SALLOC_MEMORY_ARGS stays empty.
  5. salloc ... --exclusive "${SALLOC_MEMORY_ARGS[@]}" --time=... --no-shell is called without --mem=0, so Slurm applies its implicit ~2 TB per-job cap.
  6. Inside the container, dsv4_fp4_b200_vllm.sh configures a Mooncake pool of TOTAL_CPU_DRAM_GB (typically ~2,400 GB for TP8 under the new dram-utilization=0.80 policy) and calls mooncake_master.
  7. As the pool fills, the cgroup-enforced 2 TB limit is exceeded and the job is OOM-killed — silently, from the workflow's perspective — even though the machine has ~4 TB of physical RAM.

The hicache/CONC≥512 branch on launch_b200-dgxc.sh:452 fails in the same way: dsv4-fp4-b300-sglang-agentic-hicache at conc≥512 loses its 300-minute salloc extension and expires mid-warmup on the default 180 minutes.

Fix

On all three lines, replace ${OFFLOADING:-none} with ${KV_OFFLOADING:-none}. On launch_b200-dgxc.sh:452, additionally replace the == "hicache" string comparison with KV_OFFLOAD_BACKEND == "hicache" — since KV_OFFLOADING is either none or dram, the specific backend (hicache/mooncake/lmcache/native) now lives in KV_OFFLOAD_BACKEND.

@cquil11 cquil11 force-pushed the feat/agentx-v1.0 branch from 25e71fe to c8abc2e Compare July 2, 2026 21:38
Comment on lines 316 to +371
@@ -376,3 +344,44 @@ jobs:
with:
name: "run-stats"
path: ${{ env.STATS_FILENAME }}.json

trigger-agentic-ingest:
needs:
[
test-sweep-agentic,
test-sweep-multi-node-agentic,
collect-agentic-results,
calc-success-rate,
]
if: >-
always() &&
github.event_name == 'workflow_dispatch' &&
needs.collect-agentic-results.result == 'success' &&
needs.calc-success-rate.result == 'success' &&
(
needs.test-sweep-agentic.result == 'success' ||
needs.test-sweep-multi-node-agentic.result == 'success'
) &&
(
needs.test-sweep-agentic.result == 'success' ||
needs.test-sweep-agentic.result == 'skipped'
) &&
(
needs.test-sweep-multi-node-agentic.result == 'success' ||
needs.test-sweep-multi-node-agentic.result == 'skipped'
cquil11 added 2 commits July 2, 2026 18:02
Signed-off-by: Cam Quilici <cjquilici@gmail.com>
Signed-off-by: Cam Quilici <cjquilici@gmail.com>
Comment thread .github/workflows/e2e-tests.yml Outdated
cquil11 added 6 commits July 2, 2026 19:01
Signed-off-by: Cam Quilici <cjquilici@gmail.com>
Signed-off-by: Cam Quilici <cjquilici@gmail.com>
Signed-off-by: Cam Quilici <cjquilici@gmail.com>
Signed-off-by: Cam Quilici <cjquilici@gmail.com>
@cquil11 cquil11 merged commit 303669b into main Jul 3, 2026
5 checks passed
@cquil11 cquil11 deleted the feat/agentx-v1.0 branch July 3, 2026 01:30
@cquil11 cquil11 restored the feat/agentx-v1.0 branch July 3, 2026 01:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

2 participants