feat: accuracy issuer inherits perf concurrency in online mode (#357) by arekay-nv · Pull Request #379 · mlcommons/endpoints

arekay-nv · 2026-06-27T02:18:58Z

When the performance phase runs the CONCURRENCY load pattern (online), the accuracy phase now mirrors that same fixed concurrency instead of always bursting at MAX_THROUGHPUT, so evaluation exercises the endpoint the same way as the performance run.

All other patterns are unchanged: POISSON and offline MAX_THROUGHPUT perf phases keep the accuracy phase at MAX_THROUGHPUT, since inheriting POISSON would silently rate-limit evaluation to the perf QPS (no accuracy QPS-budgeting yet). The gate is purely load_pattern.type == CONCURRENCY, which the schema already constrains to online mode.

Also logs the accuracy issuer's chosen load mode (pattern + target_concurrency) per accuracy dataset. Adds unit tests for the concurrency-inheritance, POISSON-stays-max-throughput, offline-stays-max-throughput, and logging cases.

What does this PR do?

Type of change

Bug fix
New feature
Documentation update
Refactor/cleanup

Related issues

Testing

Tests added/updated
All tests pass locally
Manual testing completed

Checklist

Code follows project style
Pre-commit hooks pass
Documentation updated (if needed)

When the performance phase runs the CONCURRENCY load pattern (online), the accuracy phase now mirrors that same fixed concurrency instead of always bursting at MAX_THROUGHPUT, so evaluation exercises the endpoint the same way as the performance run. All other patterns are unchanged: POISSON and offline MAX_THROUGHPUT perf phases keep the accuracy phase at MAX_THROUGHPUT, since inheriting POISSON would silently rate-limit evaluation to the perf QPS (no accuracy QPS-budgeting yet). The gate is purely load_pattern.type == CONCURRENCY, which the schema already constrains to online mode. Also logs the accuracy issuer's chosen load mode (pattern + target_concurrency) per accuracy dataset. Adds unit tests for the concurrency-inheritance, POISSON-stays-max-throughput, offline-stays-max-throughput, and logging cases. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

github-actions · 2026-06-27T02:19:06Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

gemini-code-assist

Code Review

This pull request updates the benchmark execution logic so that the accuracy phase mirrors the fixed concurrency of the performance phase when a CONCURRENCY load pattern is used, while continuing to default to MAX_THROUGHPUT for other patterns (such as POISSON). It also adds logging for the accuracy issuer's load mode and includes comprehensive unit tests to verify these behaviors. There are no review comments, and I have no additional feedback to provide.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Signed-off-by: arekay-nv <230885705+arekay-nv@users.noreply.github.com>

nvzhihanj · 2026-07-02T21:17:29Z

+    # the (non-agentic) accuracy datasets — create_load_strategy rejects it —
+    # so it (and a missing perf pattern) falls back to MAX_THROUGHPUT.
+    perf_lp = ctx.rt_settings.load_pattern
+    if perf_lp is None or perf_lp.type == LoadPatternType.AGENTIC_INFERENCE:


@hvagadia is this the right intended behavior for agentic workload? I might be missing something here but thought we should use the same load pattern as well

For our standalone accuracy. we will likely have to use much lower concurrency than performance due to docker overhead. @tianmu-li is working on the PR, he will likely have a separate field to control accuracy load pattern.

nvzhihanj · 2026-07-02T21:19:54Z

-        # and QPS-budgeting support are added.
-        acc_load_pattern: LoadPattern | None = LoadPattern(
-            type=LoadPatternType.MAX_THROUGHPUT
+        if acc_load_pattern.type == LoadPatternType.CONCURRENCY:


Seems like LoadPatternType can have a __str__ (or/and __repr__) class so we can print directly instead of if statement here

arekay-nv requested a review from a team June 27, 2026 02:18

github-actions Bot requested a review from nvzhihanj June 27, 2026 02:19

gemini-code-assist Bot reviewed Jun 27, 2026

View reviewed changes

arekay-nv added 4 commits June 30, 2026 08:11

Merge branch 'main' into arekay/cherry_pick_accuracy_configs

539fd64

Merge branch 'main' into arekay/cherry_pick_accuracy_configs

f6c4e56

Merge branch 'main' into arekay/cherry_pick_accuracy_configs

d61814d

More fixes.

f540abb

Signed-off-by: arekay-nv <230885705+arekay-nv@users.noreply.github.com>

arekay-nv requested a review from viraatc July 1, 2026 19:23

Merge branch 'main' into arekay/cherry_pick_accuracy_configs

75b9e69

nvzhihanj reviewed Jul 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: accuracy issuer inherits perf concurrency in online mode (#357)#379

feat: accuracy issuer inherits perf concurrency in online mode (#357)#379
arekay-nv wants to merge 6 commits into
mainfrom
arekay/cherry_pick_accuracy_configs

arekay-nv commented Jun 27, 2026

Uh oh!

github-actions Bot commented Jun 27, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

nvzhihanj Jul 2, 2026

Uh oh!

hvagadia Jul 2, 2026

Uh oh!

nvzhihanj Jul 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

arekay-nv commented Jun 27, 2026

What does this PR do?

Type of change

Related issues

Testing

Checklist

Uh oh!

github-actions Bot commented Jun 27, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

nvzhihanj Jul 2, 2026

Choose a reason for hiding this comment

Uh oh!

hvagadia Jul 2, 2026

Choose a reason for hiding this comment

Uh oh!

nvzhihanj Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nvzhihanj Jul 2, 2026 •

edited

Loading