You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: Allow passing any arguments to vllm and sglang engines (NVIDIA#368)
Put the arguments in a JSON file:
```
{
"dtype": "half",
"trust_remote_code": true
}
```
Pass it like this:
```
dynamo-run out=sglang ~/llm_models/Llama-3.2-3B-Instruct --extra-engine-args sglang_extra.json
```
Requested here ai-dynamo/dynamo#290 (`dtype`) and here ai-dynamo/dynamo#360 (`trust_remote_code`).
To pass extra arguments to the vllm engine see *Extra engine arguments* below.
231
+
228
232
## Python bring-your-own-engine
229
233
230
234
You can provide your own engine in a Python file. The file must provide a generator with this signature:
@@ -434,3 +438,20 @@ The output looks like this:
434
438
435
439
The input defaults to `in=text`. The output will default to `mistralrs` engine. If not available whatever engine you have compiled in (so depending on `--features`).
436
440
441
+
## Extra engine arguments
442
+
443
+
The vllm and sglang backends support passing any argument the engine accepts.
0 commit comments