GPUStack Higress Plugins

Higress Proxy-Wasm plugins for GPUStack, providing AI API traffic processing, observability, and enhanced gateway features.

Overview

This repository contains custom Higress Proxy-Wasm plugins designed for GPUStack, distributed as a Python package that includes pre-compiled Wasm plugins and a built-in HTTP file server for serving them.

Installation

pip install gpustack-higress-plugins

Requirements: Python >= 3.10

Available Plugins

gpustack-token-usage - Collects and injects token usage statistics into AI API responses. For streaming responses: time to first token, time per output token, and tokens per second. For non-streaming responses: tokens per second only. Supports real client IP injection and path-based filtering.
gpustack-set-header-pre-route - Automatically injects the route name and model name into HTTP request headers before routing, based on configurable path suffixes or prefixes.

Usage

Start Plugin Server

# Start the built-in HTTP file server
gpustack-plugins start --port 8080

# Or with custom host
gpustack-plugins start --port 8080 --host 0.0.0.0

The server will be available at http://localhost:8080.

API Endpoints

# Health check
curl http://localhost:8080/

# Download a plugin
curl http://localhost:8080/wasm-plugins/gpustack-token-usage/1.0.0/plugin.wasm -o plugin.wasm

# Get metadata
curl http://localhost:8080/wasm-plugins/gpustack-token-usage/1.0.0/metadata.txt

Python API

from gpustack_higress_plugins import create_app, router

# Embed in an existing FastAPI app
app.include_router(router)

# Or create a standalone app
app = create_app()

Configure Higress WasmPlugin

apiVersion: extensions.higress.io/v1alpha1
kind: WasmPlugin
metadata:
  name: gpustack-token-usage
  namespace: higress-system
spec:
  url: http://plugin-server:8080/wasm-plugins/gpustack-token-usage/1.0.0/plugin.wasm
  defaultConfig:
    realIPToHeader: x-gpustack-real-ip

Development

Prerequisites

Go 1.24+
Python 3.10+
oras (brew install oras) — required for fetching remote plugins

Build Plugins

# Install Python dependencies
make dev

# Build all plugins (local + remote, requires oras)
make build

# Build only local Go plugins (no oras required)
make -C extensions build-all

# Build specific plugin
make -C extensions build PLUGIN_NAME=gpustack-token-usage

If oras is not installed, make build will build local plugins only and print a warning.

Run Tests

# Test Go plugins
make test

# Test single plugin
make -C extensions test PLUGIN_NAME=gpustack-token-usage

Check Wheel Contents

make verify-whl

Reports each expected plugin (from extensions/*/VERSION and remote_plugins.yaml) as ✓ present, ✗ missing, or version mismatch, and checks that manifest.json is included.

Deployment

Kubernetes (recommended)

Deploy the plugin server as a separate service and reference it from WasmPlugin resources:

# Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: gpustack-higress-plugins
spec:
  template:
    spec:
      containers:
        - name: plugins
          image: gpustack/higress-plugins:latest
          ports:
            - containerPort: 8080
          livenessProbe:
            httpGet:
              path: /
              port: 8080
          readinessProbe:
            httpGet:
              path: /
              port: 8080

Docker Image

# Build Docker image
make image

# Build with custom Go proxy
GOPROXY=https://goproxy.cn,direct make image

# Run standalone
docker run -p 8080:8080 gpustack/higress-plugins:latest

Project Structure

gpustack-higress-plugins/
├── extensions/                    # Go plugin source code
│   ├── gpustack-token-usage/
│   │   ├── main.go
│   │   ├── go.mod
│   │   └── VERSION
│   ├── gpustack-set-header-pre-route/
│   ├── remote_plugins.yaml        # Remote OCI plugin config
│   └── Makefile
├── gpustack_higress_plugins/      # Python package
│   ├── __init__.py
│   ├── main.py                    # CLI + FastAPI app factory
│   ├── server.py                  # /wasm-plugins router
│   ├── plugins/                   # Compiled .wasm files (generated)
│   └── manifest.json              # Plugin index (generated)
├── scripts/                       # Build scripts
│   ├── generate_manifest.py
│   ├── generate_metadata.py
│   └── fetch_remote_plugins.py
├── Dockerfile
├── pyproject.toml
└── Makefile

Versioning

Package version follows Semantic Versioning (MAJOR.MINOR.PATCH)
Each plugin has its own version in extensions/<name>/VERSION
Package version is set from the git tag at release time (placeholder 0.0.0 in development)
RC releases (e.g. 0.2.0rc1) are published to TestPyPI; stable releases go to PyPI

License

Apache License 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github/workflows		.github/workflows
docs		docs
extensions		extensions
gpustack_higress_plugins		gpustack_higress_plugins
scripts		scripts
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CLAUDE.md		CLAUDE.md
DOCKER_BUILD.md		DOCKER_BUILD.md
Dockerfile		Dockerfile
Dockerfile.oci-plugin		Dockerfile.oci-plugin
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
go.work		go.work
go.work.sum		go.work.sum
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GPUStack Higress Plugins

Overview

Installation

Available Plugins

Usage

Start Plugin Server

API Endpoints

Python API

Configure Higress WasmPlugin

Development

Prerequisites

Build Plugins

Run Tests

Check Wheel Contents

Deployment

Kubernetes (recommended)

Docker Image

Project Structure

Versioning

License

About

Uh oh!

Releases 22

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

GPUStack Higress Plugins

Overview

Installation

Available Plugins

Usage

Start Plugin Server

API Endpoints

Python API

Configure Higress WasmPlugin

Development

Prerequisites

Build Plugins

Run Tests

Check Wheel Contents

Deployment

Kubernetes (recommended)

Docker Image

Project Structure

Versioning

License

About

Resources

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 22

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages