Cake Documentation

Cake is a Rust framework for multimodal distributed inference. It shards models across consumer devices — iOS, Android, macOS, Linux, Windows — to run workloads that wouldn't fit on a single GPU.

Built on Candle with support for CUDA, Metal, Vulkan, and CPU backends.

Installation — Building from source, platform support, acceleration backends
Models — Supported text, image, and voice model architectures
Usage — Downloading models, running inference, Web UI, TUI chat
REST API — OpenAI-compatible endpoints for chat, audio, and image generation
Clustering — Zero-config mDNS discovery, manual topology, model splitting
Image Generation — FLUX and Stable Diffusion image synthesis
Voice Generation — VibeVoice TTS with voice cloning
Docker — Container builds for Linux/NVIDIA
Benchmarks — Performance comparison vs reference implementations

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cake Documentation

Table of Contents

Uh oh!

FilesExpand file tree

index.md

Latest commit

History

index.md

File metadata and controls

Cake Documentation

Table of Contents