Skip to content

Commit 7112685

Browse files
unamedkrclaude
andcommitted
Rebrand: TurboQuant.cpp → quant.cpp
Minimal C inference engine. 33K LOC. Zero dependencies. Rename: - CLI binary: tq_run → quant - tools/tq_run.c → tools/quant.c - All user-facing references updated (README, CONTRIBUTING, Dockerfile, Makefile, scripts, Python bindings) - Internal API (tq_* prefix) stays unchanged New README: - "4x longer context, same hardware" — developer-focused message - Comparison table vs llama.cpp (33K vs 250K LOC, PPL -3.2% vs +10.6%) - Honest PPL results with delta compression - Clean, minimal, ~150 lines Positioning: quant.cpp is the SQLite of LLM inference — small, readable, hackable. KV cache compression is the killer feature, not the whole identity. 34/34 tests passing. Build clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent e08ce72 commit 7112685

19 files changed

Lines changed: 190 additions & 282 deletions

.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,9 @@ models/.claude/worktrees/
4646
models/
4747

4848
# Makefile build artifacts
49-
tq_run
49+
quant
5050
tq_convert
5151
libturboquant.a
5252
*.o
53+
tq_run
54+
tq_run.dSYM/

CLAUDE.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,10 @@
1-
# TurboQuant.cpp — Agent Development Guide
1+
# quant.cpp — Agent Development Guide
22

33
## Project Overview
44

5-
TurboQuant.cpp is a cross-platform C/C++ library for extreme KV cache compression in LLM inference.
6-
It implements PolarQuant + QJL (TurboQuant) algorithms to achieve 5x KV cache memory reduction at 3-bit with zero quality loss.
5+
quant.cpp is a minimal C inference engine for local LLM with KV cache compression.
6+
33K LOC, pure C, zero dependencies. Supports 5 architectures via GGUF.
7+
Killer feature: delta KV compression — 3-bit keys with PPL -3.2% vs FP32.
78

89
## Architecture
910

@@ -196,5 +197,5 @@ When merging worker results back to main:
196197
./harness/run.sh --parallel-only
197198

198199
# Manual team spawn
199-
clawteam launch harness/team.toml --goal "Build TurboQuant.cpp" --workspace
200+
clawteam launch harness/team.toml --goal "Build quant.cpp" --workspace
200201
```

CMakeLists.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -238,8 +238,8 @@ if(TQ_BUILD_BENCH)
238238
endif()
239239

240240
# CLI inference tool
241-
add_executable(tq_run tools/tq_run.c)
242-
target_link_libraries(tq_run turboquant)
241+
add_executable(quant tools/quant.c)
242+
target_link_libraries(quant turboquant)
243243

244244
# Debug comparison tool
245245
add_executable(debug_compare tools/debug_compare.c)

CONTRIBUTING.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
1-
# Contributing to TurboQuant.cpp
1+
# Contributing to quant.cpp
22

33
Thank you for your interest in contributing! Here's how to get started.
44

55
## Quick Setup
66

77
```bash
8-
git clone https://github.com/quantumaikr/TurboQuant.cpp
9-
cd TurboQuant.cpp
8+
git clone https://github.com/quantumaikr/quant.cpp
9+
cd quant.cpp
1010
cmake -B build -DCMAKE_BUILD_TYPE=Debug -DTQ_BUILD_TESTS=ON
1111
cmake --build build -j$(nproc 2>/dev/null || sysctl -n hw.ncpu)
1212
ctest --test-dir build --output-on-failure
@@ -15,8 +15,8 @@ ctest --test-dir build --output-on-failure
1515
Or with Docker:
1616

1717
```bash
18-
docker build -t turboquant .
19-
docker run turboquant models/model.tqm -p "Hello" -k turbo_kv_1b
18+
docker build -t quant .
19+
docker run quant models/model.gguf -p "Hello"
2020
```
2121

2222
## Running Tests
@@ -38,7 +38,7 @@ bash score.sh --quality # Quantization quality metrics
3838

3939
## What to Work On
4040

41-
Check [Issues](https://github.com/quantumaikr/TurboQuant.cpp/issues) for tasks labeled `good first issue` or `help wanted`.
41+
Check [Issues](https://github.com/quantumaikr/quant.cpp/issues) for tasks labeled `good first issue` or `help wanted`.
4242

4343
**High-impact areas:**
4444
- New model architectures (Llama, Phi, Gemma)
@@ -50,7 +50,7 @@ Check [Issues](https://github.com/quantumaikr/TurboQuant.cpp/issues) for tasks l
5050

5151
1. Add the model config struct to `include/turboquant/tq_engine.h`
5252
2. Implement the forward pass in `src/engine/` (one file per architecture)
53-
3. Register the architecture in `tq_load_model()` in `src/engine/tq_model_loader.c`
53+
3. Register the architecture in `tq_load_model()` in `src/engine/tq_model.c`
5454
4. Add a test in `tests/` and an example in `examples/`
5555
5. Verify with `bash score.sh --quick`
5656

@@ -61,7 +61,7 @@ Check [Issues](https://github.com/quantumaikr/TurboQuant.cpp/issues) for tasks l
6161
3. Implement `quantize`/`dequantize`/`attention` in `src/core/tq_<name>.c`
6262
4. Register in the dispatch table in `src/core/tq_traits.c`
6363
5. Add unit tests in `tests/test_<name>.cpp`
64-
6. Update `tools/tq_run.c` to accept the new type name in `parse_kv_type()`
64+
6. Update `tools/quant.c` to accept the new type name in `parse_kv_type()`
6565

6666
## Code Standards
6767

Dockerfile

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,8 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
1313
&& rm -rf /var/lib/apt/lists/*
1414

1515
# Copy project source (see .dockerignore for exclusions)
16-
COPY . /turboquant
17-
WORKDIR /turboquant
16+
COPY . /quant
17+
WORKDIR /quant
1818

1919
# Build the library, tools, and tests
2020
RUN cmake -B build \
@@ -26,6 +26,6 @@ RUN cmake -B build \
2626
# Run the test suite
2727
RUN ctest --test-dir build --output-on-failure
2828

29-
# Default entrypoint: the tq_run inference CLI
30-
# Usage: docker run turboquant models/model.tqm -p "Hello" -k turbo_kv_1b
31-
ENTRYPOINT ["./build/tq_run"]
29+
# Default entrypoint: the quant inference CLI
30+
# Usage: docker run quant models/model.gguf -p "Hello"
31+
ENTRYPOINT ["./build/quant"]

Makefile

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,16 @@
11
# TurboQuant.cpp — Standalone Makefile (no CMake needed)
22
#
33
# Usage:
4-
# make # build tq_run + tq_convert
5-
# make tq_run # inference tool only
4+
# make # build quant + tq_convert
5+
# make quant # inference tool only
66
# make test # build and run tests (requires Google Test)
77
# make clean # remove build artifacts
88
#
99
# Cross-platform:
1010
# Linux/gcc: make CC=gcc
1111
# macOS/clang: make (auto-detects Apple Silicon)
1212
# macOS+Metal: make METAL=1 (enables Metal GPU backend)
13-
# Windows/mingw: make CC=x86_64-w64-mingw32-gcc TARGET=tq_run.exe
13+
# Windows/mingw: make CC=x86_64-w64-mingw32-gcc TARGET=quant.exe
1414
#
1515
# Options:
1616
# DEBUG=1 — debug build (-g -O0 -fsanitize=address)
@@ -87,14 +87,14 @@ endif
8787

8888
.PHONY: all clean test
8989

90-
all: tq_run tq_convert
90+
all: quant tq_convert
9191

9292
# Static library
9393
libturboquant.a: $(OBJ_LIB)
9494
$(AR) rcs $@ $^
9595

9696
# Main tools
97-
tq_run: tools/tq_run.c libturboquant.a
97+
quant: tools/quant.c libturboquant.a
9898
$(CC) $(CFLAGS) -o $@ $< -L. -lturboquant $(LDFLAGS)
9999

100100
tq_convert: tools/tq_convert.c libturboquant.a
@@ -115,12 +115,12 @@ tq_convert: tools/tq_convert.c libturboquant.a
115115
# Test (lightweight — no Google Test dependency)
116116
# ============================================================
117117

118-
test: tq_run
118+
test: quant
119119
@echo "=== Quick sanity test ==="
120120
@echo "Building..."
121-
@echo "Running tq_run --info on test..."
121+
@echo "Running quant --info on test..."
122122
@if [ -f model.tqm ]; then \
123-
./tq_run model.tqm --info && echo "PASS: model loads" || echo "FAIL"; \
123+
./quant model.tqm --info && echo "PASS: model loads" || echo "FAIL"; \
124124
else \
125125
echo "SKIP: no model.tqm found (download a model first)"; \
126126
fi
@@ -131,7 +131,7 @@ test: tq_run
131131
# ============================================================
132132

133133
clean:
134-
rm -f $(OBJ_LIB) $(OBJ_METAL) libturboquant.a tq_run tq_convert
134+
rm -f $(OBJ_LIB) $(OBJ_METAL) libturboquant.a quant tq_convert
135135
rm -f src/**/*.o
136136

137137
# ============================================================
@@ -142,8 +142,8 @@ help:
142142
@echo "TurboQuant.cpp Makefile"
143143
@echo ""
144144
@echo "Targets:"
145-
@echo " make Build tq_run + tq_convert"
146-
@echo " make tq_run Build inference tool only"
145+
@echo " make Build quant + tq_convert"
146+
@echo " make quant Build inference tool only"
147147
@echo " make clean Remove build artifacts"
148148
@echo " make test Quick sanity test"
149149
@echo " make help Show this help"

README.ko.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -63,8 +63,8 @@ cmake -B build -DCMAKE_BUILD_TYPE=Release -DTQ_BUILD_TESTS=ON
6363
cmake --build build -j$(nproc)
6464
ctest --test-dir build # 33/33 통과
6565

66-
./build/tq_run model.gguf -p "Hello" -k uniform_4b -v q4 # 3.8x 압축
67-
./build/tq_run model.gguf --ppl input.txt -k uniform_4b -v q4 # PPL 측정
66+
./build/quant model.gguf -p "Hello" -k uniform_4b -v q4 # 3.8x 압축
67+
./build/quant model.gguf --ppl input.txt -k uniform_4b -v q4 # PPL 측정
6868
```
6969

7070
---

0 commit comments

Comments
 (0)