model2vec

High-performance, local text embeddings for Dart and Flutter. A Dart wrapper around model2vec-rs using Rust FFI and Native Assets. Model2Vec creates small, fast, and effective text embeddings by distilling knowledge from large language models into a simple vocabulary-based look-up table.

Key Features

Extreme Performance: Built on top of a highly optimized Rust engine. Up to ~1.7x faster than the official Python implementation, generating embeddings in microseconds.
Compact & Quantized: Models are typically 25MB - 100MB. Perfect for edge computing.
Massive Streaming: Built-in generateEmbeddingStream for processing millions of rows without blocking the Event Loop or overflowing RAM.
Hugging Face Integration: Automatically downloads and caches models directly from the Hugging Face Hub.
Zero-Stutter Async: Transparently runs heavy tokenization and math in background Dart Isolates using Async methods.
Vector Utilities: Ships with high-performance mathematical tools (cosineSimilarity, quantizeToInt8, similaritySearch, etc.).

Recommended Models

Model2Vec provides a variety of pre-trained models optimized for different use cases. These can be loaded directly via their Hugging Face model ID.

Model ID	Language	Distilled From	Params	Dimension	Size
`minishlab/potion-base-32M`	English	bge-base-en-v1.5	32.3M	512	~150MB
`minishlab/potion-multilingual-128M`	Multi	bge-m3	128M	768	~500MB
`minishlab/potion-retrieval-32M`	English	bge-base-en-v1.5	32.3M	512	~150MB
`minishlab/potion-code-16M`	Code	CodeRankEmbed	16M	384	~80MB
`minishlab/potion-base-8M`	English	bge-base-en-v1.5	7.5M	256	~50MB
`minishlab/potion-base-4M`	English	bge-base-en-v1.5	3.7M	128	~30MB
`minishlab/potion-base-2M`	English	bge-base-en-v1.5	1.8M	64	~25MB

Installation

Add model2vec to your pubspec.yaml:

dependencies:
  model2vec: any

Or add it using the command line:

dart pub add model2vec

Requires Dart SDK: 3.10.0+ and Rust toolchain: 1.86.0+ (to build the native library via Native Assets).

Quick Start

import 'package:model2vec/model2vec.dart';

void main() {
  final m2v = Model2Vec.instance;
  
  // Initialize with a model from Hugging Face
  m2v.initEmbedder('minishlab/potion-base-2M');
  
  // Generate an embedding
  final embedding = m2v.generateEmbedding('Dart FFI is blazingly fast 🚀');
  
  print('Vector dimension: ${m2v.embeddingDimension}');
  print('Vocabulary size: ${m2v.vocabularySize}');
}

Recipes & Patterns

1. Advanced Batch Processing

Process multiple strings at once for maximum hardware utilization. You can control sequence truncation and batch sizes.

final texts = ['Dart', 'Rust', 'Flutter'];

final embeddings = m2v.generateBatchEmbeddings(
  texts,
  maxLength: 256,   // Truncate strings longer than 256 tokens
  batchSize: 1024,  // Internal chunks sent to the FFI layer
);

2. Massive Data Streaming

When reading gigabytes of text from files or databases, loading everything into memory will crash the app. Use the Streaming API to handle data in chunks automatically.

import 'dart:convert';
import 'dart:io';

Future<void> processHugeFile() async {
  final fileStream = File('massive_dataset.txt')
      .openRead()
      .transform(utf8.decoder)
      .transform(const LineSplitter());

  // Converts a Stream<String> into a Stream<Float32List>
  final embeddingStream = m2v.generateEmbeddingStream(
    fileStream,
    batchSize: 500, // Process 500 strings at a time
    useIsolate: true, // Run math in background threads
  );

  await for (final embedding in embeddingStream) {
    saveToDb(embedding); // Memory safe!
  }
}

3. Asynchronous Isolate Execution

Never block the main thread. If you are building a Flutter app, always use the Async variants to perform generation in a background Isolate.

final embedding = await m2v.generateEmbeddingAsync('A very long text...');
final batch = await m2v.generateBatchEmbeddingsAsync(['A', 'B', 'C']);

4. Vector Math & Quantization

The library ships with Model2VecUtils — a powerful suite of math operations tuned for embeddings.

final query = m2v.generateEmbedding('cat');
final candidates = [
  m2v.generateEmbedding('dog'),
  m2v.generateEmbedding('space'),
];

// 1. Semantic Similarity (Cosine)
final sim = Model2VecUtils.cosineSimilarity(query, candidates[0]);

// 2. Threshold Searching (Find all matches > 80%)
final matches = Model2VecUtils.similaritySearchWithThreshold(
  query, candidates, threshold: 0.8,
);

// 3. Scalar Quantization (Compress Float32 to Int8 to save 4x RAM)
final compressed = Model2VecUtils.quantizeToInt8(query);

// 4. Mean Pooling (Average multiple vectors into one)
final sentenceVector = Model2VecUtils.meanPooling(candidates);

// 5. DB Serialization
final base64String = Model2VecUtils.toBase64(query);

API Reference

Core Methods (`Model2Vec` class)

Method / Property	Description
`initEmbedder(path)`	Initializes the model from a Hugging Face repo ID or local path.
`initEmbedderAdvanced(...)`	Advanced initialization with custom `cacheDirectory`, `hfToken`, or `normalize` overrides.
`initEmbedderFromBytes(...)`	Initializes the model directly from raw `Uint8List` bytes (`model.safetensors`, `tokenizer.json`, etc).
`getRecommendedModels()`	Returns a list of officially supported models.
`tokenize(text)`	Runs the internal BPE tokenizer and returns a `List<String>`.
`generateEmbedding(text)`	Synchronously generates a `Float32List` embedding vector.
`generateBatchEmbeddings(texts)`	Synchronously generates embeddings for a `List<String>` using Rust SIMD.
`generateEmbeddingAsync(text)`	Asynchronously generates an embedding in a background `Isolate`.
`generateEmbeddingStream(stream)`	Processes a huge `Stream<String>` into a `Stream<Float32List>` in batches.
`embeddingDimension`	Property returning the vector size (e.g., 256, 384, 512).
`vocabularySize`	Property returning the number of tokens in the model's vocabulary.

Math Utilities (`Model2VecUtils` class)

Method	Description
`cosineSimilarity(a, b)`	Calculates cosine similarity (-1.0 to 1.0) between two vectors.
`cosineDistance(a, b)`	Calculates cosine distance (0.0 to 2.0).
`euclideanDistance(a, b)`	Calculates Euclidean (L2) distance.
`similaritySearch(query, docs)`	Returns the indices of the Top-K most similar vectors in a database.
`similaritySearchWithThreshold`	Returns all indices with similarity above a given threshold.
`quantizeToInt8(vector)`	Compresses a `Float32List` into an `Int8List` (4x memory savings).
`normalize(vector)`	Applies L2 normalization to a vector.
`meanPooling(vectors)`	Averages multiple vectors into a single vector.
`toBase64` / `fromBase64`	Serializes/Deserializes a vector to/from a Base64 string for DB storage.

Performance

model2vec uses highly optimized FFI bindings. For mathematical operations on embeddings, Dart handles single-vector math natively with zero-overhead, while batch generation leverages Rust's SIMD (auto-vectorization) capabilities.

Here is a performance benchmark run on a typical machine (AOT compiled):

Model	Load Time (Cache)	Single Embedding	Batch (32)
`minishlab/potion-base-2M`	~40 ms	372.9 μs	3.85 ms
`minishlab/potion-base-4M`	~40 ms	363.7 μs	4.19 ms
`minishlab/potion-base-8M`	~40 ms	382.1 μs	5.60 ms
`minishlab/potion-base-32M`	~120 ms	452.6 μs	6.79 ms
`minishlab/potion-multilingual-128M`	~1050 ms	416.1 μs	5.38 ms

Note: Initial load times may vary slightly based on the disk speed. Generating an embedding takes just a few microseconds per string.

similaritySearch over 100,000 vectors takes <100ms in pure Dart.

Development & Contributing

The library uses Dart Native Assets, meaning cargo build is invoked automatically when running Dart code.

To manually re-build bindings if you modify the Rust C-API (native/src/lib.rs):

dart run ffigen

To run the test suite:

dart test

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
benchmark		benchmark
example		example
hook		hook
lib		lib
native		native
test		test
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
analysis_options.yaml		analysis_options.yaml
pubspec.yaml		pubspec.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

model2vec

Table of Contents

Key Features

Recommended Models

Installation

Quick Start

Recipes & Patterns

1. Advanced Batch Processing

2. Massive Data Streaming

3. Asynchronous Isolate Execution

4. Vector Math & Quantization

API Reference

Core Methods (`Model2Vec` class)

Math Utilities (`Model2VecUtils` class)

Performance

Development & Contributing

License

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

model2vec

Table of Contents

Key Features

Recommended Models

Installation

Quick Start

Recipes & Patterns

1. Advanced Batch Processing

2. Massive Data Streaming

3. Asynchronous Isolate Execution

4. Vector Math & Quantization

API Reference

Core Methods (Model2Vec class)

Math Utilities (Model2VecUtils class)

Performance

Development & Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages

Core Methods (`Model2Vec` class)

Math Utilities (`Model2VecUtils` class)