Glyph is a visual-text compression system integrated into AbstractCore. It renders long text into optimized images and processes them with Vision-Language Models (VLMs) to reduce effective token usage for long-document workflows.
Requires pip install "abstractcore[compression]" (and pip install "abstractcore[media]" if you want PDF/Office text extraction).
Glyph transforms the traditional text-processing paradigm by:
- Converting text to optimized images using precise typography and layout
- Processing images with VLMs instead of processing raw text tokens
- Achieving significant compression while preserving semantic information
- Reducing computational overhead for large documents
- Lower effective token usage for long text (often 3–4x compression; depends on content/model)
- Fewer context overflows for long-document analysis
- Preserved analytical quality via vision-capable models (best-effort; validate for your domain)
- Transparent integration - works seamlessly with existing code
Traditional Approach:
Long Text (1M tokens) → Tokenization → Sequential Processing → Context Overflow
Glyph Approach:
Long Text (1M tokens) → Visual Rendering → Image Processing (250K tokens) → VLM Interpretation
The Glyph pipeline transforms text through these stages:
- Content Analysis: Determines compression suitability and optimal parameters
- Text Extraction (optional): For PDFs/Office docs, extract text via the Media system when installed
- Visual Rendering: Render text into images via a Pillow-based renderer
- Quality Validation: Best-effort checks + fallback to standard processing when needed
- VLM Processing: Vision models process the compressed visual content
- Caching: Store artifacts for repeated content processing
Optional (experimental): For PDFs, the media pipeline can try a direct PDF→image conversion path (requires pdf2image + system dependencies). When unavailable, it falls back to text extraction.
Glyph automatically optimizes rendering for each provider:
| Provider | DPI | Font Size | Quality Focus |
|---|---|---|---|
| OpenAI | 72 | 9pt | Dense text, aggressive compression |
| Anthropic | 96 | 10pt | Font clarity, conservative settings |
| Ollama | 72 | 9pt | Balanced approach, auto-cropping |
| LMStudio | 96 | 10pt | Quality-focused rendering |
Glyph compression is applied automatically when:
- Document size exceeds configured thresholds
- Provider supports vision capabilities
- Content type is suitable for compression (text-heavy documents)
- Quality requirements can be met
Glyph is seamlessly integrated into AbstractCore's architecture:
# Glyph works transparently through existing media handling
llm = create_llm("ollama", model="llama3.2-vision:11b")
response = llm.generate(
"Analyze this document",
media=["large_document.pdf"] # Automatically compressed if beneficial
)- Ollama: Vision models (llama3.2-vision, qwen2.5vl, granite3.2-vision)
- LMStudio: Local vision models with OpenAI-compatible API
- HuggingFace: Vision-language models via transformers
- OpenAI: GPT-4 Vision models
- Anthropic: Claude 3 Vision models
from abstractcore.compression import GlyphConfig
# Configure compression behavior
glyph_config = GlyphConfig(
enabled=True,
global_default="auto", # "auto", "always", "never"
quality_threshold=0.95,
target_compression_ratio=3.0
)
llm = create_llm("ollama", model="qwen2.5vl:7b", glyph_config=glyph_config)from abstractcore import create_llm
# Using Ollama with a vision model
llm = create_llm("ollama", model="llama3.2-vision:11b")
# Analyze a research paper - Glyph compression applied automatically
response = llm.generate(
"What are the key findings and methodology in this research paper?",
media=["research_paper.pdf"]
)
print(f"Analysis: {response.content}")
print(f"Processing time: {response.gen_time}ms")from abstractcore import create_llm
# Force compression for testing
llm = create_llm("ollama", model="qwen2.5vl:7b")
response = llm.generate(
"Summarize the main points of this document",
media=["long_document.pdf"],
glyph_compression="always" # Force Glyph compression
)
# Check if compression was used
if response.metadata and response.metadata.get('compression_used'):
stats = response.metadata.get('compression_stats', {})
print(f"Compression ratio: {stats.get('compression_ratio', 'N/A')}")
print(f"Original tokens: {stats.get('original_tokens', 'N/A')}")
print(f"Compressed tokens: {stats.get('compressed_tokens', 'N/A')}")from abstractcore import create_llm
from abstractcore.compression import GlyphConfig
# Configure for LMStudio with custom settings
glyph_config = GlyphConfig(
enabled=True,
provider_profiles={
"lmstudio": {
"dpi": 96,
"font_size": 10,
"quality_threshold": 0.90
}
}
)
# Connect to LMStudio
llm = create_llm(
"lmstudio",
model="qwen/qwen3-next-80b", # Your LMStudio model
base_url="http://localhost:1234/v1",
glyph_config=glyph_config
)
# Process complex document
response = llm.generate(
"Provide a detailed analysis of the figures and tables in this paper",
media=["academic_paper.pdf"]
)from abstractcore import create_llm
# Using HuggingFace vision models
llm = create_llm(
"huggingface",
model="microsoft/Phi-3.5-vision-instruct", # Example vision model
device="auto"
)
# Batch processing with compression
documents = ["doc1.pdf", "doc2.pdf", "doc3.pdf"]
for doc in documents:
response = llm.generate(
"Extract key insights and recommendations",
media=[doc],
glyph_compression="auto" # Let Glyph decide
)
print(f"Document: {doc}")
print(f"Insights: {response.content[:200]}...")
print("---")from abstractcore.compression import GlyphConfig
config = GlyphConfig(
enabled=True, # Enable/disable Glyph
global_default="auto", # "auto", "always", "never"
quality_threshold=0.95, # Minimum quality score (0-1)
min_token_threshold=10000, # Minimum size to consider compression
target_compression_ratio=3.0, # Target compression ratio
provider_optimization=True, # Enable provider-specific optimization
cache_directory="~/.abstractcore/glyph_cache",
cache_size_gb=1.0,
cache_ttl_days=7,
)config = GlyphConfig(
provider_profiles={
"ollama": {
"dpi": 150, # Higher DPI for better quality
"font_size": 9, # Smaller font for more content
"quality_threshold": 0.95
},
"lmstudio": {
"dpi": 96, # Standard DPI for speed
"font_size": 10,
"quality_threshold": 0.90
},
"huggingface": {
"dpi": 120,
"font_size": 8,
"quality_threshold": 0.92
}
}
)# Per-request compression control
response = llm.generate(
prompt="Analyze this document",
media=["document.pdf"],
glyph_compression="always" # "always", "never", "auto"
)
# Check compression usage
if hasattr(response, 'metadata') and response.metadata:
compression_used = response.metadata.get('compression_used', False)
print(f"Glyph compression used: {compression_used}")Based on your system, here are vision-capable models you can test with:
# Large, high-quality model
llm = create_llm("ollama", model="llama3.2-vision:11b")
# Efficient model for faster processing
llm = create_llm("ollama", model="qwen2.5vl:7b")
# Lightweight model for testing
llm = create_llm("ollama", model="granite3.2-vision:latest")# Connect to your LMStudio instance
llm = create_llm(
"lmstudio",
model="your-vision-model", # Replace with your loaded model
base_url="http://localhost:1234/v1"
)response = llm.generate("Analyze document", media=["doc.pdf"])
# Check performance metrics
print(f"Generation time: {response.gen_time}ms")
print(f"Token usage: {response.usage}")
# Compression-specific metrics
if response.metadata:
stats = response.metadata.get('compression_stats', {})
print(f"Compression ratio: {stats.get('compression_ratio')}")
print(f"Quality score: {stats.get('quality_score')}")import time
from abstractcore import create_llm
def benchmark_compression(document_path, model_name):
"""Compare processing with and without Glyph compression"""
llm = create_llm("ollama", model=model_name)
# Without compression
start = time.time()
response_no_glyph = llm.generate(
"Summarize this document",
media=[document_path],
glyph_compression="never"
)
time_no_glyph = time.time() - start
# With compression
start = time.time()
response_glyph = llm.generate(
"Summarize this document",
media=[document_path],
glyph_compression="always"
)
time_glyph = time.time() - start
print(f"Without Glyph: {time_no_glyph:.2f}s")
print(f"With Glyph: {time_glyph:.2f}s")
print(f"Speedup: {time_no_glyph/time_glyph:.2f}x")
# Test with your documents
benchmark_compression("large_document.pdf", "llama3.2-vision:11b")-
Compression not activating
- Ensure you're using a vision-capable model
- Check that document size exceeds minimum threshold
- Verify
glyph_compressionparameter is not set to "never"
-
Quality concerns
- Adjust
quality_thresholdin configuration - Use higher DPI settings for better image quality
- Test with different font sizes
- Adjust
-
Performance issues
- Lower DPI for faster processing
- Reduce
target_compression_ratio - Enable caching for repeated documents
Enable verbose logging via AbstractCore’s centralized config:
abstractcore --set-console-log-level DEBUG
# or:
abstractcore --enable-debug-loggingThen inspect response metadata for compression decisions:
from abstractcore import create_llm
llm = create_llm("ollama", model="qwen2.5vl:7b")
resp = llm.generate("Analyze", media=["doc.pdf"])
print(resp.metadata)Glyph is a best-effort optimization. Compression ratio and accuracy depend on the vision model, rendering settings (DPI/font size), and the content type (prose vs code vs tables).
Treat it as an optional acceleration technique:
- validate outputs on your workload
- keep
glyph_compression="auto"unless you have a strong reason to force it - prefer higher DPI / lower compression ratios for quality-critical tasks
✅ Recommended for:
- Documents > 10,000 tokens
- Prose and natural language content
- Technical documentation
- Research papers and reports
- Large configuration files
❌ Not recommended for:
- Mathematical notation (OCR challenges)
- Very dense special characters
- Content < 5,000 tokens
- Real-time chat applications
- OpenAI GPT-4o: Excellent OCR, handles dense text well
- Anthropic Claude: Good OCR, font-sensitive, quality-focused
- Ollama qwen2.5vl: Balanced performance, good for local deployment
- LMStudio: Variable quality, depends on specific model
# High-quality compression for critical applications
config = GlyphConfig(
quality_threshold=0.98, # Higher quality requirement
target_compression_ratio=2.5, # Conservative compression
provider_optimization=True # Use provider-specific settings
)
# Performance-focused compression
config = GlyphConfig(
quality_threshold=0.90, # Lower quality for speed
target_compression_ratio=4.0, # Aggressive compression
cache_ttl_days=30, # Keep artifacts longer for repeated runs
cache_size_gb=5.0, # Increase cache size for many documents
)For a comprehensive, runnable example that demonstrates all Glyph features, see:
examples/glyph_complete_example.py
This complete example includes:
#!/usr/bin/env python3
"""
Complete Glyph Visual-Text Compression Example
Demonstrates all aspects of Glyph compression with AbstractCore
"""
from abstractcore import create_llm
from abstractcore.compression import GlyphConfig
import time
def basic_glyph_example():
"""Basic usage with automatic compression detection"""
# Create LLM with vision model
llm = create_llm("ollama", model="llama3.2-vision:11b")
# Process document - Glyph decides automatically
response = llm.generate(
"Analyze this research paper and summarize key findings.",
media=["research_paper.pdf"] # Auto-compressed if beneficial
)
# Check if compression was used
if response.metadata and response.metadata.get('compression_used'):
stats = response.metadata.get('compression_stats', {})
print(f"🎨 Glyph compression used!")
print(f"Compression ratio: {stats.get('compression_ratio')}")
print(f"Quality score: {stats.get('quality_score')}")
return response
def benchmark_comparison():
"""Compare performance with and without compression"""
llm = create_llm("ollama", model="qwen2.5vl:7b")
# Test without compression
start = time.time()
response_no_glyph = llm.generate(
"Analyze this document",
media=["large_document.pdf"],
glyph_compression="never"
)
time_no_glyph = time.time() - start
# Test with compression
start = time.time()
response_glyph = llm.generate(
"Analyze this document",
media=["large_document.pdf"],
glyph_compression="always"
)
time_glyph = time.time() - start
print(f"Without Glyph: {time_no_glyph:.2f}s")
print(f"With Glyph: {time_glyph:.2f}s")
print(f"Speedup: {time_no_glyph/time_glyph:.2f}x")
def custom_configuration():
"""Advanced configuration for specific use cases"""
# High-quality configuration
config = GlyphConfig(
enabled=True,
quality_threshold=0.98, # Very high quality
target_compression_ratio=2.5, # Conservative compression
provider_profiles={
"ollama": {
"dpi": 150, # High DPI for quality
"font_size": 9, # Optimal font size
"quality_threshold": 0.98
}
}
)
llm = create_llm("ollama", model="granite3.2-vision:latest", glyph_config=config)
response = llm.generate(
"Provide detailed analysis with high accuracy requirements",
media=["critical_document.pdf"]
)
return response
# Run the complete example
if __name__ == "__main__":
print("🎨 Glyph Compression Complete Example")
# Basic usage
basic_response = basic_glyph_example()
# Performance benchmark
benchmark_comparison()
# Custom configuration
custom_response = custom_configuration()
print("✅ All examples completed successfully!")# Make sure you have a vision model available
ollama pull llama3.2-vision:11b
# Run the complete example
cd examples
python glyph_complete_example.pyThe complete example demonstrates:
- Basic automatic compression with intelligent decision-making
- Performance benchmarking comparing compressed vs uncompressed processing
- Custom configuration for different quality/speed requirements
- Multi-provider testing across different vision models
- Error handling and debugging techniques
- Real-world usage patterns with sample documents
- Explore the Vision Capabilities documentation
- Learn about Media Handling System
- Check out Examples for more use cases
- Review Configuration for advanced settings
For implementation details, API specifications, and research background, see:
- Glyph Technical Report - Detailed technical specifications
- Glyph Research Paper - Original research by Z.ai/THU-COAI
Glyph compression represents a paradigm shift in document processing, making large-scale text analysis more efficient while maintaining the quality and accuracy you expect from AbstractCore.