Glyph Visual-Text Compression
Glyph is a visual-text compression system integrated into AbstractCore. It renders long text into optimized images and processes them with Vision-Language Models (VLMs) to reduce effective token usage for long-document workflows.
Table of Contents
What is Glyph? How Glyph Works Integration with AbstractCore Practical Examples Configuration Options Available Models for Testing Performance Monitoring Troubleshooting Performance and quality Best Practices Complete Working Example Next Steps Technical DetailsRequires pip install "abstractcore[compression]" (and pip install "abstractcore[media]" if you want PDF/Office text extraction).
What is Glyph?
Glyph transforms the traditional text-processing paradigm by:
- Converting text to optimized images using precise typography and layout
- Processing images with VLMs instead of processing raw text tokens
- Achieving significant compression while preserving semantic information
- Reducing computational overhead for large documents
Key Benefits
- Lower effective token usage for long text (often 3–4x compression; depends on content/model)
- Fewer context overflows for long-document analysis
- Preserved analytical quality via vision-capable models (best-effort; validate for your domain)
- Transparent integration - works seamlessly with existing code
How Glyph Works
The Compression Pipeline
Traditional Approach:
Long Text (1M tokens) → Tokenization → Sequential Processing → Context Overflow
Glyph Approach:
Long Text (1M tokens) → Visual Rendering → Image Processing (250K tokens) → VLM Interpretation
The Glyph pipeline transforms text through these stages:
- Content Analysis: Determines compression suitability and optimal parameters
- Text Extraction (optional): For PDFs/Office docs, extract text via the Media system when installed
- Visual Rendering: Render text into images via a Pillow-based renderer
- Quality Validation: Best-effort checks + fallback to standard processing when needed
- VLM Processing: Vision models process the compressed visual content
- Caching: Store artifacts for repeated content processing
Optional (experimental): For PDFs, the media pipeline can try a direct PDF→image conversion path (requires pdf2image + system dependencies). When unavailable, it falls back to text extraction.
Provider Optimization
Glyph automatically optimizes rendering for each provider:
| Provider | DPI | Font Size | Quality Focus |
|---|---|---|---|
| OpenAI | 72 | 9pt | Dense text, aggressive compression |
| Anthropic | 96 | 10pt | Font clarity, conservative settings |
| Ollama | 72 | 9pt | Balanced approach, auto-cropping |
| LMStudio | 96 | 10pt | Quality-focused rendering |
When Glyph Activates
Glyph compression is applied automatically when: - Document size exceeds configured thresholds - Provider supports vision capabilities - Content type is suitable for compression (text-heavy documents) - Quality requirements can be met
Integration with AbstractCore
Glyph is seamlessly integrated into AbstractCore's architecture:
Media Processing Pipeline
# Glyph works transparently through existing media handling
llm = create_llm("ollama", model="llama3.2-vision:11b")
response = llm.generate(
"Analyze this document",
media=["large_document.pdf"] # Automatically compressed if beneficial
)
Provider Support
- Ollama: Vision models (llama3.2-vision, qwen2.5vl, granite3.2-vision)
- LMStudio: Local vision models with OpenAI-compatible API
- HuggingFace: Vision-language models via transformers
- OpenAI: GPT-4 Vision models
- Anthropic: Claude 3 Vision models
Configuration System
from abstractcore.compression import GlyphConfig
# Configure compression behavior
glyph_config = GlyphConfig(
enabled=True,
global_default="auto", # "auto", "always", "never"
quality_threshold=0.95,
target_compression_ratio=3.0
)
llm = create_llm("ollama", model="qwen2.5vl:7b", glyph_config=glyph_config)
Practical Examples
Example 1: Document Analysis with Ollama
from abstractcore import create_llm
# Using Ollama with a vision model
llm = create_llm("ollama", model="llama3.2-vision:11b")
# Analyze a research paper - Glyph compression applied automatically
response = llm.generate(
"What are the key findings and methodology in this research paper?",
media=["research_paper.pdf"]
)
print(f"Analysis: {response.content}")
print(f"Processing time: {response.gen_time}ms")
Example 2: Explicit Compression Control
from abstractcore import create_llm
# Force compression for testing
llm = create_llm("ollama", model="qwen2.5vl:7b")
response = llm.generate(
"Summarize the main points of this document",
media=["long_document.pdf"],
glyph_compression="always" # Force Glyph compression
)
# Check if compression was used
if response.metadata and response.metadata.get('compression_used'):
stats = response.metadata.get('compression_stats', {})
print(f"Compression ratio: {stats.get('compression_ratio', 'N/A')}")
print(f"Original tokens: {stats.get('original_tokens', 'N/A')}")
print(f"Compressed tokens: {stats.get('compressed_tokens', 'N/A')}")
Example 3: LMStudio Integration
from abstractcore import create_llm
from abstractcore.compression import GlyphConfig
# Configure for LMStudio with custom settings
glyph_config = GlyphConfig(
enabled=True,
provider_profiles={
"lmstudio": {
"dpi": 96,
"font_size": 10,
"quality_threshold": 0.90
}
}
)
# Connect to LMStudio
llm = create_llm(
"lmstudio",
model="qwen/qwen3-next-80b", # Your LMStudio model
base_url="http://localhost:1234/v1",
glyph_config=glyph_config
)
# Process complex document
response = llm.generate(
"Provide a detailed analysis of the figures and tables in this paper",
media=["academic_paper.pdf"]
)
Example 4: HuggingFace Vision Models
from abstractcore import create_llm
# Using HuggingFace vision models
llm = create_llm(
"huggingface",
model="microsoft/Phi-3.5-vision-instruct", # Example vision model
device="auto"
)
# Batch processing with compression
documents = ["doc1.pdf", "doc2.pdf", "doc3.pdf"]
for doc in documents:
response = llm.generate(
"Extract key insights and recommendations",
media=[doc],
glyph_compression="auto" # Let Glyph decide
)
print(f"Document: {doc}")
print(f"Insights: {response.content[:200]}...")
print("---")
Configuration Options
Global Configuration
from abstractcore.compression import GlyphConfig
config = GlyphConfig(
enabled=True, # Enable/disable Glyph
global_default="auto", # "auto", "always", "never"
quality_threshold=0.95, # Minimum quality score (0-1)
min_token_threshold=10000, # Minimum size to consider compression
target_compression_ratio=3.0, # Target compression ratio
provider_optimization=True, # Enable provider-specific optimization
cache_directory="~/.abstractcore/glyph_cache",
cache_size_gb=1.0,
cache_ttl_days=7,
)
Provider-Specific Optimization
config = GlyphConfig(
provider_profiles={
"ollama": {
"dpi": 150, # Higher DPI for better quality
"font_size": 9, # Smaller font for more content
"quality_threshold": 0.95
},
"lmstudio": {
"dpi": 96, # Standard DPI for speed
"font_size": 10,
"quality_threshold": 0.90
},
"huggingface": {
"dpi": 120,
"font_size": 8,
"quality_threshold": 0.92
}
}
)
Runtime Control
# Per-request compression control
response = llm.generate(
prompt="Analyze this document",
media=["document.pdf"],
glyph_compression="always" # "always", "never", "auto"
)
# Check compression usage
if hasattr(response, 'metadata') and response.metadata:
compression_used = response.metadata.get('compression_used', False)
print(f"Glyph compression used: {compression_used}")
Available Models for Testing
Based on your system, here are vision-capable models you can test with:
Ollama Models (Recommended)
# Large, high-quality model
llm = create_llm("ollama", model="llama3.2-vision:11b")
# Efficient model for faster processing
llm = create_llm("ollama", model="qwen2.5vl:7b")
# Lightweight model for testing
llm = create_llm("ollama", model="granite3.2-vision:latest")
LMStudio (if running)
# Connect to your LMStudio instance
llm = create_llm(
"lmstudio",
model="your-vision-model", # Replace with your loaded model
base_url="http://localhost:1234/v1"
)
Performance Monitoring
Built-in Metrics
response = llm.generate("Analyze document", media=["doc.pdf"])
# Check performance metrics
print(f"Generation time: {response.gen_time}ms")
print(f"Token usage: {response.usage}")
# Compression-specific metrics
if response.metadata:
stats = response.metadata.get('compression_stats', {})
print(f"Compression ratio: {stats.get('compression_ratio')}")
print(f"Quality score: {stats.get('quality_score')}")
Benchmarking
import time
from abstractcore import create_llm
def benchmark_compression(document_path, model_name):
"""Compare processing with and without Glyph compression"""
llm = create_llm("ollama", model=model_name)
# Without compression
start = time.time()
response_no_glyph = llm.generate(
"Summarize this document",
media=[document_path],
glyph_compression="never"
)
time_no_glyph = time.time() - start
# With compression
start = time.time()
response_glyph = llm.generate(
"Summarize this document",
media=[document_path],
glyph_compression="always"
)
time_glyph = time.time() - start
print(f"Without Glyph: {time_no_glyph:.2f}s")
print(f"With Glyph: {time_glyph:.2f}s")
print(f"Speedup: {time_no_glyph/time_glyph:.2f}x")
# Test with your documents
benchmark_compression("large_document.pdf", "llama3.2-vision:11b")
Troubleshooting
Common Issues
- Compression not activating
- Ensure you're using a vision-capable model
- Check that document size exceeds minimum threshold
-
Verify
glyph_compressionparameter is not set to "never" -
Quality concerns
- Adjust
quality_thresholdin configuration - Use higher DPI settings for better image quality
-
Test with different font sizes
-
Performance issues
- Lower DPI for faster processing
- Reduce
target_compression_ratio - Enable caching for repeated documents
Debug Mode
Enable verbose logging via AbstractCore’s centralized config:
abstractcore --set-console-log-level DEBUG
# or:
abstractcore --enable-debug-logging
Then inspect response metadata for compression decisions:
from abstractcore import create_llm
llm = create_llm("ollama", model="qwen2.5vl:7b")
resp = llm.generate("Analyze", media=["doc.pdf"])
print(resp.metadata)
Performance and quality
Glyph is a best-effort optimization. Compression ratio and accuracy depend on the vision model, rendering settings (DPI/font size), and the content type (prose vs code vs tables).
Treat it as an optional acceleration technique:
- validate outputs on your workload
- keep glyph_compression="auto" unless you have a strong reason to force it
- prefer higher DPI / lower compression ratios for quality-critical tasks
Best Practices
When to Use Compression
✅ Recommended for: - Documents > 10,000 tokens - Prose and natural language content - Technical documentation - Research papers and reports - Large configuration files
❌ Not recommended for: - Mathematical notation (OCR challenges) - Very dense special characters - Content < 5,000 tokens - Real-time chat applications
Provider Selection
- OpenAI GPT-4o: Excellent OCR, handles dense text well
- Anthropic Claude: Good OCR, font-sensitive, quality-focused
- Ollama qwen2.5vl: Balanced performance, good for local deployment
- LMStudio: Variable quality, depends on specific model
Quality Optimization
# High-quality compression for critical applications
config = GlyphConfig(
quality_threshold=0.98, # Higher quality requirement
target_compression_ratio=2.5, # Conservative compression
provider_optimization=True # Use provider-specific settings
)
# Performance-focused compression
config = GlyphConfig(
quality_threshold=0.90, # Lower quality for speed
target_compression_ratio=4.0, # Aggressive compression
cache_ttl_days=30, # Keep artifacts longer for repeated runs
cache_size_gb=5.0, # Increase cache size for many documents
)
Complete Working Example
For a comprehensive, runnable example that demonstrates all Glyph features, see:
examples/glyph_complete_example.py
This complete example includes:
#!/usr/bin/env python3
"""
Complete Glyph Visual-Text Compression Example
Demonstrates all aspects of Glyph compression with AbstractCore
"""
from abstractcore import create_llm
from abstractcore.compression import GlyphConfig
import time
def basic_glyph_example():
"""Basic usage with automatic compression detection"""
# Create LLM with vision model
llm = create_llm("ollama", model="llama3.2-vision:11b")
# Process document - Glyph decides automatically
response = llm.generate(
"Analyze this research paper and summarize key findings.",
media=["research_paper.pdf"] # Auto-compressed if beneficial
)
# Check if compression was used
if response.metadata and response.metadata.get('compression_used'):
stats = response.metadata.get('compression_stats', {})
print(f"🎨 Glyph compression used!")
print(f"Compression ratio: {stats.get('compression_ratio')}")
print(f"Quality score: {stats.get('quality_score')}")
return response
def benchmark_comparison():
"""Compare performance with and without compression"""
llm = create_llm("ollama", model="qwen2.5vl:7b")
# Test without compression
start = time.time()
response_no_glyph = llm.generate(
"Analyze this document",
media=["large_document.pdf"],
glyph_compression="never"
)
time_no_glyph = time.time() - start
# Test with compression
start = time.time()
response_glyph = llm.generate(
"Analyze this document",
media=["large_document.pdf"],
glyph_compression="always"
)
time_glyph = time.time() - start
print(f"Without Glyph: {time_no_glyph:.2f}s")
print(f"With Glyph: {time_glyph:.2f}s")
print(f"Speedup: {time_no_glyph/time_glyph:.2f}x")
def custom_configuration():
"""Advanced configuration for specific use cases"""
# High-quality configuration
config = GlyphConfig(
enabled=True,
quality_threshold=0.98, # Very high quality
target_compression_ratio=2.5, # Conservative compression
provider_profiles={
"ollama": {
"dpi": 150, # High DPI for quality
"font_size": 9, # Optimal font size
"quality_threshold": 0.98
}
}
)
llm = create_llm("ollama", model="granite3.2-vision:latest", glyph_config=config)
response = llm.generate(
"Provide detailed analysis with high accuracy requirements",
media=["critical_document.pdf"]
)
return response
# Run the complete example
if __name__ == "__main__":
print("🎨 Glyph Compression Complete Example")
# Basic usage
basic_response = basic_glyph_example()
# Performance benchmark
benchmark_comparison()
# Custom configuration
custom_response = custom_configuration()
print("✅ All examples completed successfully!")
Running the Complete Example
# Make sure you have a vision model available
ollama pull llama3.2-vision:11b
# Run the complete example
cd examples
python glyph_complete_example.py
The complete example demonstrates: - Basic automatic compression with intelligent decision-making - Performance benchmarking comparing compressed vs uncompressed processing - Custom configuration for different quality/speed requirements - Multi-provider testing across different vision models - Error handling and debugging techniques - Real-world usage patterns with sample documents
Next Steps
- Explore the Vision Capabilities documentation
- Learn about Media Handling System
- Check out Examples for more use cases
- Review Configuration for advanced settings
Technical Details
For implementation details, API specifications, and research background, see: - Glyph Technical Report - Detailed technical specifications - Glyph Research Paper - Original research by Z.ai/THU-COAI
Glyph compression represents a paradigm shift in document processing, making large-scale text analysis more efficient while maintaining the quality and accuracy you expect from AbstractCore.