Glyph Visual-Text Compression

Glyph is a visual-text compression system integrated into AbstractCore. It renders long text into optimized images and processes them with Vision-Language Models (VLMs) to reduce effective token usage for long-document workflows.

What is Glyph? How Glyph Works Integration with AbstractCore Practical Examples Configuration Options Available Models for Testing Performance Monitoring Troubleshooting Performance and quality Best Practices Complete Working Example Next Steps Technical Details

Requires pip install "abstractcore[compression]" (and pip install "abstractcore[media]" if you want PDF/Office text extraction).

What is Glyph?

Glyph transforms the traditional text-processing paradigm by:

Converting text to optimized images using precise typography and layout
Processing images with VLMs instead of processing raw text tokens
Achieving significant compression while preserving semantic information
Reducing computational overhead for large documents

Key Benefits

Lower effective token usage for long text (often 3–4x compression; depends on content/model)
Fewer context overflows for long-document analysis
Preserved analytical quality via vision-capable models (best-effort; validate for your domain)
Transparent integration - works seamlessly with existing code

How Glyph Works

The Compression Pipeline

Traditional Approach:
Long Text (1M tokens) → Tokenization → Sequential Processing → Context Overflow

Glyph Approach:  
Long Text (1M tokens) → Visual Rendering → Image Processing (250K tokens) → VLM Interpretation

The Glyph pipeline transforms text through these stages:

Content Analysis: Determines compression suitability and optimal parameters
Text Extraction (optional): For PDFs/Office docs, extract text via the Media system when installed
Visual Rendering: Render text into images via a Pillow-based renderer
Quality Validation: Best-effort checks + fallback to standard processing when needed
VLM Processing: Vision models process the compressed visual content
Caching: Store artifacts for repeated content processing

Optional (experimental): For PDFs, the media pipeline can try a direct PDF→image conversion path (requires pdf2image + system dependencies). When unavailable, it falls back to text extraction.

Provider Optimization

Glyph automatically optimizes rendering for each provider:

Provider	DPI	Font Size	Quality Focus
OpenAI	72	9pt	Dense text, aggressive compression
Anthropic	96	10pt	Font clarity, conservative settings
Ollama	72	9pt	Balanced approach, auto-cropping
LMStudio	96	10pt	Quality-focused rendering

When Glyph Activates

Glyph compression is applied automatically when: - Document size exceeds configured thresholds - Provider supports vision capabilities - Content type is suitable for compression (text-heavy documents) - Quality requirements can be met

Integration with AbstractCore

Glyph is seamlessly integrated into AbstractCore's architecture:

Media Processing Pipeline

# Glyph works transparently through existing media handling
llm = create_llm("ollama", model="llama3.2-vision:11b")
response = llm.generate(
    "Analyze this document",
    media=["large_document.pdf"]  # Automatically compressed if beneficial
)

Provider Support

Ollama: Vision models (llama3.2-vision, qwen2.5vl, granite3.2-vision)
LMStudio: Local vision models with OpenAI-compatible API
HuggingFace: Vision-language models via transformers
OpenAI: GPT-4 Vision models
Anthropic: Claude 3 Vision models

Configuration System

from abstractcore.compression import GlyphConfig

# Configure compression behavior
glyph_config = GlyphConfig(
    enabled=True,
    global_default="auto",  # "auto", "always", "never"
    quality_threshold=0.95,
    target_compression_ratio=3.0
)

llm = create_llm("ollama", model="qwen2.5vl:7b", glyph_config=glyph_config)

Practical Examples

Example 1: Document Analysis with Ollama

from abstractcore import create_llm

# Using Ollama with a vision model
llm = create_llm("ollama", model="llama3.2-vision:11b")

# Analyze a research paper - Glyph compression applied automatically
response = llm.generate(
    "What are the key findings and methodology in this research paper?",
    media=["research_paper.pdf"]
)

print(f"Analysis: {response.content}")
print(f"Processing time: {response.gen_time}ms")

Example 2: Explicit Compression Control

from abstractcore import create_llm

# Force compression for testing
llm = create_llm("ollama", model="qwen2.5vl:7b")

response = llm.generate(
    "Summarize the main points of this document",
    media=["long_document.pdf"],
    glyph_compression="always"  # Force Glyph compression
)

# Check if compression was used
if response.metadata and response.metadata.get('compression_used'):
    stats = response.metadata.get('compression_stats', {})
    print(f"Compression ratio: {stats.get('compression_ratio', 'N/A')}")
    print(f"Original tokens: {stats.get('original_tokens', 'N/A')}")
    print(f"Compressed tokens: {stats.get('compressed_tokens', 'N/A')}")

Example 3: LMStudio Integration

from abstractcore import create_llm
from abstractcore.compression import GlyphConfig

# Configure for LMStudio with custom settings
glyph_config = GlyphConfig(
    enabled=True,
    provider_profiles={
        "lmstudio": {
            "dpi": 96,
            "font_size": 10,
            "quality_threshold": 0.90
        }
    }
)

# Connect to LMStudio
llm = create_llm(
    "lmstudio",
    model="qwen/qwen3-next-80b",  # Your LMStudio model
    base_url="http://localhost:1234/v1",
    glyph_config=glyph_config
)

# Process complex document
response = llm.generate(
    "Provide a detailed analysis of the figures and tables in this paper",
    media=["academic_paper.pdf"]
)

Example 4: HuggingFace Vision Models

from abstractcore import create_llm

# Using HuggingFace vision models
llm = create_llm(
    "huggingface",
    model="microsoft/Phi-3.5-vision-instruct",  # Example vision model
    device="auto"
)

# Batch processing with compression
documents = ["doc1.pdf", "doc2.pdf", "doc3.pdf"]

for doc in documents:
    response = llm.generate(
        "Extract key insights and recommendations",
        media=[doc],
        glyph_compression="auto"  # Let Glyph decide
    )

    print(f"Document: {doc}")
    print(f"Insights: {response.content[:200]}...")
    print("---")

Configuration Options

Global Configuration

from abstractcore.compression import GlyphConfig

config = GlyphConfig(
    enabled=True,                    # Enable/disable Glyph
    global_default="auto",           # "auto", "always", "never"
    quality_threshold=0.95,          # Minimum quality score (0-1)
    min_token_threshold=10000,       # Minimum size to consider compression
    target_compression_ratio=3.0,    # Target compression ratio
    provider_optimization=True,      # Enable provider-specific optimization
    cache_directory="~/.abstractcore/glyph_cache",
    cache_size_gb=1.0,
    cache_ttl_days=7,
)

Provider-Specific Optimization

config = GlyphConfig(
    provider_profiles={
        "ollama": {
            "dpi": 150,              # Higher DPI for better quality
            "font_size": 9,          # Smaller font for more content
            "quality_threshold": 0.95
        },
        "lmstudio": {
            "dpi": 96,               # Standard DPI for speed
            "font_size": 10,
            "quality_threshold": 0.90
        },
        "huggingface": {
            "dpi": 120,
            "font_size": 8,
            "quality_threshold": 0.92
        }
    }
)

Runtime Control

# Per-request compression control
response = llm.generate(
    prompt="Analyze this document",
    media=["document.pdf"],
    glyph_compression="always"    # "always", "never", "auto"
)

# Check compression usage
if hasattr(response, 'metadata') and response.metadata:
    compression_used = response.metadata.get('compression_used', False)
    print(f"Glyph compression used: {compression_used}")

Available Models for Testing

Based on your system, here are vision-capable models you can test with:

Ollama Models (Recommended)

# Large, high-quality model
llm = create_llm("ollama", model="llama3.2-vision:11b")

# Efficient model for faster processing
llm = create_llm("ollama", model="qwen2.5vl:7b")

# Lightweight model for testing
llm = create_llm("ollama", model="granite3.2-vision:latest")

LMStudio (if running)

# Connect to your LMStudio instance
llm = create_llm(
    "lmstudio",
    model="your-vision-model",  # Replace with your loaded model
    base_url="http://localhost:1234/v1"
)

Performance Monitoring

Built-in Metrics

response = llm.generate("Analyze document", media=["doc.pdf"])

# Check performance metrics
print(f"Generation time: {response.gen_time}ms")
print(f"Token usage: {response.usage}")

# Compression-specific metrics
if response.metadata:
    stats = response.metadata.get('compression_stats', {})
    print(f"Compression ratio: {stats.get('compression_ratio')}")
    print(f"Quality score: {stats.get('quality_score')}")

Benchmarking

import time
from abstractcore import create_llm

def benchmark_compression(document_path, model_name):
    """Compare processing with and without Glyph compression"""

    llm = create_llm("ollama", model=model_name)

    # Without compression
    start = time.time()
    response_no_glyph = llm.generate(
        "Summarize this document",
        media=[document_path],
        glyph_compression="never"
    )
    time_no_glyph = time.time() - start

    # With compression
    start = time.time()
    response_glyph = llm.generate(
        "Summarize this document",
        media=[document_path],
        glyph_compression="always"
    )
    time_glyph = time.time() - start

    print(f"Without Glyph: {time_no_glyph:.2f}s")
    print(f"With Glyph: {time_glyph:.2f}s")
    print(f"Speedup: {time_no_glyph/time_glyph:.2f}x")

# Test with your documents
benchmark_compression("large_document.pdf", "llama3.2-vision:11b")

Troubleshooting

Common Issues

Compression not activating
Ensure you're using a vision-capable model
Check that document size exceeds minimum threshold
Verify glyph_compression parameter is not set to "never"
Quality concerns
Adjust quality_threshold in configuration
Use higher DPI settings for better image quality
Test with different font sizes
Performance issues
Lower DPI for faster processing
Reduce target_compression_ratio
Enable caching for repeated documents

Debug Mode

Enable verbose logging via AbstractCore’s centralized config:

abstractcore --set-console-log-level DEBUG
# or:
abstractcore --enable-debug-logging

Then inspect response metadata for compression decisions:

from abstractcore import create_llm

llm = create_llm("ollama", model="qwen2.5vl:7b")
resp = llm.generate("Analyze", media=["doc.pdf"])
print(resp.metadata)

Performance and quality

Glyph is a best-effort optimization. Compression ratio and accuracy depend on the vision model, rendering settings (DPI/font size), and the content type (prose vs code vs tables).

Treat it as an optional acceleration technique: - validate outputs on your workload - keep glyph_compression="auto" unless you have a strong reason to force it - prefer higher DPI / lower compression ratios for quality-critical tasks

Best Practices

When to Use Compression

✅ Recommended for: - Documents > 10,000 tokens - Prose and natural language content - Technical documentation - Research papers and reports - Large configuration files

❌ Not recommended for: - Mathematical notation (OCR challenges) - Very dense special characters - Content < 5,000 tokens - Real-time chat applications

Provider Selection

OpenAI GPT-4o: Excellent OCR, handles dense text well
Anthropic Claude: Good OCR, font-sensitive, quality-focused
Ollama qwen2.5vl: Balanced performance, good for local deployment
LMStudio: Variable quality, depends on specific model

Quality Optimization

# High-quality compression for critical applications
config = GlyphConfig(
    quality_threshold=0.98,        # Higher quality requirement
    target_compression_ratio=2.5,  # Conservative compression
    provider_optimization=True     # Use provider-specific settings
)

# Performance-focused compression
config = GlyphConfig(
    quality_threshold=0.90,        # Lower quality for speed
    target_compression_ratio=4.0,  # Aggressive compression
    cache_ttl_days=30,             # Keep artifacts longer for repeated runs
    cache_size_gb=5.0,             # Increase cache size for many documents
)

Complete Working Example

For a comprehensive, runnable example that demonstrates all Glyph features, see:

examples/glyph_complete_example.py

This complete example includes:

#!/usr/bin/env python3
"""
Complete Glyph Visual-Text Compression Example
Demonstrates all aspects of Glyph compression with AbstractCore
"""

from abstractcore import create_llm
from abstractcore.compression import GlyphConfig
import time

def basic_glyph_example():
    """Basic usage with automatic compression detection"""
    # Create LLM with vision model
    llm = create_llm("ollama", model="llama3.2-vision:11b")

    # Process document - Glyph decides automatically
    response = llm.generate(
        "Analyze this research paper and summarize key findings.",
        media=["research_paper.pdf"]  # Auto-compressed if beneficial
    )

    # Check if compression was used
    if response.metadata and response.metadata.get('compression_used'):
        stats = response.metadata.get('compression_stats', {})
        print(f"🎨 Glyph compression used!")
        print(f"Compression ratio: {stats.get('compression_ratio')}")
        print(f"Quality score: {stats.get('quality_score')}")

    return response

def benchmark_comparison():
    """Compare performance with and without compression"""
    llm = create_llm("ollama", model="qwen2.5vl:7b")

    # Test without compression
    start = time.time()
    response_no_glyph = llm.generate(
        "Analyze this document",
        media=["large_document.pdf"],
        glyph_compression="never"
    )
    time_no_glyph = time.time() - start

    # Test with compression
    start = time.time()
    response_glyph = llm.generate(
        "Analyze this document", 
        media=["large_document.pdf"],
        glyph_compression="always"
    )
    time_glyph = time.time() - start

    print(f"Without Glyph: {time_no_glyph:.2f}s")
    print(f"With Glyph: {time_glyph:.2f}s")
    print(f"Speedup: {time_no_glyph/time_glyph:.2f}x")

def custom_configuration():
    """Advanced configuration for specific use cases"""
    # High-quality configuration
    config = GlyphConfig(
        enabled=True,
        quality_threshold=0.98,        # Very high quality
        target_compression_ratio=2.5,  # Conservative compression
        provider_profiles={
            "ollama": {
                "dpi": 150,            # High DPI for quality
                "font_size": 9,        # Optimal font size
                "quality_threshold": 0.98
            }
        }
    )

    llm = create_llm("ollama", model="granite3.2-vision:latest", glyph_config=config)

    response = llm.generate(
        "Provide detailed analysis with high accuracy requirements",
        media=["critical_document.pdf"]
    )

    return response

# Run the complete example
if __name__ == "__main__":
    print("🎨 Glyph Compression Complete Example")

    # Basic usage
    basic_response = basic_glyph_example()

    # Performance benchmark  
    benchmark_comparison()

    # Custom configuration
    custom_response = custom_configuration()

    print("✅ All examples completed successfully!")

Running the Complete Example

# Make sure you have a vision model available
ollama pull llama3.2-vision:11b

# Run the complete example
cd examples
python glyph_complete_example.py

The complete example demonstrates: - Basic automatic compression with intelligent decision-making - Performance benchmarking comparing compressed vs uncompressed processing - Custom configuration for different quality/speed requirements - Multi-provider testing across different vision models - Error handling and debugging techniques - Real-world usage patterns with sample documents

Next Steps

Explore the Vision Capabilities documentation
Learn about Media Handling System
Check out Examples for more use cases
Review Configuration for advanced settings

Technical Details

For implementation details, API specifications, and research background, see: - Glyph Technical Report - Detailed technical specifications - Glyph Research Paper - Original research by Z.ai/THU-COAI

Glyph compression represents a paradigm shift in document processing, making large-scale text analysis more efficient while maintaining the quality and accuracy you expect from AbstractCore.