Glyph Visual-Text Compression

Glyph is a visual-text compression system integrated into AbstractCore. It renders long text into optimized images and processes them with Vision-Language Models (VLMs) to reduce effective token usage for long-document workflows.

Table of Contents

What is Glyph? How Glyph Works Integration with AbstractCore Practical Examples Configuration Options Available Models for Testing Performance Monitoring Troubleshooting Performance and quality Best Practices Complete Working Example Next Steps Technical Details

Requires pip install "abstractcore[compression]" (and pip install "abstractcore[media]" if you want PDF/Office text extraction).

What is Glyph?

Glyph transforms the traditional text-processing paradigm by:

  1. Converting text to optimized images using precise typography and layout
  2. Processing images with VLMs instead of processing raw text tokens
  3. Achieving significant compression while preserving semantic information
  4. Reducing computational overhead for large documents

Key Benefits

  • Lower effective token usage for long text (often 3–4x compression; depends on content/model)
  • Fewer context overflows for long-document analysis
  • Preserved analytical quality via vision-capable models (best-effort; validate for your domain)
  • Transparent integration - works seamlessly with existing code

How Glyph Works

The Compression Pipeline

Traditional Approach:
Long Text (1M tokens) → Tokenization → Sequential Processing → Context Overflow

Glyph Approach:  
Long Text (1M tokens) → Visual Rendering → Image Processing (250K tokens) → VLM Interpretation

The Glyph pipeline transforms text through these stages:

  1. Content Analysis: Determines compression suitability and optimal parameters
  2. Text Extraction (optional): For PDFs/Office docs, extract text via the Media system when installed
  3. Visual Rendering: Render text into images via a Pillow-based renderer
  4. Quality Validation: Best-effort checks + fallback to standard processing when needed
  5. VLM Processing: Vision models process the compressed visual content
  6. Caching: Store artifacts for repeated content processing

Optional (experimental): For PDFs, the media pipeline can try a direct PDF→image conversion path (requires pdf2image + system dependencies). When unavailable, it falls back to text extraction.

Provider Optimization

Glyph automatically optimizes rendering for each provider:

Provider DPI Font Size Quality Focus
OpenAI 72 9pt Dense text, aggressive compression
Anthropic 96 10pt Font clarity, conservative settings
Ollama 72 9pt Balanced approach, auto-cropping
LMStudio 96 10pt Quality-focused rendering

When Glyph Activates

Glyph compression is applied automatically when: - Document size exceeds configured thresholds - Provider supports vision capabilities - Content type is suitable for compression (text-heavy documents) - Quality requirements can be met

Integration with AbstractCore

Glyph is seamlessly integrated into AbstractCore's architecture:

Media Processing Pipeline

# Glyph works transparently through existing media handling
llm = create_llm("ollama", model="llama3.2-vision:11b")
response = llm.generate(
    "Analyze this document",
    media=["large_document.pdf"]  # Automatically compressed if beneficial
)

Provider Support

  • Ollama: Vision models (llama3.2-vision, qwen2.5vl, granite3.2-vision)
  • LMStudio: Local vision models with OpenAI-compatible API
  • HuggingFace: Vision-language models via transformers
  • OpenAI: GPT-4 Vision models
  • Anthropic: Claude 3 Vision models

Configuration System

from abstractcore.compression import GlyphConfig

# Configure compression behavior
glyph_config = GlyphConfig(
    enabled=True,
    global_default="auto",  # "auto", "always", "never"
    quality_threshold=0.95,
    target_compression_ratio=3.0
)

llm = create_llm("ollama", model="qwen2.5vl:7b", glyph_config=glyph_config)

Practical Examples

Example 1: Document Analysis with Ollama

from abstractcore import create_llm

# Using Ollama with a vision model
llm = create_llm("ollama", model="llama3.2-vision:11b")

# Analyze a research paper - Glyph compression applied automatically
response = llm.generate(
    "What are the key findings and methodology in this research paper?",
    media=["research_paper.pdf"]
)

print(f"Analysis: {response.content}")
print(f"Processing time: {response.gen_time}ms")

Example 2: Explicit Compression Control

from abstractcore import create_llm

# Force compression for testing
llm = create_llm("ollama", model="qwen2.5vl:7b")

response = llm.generate(
    "Summarize the main points of this document",
    media=["long_document.pdf"],
    glyph_compression="always"  # Force Glyph compression
)

# Check if compression was used
if response.metadata and response.metadata.get('compression_used'):
    stats = response.metadata.get('compression_stats', {})
    print(f"Compression ratio: {stats.get('compression_ratio', 'N/A')}")
    print(f"Original tokens: {stats.get('original_tokens', 'N/A')}")
    print(f"Compressed tokens: {stats.get('compressed_tokens', 'N/A')}")

Example 3: LMStudio Integration

from abstractcore import create_llm
from abstractcore.compression import GlyphConfig

# Configure for LMStudio with custom settings
glyph_config = GlyphConfig(
    enabled=True,
    provider_profiles={
        "lmstudio": {
            "dpi": 96,
            "font_size": 10,
            "quality_threshold": 0.90
        }
    }
)

# Connect to LMStudio
llm = create_llm(
    "lmstudio",
    model="qwen/qwen3-next-80b",  # Your LMStudio model
    base_url="http://localhost:1234/v1",
    glyph_config=glyph_config
)

# Process complex document
response = llm.generate(
    "Provide a detailed analysis of the figures and tables in this paper",
    media=["academic_paper.pdf"]
)

Example 4: HuggingFace Vision Models

from abstractcore import create_llm

# Using HuggingFace vision models
llm = create_llm(
    "huggingface",
    model="microsoft/Phi-3.5-vision-instruct",  # Example vision model
    device="auto"
)

# Batch processing with compression
documents = ["doc1.pdf", "doc2.pdf", "doc3.pdf"]

for doc in documents:
    response = llm.generate(
        "Extract key insights and recommendations",
        media=[doc],
        glyph_compression="auto"  # Let Glyph decide
    )

    print(f"Document: {doc}")
    print(f"Insights: {response.content[:200]}...")
    print("---")

Configuration Options

Global Configuration

from abstractcore.compression import GlyphConfig

config = GlyphConfig(
    enabled=True,                    # Enable/disable Glyph
    global_default="auto",           # "auto", "always", "never"
    quality_threshold=0.95,          # Minimum quality score (0-1)
    min_token_threshold=10000,       # Minimum size to consider compression
    target_compression_ratio=3.0,    # Target compression ratio
    provider_optimization=True,      # Enable provider-specific optimization
    cache_directory="~/.abstractcore/glyph_cache",
    cache_size_gb=1.0,
    cache_ttl_days=7,
)

Provider-Specific Optimization

config = GlyphConfig(
    provider_profiles={
        "ollama": {
            "dpi": 150,              # Higher DPI for better quality
            "font_size": 9,          # Smaller font for more content
            "quality_threshold": 0.95
        },
        "lmstudio": {
            "dpi": 96,               # Standard DPI for speed
            "font_size": 10,
            "quality_threshold": 0.90
        },
        "huggingface": {
            "dpi": 120,
            "font_size": 8,
            "quality_threshold": 0.92
        }
    }
)

Runtime Control

# Per-request compression control
response = llm.generate(
    prompt="Analyze this document",
    media=["document.pdf"],
    glyph_compression="always"    # "always", "never", "auto"
)

# Check compression usage
if hasattr(response, 'metadata') and response.metadata:
    compression_used = response.metadata.get('compression_used', False)
    print(f"Glyph compression used: {compression_used}")

Available Models for Testing

Based on your system, here are vision-capable models you can test with:

# Large, high-quality model
llm = create_llm("ollama", model="llama3.2-vision:11b")

# Efficient model for faster processing
llm = create_llm("ollama", model="qwen2.5vl:7b")

# Lightweight model for testing
llm = create_llm("ollama", model="granite3.2-vision:latest")

LMStudio (if running)

# Connect to your LMStudio instance
llm = create_llm(
    "lmstudio",
    model="your-vision-model",  # Replace with your loaded model
    base_url="http://localhost:1234/v1"
)

Performance Monitoring

Built-in Metrics

response = llm.generate("Analyze document", media=["doc.pdf"])

# Check performance metrics
print(f"Generation time: {response.gen_time}ms")
print(f"Token usage: {response.usage}")

# Compression-specific metrics
if response.metadata:
    stats = response.metadata.get('compression_stats', {})
    print(f"Compression ratio: {stats.get('compression_ratio')}")
    print(f"Quality score: {stats.get('quality_score')}")

Benchmarking

import time
from abstractcore import create_llm

def benchmark_compression(document_path, model_name):
    """Compare processing with and without Glyph compression"""

    llm = create_llm("ollama", model=model_name)

    # Without compression
    start = time.time()
    response_no_glyph = llm.generate(
        "Summarize this document",
        media=[document_path],
        glyph_compression="never"
    )
    time_no_glyph = time.time() - start

    # With compression
    start = time.time()
    response_glyph = llm.generate(
        "Summarize this document",
        media=[document_path],
        glyph_compression="always"
    )
    time_glyph = time.time() - start

    print(f"Without Glyph: {time_no_glyph:.2f}s")
    print(f"With Glyph: {time_glyph:.2f}s")
    print(f"Speedup: {time_no_glyph/time_glyph:.2f}x")

# Test with your documents
benchmark_compression("large_document.pdf", "llama3.2-vision:11b")

Troubleshooting

Common Issues

  1. Compression not activating
  2. Ensure you're using a vision-capable model
  3. Check that document size exceeds minimum threshold
  4. Verify glyph_compression parameter is not set to "never"

  5. Quality concerns

  6. Adjust quality_threshold in configuration
  7. Use higher DPI settings for better image quality
  8. Test with different font sizes

  9. Performance issues

  10. Lower DPI for faster processing
  11. Reduce target_compression_ratio
  12. Enable caching for repeated documents

Debug Mode

Enable verbose logging via AbstractCore’s centralized config:

abstractcore --set-console-log-level DEBUG
# or:
abstractcore --enable-debug-logging

Then inspect response metadata for compression decisions:

from abstractcore import create_llm

llm = create_llm("ollama", model="qwen2.5vl:7b")
resp = llm.generate("Analyze", media=["doc.pdf"])
print(resp.metadata)

Performance and quality

Glyph is a best-effort optimization. Compression ratio and accuracy depend on the vision model, rendering settings (DPI/font size), and the content type (prose vs code vs tables).

Treat it as an optional acceleration technique: - validate outputs on your workload - keep glyph_compression="auto" unless you have a strong reason to force it - prefer higher DPI / lower compression ratios for quality-critical tasks

Best Practices

When to Use Compression

Recommended for: - Documents > 10,000 tokens - Prose and natural language content - Technical documentation - Research papers and reports - Large configuration files

Not recommended for: - Mathematical notation (OCR challenges) - Very dense special characters - Content < 5,000 tokens - Real-time chat applications

Provider Selection

  • OpenAI GPT-4o: Excellent OCR, handles dense text well
  • Anthropic Claude: Good OCR, font-sensitive, quality-focused
  • Ollama qwen2.5vl: Balanced performance, good for local deployment
  • LMStudio: Variable quality, depends on specific model

Quality Optimization

# High-quality compression for critical applications
config = GlyphConfig(
    quality_threshold=0.98,        # Higher quality requirement
    target_compression_ratio=2.5,  # Conservative compression
    provider_optimization=True     # Use provider-specific settings
)

# Performance-focused compression
config = GlyphConfig(
    quality_threshold=0.90,        # Lower quality for speed
    target_compression_ratio=4.0,  # Aggressive compression
    cache_ttl_days=30,             # Keep artifacts longer for repeated runs
    cache_size_gb=5.0,             # Increase cache size for many documents
)

Complete Working Example

For a comprehensive, runnable example that demonstrates all Glyph features, see:

examples/glyph_complete_example.py

This complete example includes:

#!/usr/bin/env python3
"""
Complete Glyph Visual-Text Compression Example
Demonstrates all aspects of Glyph compression with AbstractCore
"""

from abstractcore import create_llm
from abstractcore.compression import GlyphConfig
import time

def basic_glyph_example():
    """Basic usage with automatic compression detection"""
    # Create LLM with vision model
    llm = create_llm("ollama", model="llama3.2-vision:11b")

    # Process document - Glyph decides automatically
    response = llm.generate(
        "Analyze this research paper and summarize key findings.",
        media=["research_paper.pdf"]  # Auto-compressed if beneficial
    )

    # Check if compression was used
    if response.metadata and response.metadata.get('compression_used'):
        stats = response.metadata.get('compression_stats', {})
        print(f"🎨 Glyph compression used!")
        print(f"Compression ratio: {stats.get('compression_ratio')}")
        print(f"Quality score: {stats.get('quality_score')}")

    return response

def benchmark_comparison():
    """Compare performance with and without compression"""
    llm = create_llm("ollama", model="qwen2.5vl:7b")

    # Test without compression
    start = time.time()
    response_no_glyph = llm.generate(
        "Analyze this document",
        media=["large_document.pdf"],
        glyph_compression="never"
    )
    time_no_glyph = time.time() - start

    # Test with compression
    start = time.time()
    response_glyph = llm.generate(
        "Analyze this document", 
        media=["large_document.pdf"],
        glyph_compression="always"
    )
    time_glyph = time.time() - start

    print(f"Without Glyph: {time_no_glyph:.2f}s")
    print(f"With Glyph: {time_glyph:.2f}s")
    print(f"Speedup: {time_no_glyph/time_glyph:.2f}x")

def custom_configuration():
    """Advanced configuration for specific use cases"""
    # High-quality configuration
    config = GlyphConfig(
        enabled=True,
        quality_threshold=0.98,        # Very high quality
        target_compression_ratio=2.5,  # Conservative compression
        provider_profiles={
            "ollama": {
                "dpi": 150,            # High DPI for quality
                "font_size": 9,        # Optimal font size
                "quality_threshold": 0.98
            }
        }
    )

    llm = create_llm("ollama", model="granite3.2-vision:latest", glyph_config=config)

    response = llm.generate(
        "Provide detailed analysis with high accuracy requirements",
        media=["critical_document.pdf"]
    )

    return response

# Run the complete example
if __name__ == "__main__":
    print("🎨 Glyph Compression Complete Example")

    # Basic usage
    basic_response = basic_glyph_example()

    # Performance benchmark  
    benchmark_comparison()

    # Custom configuration
    custom_response = custom_configuration()

    print("✅ All examples completed successfully!")

Running the Complete Example

# Make sure you have a vision model available
ollama pull llama3.2-vision:11b

# Run the complete example
cd examples
python glyph_complete_example.py

The complete example demonstrates: - Basic automatic compression with intelligent decision-making - Performance benchmarking comparing compressed vs uncompressed processing - Custom configuration for different quality/speed requirements - Multi-provider testing across different vision models - Error handling and debugging techniques - Real-world usage patterns with sample documents

Next Steps

Technical Details

For implementation details, API specifications, and research background, see: - Glyph Technical Report - Detailed technical specifications - Glyph Research Paper - Original research by Z.ai/THU-COAI


Glyph compression represents a paradigm shift in document processing, making large-scale text analysis more efficient while maintaining the quality and accuracy you expect from AbstractCore.