Media Handling System

Attach any file type to your LLM requests. One simple API works across all providers with intelligent processing and graceful fallback.

Overview

AbstractCore provides a production-ready unified media handling system that enables seamless file attachment and processing across all LLM providers. The system automatically processes images, documents, and other media files using the same simple API, with intelligent provider-specific formatting and graceful fallback handling.

Key Features

Universal API - Same media=[] parameter works across all providers
CLI Integration - Simple @filename syntax for instant file attachment
Intelligent Processing - Automatic file type detection with specialized processors
Provider Adaptation - Automatic formatting for each provider's API requirements
Robust Fallback - Graceful degradation when advanced processing fails
Cross-Format Support - Images, PDFs, Office docs, CSV/TSV all work seamlessly

Quick Start

Python API

from abstractcore import create_llm

# Works with any provider - just change the provider name
llm = create_llm("openai", model="gpt-4o", api_key="your-key")
response = llm.generate(
    "What's in this image and document?",
    media=["photo.jpg", "report.pdf"]
)

# Same code works with any provider
llm = create_llm("anthropic", model="claude-3.5-sonnet")
response = llm.generate(
    "Analyze these materials",
    media=["chart.png", "data.csv", "presentation.pptx"]
)

CLI Integration

Use the simple @filename syntax to attach any file type:

# PDF Analysis
python -m abstractcore.utils.cli --prompt "What is this document about? @report.pdf"

# Office Documents
python -m abstractcore.utils.cli --prompt "Summarize this presentation @slides.pptx"
python -m abstractcore.utils.cli --prompt "What data is in @spreadsheet.xlsx"
python -m abstractcore.utils.cli --prompt "Analyze this document @contract.docx"

# Data Files
python -m abstractcore.utils.cli --prompt "What patterns are in @sales_data.csv"

# Images
python -m abstractcore.utils.cli --prompt "What's in this image? @screenshot.png"

# Mixed Media
python -m abstractcore.utils.cli --prompt "Compare @chart.png and @data.csv and explain trends"

Supported File Types

Images (Vision Models)

Formats: PNG, JPEG, GIF, WEBP, BMP, TIFF
Features: Automatic optimization, resizing, format conversion, EXIF handling
Max Size: Automatically resized for optimal model performance

Documents

PDF: Full text extraction with PyMuPDF4LLM, preserves formatting and structure
Word (DOCX): Full document analysis with structure preservation
Excel (XLSX): Sheet-by-sheet extraction with data analysis
PowerPoint (PPTX): Slide content extraction with comprehensive analysis

Data Files

Text Files: TXT, MD with intelligent parsing
Data: CSV, TSV with data analysis
Structured: JSON with intelligent parsing

How It Works

The media system uses a sophisticated multi-layer architecture:

File Attachment Processing - CLI @filename syntax and Python media=[] parameter
Intelligent Processing - AutoMediaHandler selects appropriate processors (Image, PDF, Office, Text)
Provider Formatting - Same content formatted differently for each provider's API
Graceful Fallback - Multi-level fallback ensures users always get meaningful results

Provider-Specific Formatting Example

AbstractCore automatically formats the same content differently for each provider:

# OpenAI Format (JSON)
{
  "role": "user",
  "content": [
    {"type": "text", "text": "Analyze these files"},
    {"type": "image_url", "image_url": {"url": "data:image/png;base64,iVBORw0..."}},
    {"type": "text", "text": "PDF Content: # Report Title\n\nExecutive Summary..."}
  ]
}

# Anthropic Format (Messages API)
{
  "role": "user",
  "content": [
    {"type": "text", "text": "Analyze these files"},
    {"type": "image", "source": {"type": "base64", "media_type": "image/png", "data": "iVBORw0..."}},
    {"type": "text", "text": "PDF Content: # Report Title\n\nExecutive Summary..."}
  ]
}

Common Use Cases

Document Analysis

from abstractcore import create_llm

llm = create_llm("openai", model="gpt-4o")

# Analyze PDF
response = llm.generate(
    "Summarize the key findings in this research paper",
    media=["research_paper.pdf"]
)

# Extract data from Excel
response = llm.generate(
    "What are the top 5 sales regions by revenue?",
    media=["sales_report.xlsx"]
)

# Analyze PowerPoint
response = llm.generate(
    "List the main talking points from this presentation",
    media=["quarterly_review.pptx"]
)

Multi-File Analysis

# Compare multiple files
response = llm.generate(
    "Compare the financial data across these three reports",
    media=["q1_report.pdf", "q2_report.pdf", "q3_report.pdf"]
)

# Mixed media types
response = llm.generate(
    "Verify that the chart matches the data in the spreadsheet",
    media=["sales_chart.png", "sales_data.csv"]
)

Image Analysis with Documents

# Combine images and documents
response = llm.generate(
    "Compare the architectural designs with the specifications",
    media=["design1.jpg", "design2.jpg", "specifications.pdf"]
)

HTTP Server Support

The media handling system is fully integrated with the OpenAI-compatible HTTP server:

Using @filename Syntax

import openai

client = openai.OpenAI(base_url="http://localhost:8000/v1", api_key="unused")

response = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "Analyze @report.pdf and @chart.png"}]
)

Using OpenAI Responses API Format

import requests

response = requests.post(
    "http://localhost:8000/v1/responses",
    json={
        "model": "gpt-4o",
        "input": [
            {
                "role": "user",
                "content": [
                    {"type": "input_text", "text": "Analyze this document"},
                    {"type": "input_file", "file_url": "https://example.com/report.pdf"}
                ]
            }
        ]
    }
)

Error Handling and Fallback

AbstractCore provides robust error handling with graceful degradation:

Format Detection Failure - Falls back to basic text extraction
Processing Errors - Returns partial content with error indication
Unsupported Files - Clear error messages with supported format list
Size Limits - Automatic chunking for large documents

Best Practices

File Size - Keep individual files under 10MB for optimal performance
Image Quality - Use high-quality images but let AbstractCore handle optimization
Multiple Files - Limit to 5-10 files per request to avoid token limits
File Types - Stick to supported formats for reliable processing
Clear Prompts - Specify what you want to extract or analyze from the files

Installation

To use media handling features, install the media extras:

# Install with media support
pip install abstractcore[media]

# Or install everything
pip install abstractcore[all]