Media Handling System

Attach any file type to your LLM requests. One simple API works across all providers with intelligent processing and graceful fallback.

Overview

AbstractCore provides a production-ready unified media handling system that enables seamless file attachment and processing across all LLM providers. The system automatically processes images, documents, and other media files using the same simple API, with intelligent provider-specific formatting and graceful fallback handling.

Key Features

  • Universal API - Same media=[] parameter works across all providers
  • CLI Integration - Simple @filename syntax for instant file attachment
  • Intelligent Processing - Automatic file type detection with specialized processors
  • Provider Adaptation - Automatic formatting for each provider's API requirements
  • Robust Fallback - Graceful degradation when advanced processing fails
  • Cross-Format Support - Images, PDFs, Office docs, CSV/TSV all work seamlessly

Quick Start

Python API

from abstractcore import create_llm

# Works with any provider - just change the provider name
llm = create_llm("openai", model="gpt-4o", api_key="your-key")
response = llm.generate(
    "What's in this image and document?",
    media=["photo.jpg", "report.pdf"]
)

# Same code works with any provider
llm = create_llm("anthropic", model="claude-3.5-sonnet")
response = llm.generate(
    "Analyze these materials",
    media=["chart.png", "data.csv", "presentation.pptx"]
)

CLI Integration

Use the simple @filename syntax to attach any file type:

# PDF Analysis
python -m abstractcore.utils.cli --prompt "What is this document about? @report.pdf"

# Office Documents
python -m abstractcore.utils.cli --prompt "Summarize this presentation @slides.pptx"
python -m abstractcore.utils.cli --prompt "What data is in @spreadsheet.xlsx"
python -m abstractcore.utils.cli --prompt "Analyze this document @contract.docx"

# Data Files
python -m abstractcore.utils.cli --prompt "What patterns are in @sales_data.csv"

# Images
python -m abstractcore.utils.cli --prompt "What's in this image? @screenshot.png"

# Mixed Media
python -m abstractcore.utils.cli --prompt "Compare @chart.png and @data.csv and explain trends"

Supported File Types

Images (Vision Models)

  • Formats: PNG, JPEG, GIF, WEBP, BMP, TIFF
  • Features: Automatic optimization, resizing, format conversion, EXIF handling
  • Max Size: Automatically resized for optimal model performance

Documents

  • PDF: Full text extraction with PyMuPDF4LLM, preserves formatting and structure
  • Word (DOCX): Full document analysis with structure preservation
  • Excel (XLSX): Sheet-by-sheet extraction with data analysis
  • PowerPoint (PPTX): Slide content extraction with comprehensive analysis

Data Files

  • Text Files: TXT, MD with intelligent parsing
  • Data: CSV, TSV with data analysis
  • Structured: JSON with intelligent parsing

How It Works

The media system uses a sophisticated multi-layer architecture:

  1. File Attachment Processing - CLI @filename syntax and Python media=[] parameter
  2. Intelligent Processing - AutoMediaHandler selects appropriate processors (Image, PDF, Office, Text)
  3. Provider Formatting - Same content formatted differently for each provider's API
  4. Graceful Fallback - Multi-level fallback ensures users always get meaningful results

Provider-Specific Formatting Example

AbstractCore automatically formats the same content differently for each provider:

# OpenAI Format (JSON)
{
  "role": "user",
  "content": [
    {"type": "text", "text": "Analyze these files"},
    {"type": "image_url", "image_url": {"url": "..."}},
    {"type": "text", "text": "PDF Content: # Report Title\n\nExecutive Summary..."}
  ]
}

# Anthropic Format (Messages API)
{
  "role": "user",
  "content": [
    {"type": "text", "text": "Analyze these files"},
    {"type": "image", "source": {"type": "base64", "media_type": "image/png", "data": "iVBORw0..."}},
    {"type": "text", "text": "PDF Content: # Report Title\n\nExecutive Summary..."}
  ]
}

Common Use Cases

Document Analysis

from abstractcore import create_llm

llm = create_llm("openai", model="gpt-4o")

# Analyze PDF
response = llm.generate(
    "Summarize the key findings in this research paper",
    media=["research_paper.pdf"]
)

# Extract data from Excel
response = llm.generate(
    "What are the top 5 sales regions by revenue?",
    media=["sales_report.xlsx"]
)

# Analyze PowerPoint
response = llm.generate(
    "List the main talking points from this presentation",
    media=["quarterly_review.pptx"]
)

Multi-File Analysis

# Compare multiple files
response = llm.generate(
    "Compare the financial data across these three reports",
    media=["q1_report.pdf", "q2_report.pdf", "q3_report.pdf"]
)

# Mixed media types
response = llm.generate(
    "Verify that the chart matches the data in the spreadsheet",
    media=["sales_chart.png", "sales_data.csv"]
)

Image Analysis with Documents

# Combine images and documents
response = llm.generate(
    "Compare the architectural designs with the specifications",
    media=["design1.jpg", "design2.jpg", "specifications.pdf"]
)

HTTP Server Support

The media handling system is fully integrated with the OpenAI-compatible HTTP server:

Using @filename Syntax

import openai

client = openai.OpenAI(base_url="http://localhost:8000/v1", api_key="unused")

response = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "Analyze @report.pdf and @chart.png"}]
)

Using OpenAI Responses API Format

import requests

response = requests.post(
    "http://localhost:8000/v1/responses",
    json={
        "model": "gpt-4o",
        "input": [
            {
                "role": "user",
                "content": [
                    {"type": "input_text", "text": "Analyze this document"},
                    {"type": "input_file", "file_url": "https://example.com/report.pdf"}
                ]
            }
        ]
    }
)

Error Handling and Fallback

AbstractCore provides robust error handling with graceful degradation:

  • Format Detection Failure - Falls back to basic text extraction
  • Processing Errors - Returns partial content with error indication
  • Unsupported Files - Clear error messages with supported format list
  • Size Limits - Automatic chunking for large documents

Best Practices

  • File Size - Keep individual files under 10MB for optimal performance
  • Image Quality - Use high-quality images but let AbstractCore handle optimization
  • Multiple Files - Limit to 5-10 files per request to avoid token limits
  • File Types - Stick to supported formats for reliable processing
  • Clear Prompts - Specify what you want to extract or analyze from the files

Installation

To use media handling features, install the media extras:

# Install with media support
pip install abstractcore[media]

# Or install everything
pip install abstractcore[all]

Related Documentation

Getting Started

Quick setup guide

Vision Capabilities

Image analysis across providers

Centralized Configuration

Global defaults and settings

HTTP Server

OpenAI-compatible REST API

Internal CLI

Built-in CLI with @filename syntax

API Reference

Complete Python API