# AbstractCore - Unified LLM Provider Interface

**Write once, run everywhere.** Production-ready Python library for all major LLM providers (OpenAI, Anthropic, Ollama, MLX, LMStudio, HuggingFace) with universal tool calling, streaming, session management, and built-in CLI apps.

## Quick Start

```bash
pip install abstractcore[all]
```

- **[5-Minute Setup](docs/getting-started.md)**: Basic usage with any provider
- **[Prerequisites](docs/prerequisites.md)**: API keys and local provider setup
- **[Installation Options](#installation-options)**: Provider-specific dependencies

## Core Features

- **[API Reference](docs/api-reference.md)**: Complete Python API with examples
- **[Centralized Configuration](docs/centralized-config.md)**: Global defaults, app preferences, API key management
- **[Media Handling System](docs/media-handling-system.md)**: Universal file attachment (images, PDFs, Office docs, CSV) across all providers
- **[Vision Capabilities](docs/vision-capabilities.md)**: Image analysis across all providers with automatic optimization
- **[Universal Tool Calling](docs/tool-calling.md)**: @tool decorator works across ALL providers
- **[Provider Discovery](docs/getting-started.md)**: Centralized registry to programmatically list ALL providers and models
- **[Session Management](docs/session.md)**: Persistent conversations with analytics and auto-compaction
- **[Streaming](docs/api-reference.md)**: Real-time responses across all providers
- **[Structured Output](docs/getting-started.md)**: Pydantic validation with auto-retry
- **[Embeddings](docs/embeddings.md)**: SOTA models for semantic search and RAG
- **[Token Management](docs/token-management.md)**: Unified parameter vocabulary with budget validation
- **[Tool Syntax Rewriting](docs/tool-calling.md)**: Real-time format conversion for agent CLI compatibility

## Built-in CLI Applications

Ready-to-use terminal tools - no Python code required:

- **[Summarizer](docs/apps/basic-summarizer.md)**: `summarizer document.pdf --style executive`
- **[Extractor](docs/apps/basic-extractor.md)**: `extractor report.txt --format json-ld --focus technology`
- **[Judge](docs/apps/basic-judge.md)**: `judge essay.txt --criteria clarity,accuracy --context "academic writing"`

## Optional Components

- **[HTTP Server](docs/server.md)**: OpenAI-compatible REST API for multi-language access
- **[Architecture](docs/architecture.md)**: System design and component interactions
- **[Examples](examples/)**: Progressive tutorials from basic to production patterns

## Documentation & Support

- **[Troubleshooting](docs/troubleshooting.md)**: Common issues and solutions
- **[Full Documentation](docs/README.md)**: Complete navigation guide
- **[GitHub Issues](https://github.com/lpalbou/AbstractCore/issues)**: Report bugs and get help

## Installation Options

```bash
# Minimal core
pip install abstractcore

# With specific providers
pip install abstractcore[openai]
pip install abstractcore[anthropic]
pip install abstractcore[ollama]
pip install abstractcore[lmstudio]
pip install abstractcore[mlx]
pip install abstractcore[huggingface]

# With media handling (images, PDFs, Office docs)
pip install abstractcore[media]

# With server support
pip install abstractcore[server]

# With embeddings
pip install abstractcore[embeddings]

# Everything
pip install abstractcore[all]
```

## Why AbstractCore?

- **Write once, run everywhere**: Same code works with any LLM provider
- **Production ready**: Built-in error handling, retries, and monitoring
- **Universal tool calling**: @tool decorator works across ALL providers
- **Universal media handling**: Simple `@filename` syntax and `media=[]` API for images, PDFs, Office docs
- **Vision capabilities**: Image analysis across all providers with automatic optimization
- **Centralized configuration**: Global defaults and app-specific preferences eliminate repetitive setup
- **Built-in apps**: Ready-to-use CLI tools (summarizer, extractor, judge)
- **Debug capabilities**: Self-healing JSON and `--debug` mode for troubleshooting

## Essential Code Examples

### Basic Usage with Factory Pattern
```python
from abstractcore import create_llm

# Factory pattern creates appropriate provider
llm = create_llm("anthropic", model="claude-3-5-haiku-latest")
response = llm.generate("What is the capital of France?")
print(response.content)
```

### Provider Discovery (Centralized Registry)
```python
from abstractcore.providers import get_all_providers_with_models

# Get comprehensive information about all providers with available models
providers = get_all_providers_with_models()
for provider in providers:
    print(f"{provider['display_name']}: {provider['model_count']} models")
    print(f"Features: {', '.join(provider['supported_features'])}")
    print(f"Local: {provider['local_provider']}")
    print(f"Auth Required: {provider['authentication_required']}")
```

### Centralized Configuration
```bash
# Check current configuration status
abstractcore --status

# Set global fallback model
abstractcore --set-global-default ollama/llama3:8b

# Set app-specific defaults for optimal performance
abstractcore --set-app-default summarizer openai gpt-4o-mini
abstractcore --set-app-default extractor ollama qwen3:4b-instruct
abstractcore --set-app-default judge anthropic claude-3-5-haiku

# Set API keys
abstractcore --set-api-key openai sk-your-key-here
abstractcore --set-api-key anthropic your-anthropic-key
```

### Media Handling System
```python
from abstractcore import create_llm

# Works with any provider - just change the provider name
llm = create_llm("openai", model="gpt-4o", api_key="your-key")

# Universal media parameter works across all providers
response = llm.generate(
    "What's in this image and document?",
    media=["photo.jpg", "report.pdf"]
)

# CLI integration with @filename syntax
# python -m abstractcore.utils.cli --prompt "Analyze @report.pdf and @chart.png"

# Supports: Images (PNG, JPEG, GIF, WEBP), PDFs, Office docs (DOCX, XLSX, PPTX), CSV/TSV
response = llm.generate(
    "Compare the data",
    media=["chart.png", "data.csv", "presentation.pptx"]
)
```

### Vision Capabilities
```python
from abstractcore import create_llm

# Works with any vision-capable provider
llm = create_llm("openai", model="gpt-4o")

# Single image analysis
response = llm.generate(
    "What objects do you see in this image?",
    media=["photo.jpg"]
)

# Multiple images comparison
response = llm.generate(
    "Compare these architectural styles and identify differences",
    media=["building1.jpg", "building2.jpg", "building3.jpg"]
)

# Cross-provider consistency - same code works everywhere
openai_response = create_llm("openai", model="gpt-4o").generate("Analyze", media=["chart.png"])
anthropic_response = create_llm("anthropic", model="claude-3-5-sonnet").generate("Analyze", media=["chart.png"])
ollama_response = create_llm("ollama", model="qwen2.5vl:7b").generate("Analyze", media=["chart.png"])

# Vision fallback for text-only models (one-time setup)
# abstractcore --download-vision-model  # Downloads local vision model
# Now text-only models can process images transparently
```

### Universal Tool Calling with @tool Decorator
```python
from abstractcore import create_llm
from abstractcore.tools import tool

@tool
def get_weather(city: str) -> str:
    """Get current weather for a city."""
    return f"Weather in {city}: 72°F, Sunny"

# Works with ANY provider - OpenAI, Anthropic, Ollama, etc.
llm = create_llm("ollama", model="qwen3:4b-instruct-2507-q4_K_M")
response = llm.generate(
    "What's the weather in Paris?",
    tools=[get_weather]  # Automatic format detection and conversion
)

# Enhanced metadata for complex tools
@tool(
    description="Search database for records",
    tags=["database", "search"],
    when_to_use="When user asks for specific data",
    examples=[{"description": "Find users named John", "arguments": {"query": "name=John"}}]
)
def search_database(query: str, table: str = "users") -> str:
    return f"Searching {table} for: {query}"

# Tool chaining - LLM automatically calls multiple tools in sequence
@tool
def get_user_location(user_id: str) -> str:
    return {"user123": "Paris", "user456": "Tokyo"}.get(user_id, "Unknown")

# LLM will call get_user_location first, then get_weather with the result
response = llm.generate("What's the weather for user123?", tools=[get_user_location, get_weather])
```

### Session Management with Analytics
```python
from abstractcore import BasicSession, create_llm

llm = create_llm("openai", model="gpt-4o-mini")
session = BasicSession(llm, system_prompt="You are a helpful assistant.")

response1 = session.generate("My name is Alice")
response2 = session.generate("What's my name?")  # Remembers context

# Auto-compaction with SOTA 2025 algorithm
session.compact(target_tokens=8000)  # Compresses while preserving context

# Advanced analytics
summary = session.generate_summary(style="executive", length="brief")
assessment = session.generate_assessment(criteria=["clarity", "helpfulness"])
facts = session.extract_facts(entity_types=["person", "organization", "date"])

# Save with analytics
session.save("conversation.json", summary=True, assessment=True, facts=True)
```

### Token Management (Unified Parameters)
```python
from abstractcore import create_llm
from abstractcore.utils.token_utils import estimate_tokens

# Unified token parameters work across ALL providers
llm = create_llm(
    "anthropic",
    model="claude-3-5-haiku-latest",
    max_tokens=32000,           # Context window (input + output)
    max_output_tokens=8000,     # Maximum output tokens
    max_input_tokens=24000      # Maximum input tokens (auto-calculated if not set)
)

# Token estimation and validation
text = "Your input text here..."
estimated = estimate_tokens(text, model="claude-3-5-haiku-latest")
print(f"Estimated tokens: {estimated}")

# Budget validation with warnings
response = llm.generate("Write a detailed analysis...")
print(f"Input tokens: {response.usage.input_tokens}")
print(f"Output tokens: {response.usage.output_tokens}")
print(f"Cost estimate: ${response.usage.cost_usd:.4f}")
```

### Tool Syntax Rewriting (Real-time Format Conversion)
```python
from abstractcore import create_llm
from abstractcore.tools.tag_rewriter import ToolCallTagRewriter

# Custom tag configuration for any agent framework
llm = create_llm(
    "ollama",
    model="qwen3-coder:30b",
    tool_call_tags="function_call"  # Converts to: <function_call>...JSON...</function_call>
)

# Multiple custom formats
llm = create_llm(
    "mlx",
    model="qwen3-air-4bit",
    tool_call_tags=("[[tool_use]]", "[[/tool_use]]")  # Start and end tags
)

# Architecture-aware format detection
# Qwen3: <|tool_call|>{...}</|tool_call|>
# LLaMA3: <function_call>{...}</function_call>
# XML-based: <tool_call>{...}</tool_call>
# Claude Code: [[tool_use]]{...}[[/tool_use]]
# OpenAI/Anthropic: Native JSON API calls

# Real-time streaming with format conversion
for chunk in llm.generate("Use the calculator tool", tools=[calculate], stream=True):
    if chunk.tool_calls:
        print(f"Tool detected: {chunk.tool_calls[0].name}")
    print(chunk.content, end="", flush=True)
```

### Structured Output with Automatic Retry
```python
from pydantic import BaseModel
from abstractcore import create_llm

class Person(BaseModel):
    name: str
    age: int

# Automatic validation and retry on failures
person = llm.generate(
    "Extract: John Doe is 25 years old",
    response_model=Person
)
print(f"{person.name}, age {person.age}")
```

### CLI Applications (Terminal & Python)
```bash
# Terminal usage - no Python code needed
summarizer document.pdf --style executive --output summary.txt
extractor report.txt --format json-ld --focus technology
judge essay.txt --criteria clarity,accuracy --context "academic writing"
```

```python
# Python API usage
from abstractcore.processing import BasicSummarizer, BasicExtractor, BasicJudge

summarizer = BasicSummarizer()
summary = summarizer.summarize(text, style="executive", length="brief")

extractor = BasicExtractor()
kg = extractor.extract(text, output_format="jsonld")

judge = BasicJudge()
assessment = judge.evaluate(text, context="code review", focus="error handling")
```

### HTTP Server (OpenAI-Compatible)
```bash
# Start server
uvicorn abstractcore.server.app:app --host 0.0.0.0 --port 8000
```

```python
# OpenAI-compatible client usage
import openai

client = openai.OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="unused"  # Not required for local providers
)

# Route to any provider using model format: provider/model
response = client.chat.completions.create(
    model="ollama/qwen3-coder:30b",      # Ollama provider
    messages=[{"role": "user", "content": "Hello!"}]
)

# File attachments with @filename syntax
response = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "Analyze @report.pdf and @chart.png"}]
)

# OpenAI Responses API (/v1/responses) with native input_file support
import requests
response = requests.post(
    "http://localhost:8000/v1/responses",
    json={
        "model": "gpt-4o",
        "input": [
            {
                "role": "user",
                "content": [
                    {"type": "input_text", "text": "Analyze this document"},
                    {"type": "input_file", "file_url": "https://example.com/report.pdf"}
                ]
            }
        ]
    }
)

# Provider discovery via HTTP API
# GET /providers - Complete provider metadata
# GET /providers/ollama/models - Models for specific provider
```

### Advanced Features
```python
# Architecture-aware tool calling - automatic format detection
# Event system for production monitoring
from abstractcore.events import EventType, on_global

def cost_monitor(event):
    if event.cost_usd and event.cost_usd > 0.10:
        alert(f"High cost request: ${event.cost_usd}")

on_global(EventType.GENERATION_COMPLETED, cost_monitor)

# Memory management for local models
llm = create_llm("ollama", model="large-model")
response = llm.generate("Hello")
llm.unload()  # Free memory

# Production resilience with retry and circuit breaker
from abstractcore.resilience import RetryManager, CircuitBreaker

llm = create_llm(
    "openai",
    model="gpt-4o-mini",
    retry_manager=RetryManager(max_attempts=3, backoff_strategy="exponential"),
    circuit_breaker=CircuitBreaker(failure_threshold=5, timeout=60)
)
```

### Debug & Performance Features
```bash
# Debug mode shows raw LLM responses
judge document.txt --debug --provider lmstudio --model qwen/qwen3-next-80b

# Advanced extractor modes and iterations
extractor document.pdf --mode thorough --iterate 3 --similarity-threshold 0.9
extractor large_file.txt --mode fast --no-embeddings --minified

# Focus areas for targeted evaluation
judge README.md --focus "architectural diagrams, technical comparisons" --debug
judge code.py --focus "error handling, performance" --temperature 0.05

# Self-healing JSON handles truncated responses automatically
# Token limits: max_tokens=32k, max_output_tokens=8k prevent truncation
```

## API Summary for AI Systems

### Core Functions
```python
# Factory
from abstractcore import create_llm, BasicSession

# Provider Discovery
from abstractcore.providers import (
    get_all_providers_with_models,
    list_available_providers,
    get_provider_info,
    is_provider_available
)

# Tools
from abstractcore.tools import tool, UniversalToolHandler

# Token Management
from abstractcore.utils.token_utils import estimate_tokens, calculate_token_budget

# Processing
from abstractcore.processing import BasicSummarizer, BasicExtractor, BasicJudge

# Events
from abstractcore.events import EventType, on_global, emit_global

# Resilience
from abstractcore.resilience import RetryManager, CircuitBreaker
```

### Supported Providers
| Provider | Features | Default Model |
|----------|----------|---------------|
| **OpenAI** | Native tools, streaming, structured output, vision (GPT-4o), media processing | gpt-5-nano-2025-08-07 |
| **Anthropic** | Native tools, streaming, structured output, vision (Claude 3.5), media processing | claude-3-5-haiku-latest |
| **Ollama** | Prompted tools, streaming, local models, vision (qwen2.5vl:7b), media processing | qwen3-coder:30b |
| **LMStudio** | Prompted tools, streaming, local models, vision (qwen2.5-vl), media processing | llama-3.2-8b-instruct |
| **MLX** | Prompted tools, streaming, Apple Silicon, vision models, media processing | qwen3-air-4bit |
| **HuggingFace** | Prompted tools, streaming, open models, vision models, media processing | microsoft/DialoGPT-large |

### Key Capabilities for AI
- **137+ models** across 6 providers via centralized registry
- **Universal media handling** with `media=[]` API and `@filename` CLI syntax
- **Vision capabilities** across all providers with automatic optimization and fallback
- **Centralized configuration** with global defaults and app-specific preferences
- **Real-time tool syntax conversion** for agent CLI compatibility
- **SOTA 2025 session compaction** with context preservation
- **Unified token parameters** across all providers
- **Event-driven architecture** for monitoring and control
- **Production resilience** with retries and circuit breakers
- **Self-healing JSON** for robust structured output
- **OpenAI-compatible server** with /v1/responses endpoint for multi-language access

---

## Quick Links

- **GitHub**: https://github.com/lpalbou/AbstractCore
- **PyPI**: `pip install abstractcore[all]`
- **License**: MIT | **Python**: 3.9+ | **Status**: Production Ready

**Complete Documentation**: [llms-full.txt](https://lpalbou.github.io/AbstractCore/llms-full.txt)