AbstractCore - Unified LLM Provider Interface

Why Choose AbstractCore?

Production-ready LLM infrastructure with everything you need to build reliable AI applications.

Centralized Configuration

One-time setup with ~/.abstractcore/config/abstractcore.json. Set global defaults, app-specific preferences, and API keys. Never specify providers repeatedly again.

Universal Media Handling

Attach files with media=[...] (or @file in the CLI). Images and documents are processed automatically; audio/video inputs are policy-driven (audio_policy/video_policy) to avoid silent semantic changes.

Vision Capabilities

Image and video understanding across providers with automatic optimization. Vision fallback for text-only models through smart captioning, plus explicit frame sampling for video when native input isn’t available.

Provider Discovery

Centralized registry for providers, defaults, and capabilities — from open-source local stacks to cloud APIs. Discover what's installed and what each provider supports.

Production Ready

Built-in retry logic, circuit breakers, comprehensive error handling, and event-driven observability.

Universal Tools + Syntax Rewriting

Tool calling across ALL providers with real-time format conversion for agent CLI compatibility. Architecture-aware detection and custom tag support.

Type Safe

Full Pydantic integration for structured outputs with automatic validation and retry on failures.

Local & Cloud

Run open-source models locally to control the full stack end-to-end (software + models), or use cloud APIs for maximum performance.

Token Management + Streaming

Unified token parameters across all providers with budget validation. Real-time streaming with proper tool call handling and cost estimation.

Session Management + Analytics

Persistent conversations with SOTA 2025 auto-compaction algorithm. Built-in analytics: summaries, assessments, fact extraction with complete serialization.

Production Resilience + OpenAI Server

Production-grade retry logic, circuit breakers, and event-driven monitoring. OpenAI-compatible HTTP server for multi-language access.

Get Started in Minutes

Install AbstractCore and make your first LLM call in under 5 minutes.

Install

Install AbstractCore with your preferred providers

# Core (lightweight default)
pip install abstractcore

# Providers (install only what you use; zsh: keep quotes)
pip install "abstractcore[openai]"
pip install "abstractcore[anthropic]"

# Optional features
pip install "abstractcore[media]"   # images, PDFs, Office docs
pip install "abstractcore[server]"  # OpenAI-compatible HTTP gateway

Configure

Set up your API keys or local providers

# For cloud providers
export OPENAI_API_KEY="your-key-here"
export ANTHROPIC_API_KEY="your-key-here"

# For local providers (no keys needed)
# Install Ollama, LMStudio, or MLX

Code

Start building with the unified interface

from abstractcore import create_llm

# Create LLM instance
llm = create_llm("openai", model="gpt-4o-mini")

# Generate response
response = llm.generate("Hello, world!")
print(response.content)

centralized_configuration.sh

# Check current configuration
abstractcore --status

# Set global fallback model (used when no app-specific default)
abstractcore --set-global-default ollama/qwen3:4b-instruct

# Set app-specific defaults (examples)
abstractcore --set-app-default summarizer openai gpt-4o-mini
abstractcore --set-app-default cli ollama qwen3:4b-instruct

# Configure vision fallback for text-only models
abstractcore --download-vision-model  # Download local caption model
# OR use an existing vision model:
# abstractcore --set-vision-provider ollama qwen2.5vl:7b

# Set API keys
abstractcore --set-api-key openai sk-your-key-here

# Configure logging
abstractcore --set-console-log-level WARNING

# Now use without specifying provider/model every time!
abstractcore-chat --prompt "Hello!"  # Uses configured defaults

media_handling.py

from abstractcore import create_llm

# Works with any provider - same API everywhere
llm = create_llm("openai", model="gpt-4o")

# Attach any file type with media parameter
response = llm.generate(
    "What's in this image and document?",
    media=["photo.jpg", "report.pdf"]
)

# Or use CLI with @filename syntax
# abstractcore-chat --prompt "Analyze @report.pdf"

# Supported file types:
# - Images: PNG, JPEG, GIF, WEBP, BMP, TIFF
# - Documents: PDF, DOCX, XLSX, PPTX
# - Data: CSV, TSV, TXT, MD, JSON

# Same code works with any provider
llm = create_llm("anthropic", model="claude-haiku-4-5")
response = llm.generate(
    "Summarize these materials",
    media=["chart.png", "data.csv", "presentation.pptx"]
)

vision_capabilities.py

from abstractcore import create_llm

# Vision works across all providers with same interface
openai_llm = create_llm("openai", model="gpt-4o")
response = openai_llm.generate(
    "Describe this image in detail",
    media=["photo.jpg"]
)

# Same code with local provider
ollama_llm = create_llm("ollama", model="qwen2.5vl:7b")
response = ollama_llm.generate(
    "What objects do you see?",
    media=["scene.jpg"]
)

# Vision fallback for text-only models
# Configure once: abstractcore --download-vision-model
text_llm = create_llm("lmstudio", model="qwen/qwen3-4b-2507")  # No native vision
response = text_llm.generate(
    "What's in this image?",
    media=["complex_scene.jpg"]
)
# Works transparently: vision model analyzes → text model processes description

# Multi-image analysis
response = llm.generate(
    "Compare these architectural styles",
    media=["building1.jpg", "building2.jpg", "building3.jpg"]
)

audio_and_voice.py

from abstractcore import create_llm

llm = create_llm('openai', model='gpt-4o-mini')

# Speech audio as input (policy-driven)
resp = llm.generate(
    'Summarize this call.',
    media=['call.wav'],
    audio_policy='speech_to_text',  # requires: pip install abstractvoice
)
print(resp.content)

# Deterministic STT/TTS surfaces (capability plugin)
text = llm.audio.transcribe('speech.wav')
wav_bytes = llm.voice.tts('Hello', format='wav')
print(len(wav_bytes))

tool_syntax_rewriting.py

from abstractcore import create_llm, tool

@tool
def multiply(a: float, b: float) -> float:
    """Multiply two numbers."""
    return a * b

llm = create_llm("ollama", model="qwen3:4b-instruct-2507-q4_K_M")

# Convert tool-call markup in `content` for downstream parsers.
# Tool calls are always available as structured data in `chunk.tool_calls`.
for chunk in llm.generate(
    "Compute 15 * 23 using the tool.",
    tools=[multiply],
    stream=True,
    tool_call_tags="llama3",
):
    if chunk.tool_calls:
        print(f"Tool detected: {chunk.tool_calls[0].name}")
    print(chunk.content or "", end="", flush=True)

session_analytics.py

from abstractcore import BasicSession, create_llm

llm = create_llm("openai", model="gpt-4o-mini")
session = BasicSession(llm, system_prompt="You are a helpful assistant.")

response1 = session.generate("My name is Alice")
response2 = session.generate("What's my name?")  # Remembers context

# Compact chat history (in-place)
session.force_compact(preserve_recent=6, focus="key details")

# Advanced analytics
summary = session.generate_summary(focus="key decisions")
assessment = session.generate_assessment(criteria={"clarity": True, "helpfulness": True})
facts = session.extract_facts(output_format="triples")

# Save (includes analytics if generated)
session.save("conversation.json")

token_management.py

from abstractcore import create_llm
from abstractcore.utils.token_utils import estimate_tokens

# Unified token parameters work across ALL providers
llm = create_llm(
    "anthropic",
    model="claude-haiku-4-5",
    max_tokens=32000,           # Context window (input + output)
    max_output_tokens=8000,     # Maximum output tokens
    max_input_tokens=24000      # Maximum input tokens (auto-calculated if not set)
)

# Token estimation and validation
text = "Your input text here..."
estimated = estimate_tokens(text, model="claude-haiku-4-5")
print(f"Estimated tokens: {estimated}")

# Budget validation with warnings
response = llm.generate("Write a detailed analysis...")
print(f"Input tokens: {response.usage.input_tokens}")
print(f"Output tokens: {response.usage.output_tokens}")
print(f"Cost estimate: ${response.usage.cost_usd:.4f}")

deterministic_generation.py

from abstractcore import create_llm

# Deterministic outputs with seed + temperature=0
llm = create_llm("openai", model="gpt-4o-mini", seed=42, temperature=0.0)

# Best-effort determinism depends on provider/model
response1 = llm.generate("Write exactly 3 words about coding")
response2 = llm.generate("Write exactly 3 words about coding")
print(f"Response 1: {response1.content}")  # "Innovative, challenging, rewarding."
print(f"Response 2: {response2.content}")  # "Innovative, challenging, rewarding."

# Notes:
# - Many local providers support `seed` (Ollama/MLX/HF/LMStudio best-effort).
# - Anthropic issues a warning when `seed` is provided; use temperature=0.0 for consistency.

# Works across all providers with same interface
ollama_llm = create_llm("ollama", model="qwen3:4b-instruct", seed=42, temperature=0.0)
mlx_llm = create_llm("mlx", model="mlx-community/Qwen3-4B-4bit", seed=42, temperature=0.0)

http_server.py

# Start OpenAI-compatible server
# uvicorn abstractcore.server.app:app --host 0.0.0.0 --port 8000

import requests

# NEW: OpenAI Responses API with native file support
response = requests.post(
    "http://localhost:8000/v1/responses",
    json={
        "model": "gpt-4o",
        "input": [
            {
                "role": "user",
                "content": [
                    {"type": "input_text", "text": "Analyze this document"},
                    {"type": "input_file", "file_url": "https://example.com/report.pdf"}
                ]
            }
        ],
        "stream": False  # Optional streaming
    }
)

# Or use standard chat completions with @filename syntax
import openai
client = openai.OpenAI(base_url="http://localhost:8000/v1", api_key="unused")
response = client.chat.completions.create(
    model="ollama/qwen3:4b-instruct",
    messages=[{"role": "user", "content": "Analyze @report.pdf"}]
)

Real-World Examples

Learn from practical examples and use cases.

Universal API Gateway

Server

Deploy a single OpenAI-compatible /v1 gateway for chat + tools, and optionally add /v1/images/* and /v1/audio/* via capability plugins.

# Start server
python -m abstractcore.server.app --port 8000

# Route by changing model:
#   ollama/qwen3:4b-instruct
#   openai/gpt-4o-mini
curl -X POST http://localhost:8000/v1/chat/completions \
  -d '{"model":"ollama/qwen3:4b-instruct","messages":[...]}'

View Full Example →

Vision Fallback (Images → Any LLM)

Vision

Attach images to text-only models (AbstractCore-exclusive fallback). A configured vision model captions and injects short observations into your request.

from abstractcore import create_llm

# Text-only local model (no native vision)
llm = create_llm("ollama", model="qwen3:4b-instruct")

# Configure vision fallback once, then attach images anyway
resp = llm.generate(
    "What are the key numbers in this chart?",
    media=["chart.png"],
)
print(resp.content)

View Full Example →

Audio & Voice Agents (STT/TTS)

Audio

Speech-to-text for audio inputs is policy-driven (no silent semantic changes) and works across providers via an optional capability plugin (AbstractVoice).

from abstractcore import create_llm

llm = create_llm("openai", model="gpt-4o-mini")

resp = llm.generate(
    "Summarize this call and list action items.",
    media=["call.wav"],
    audio_policy="speech_to_text",  # requires: pip install abstractvoice
)
print(resp.content)

View Full Example →

Provider Flexibility

Core Feature

Switch between providers with identical code. Perfect for development vs production environments.

# Development (free, local)
llm_dev = create_llm("ollama", model="qwen3:4b-instruct")

# Production (high quality, cloud)
llm_prod = create_llm("openai", model="gpt-4o-mini")

# Same interface, different capabilities

View Full Example →

RAG with Embeddings

Advanced

Build retrieval-augmented generation systems with built-in embedding support.

from abstractcore.embeddings import EmbeddingManager

embedder = EmbeddingManager()
docs_embeddings = embedder.embed_batch(documents)

# Find most similar document
query_embedding = embedder.embed("user query")
similarity = embedder.compute_similarity(query, docs[0])

View Full Example →

CLI Apps & Debug Mode

New

Ready-to-use terminal tools with debug capabilities and focus areas for targeted processing.

# Extract knowledge with debug mode
extractor document.pdf --format json-ld --debug --iterate 3

# Evaluate with focus areas
judge README.md --focus "examples, completeness" --debug

# Self-healing JSON handles truncated responses automatically

View CLI Docs →

View All Examples

Unified LLM Interface
Write once, run everywhere

Why Choose AbstractCore?

Centralized Configuration

Universal Media Handling

Vision Capabilities

Provider Discovery

Production Ready

Universal Tools + Syntax Rewriting

Type Safe

Local & Cloud

Token Management + Streaming

Session Management + Analytics

Production Resilience + OpenAI Server

Get Started in Minutes

Install

Configure

Code

Supported Providers

OpenAI

Anthropic

Ollama

LMStudio

MLX

HuggingFace

OpenRouter

Portkey

OpenAI-Compatible

vLLM

Comprehensive Documentation

🚀 Getting Started

📚 Core Library

🔧 Advanced

Real-World Examples

Universal API Gateway

Vision Fallback (Images → Any LLM)

Audio & Voice Agents (STT/TTS)

Provider Flexibility

RAG with Embeddings

CLI Apps & Debug Mode

Unified LLM Interface Write once, run everywhere

Why Choose AbstractCore?

Centralized Configuration

Universal Media Handling

Vision Capabilities

Provider Discovery

Production Ready

Universal Tools + Syntax Rewriting

Type Safe

Local & Cloud

Token Management + Streaming

Session Management + Analytics

Production Resilience + OpenAI Server

Get Started in Minutes

Install

Configure

Code

Supported Providers

OpenAI

Anthropic

Ollama

LMStudio

MLX

HuggingFace

OpenRouter

Portkey

OpenAI-Compatible

vLLM

Comprehensive Documentation

🚀 Getting Started

📚 Core Library

🔧 Advanced

Real-World Examples

Universal API Gateway

Vision Fallback (Images → Any LLM)

Audio & Voice Agents (STT/TTS)

Provider Flexibility

RAG with Embeddings

CLI Apps & Debug Mode

Unified LLM Interface
Write once, run everywhere