Architecture Overview
Interactive exploration of AbstractCore's system design, components, and data flow. Click on components to learn more about how they work together.
🏗️ System Overview
AbstractCore operates as both a Python library and an optional HTTP server with centralized configuration, universal media handling, and cross-provider vision capabilities. Click on components to explore their functionality:
🐍 Your Python App
Direct library integration
🌐 HTTP Clients
REST API consumers
🎯 AbstractCore API
Unified interface layer
🖥️ AbstractCore Server
OpenAI-compatible REST API
🔌 Provider Interface
Common abstraction layer
⚙️ Core Systems
Events, Tools, Retry, Streaming
🤖 LLM Providers
🚀 Your Application
Your Python application uses AbstractCore through the simple factory pattern:
from abstractcore import create_llm
# Factory creates the right provider
llm = create_llm("openai", model="gpt-4o-mini")
response = llm.generate("Hello, world!")
Benefits: Same interface for all providers, automatic reliability, built-in tool support.
🌐 HTTP Clients
Any HTTP client can use AbstractCore through the OpenAI-compatible REST API:
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "openai/gpt-4o-mini", "messages": [{"role": "user", "content": "Hello"}]}'
Benefits: Drop-in OpenAI replacement, universal provider access, tool call format conversion.
🎯 AbstractCore API
The unified interface that provides consistent behavior across all providers:
- Provider Abstraction: Same methods for all LLM providers
- Universal Tools: Tools work everywhere, even without native support
- Streaming: Unified streaming with real-time tool detection
- Structured Output: Type-safe responses with automatic validation
🖥️ AbstractCore Server
FastAPI-based HTTP server providing OpenAI-compatible endpoints:
/v1/chat/completions
- Chat completions with streaming/v1/embeddings
- Multi-provider embeddings/v1/models
- Dynamic model discovery/providers
- Provider status and capabilities
🔌 Provider Interface
Common abstraction layer ensuring consistent behavior:
Features: Memory management, capability detection, unified error handling.
⚙️ Core Systems
Event System
Observability and control through comprehensive events
Tool System
Universal tool execution with format conversion
Retry System
Production-grade reliability with circuit breakers
Streaming
Unified streaming with real-time tool detection
🔄 Request Lifecycle
Click on each step to see how requests flow through the system:
1. Request Initiation
Your app calls llm.generate()
or HTTP client sends request
2. Event Emission
System emits GENERATION_STARTED
event for monitoring
3. Provider Call
Request routed to appropriate provider with retry logic
4. Tool Detection
Response parsed for tool calls using architecture-specific patterns
5. Tool Execution
Tools executed locally with security controls and error handling
6. Response Assembly
Final response assembled with tool results and metadata
📎 Media Handling System Architecture
AbstractCore provides a production-ready unified media handling system with intelligent processing and graceful fallback.
Multi-Layer Architecture
1. File Attachment Processing
CLI @filename
syntax and Python media=[]
parameter
2. Intelligent Processing
AutoMediaHandler selects appropriate processors (Image, PDF, Office, Text)
3. Provider Formatting
Same content formatted differently for each provider's API requirements
4. Graceful Fallback
Multi-level fallback ensures users always get meaningful results
Supported File Types
Images
PNG, JPEG, GIF, WEBP, BMP, TIFF with automatic optimization
Documents
PDF, DOCX, XLSX, PPTX with intelligent extraction
Data/Text
CSV, TSV, TXT, MD, JSON with parsing and analysis
Vision Fallback System: Text-only models can process images through a transparent two-stage pipeline where a vision model analyzes the image and the text model processes the description.
🛠️ Tool System Architecture
Universal Tool Support
AbstractCore provides tool calling across all providers through two mechanisms:
Native Tool Support
For providers with native tool APIs (OpenAI, Anthropic)
- Direct API integration
- Optimal performance
- Full feature support
Intelligent Prompting
For providers without native support (Ollama, MLX, LMStudio)
- Automatic prompt injection
- Architecture-aware formatting
- Format conversion support
🔄 Tool Call Format Conversion
AbstractCore automatically converts between different tool call formats:
Qwen3 Format
<|tool_call|>...</|tool_call|>
Compatible with Codex CLI
LLaMA3 Format
<function_call>...</function_call>
Compatible with Crush CLI
XML Format
<tool_call>...</tool_call>
Compatible with Gemini CLI
Custom Format
[TOOL]...JSON...[/TOOL]
User-defined formats
⚙️ Centralized Configuration System
Global configuration stored in ~/.abstractcore/config/abstractcore.json
with clear priority hierarchy.
Configuration Priority Hierarchy
Managed Settings
- Default Models: Global and app-specific model defaults
- API Keys: Secure provider authentication
- Vision Fallback: Vision model for text-only models
- Logging: Console and file logging configuration
- Storage: Cache directories and model storage
🎯 Design Principles
Provider Abstraction
Goal: Same interface for all providers
Implementation: Common interface with provider-specific implementations
Production Reliability
Goal: Handle real-world failures gracefully
Implementation: Built-in retry logic, circuit breakers, comprehensive error handling
Universal Media Handling
Goal: Same code for images, documents, and data
Implementation: Intelligent processors with automatic provider formatting
Universal Tool Support
Goal: Tools work everywhere
Implementation: Native support where available, intelligent prompting as fallback
Centralized Configuration
Goal: Single source of truth for settings
Implementation: Hierarchical config with clear priority and easy management
Simplicity Over Features
Goal: Clean, focused API
Implementation: Minimal core with clear extension points
⚡ Performance Characteristics
💾 Memory Usage
- Core: ~15MB base
- Per Provider: ~2-5MB
- Scaling: Linear with requests
⚡ Latency Overhead
- Provider abstraction: ~1-2ms
- Event system: ~0.5ms per event
- Tool parsing: ~1-5ms
🚀 Throughput
- Single instance: 100+ req/sec
- Bottleneck: Usually the LLM provider
- Scaling: Horizontal scaling