HTTP Server Guide
Run AbstractCore as an OpenAI-compatible API gateway: one server, any provider, any model, any client.
Why use the server?
OpenAI-compatible
Use standard clients, route via model="provider/model".
One endpoint
Switch providers without changing your application code.
Tools + media
Tools, images, PDFs, Office docs, and @filename attachments.
Table of Contents
Quick Start
Install and run the server:
pip install "abstractcore[server]"
# Start server
python -m abstractcore.server.app
# Or with uvicorn directly
uvicorn abstractcore.server.app:app --host 0.0.0.0 --port 8000
# Health check
curl http://localhost:8000/health
First request (cURL)
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o-mini",
"messages": [{"role": "user", "content": "Hello!"}]
}'
First request (Python)
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key="unused")
resp = client.chat.completions.create(
model="anthropic/claude-haiku-4-5",
messages=[{"role": "user", "content": "Explain quantum computing in 3 bullets."}],
)
print(resp.choices[0].message.content)
Configuration
Environment variables
# Provider API keys
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export OPENROUTER_API_KEY="sk-or-..."
# Local providers
export OLLAMA_BASE_URL="http://localhost:11434" # (or legacy: OLLAMA_HOST)
export LMSTUDIO_BASE_URL="http://localhost:1234/v1"
export VLLM_BASE_URL="http://localhost:8000/v1"
# Defaults (optional)
export ABSTRACTCORE_DEFAULT_PROVIDER=openai
export ABSTRACTCORE_DEFAULT_MODEL=gpt-4o-mini
# Debug mode
export ABSTRACTCORE_DEBUG=true
# Multi-tenant hazard: allow unload_after for providers that can unload shared server state
export ABSTRACTCORE_ALLOW_UNSAFE_UNLOAD_AFTER=1
Startup options
python -m abstractcore.server.app --help
python -m abstractcore.server.app --debug
python -m abstractcore.server.app --host 127.0.0.1 --port 8080
# uvicorn examples
uvicorn abstractcore.server.app:app --reload
uvicorn abstractcore.server.app:app --workers 4
API Endpoints
The server is OpenAI-compatible by default, with a few useful AbstractCore extensions.
- Chat:
POST /v1/chat/completions - Responses API:
POST /v1/responses - Embeddings:
POST /v1/embeddings - Images (generate):
POST /v1/images/generations - Images (edit):
POST /v1/images/edits - Audio (STT):
POST /v1/audio/transcriptions - Audio (TTS):
POST /v1/audio/speech - Model discovery:
GET /v1/models - Provider status:
GET /providers - Health:
GET /health
Model discovery
curl http://localhost:8000/v1/models
curl "http://localhost:8000/v1/models?provider=ollama"
curl "http://localhost:8000/v1/models?type=text-embedding"
curl "http://localhost:8000/v1/models?provider=ollama&type=text-embedding"
Media generation endpoints (optional)
AbstractCore Server can optionally expose OpenAI-compatible image generation and audio endpoints.
These are interoperability-first endpoints; if the required plugin/backend is not available, the server returns 501 with actionable messaging.
Images (generate/edit) — requires abstractvision
POST /v1/images/generationsPOST /v1/images/edits
pip install "abstractcore[server]" abstractvision
python -m abstractcore.server.app
abstractvision typically needs an OpenAI-compatible images backend. Configure
ABSTRACTVISION_BASE_URL + ABSTRACTVISION_API_KEY + ABSTRACTVISION_MODEL_ID
(or the equivalent config settings) to point to that backend.
Audio (STT/TTS) — requires abstractvoice
These endpoints do not require an LLM provider configuration.
/v1/audio/transcriptionsrequirespip install abstractvoice(multipart parsing viapython-multipartis included inabstractcore[server])./v1/audio/speechrequirespip install abstractvoiceon the server.
# Speech-to-text (multipart)
curl -X POST http://localhost:8000/v1/audio/transcriptions \
-F file=@./call.wav \
-F language=en
# Text-to-speech (JSON)
curl -X POST http://localhost:8000/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{"input":"Hello from AbstractCore","format":"wav"}' \
--output out.wav
See Audio & Voice (STT/TTS) for library-side usage and configuration defaults.
Model Routing
Route to any provider/model using model="provider/model-name", for example:
openai/gpt-4o-minianthropic/claude-haiku-4-5ollama/qwen3:4b-instructlmstudio/qwen/qwen3-4b-2507
Per-request extensions
AbstractCore also supports a few optional request fields:
api_key: per-request provider key (falls back to env vars if omitted)base_url: override provider endpoint (include/v1for OpenAI-compatible servers)unload_after: unload model after request (restricted for some providers unless explicitly enabled)thinking: unified thinking/reasoning control (best-effort)
Multimodal Requests (Images, Documents, Audio/Video)
Two common ways to attach files. Images and documents are handled automatically; audio/video inputs are
policy-driven (audio_policy, video_policy) to avoid silent semantic changes.
1) @filename syntax (AbstractCore extension)
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o",
"messages": [
{"role": "user", "content": "Summarize this report. @/path/to/report.pdf"}
]
}'
2) OpenAI vision-style messages
For full examples (image URLs, base64, and policy-driven audio/video fallbacks), see Media Handling, Vision Capabilities, and Audio & Voice.
Tools + agent_format
Tool calls are supported on /v1/chat/completions and /v1/responses.
- Use
toolsas you would with OpenAI-compatible clients. - Set
agent_formatto request a specific tool-call syntax (useful for agentic CLIs/parsers).
See Tool Syntax Rewriting for details and supported formats.
Agentic CLI Integration
Point agentic CLIs at the server by setting OpenAI-compatible environment variables.
Codex CLI
export OPENAI_BASE_URL="http://localhost:8000/v1"
export OPENAI_API_KEY="unused"
export ABSTRACTCORE_API_KEY="unused"
codex --model "ollama/qwen3-coder:30b" "Write a factorial function"
Tool-call format defaults
# Example: prefer LLaMA3-style tool markup by default
export ABSTRACTCORE_DEFAULT_TOOL_CALL_TAGS=llama3
export ABSTRACTCORE_DEFAULT_EXECUTE_TOOLS=false
Deployment Notes
- Use
uvicorn ... --workers Nfor multiple workers in production. - Use
/healthfor monitoring and readiness checks. - Be cautious with
unload_afterin multi-tenant environments.