FAQ

Common questions about installs, providers, tools, structured output, media, embeddings, and the server.

Table of Contents

Install & Extras Providers & Models Tools
Structured Output Media / Vision Embeddings
HTTP Server Debugging & Downloads Scope & Philosophy

Install & Extras

What do I get with pip install abstractcore?

The default install is intentionally lightweight. It includes the core API (create_llm, BasicSession, tool definitions, structured output plumbing) and small dependencies. Heavy dependencies live behind install extras.

See Getting Started and Prerequisites.

Which extra do I need for my provider?

pip install "abstractcore[openai]"       # OpenAI SDK
pip install "abstractcore[anthropic]"    # Anthropic SDK
pip install "abstractcore[huggingface]"  # Transformers / torch (heavy)
pip install "abstractcore[mlx]"          # Apple Silicon local inference (heavy)
pip install "abstractcore[vllm]"         # GPU server integration (heavy)

These providers work with the core install (no provider extra): ollama, lmstudio, openrouter, openai-compatible.

How do I combine extras?

# zsh: keep quotes
pip install "abstractcore[openai,media,tools]"

For turnkey installs, see the project README (extras like all-apple, all-non-mlx, all-gpu).

Why did my install pull torch / take a long time?

You probably installed a heavy extra (most commonly abstractcore[huggingface], abstractcore[mlx], or a turnkey all-* extra). The core install (pip install abstractcore) does not include torch/transformers.

Providers & Models

What’s the difference between “provider” and “model”?

  • Provider: a backend adapter (openai, anthropic, ollama, lmstudio, …)
  • Model: a provider-specific model name (for example gpt-4o-mini or qwen3:4b-instruct)
from abstractcore import create_llm

llm = create_llm("openai", model="gpt-4o-mini")

How do I connect to a local server (Ollama / LMStudio / vLLM / OpenAI-compatible)?

Use the matching provider and set base_url (or the provider’s base-url env var).

from abstractcore import create_llm

llm = create_llm("ollama", model="qwen3:4b-instruct", base_url="http://localhost:11434")
llm = create_llm("lmstudio", model="qwen/qwen3-4b-2507", base_url="http://localhost:1234/v1")
llm = create_llm("vllm", model="Qwen/Qwen3-Coder-30B-A3B-Instruct", base_url="http://localhost:8000/v1")

# Generic OpenAI-compatible endpoint
llm = create_llm("openai-compatible", model="my-model", base_url="http://localhost:1234/v1")

See Prerequisites for setup details and env var names.

How do I set API keys and defaults?

Use env vars, or persist settings via the config CLI:

abstractcore --configure
abstractcore --set-api-key openai sk-...
abstractcore --set-api-key anthropic sk-ant-...
abstractcore --status

Config is stored in ~/.abstractcore/config/abstractcore.json. See Centralized Configuration.

Tools

Why aren’t tools executed automatically?

By default, AbstractCore runs in pass-through mode (execute_tools=False): it returns tool calls in resp.tool_calls, and your host/runtime decides whether/how to execute them.

Automatic execution (execute_tools=True) exists but is deprecated for most use cases. See Tool Calling.

What’s the difference between web_search, skim_websearch, skim_url, and fetch_url?

These built-in web tools live in abstractcore.tools.common_tools and require:

pip install "abstractcore[tools]"
  • web_search: fuller DuckDuckGo result set (good when you want breadth or more options).
  • skim_websearch: compact/filtered search results (good default for agents to keep prompts smaller). Defaults to 5 results and truncates long snippets.
  • skim_url: fast URL triage (fetches only a prefix and extracts lightweight metadata + a short preview). Defaults: max_bytes=200_000, max_preview_chars=1200, max_headings=8.
  • fetch_url: full fetch + parsing for text-first types (HTML→Markdown, JSON/XML/text). For PDFs/images/other binaries it returns metadata and optional previews; it does not do full PDF text extraction. It downloads up to 10MB by default; use include_full_content=False for smaller outputs.

Recommended workflow: skim_websearchskim_urlfetch_url (use include_full_content=False when you want a smaller fetch_url output).

How do I preserve tool-call markup in response.content for agentic CLIs?

Use tool-call syntax rewriting:

  • Python: pass tool_call_tags=... to generate() / agenerate()
  • Server: set agent_format in requests

See Tool Syntax Rewriting.

Structured Output

How do I get structured output (typed objects) instead of parsing JSON?

Pass a Pydantic model via response_model=...:

from pydantic import BaseModel
from abstractcore import create_llm

class Answer(BaseModel):
    title: str
    bullets: list[str]

llm = create_llm("openai", model="gpt-4o-mini")
result = llm.generate("Summarize HTTP/3 in 3 bullets.", response_model=Answer)
print(result.bullets)

See Structured Output.

Why does structured output retry or fail validation?

Validation failures trigger retries with schema feedback (up to the configured retry limit). Common fixes:

  • Simplify schemas (fewer nested structures, fewer strict constraints)
  • Tighten prompts (explicit allowed values and ranges)
  • Increase timeouts for slow backends

See Structured Output and Troubleshooting.

Media / Vision

Why do PDFs / Office docs / images not work?

Those require the media extra:

pip install "abstractcore[media]"

Then pass media=[...] to generate() or use the media pipeline. See Media Handling.

How do I attach audio or video?

Audio and video attachments are supported via media=[...], but they are policy-driven by design:

  • Audio defaults to audio_policy="native_only" (fails unless the selected model supports native audio input).
  • Video defaults to video_policy="auto" (native when supported; otherwise samples frames and routes them through vision handling).

For speech audio, use audio_policy="speech_to_text" (typically requires installing abstractvoice). See Media Handling, Vision Capabilities, and Audio & Voice.

How do I do speech-to-text (STT) or text-to-speech (TTS)?

Install the optional capability plugin package:

pip install abstractvoice

Then use the deterministic capability surfaces:

from abstractcore import create_llm

llm = create_llm("openai", model="gpt-4o-mini")  # provider/model is for LLM calls; STT/TTS are deterministic
print(llm.capabilities.status())  # availability + selected backend ids + install hints

wav_bytes = llm.voice.tts("Hello", format="wav")
text = llm.audio.transcribe("speech.wav")

If you run the optional HTTP server, you can also use OpenAI-compatible endpoints (POST /v1/audio/transcriptions, POST /v1/audio/speech). See Server.

How do I generate or edit images?

Generative vision is intentionally not part of AbstractCore’s default install. Use abstractvision:

pip install abstractvision

You can use it through AbstractCore’s llm.vision.* capability plugin surface, or via AbstractCore Server’s optional endpoints (POST /v1/images/generations, POST /v1/images/edits). See Server and Capabilities.

What are “glyphs” and what do they require?

Glyph visual-text compression is an optional feature for long documents. Until this website includes a dedicated page, see the GitHub docs: Glyph Visual-Text Compression.

Embeddings

How do I use embeddings?

Embeddings are opt-in:

pip install "abstractcore[embeddings]"

Then import and use the embeddings module:

from abstractcore.embeddings import EmbeddingManager

See Embeddings.

HTTP Server

Do I need the HTTP server?

No. The server is optional and is mainly for:

  • Exposing one OpenAI-compatible /v1 endpoint that can route to multiple providers/models
  • Integrating with OpenAI-compatible clients and agentic CLIs

Install and run:

pip install "abstractcore[server]"
python -m abstractcore.server.app

See HTTP Server Guide.

Debugging & Downloads

Where are logs and traces?

  • Logging: configured via the config CLI and config file.
  • Interaction tracing: opt-in (enable_tracing=True).

See Structured Logging and (for tracing) Interaction Tracing.

I’m getting HTTP timeouts. What should I change?

  • Per-provider: pass timeout=... to create_llm(...) (timeout=None means unlimited).
  • Process-wide default: abstractcore --set-default-timeout 0 (0 = unlimited).

See Troubleshooting and Centralized Config.

HuggingFace won’t download models — why?

The HuggingFace provider respects offline-first settings. If you want it to fetch from the Hub, update ~/.abstractcore/config/abstractcore.json:

  • Set "offline_first": false
  • Set "force_local_files_only": false

Restart your Python process after changing this (the provider reads these settings at import time).

Scope & Philosophy

Is AbstractCore a full agent/RAG framework?

AbstractCore focuses on provider abstraction + infrastructure (tools, structured output, media handling, tracing). It does not ship a full RAG pipeline or multi-step agent orchestration framework.

See Capabilities for current scope and limitations.

Related Documentation

Getting Started

First call + core concepts

Prerequisites

Provider setup + env vars

Tool Calling

Passthrough tool calls + safety boundaries

HTTP Server

OpenAI-compatible gateway