core.yml — detailed introduction
core.yml is the main configuration file for HomeClaw Core. It lives in config/core.yml (relative to the project root). This page explains the main sections and what each setting means so you can tune Core for your setup.
Overview
| Section | What it controls |
|---|---|
| Server | Where Core listens (host, port). |
| Models | Which LLM is used for chat and embedding (local and/or cloud); mix-mode routing. |
| Memory | RAG memory, agent memory files, daily memory, session lifecycle. |
| Knowledge base | User documents and URLs for RAG; file upload handling. |
| Routing | Mix mode: how Core chooses local vs cloud per request (heuristic, semantic, classifier). |
| Skills & plugins | Skills (SKILL.md), plugins (built-in + external), system plugins. |
| Tools | File read base, web search, browser, run_skill, exec allowlist, timeouts. |
| Auth | API key when Core is exposed (e.g. tunnel, remote app). |
1. Server
| Setting | Meaning |
|---|---|
name |
Core instance name (e.g. core). Used in logs and plugin registration. |
host |
Bind address. 0.0.0.0 = accept connections from any interface (e.g. LAN, tunnel). 127.0.0.1 = local only. |
port |
HTTP port Core listens on. Default 9000. Clients (Companion app, WebChat, channels) use http://<host>:9000. |
mode |
Environment hint (e.g. dev, prod). Affects logging and optional behavior. |
model_path |
Base path for local model files (GGUF). Relative to project or absolute. Default ../models/. |
2. Models
These settings define which model Core uses for chat (main LLM) and for embeddings (e.g. memory, knowledge base).
Main and embedding model
| Setting | Meaning |
|---|---|
main_llm |
Single main model when not using mix mode: local_models/<id> or cloud_models/<id>. |
main_llm_mode |
local = always use local; cloud = always use cloud; mix = router picks local or cloud per request (see Routing below). |
main_llm_local |
When mode is local or mix: which local model to use (e.g. local_models/main_vl_model_4B). |
main_llm_cloud |
When mode is cloud or mix: which cloud model to use (e.g. cloud_models/Gemini-2.5-Flash). |
embedding_llm |
Model used for embeddings (RAG, knowledge base). Can be local or cloud (e.g. local_models/embedding_text_model). |
main_llm_language |
Preferred response languages (e.g. [en, zh]). First item is primary for prompts. |
local_models
List of local models (llama.cpp servers). Each entry has:
id— Unique id; you refer to it aslocal_models/<id>inmain_llm/embedding_llm.path— GGUF file path relative tomodel_path.host,port— Where the llama.cpp server for this model runs.capabilities—[Chat]for chat,[embedding]for embedding, or both.mmproj— (Optional) Path to vision projector .gguf for image input.supported_media— (Optional) e.g.[image]for vision; default[image]whenmmprojis set.
You run a separate llama.cpp server per model (or use Core’s built-in start). See Models and llama.cpp-master/README.md.
cloud_models
List of cloud models (LiteLLM / provider APIs). Each entry has:
id— Unique id; you refer to it ascloud_models/<id>.path— LiteLLM model name (e.g.gemini/gemini-2.5-flash,openai/gpt-4o).host,port— LiteLLM proxy (often same for all cloud models).api_key_name— Environment variable name for the API key (e.g.GEMINI_API_KEY).api_key— (Optional) Set the key here instead of env (convenience only; avoid committing secrets).supported_media— (Optional) e.g.[image]for vision-only models.
API key can be set via environment variable (recommended) or api_key in core.yml. See Models.
llama_cpp and completion
llama_cpp— Defaults for local llama.cpp servers:ctx_size(context window),predict(max output tokens),temp,threads,n_gpu_layers,repeat_penalty,function_calling. Sub-keyembeddingfor embedding-model-only overrides.completion— Parameters sent with every chat request:max_tokens,temperature,top_p,repeat_penalty,image_max_dimension(resize images for vision).
3. Routing (mix mode)
When main_llm_mode: mix, Core chooses local or cloud for each user message. These settings control how that choice is made.
| Setting | Meaning |
|---|---|
hybrid_router.default_route |
Fallback when no other rule applies: local or cloud. |
hybrid_router.fallback_on_llm_error |
If the chosen model fails (timeout/error), retry once with the other route. |
hybrid_router.show_route_in_response |
Include which route (local/cloud) was used in the reply (e.g. for tuning). |
Layer 1: Heuristic rules
hybrid_router.heuristic.enabled— Use keyword/simple rules (e.g. “translate” → cloud).hybrid_router.heuristic.threshold— Score threshold for heuristic match.hybrid_router.heuristic.rules_path— YAML file with rules (e.g.config/hybrid/heuristic_rules.yml).
Layer 2: Semantic routes
hybrid_router.semantic.enabled— Use semantic similarity to route (e.g. “complex coding” → cloud).hybrid_router.semantic.threshold— Similarity threshold.hybrid_router.semantic.routes_path— YAML with example phrases per route (e.g.config/hybrid/semantic_routes.yml).
Layer 3: Classifier or perplexity
hybrid_router.slm.enabled— Use a small model or perplexity probe to decide.hybrid_router.slm.mode—classifier= small model answers “Local or Cloud?”;perplexity= main local model confidence (logprobs) to decide.hybrid_router.slm.model— Local model id for classifier (e.g.local_models/classifier_0_6b).hybrid_router.slm.threshold— Score threshold for Layer 3.
Order: heuristic → semantic → Layer 3 → default_route. See Mix mode and reports.
4. Memory
These settings control RAG memory (vector + relational), agent memory files, and session lifecycle.
| Setting | Meaning |
|---|---|
use_memory |
Turn on/off RAG memory (search, store). |
memory_backend |
cognee (default) or chroma. Cognee uses its own DB; Chroma uses vectorDB and graphDB in core.yml. To clear memory, use POST or GET http://<core_host>:<core_port>/memory/reset. |
Agent and daily memory (file-based)
| Setting | Meaning |
|---|---|
use_agent_memory_file |
Inject long-term context from AGENT_MEMORY.md (path set by agent_memory_path or workspace_dir/AGENT_MEMORY.md). |
agent_memory_max_chars |
Max characters to inject from that file; 0 = no limit. |
use_agent_memory_search |
If true, index agent memory and use agent_memory_search / agent_memory_get tools instead of injecting full file (recommended for large files). |
use_daily_memory |
Inject short-term context from daily files (e.g. memory/YYYY-MM-DD.md in daily_memory_dir). |
daily_memory_dir |
Directory for daily files; empty = workspace_dir/memory. |
Session
| Setting | Meaning |
|---|---|
session.dm_scope |
How to isolate conversations: main = one shared session; per-peer = by sender id; per-channel-peer = by channel + sender; per-account-channel-peer = by account + channel + sender. |
session.identity_links |
Map one user id to several channel ids (e.g. same person on Telegram and Discord) so they share one session. |
session.prune_keep_last_n |
Keep at most this many turns per session when pruning. |
session.prune_after_turn |
If true, prune after each reply to avoid unbounded context. |
session.daily_reset_at_hour |
0–23 = start a new session when last activity was before today at this hour; -1 = disabled. |
session.idle_minutes |
New session when last activity older than N minutes; -1 = disabled. |
session.api_enabled |
Expose GET /api/sessions for UIs. |
Reset memory via POST/GET http://<core>:<port>/memory/reset. See Tools and design docs in the repo.
5. Knowledge base
User documents and URLs for RAG (separate from chat memory).
| Setting | Meaning |
|---|---|
knowledge_base.enabled |
Turn on/off knowledge base and tools (e.g. knowledge_base_add, retrieval). |
knowledge_base.backend |
auto = same as memory backend; cognee or chroma to override. |
knowledge_base.collection_name |
Chroma collection name when backend is chroma. |
knowledge_base.chunk_size, chunk_overlap |
How documents are split for embedding. |
knowledge_base.unused_ttl_days |
Remove sources not used for this many days (Cognee age-based; Chroma by last_used). |
knowledge_base.retrieval_min_score |
Min similarity (0–1) for retrieved chunks; null = no filter. |
file_understanding.add_to_kb_max_chars |
When user sends only file(s), auto-add to KB only if extracted text length ≤ this; 0 = never auto-add. |
Reset KB via POST/GET http://<core>:<port>/knowledge_base/reset.
6. Skills and plugins
| Setting | Meaning |
|---|---|
use_skills |
Enable skills (folders under skills_dir with SKILL.md). The model sees “Available skills” and can call run_skill or use tools per skill. |
skills_dir |
Directory for skill folders (default skills (project root)). |
skills_use_vector_search |
Retrieve skills by similarity to user query instead of loading all; reduces prompt size. |
skills_similarity_threshold |
Min score to keep a skill in the prompt. |
skills_force_include_rules |
When user query matches a pattern, always include listed skill folders (and optional instruction). |
use_tools |
Enable built-in tools (file_read, web_search, browser_*, run_skill, etc.). |
plugins_max_in_prompt |
When plugins_use_vector_search=true, max plugins in the routing block after RAG; when false (include all), not used. |
plugins_use_vector_search |
Retrieve plugins by similarity (like skills). |
plugins_force_include_rules |
When query matches, always include listed plugin ids. |
system_plugins_auto_start |
If true, Core starts plugins in system_plugins/ (e.g. homeclaw-browser) and registers them. |
system_plugins |
Allowlist of plugin ids to auto-start; empty = all discovered. |
system_plugins_env |
Per-plugin env vars (e.g. BROWSER_HEADLESS: "false" for homeclaw-browser). |
See Plugins, Writing plugins and skills, Tools.
7. Tools
Under tools: you configure:
| Setting | Meaning |
|---|---|
file_read_base |
Base directory for file_read, folder_list, document_read; paths are relative to this. |
file_read_max_chars |
Max chars returned by file_read when not overridden per call. |
run_skill_allowlist |
If set, only these script names under skill/scripts/ are allowed; [] = allow all. |
run_skill_timeout |
Timeout in seconds for run_skill. |
web.search |
Web search provider (duckduckgo, google_cse, bing, tavily, brave, serpapi) and API keys (or env vars). |
browser_enabled |
If false, Core does not register browser tools; use plugin (e.g. homeclaw-browser) for browser actions. |
tool_timeout_seconds |
Per-tool execution timeout; 0 = no timeout. |
API keys for tools (e.g. Tavily, Google CSE) can be set in core.yml under the tool block or via environment variables where documented.
8. Auth
When Core is reachable from the internet (e.g. Cloudflare Tunnel, Tailscale Funnel), enable auth so only you can use it.
| Setting | Meaning |
|---|---|
auth_enabled |
If true, POST /inbound and WebSocket /ws require an API key. |
auth_api_key |
The secret key. Clients must send X-API-Key or Authorization: Bearer <key> on each request. |
Use a long, random value (e.g. 32+ characters). See Remote access.
Public URL and Pinggy (Companion scan-to-connect)
When you want to reach Core from another network (e.g. Companion on your phone), GET /pinggy shows a public URL and a QR code for the Companion app (Settings → Scan QR to connect). You can supply the URL in either of two ways:
| Setting | Meaning |
|---|---|
core_public_url |
Your public Core URL (e.g. from Cloudflare Tunnel, Tailscale Funnel). When set, /pinggy shows this URL and a QR code for Companion; also used for file/report links (/files/out). Leave empty for local-only or if using Pinggy. |
pinggy.token |
Your Pinggy token from pinggy.io. When set, Core starts the Pinggy tunnel and /pinggy shows the tunnel URL and QR. Leave empty if using core_public_url or another service. |
pinggy.open_browser |
If true, open the browser to /pinggy when the Pinggy tunnel is ready (default: true). Only applies when pinggy.token is set. |
Use core_public_url when you expose Core yourself (e.g. Cloudflare Tunnel, Tailscale Funnel). Use pinggy.token when you want Core to start the Pinggy tunnel. See Remote access.
9. Other important settings
| Setting | Meaning |
|---|---|
profile.enabled |
Per-user profile (name, preferences) stored in JSON; model can read/update via tools. |
workspace_dir |
Workspace root (e.g. for AGENT_MEMORY.md, identity files). |
use_workspace_bootstrap |
Inject workspace files (IDENTITY.md, AGENTS.md, TOOLS.md) into system prompt. |
database |
Relational DB for chat history/sessions: backend (sqlite, mysql, postgresql), url. |
vectorDB |
Used when memory_backend: chroma; backend (chroma, qdrant, etc.) and connection. |
graphDB |
Used when memory_backend is chroma; backend (kuzu, neo4j) for entity/relationship graph. |
cognee |
Used when memory_backend: cognee; relational, vector, graph providers and optional LLM/embedding overrides. |
silent |
If true, reduce logging for memory/tools/skills/plugin/orchestrator. |
llm_max_concurrent_local |
Max concurrent local (llama.cpp) calls; default 1. |
llm_max_concurrent_cloud |
Max concurrent cloud (LiteLLM) calls; default 4; 2–10 under provider RPM/TPM. |
compaction |
When context approaches model limit: trim or summarize messages; reserve_tokens for reply. |
Summary
- core.yml controls server, models (local/cloud, mix routing), memory, knowledge base, skills & plugins, tools, and auth.
- Models: Set main_llm_local, main_llm_cloud, embedding_llm; use main_llm_mode: mix and hybrid_router for per-request local/cloud routing.
- Memory: use_memory, memory_backend (cognee/chroma), agent/daily memory files, session scope and pruning.
- Knowledge base: knowledge_base.enabled, backend, chunking, file_understanding for auto-add on file upload.
- Routing: hybrid_router (default_route, heuristic, semantic, slm) when main_llm_mode: mix.
- API keys: For cloud models and some tools, set via environment variable or in core.yml (env recommended for secrets).
For the full file with every key, see config/core.yml (and config/core.yml.reference if present) in the repo. For model examples and tested configs, see Models.