core.yml — detailed introduction

core.yml is the main configuration file for HomeClaw Core. It lives in config/core.yml (relative to the project root). This page explains the main sections and what each setting means so you can tune Core for your setup.

Overview

Section	What it controls
Server	Where Core listens (host, port).
Models	Which LLM is used for chat and embedding (local and/or cloud); mix-mode routing.
Memory	RAG memory, agent memory files, daily memory, session lifecycle.
Knowledge base	User documents and URLs for RAG; file upload handling.
Routing	Mix mode: how Core chooses local vs cloud per request (heuristic, semantic, classifier).
Skills & plugins	Skills (SKILL.md), plugins (built-in + external), system plugins.
Tools	File read base, web search, browser, run_skill, exec allowlist, timeouts.
Auth	API key when Core is exposed (e.g. tunnel, remote app).

1. Server

Setting	Meaning
`name`	Core instance name (e.g. `core`). Used in logs and plugin registration.
`host`	Bind address. `0.0.0.0` = accept connections from any interface (e.g. LAN, tunnel). `127.0.0.1` = local only.
`port`	HTTP port Core listens on. Default 9000. Clients (Companion app, WebChat, channels) use `http://<host>:9000`.
`mode`	Environment hint (e.g. `dev`, `prod`). Affects logging and optional behavior.
`model_path`	Base path for local model files (GGUF). Relative to project or absolute. Default `../models/`.

2. Models

These settings define which model Core uses for chat (main LLM) and for embeddings (e.g. memory, knowledge base).

Main and embedding model

Setting	Meaning
`main_llm`	Single main model when not using mix mode: `local_models/<id>` or `cloud_models/<id>`.
`main_llm_mode`	`local` = always use local; `cloud` = always use cloud; `mix` = router picks local or cloud per request (see Routing below).
`main_llm_local`	When mode is `local` or `mix`: which local model to use (e.g. `local_models/main_vl_model_4B`).
`main_llm_cloud`	When mode is `cloud` or `mix`: which cloud model to use (e.g. `cloud_models/Gemini-2.5-Flash`).
`embedding_llm`	Model used for embeddings (RAG, knowledge base). Can be local or cloud (e.g. `local_models/embedding_text_model`).
`main_llm_language`	Preferred response languages (e.g. `[en, zh]`). First item is primary for prompts.

local_models

List of local models (llama.cpp servers). Each entry has:

id — Unique id; you refer to it as local_models/<id> in main_llm / embedding_llm.
path — GGUF file path relative to model_path.
host, port — Where the llama.cpp server for this model runs.
capabilities — [Chat] for chat, [embedding] for embedding, or both.
mmproj — (Optional) Path to vision projector .gguf for image input.
supported_media — (Optional) e.g. [image] for vision; default [image] when mmproj is set.

You run a separate llama.cpp server per model (or use Core’s built-in start). See Models and llama.cpp-master/README.md.

cloud_models

List of cloud models (LiteLLM / provider APIs). Each entry has:

id — Unique id; you refer to it as cloud_models/<id>.
path — LiteLLM model name (e.g. gemini/gemini-2.5-flash, openai/gpt-4o).
host, port — LiteLLM proxy (often same for all cloud models).
api_key_name — Environment variable name for the API key (e.g. GEMINI_API_KEY).
api_key — (Optional) Set the key here instead of env (convenience only; avoid committing secrets).
supported_media — (Optional) e.g. [image] for vision-only models.

API key can be set via environment variable (recommended) or api_key in core.yml. See Models.

llama_cpp and completion

llama_cpp — Defaults for local llama.cpp servers: ctx_size (context window), predict (max output tokens), temp, threads, n_gpu_layers, repeat_penalty, function_calling. Sub-key embedding for embedding-model-only overrides.
completion — Parameters sent with every chat request: max_tokens, temperature, top_p, repeat_penalty, image_max_dimension (resize images for vision).

3. Routing (mix mode)

When main_llm_mode: mix, Core chooses local or cloud for each user message. These settings control how that choice is made.

Setting	Meaning
`hybrid_router.default_route`	Fallback when no other rule applies: `local` or `cloud`.
`hybrid_router.fallback_on_llm_error`	If the chosen model fails (timeout/error), retry once with the other route.
`hybrid_router.show_route_in_response`	Include which route (local/cloud) was used in the reply (e.g. for tuning).

Layer 1: Heuristic rules

hybrid_router.heuristic.enabled — Use keyword/simple rules (e.g. “translate” → cloud).
hybrid_router.heuristic.threshold — Score threshold for heuristic match.
hybrid_router.heuristic.rules_path — YAML file with rules (e.g. config/hybrid/heuristic_rules.yml).

Layer 2: Semantic routes

hybrid_router.semantic.enabled — Use semantic similarity to route (e.g. “complex coding” → cloud).
hybrid_router.semantic.threshold — Similarity threshold.
hybrid_router.semantic.routes_path — YAML with example phrases per route (e.g. config/hybrid/semantic_routes.yml).

Layer 3: Classifier or perplexity

hybrid_router.slm.enabled — Use a small model or perplexity probe to decide.
hybrid_router.slm.mode — classifier = small model answers “Local or Cloud?”; perplexity = main local model confidence (logprobs) to decide.
hybrid_router.slm.model — Local model id for classifier (e.g. local_models/classifier_0_6b).
hybrid_router.slm.threshold — Score threshold for Layer 3.

Order: heuristic → semantic → Layer 3 → default_route. See Mix mode and reports.

4. Memory

These settings control RAG memory (vector + relational), agent memory files, and session lifecycle.

Setting	Meaning
`use_memory`	Turn on/off RAG memory (search, store).
`memory_backend`	`cognee` (default) or `chroma`. Cognee uses its own DB; Chroma uses `vectorDB` and `graphDB` in core.yml. To clear memory, use POST or GET `http://<core_host>:<core_port>/memory/reset`.

Agent and daily memory (file-based)

Setting	Meaning
`use_agent_memory_file`	Inject long-term context from AGENT_MEMORY.md (path set by `agent_memory_path` or `workspace_dir/AGENT_MEMORY.md`).
`agent_memory_max_chars`	Max characters to inject from that file; 0 = no limit.
`use_agent_memory_search`	If true, index agent memory and use agent_memory_search / agent_memory_get tools instead of injecting full file (recommended for large files).
`use_daily_memory`	Inject short-term context from daily files (e.g. `memory/YYYY-MM-DD.md` in `daily_memory_dir`).
`daily_memory_dir`	Directory for daily files; empty = `workspace_dir/memory`.

Session

Setting	Meaning
`session.dm_scope`	How to isolate conversations: `main` = one shared session; `per-peer` = by sender id; `per-channel-peer` = by channel + sender; `per-account-channel-peer` = by account + channel + sender.
`session.identity_links`	Map one user id to several channel ids (e.g. same person on Telegram and Discord) so they share one session.
`session.prune_keep_last_n`	Keep at most this many turns per session when pruning.
`session.prune_after_turn`	If true, prune after each reply to avoid unbounded context.
`session.daily_reset_at_hour`	0–23 = start a new session when last activity was before today at this hour; -1 = disabled.
`session.idle_minutes`	New session when last activity older than N minutes; -1 = disabled.
`session.api_enabled`	Expose GET /api/sessions for UIs.

Reset memory via POST/GET http://<core>:<port>/memory/reset. See Tools and design docs in the repo.

5. Knowledge base

User documents and URLs for RAG (separate from chat memory).

Setting	Meaning
`knowledge_base.enabled`	Turn on/off knowledge base and tools (e.g. knowledge_base_add, retrieval).
`knowledge_base.backend`	`auto` = same as memory backend; `cognee` or `chroma` to override.
`knowledge_base.collection_name`	Chroma collection name when backend is chroma.
`knowledge_base.chunk_size`, `chunk_overlap`	How documents are split for embedding.
`knowledge_base.unused_ttl_days`	Remove sources not used for this many days (Cognee age-based; Chroma by last_used).
`knowledge_base.retrieval_min_score`	Min similarity (0–1) for retrieved chunks; null = no filter.
`file_understanding.add_to_kb_max_chars`	When user sends only file(s), auto-add to KB only if extracted text length ≤ this; 0 = never auto-add.

Reset KB via POST/GET http://<core>:<port>/knowledge_base/reset.

6. Skills and plugins

Setting	Meaning
`use_skills`	Enable skills (folders under `skills_dir` with SKILL.md). The model sees “Available skills” and can call run_skill or use tools per skill.
`skills_dir`	Directory for skill folders (default `skills` (project root)).
`skills_use_vector_search`	Retrieve skills by similarity to user query instead of loading all; reduces prompt size.
`skills_similarity_threshold`	Min score to keep a skill in the prompt.
`skills_force_include_rules`	When user query matches a pattern, always include listed skill folders (and optional instruction).
`use_tools`	Enable built-in tools (file_read, web_search, browser_*, run_skill, etc.).
`plugins_max_in_prompt`	When `plugins_use_vector_search=true`, max plugins in the routing block after RAG; when false (include all), not used.
`plugins_use_vector_search`	Retrieve plugins by similarity (like skills).
`plugins_force_include_rules`	When query matches, always include listed plugin ids.
`system_plugins_auto_start`	If true, Core starts plugins in system_plugins/ (e.g. homeclaw-browser) and registers them.
`system_plugins`	Allowlist of plugin ids to auto-start; empty = all discovered.
`system_plugins_env`	Per-plugin env vars (e.g. `BROWSER_HEADLESS: "false"` for homeclaw-browser).

See Plugins, Writing plugins and skills, Tools.

7. Tools

Under tools: you configure:

Setting	Meaning
`file_read_base`	Base directory for file_read, folder_list, document_read; paths are relative to this.
`file_read_max_chars`	Max chars returned by file_read when not overridden per call.
`run_skill_allowlist`	If set, only these script names under skill/scripts/ are allowed; [] = allow all.
`run_skill_timeout`	Timeout in seconds for run_skill.
`web.search`	Web search provider (duckduckgo, google_cse, bing, tavily, brave, serpapi) and API keys (or env vars).
`browser_enabled`	If false, Core does not register browser tools; use plugin (e.g. homeclaw-browser) for browser actions.
`tool_timeout_seconds`	Per-tool execution timeout; 0 = no timeout.

API keys for tools (e.g. Tavily, Google CSE) can be set in core.yml under the tool block or via environment variables where documented.

8. Auth

When Core is reachable from the internet (e.g. Cloudflare Tunnel, Tailscale Funnel), enable auth so only you can use it.

Setting	Meaning
`auth_enabled`	If true, POST /inbound and WebSocket /ws require an API key.
`auth_api_key`	The secret key. Clients must send X-API-Key or Authorization: Bearer <key> on each request.

Use a long, random value (e.g. 32+ characters). See Remote access.

Public URL and Pinggy (Companion scan-to-connect)

When you want to reach Core from another network (e.g. Companion on your phone), GET /pinggy shows a public URL and a QR code for the Companion app (Settings → Scan QR to connect). You can supply the URL in either of two ways:

Setting	Meaning
`core_public_url`	Your public Core URL (e.g. from Cloudflare Tunnel, Tailscale Funnel). When set, /pinggy shows this URL and a QR code for Companion; also used for file/report links (/files/out). Leave empty for local-only or if using Pinggy.
`pinggy.token`	Your Pinggy token from pinggy.io. When set, Core starts the Pinggy tunnel and /pinggy shows the tunnel URL and QR. Leave empty if using core_public_url or another service.
`pinggy.open_browser`	If true, open the browser to /pinggy when the Pinggy tunnel is ready (default: true). Only applies when pinggy.token is set.

Use core_public_url when you expose Core yourself (e.g. Cloudflare Tunnel, Tailscale Funnel). Use pinggy.token when you want Core to start the Pinggy tunnel. See Remote access.

9. Other important settings

Setting	Meaning
`profile.enabled`	Per-user profile (name, preferences) stored in JSON; model can read/update via tools.
`workspace_dir`	Workspace root (e.g. for AGENT_MEMORY.md, identity files).
`use_workspace_bootstrap`	Inject workspace files (IDENTITY.md, AGENTS.md, TOOLS.md) into system prompt.
`database`	Relational DB for chat history/sessions: backend (sqlite, mysql, postgresql), url.
`vectorDB`	Used when memory_backend: chroma; backend (chroma, qdrant, etc.) and connection.
`graphDB`	Used when memory_backend is chroma; backend (kuzu, neo4j) for entity/relationship graph.
`cognee`	Used when memory_backend: cognee; relational, vector, graph providers and optional LLM/embedding overrides.
`silent`	If true, reduce logging for memory/tools/skills/plugin/orchestrator.
`llm_max_concurrent_local`	Max concurrent local (llama.cpp) calls; default 1.
`llm_max_concurrent_cloud`	Max concurrent cloud (LiteLLM) calls; default 4; 2–10 under provider RPM/TPM.
`compaction`	When context approaches model limit: trim or summarize messages; reserve_tokens for reply.

Summary

core.yml controls server, models (local/cloud, mix routing), memory, knowledge base, skills & plugins, tools, and auth.
Models: Set main_llm_local, main_llm_cloud, embedding_llm; use main_llm_mode: mix and hybrid_router for per-request local/cloud routing.
Memory: use_memory, memory_backend (cognee/chroma), agent/daily memory files, session scope and pruning.
Knowledge base: knowledge_base.enabled, backend, chunking, file_understanding for auto-add on file upload.
Routing: hybrid_router (default_route, heuristic, semantic, slm) when main_llm_mode: mix.
API keys: For cloud models and some tools, set via environment variable or in core.yml (env recommended for secrets).

For the full file with every key, see config/core.yml (and config/core.yml.reference if present) in the repo. For model examples and tested configs, see Models.