Hybrid router scripts
This page describes how to use the scripts that maintain Layer 1 (heuristic) and Layer 2 (semantic) routing rules for mix mode. Run all commands from the repository root.
Overview
| Script | Purpose |
|---|---|
| validate_heuristic_rules.py | Check that no keyword appears in both local and cloud rules (avoids ambiguous routing). |
| generate_utterances.py | Generate many local/cloud example utterances for the semantic router (verb×object + expert list). |
| generate_heuristics.py | Generate heuristic keywords from a verb×noun matrix and/or {{a|b}} templates. |
When to use
- After editing
config/hybrid/heuristic_rules.yml→ run validate_heuristic_rules.py. - To enrich Layer 2 utterances (more coverage, EN+ZH) → run generate_utterances.py (optionally with
--merge). - To enrich Layer 1 keywords (more verb/noun combos or templates) → run generate_heuristics.py (optionally with
--merge), then validate.
Prerequisites
- Python 3 and PyYAML (
pip install pyyaml). - No need to start Core or install full app dependencies to run these scripts.
1. validate_heuristic_rules.py
Validates heuristic_rules.yml for keyword conflicts: the same normalized phrase must not be assigned to both local and cloud. The router is first-match-wins; conflicts make behavior unclear.
Usage
python3 scripts/validate_heuristic_rules.py
python3 scripts/validate_heuristic_rules.py path/to/heuristic_rules.yml
Behavior
- Reads
config/hybrid/heuristic_rules.yml(or the path you pass). - Normalizes keywords (lowercase + Unicode NFC), same as the router.
- If any keyword appears in both a local and a cloud rule → prints conflicts and exits with code 1.
- If no conflicts → prints
OKand exits 0.
When to run
- After hand-editing heuristic rules.
- After running generate_heuristics.py --merge.
2. generate_utterances.py (Layer 2 semantic routes)
Generates dense local and cloud example utterances for the semantic router. Uses:
- Verb×object templates (e.g. find+file, open+browser) in several categories.
- Expert lists of hand-written phrases (including simple conversation: hi, thanks, bye, 你好, 谢谢, etc.) so chit-chat routes to local in Layer 2.
Usage
# Generate to default file (config/hybrid/generated_utterances.yml)
python3 scripts/generate_utterances.py
# Custom output path
python3 scripts/generate_utterances.py -o path/to/output.yml
# Merge into semantic_routes.yml (existing + generated, deduped; no cross-list duplicates)
python3 scripts/generate_utterances.py --merge
# Limit size (for latency: ~300–500 local is a good balance)
python3 scripts/generate_utterances.py --max-local 500 --max-cloud 300
Options
| Option | Default | Description |
|---|---|---|
--output, -o |
config/hybrid/generated_utterances.yml |
Output YAML path. |
--max-local |
500 | Max local utterances. |
--max-cloud |
300 | Max cloud utterances. |
--max-from-templates |
450 | Max local utterances from verb×object combinations. |
--merge |
off | Merge into config/hybrid/semantic_routes.yml (existing first, then generated; deduped within and across lists). |
Using the output
- Option A: In
config/core.yml, sethybrid_router.semantic.routes_path: config/hybrid/generated_utterances.ymlso the router uses the generated file. - Option B: Run with
--mergeto add generated utterances into semantic_routes.yml (one file, existing + generated). - Option C: Run with
-o config/hybrid/semantic_routes.ymlto overwrite semantic_routes with only the generated lists (no merge).
Tips
- Keep local utterances around 300–500 so embedding lookup stays fast (~10 ms).
- When you add new features (e.g. a Spotify plugin), add verbs/objects in the script’s
LOCAL_TEMPLATESor expert lists and re-run.
3. generate_heuristics.py (Layer 1 heuristic rules)
Generates heuristic keywords from:
- A simple-conversation rule (hello, hi, thanks, bye, ok, 你好, 谢谢, 再见, …) so chit-chat routes to local in Layer 1.
- A verb×noun matrix (e.g. open+wifi, search+file) per category (screen, system, files, app, cloud research, …).
- Templates using
{{a|b|c}}: e.g.{{open|launch}} {{browser|app}}expands to all permutations.
Usage
# Generate to default file (config/hybrid/generated_heuristic_rules.yml)
python3 scripts/generate_heuristics.py
# Custom output
python3 scripts/generate_heuristics.py -o path/to/output.yml
# Merge into heuristic_rules.yml (existing first; generated local then cloud; skips keywords that would conflict with the opposite route)
python3 scripts/generate_heuristics.py --merge
Options
| Option | Default | Description |
|---|---|---|
--output, -o |
config/hybrid/generated_heuristic_rules.yml |
Output YAML path. |
--merge |
off | Merge into config/hybrid/heuristic_rules.yml (existing rules kept; generated appended; conflicting keywords skipped). |
Template syntax (in script or in YAML)
{{a|b|c}}→ one ofa,b, orc.- Optional word:
{{a |}}→"a "or empty (so you get “take a screenshot” and “take screenshot”). - Be specific; do not use greedy patterns like
{{.*}}(they are rejected).
After merge
Run the validator:
python3 scripts/validate_heuristic_rules.py
Customization
- Edit MATRIX (verbs/nouns per category) and TEMPLATES (list of
{route, tmpl}) inscripts/generate_heuristics.py, then re-run. - You can also put
tmplin heuristic_rules.yml; the router expands templates to keywords when loading (see Mix mode and reports).
Recommended workflow
- Heuristic rules
- Edit
config/hybrid/heuristic_rules.ymlby hand and/or rungenerate_heuristics.py --merge. - Run
validate_heuristic_rules.py; fix any reported conflicts. - Semantic utterances
- Edit
config/hybrid/semantic_routes.ymlby hand and/or rungenerate_utterances.py --merge. - Optionally point
routes_pathtogenerated_utterances.ymlinstead of merging. - Restart Core (if it’s running) so it reloads the updated YAML files.
Config file locations
| File | Purpose |
|---|---|
config/hybrid/heuristic_rules.yml |
Layer 1 rules (keywords / tmpl; long_input_*). Loaded by Core when mix mode is on. |
config/hybrid/semantic_routes.yml |
Layer 2 example utterances (local_utterances, cloud_utterances). |
config/hybrid/generated_utterances.yml |
Optional output of generate_utterances.py; use as semantic routes_path if desired. |
config/hybrid/generated_heuristic_rules.yml |
Optional output of generate_heuristics.py; merge into heuristic_rules or use as reference. |
Core chooses the heuristic and semantic files via config/core.yml under hybrid_router.heuristic.rules_path and hybrid_router.semantic.routes_path when main_llm_mode: mix.