Skip to content
Karajan Code — multiagent coding orchestrator

Karajan Code

Multiagent coding orchestrator. 24 roles, 5 agents, deterministic guards, TDD, SonarQube, automated review.

Instead of running one AI agent and manually reviewing its output, kj orchestrates specialized roles — each executed by the AI agent you choose. The coder writes code, guards check for destructive patterns, SonarQube scans it, the reviewer checks it, and if issues are found, the coder gets another attempt. Roles define what to do; agents define who does it.

Zero API Costs

Runs on your existing AI subscriptions. No additional API keys or cloud services required. Pair with RTK and Squeezr (see below) for 60-90% extra token savings.

RTK Integration — Bash output compression

RTK (Rust Token Killer) compresses the output of 13 Bash commands (git, ls, find, grep, cat, head, tail, wc, diff, tree, du, file) that the coder agent uses constantly. When rtk --version is detected at preflight, Karajan transparently wraps every supported command via wrapWithRtk() and accumulates byte savings per session via RtkSavingsTracker. Optional, opt-in by install — no config flag needed. See installation docs.

Squeezr-compatible — MCP response compression

Squeezr is an MCP proxy that compresses the responses Karajan’s MCP server returns to the host (Claude Code, Cursor, etc.). It’s architecturally orthogonal to RTK: RTK compresses inside the pipeline (Bash command output); Squeezr compresses above it (MCP messages over the wire). Karajan doesn’t integrate Squeezr — Squeezr sits on the host’s MCP transport — but they stack cleanly. Install Squeezr in your MCP host config and Karajan benefits with zero changes.

Documentation Links in Errors

Preflight, bootstrap, and MCP errors include a See: <url> pointer to the relevant docs page. Specific anchors for SonarQube, Docker, agent install, RTK install, and config issues; other failures fall back to the troubleshooting guide.

Telemetry (Opt-Out)

Anonymous usage statistics (version, OS, command, pipeline duration, success rate) to help improve Karajan. Fully opt-out with telemetry: false in config.

Auto-Detect Stack

kj init scans package.json, go.mod, Cargo.toml and more to detect your framework and language. Auto-enables impeccable for frontend projects.

kj status Dashboard

Terminal dashboard showing HU states, current stage, timing, and progress. MCP returns structured JSON for programmatic access.

kj undo

Revert the last pipeline run with a soft reset or --hard. Safely undo changes when a run produces unexpected results.

HU Board Dashboard

Web dashboard for visualizing HU stories and sessions across all projects. Kanban board, session timeline, quality scores. Docker-ready, auto-syncs from local files.

HU Story Certification

Mandatory quality gate that evaluates user stories across 6 dimensions (JTBD context, user specificity, behavior change, control zone, time constraints, survivable experiment). Detects 7 antipatterns, rewrites weak stories, pauses for FDE context. Supports dependency graphs.

Codebase Health Audit

Read-only analysis across 5 dimensions: security, code quality (SOLID/DRY/KISS/YAGNI), performance, architecture, and testing. Generates a health report with A/B/C/D/F scores per dimension and prioritized recommendations without modifying any files.

5 368+ Automated Tests

482 test files covering every pipeline role (incl. rag-context-stage), guard, config option, MCP tool (27 total), the full HU Board package (writable settings modal + shared-plan badge + per-HU assignee), the Project RAG subsystem with multi-language AST chunkers (Python, Rust, Go, Java), hybrid mode + rerank + MMR diversification, golden-query recall@k / MRR harness, the retrieval dashboard, the team-shared HU cohort workflow, and the audit-history-display suite (12 tests). Full suite runs in around 60 s with Vitest. Opt-in subsystems (brain, ci, sonar, hu-board, webperf) are labelled [opt-in: <feature>] so fast feedback loops can skip them via KJ_SKIP_ALL_OPTIN=1. Coverage v8 reports (text + html + lcov) ship as a CI artifact since v2.32 — see KJC-TSK-0465.

Zero-Config Pipeline

Auto-detects TDD based on project test framework. Auto-manages SonarQube Docker lifecycle and config generation. Skips sonar/TDD for infra and doc tasks automatically. Simple tasks run lightweight (coder-only), complex tasks get full pipeline — automatically based on triage.

Skills Mode

8 slash commands (/kj-run, /kj-code, /kj-review, /kj-test, /kj-security, /kj-discover, /kj-architect, /kj-sonar) with built-in guardrails. No MCP needed — works directly in Claude Code.

Host-as-Coder

When the MCP host is the same agent as the coder (e.g., Claude calling kj_run with coder=claude), Karajan delegates directly — no subprocess, no overhead. All guardrails still apply.

Resilient Run

Auto-diagnoses failures and resumes crashed sessions — up to 2 retries. Non-recoverable errors (config, auth, missing agent) fail immediately. Configurable via session.max_auto_resumes.

Standalone Role Commands

Run any pre-build role independently: kj discover, kj triage, kj researcher, kj architect. Available as both CLI commands and MCP tools.

SonarQube + optional SonarCloud

SonarQube (local Docker, blocking quality gates) runs by default and powers the static analysis stage. SonarCloud is opt-in and complementary — enable via --enable-sonarcloud flag, enableSonarcloud: true (MCP), or sonarcloud.enabled: true in kj.config.yml. Requires sonarcloud.token and sonarcloud.organization (or KJ_SONARCLOUD_TOKEN / KJ_SONARCLOUD_ORG env vars). When both run, SonarCloud results are advisory.

Impeccable Design Audit

Automated UI/UX quality gate. Audits changed frontend files for accessibility, performance, theming, responsive, and anti-pattern issues. Runs after SonarQube, applies fixes automatically.

Deterministic Guards

Output guard blocks destructive operations and credential leaks. Perf guard catches frontend anti-patterns. Intent classifier pre-triages obvious tasks without LLM cost. All configurable with custom patterns.

Pre-Execution Discovery

kj_discover analyzes tasks for gaps before coding begins. 5 modes: gap detection, Mom Test questions, Wendel behavior change checklist, START/STOP/DIFFERENT classification, and Jobs-to-be-Done generation.

BecarIA Gateway

Full CI/CD integration with GitHub PRs as source of truth. All agents post comments and reviews on PRs. Early PR creation, configurable dispatch events, and embedded workflow templates.

Real-Time Monitoring

Stall detector, continuous heartbeats, max-silence guardrails, planner runtime cap. kj-tail for colorized live log. kj_status for parsed status.

Intelligent Reviewer Mediation

Scope filter auto-defers out-of-scope reviewer issues instead of stalling the pipeline. Deferred issues tracked as tech debt and fed back to the coder.

Solomon — Pipeline Boss

Evaluates every reviewer rejection, classifies issues as critical vs. style-only, and can override style-only blocks. 6 rules including scope guard, reviewer overreach, and smart iteration control.

Preflight Handshake

kj_preflight requires human confirmation of agent assignments before execution. 3-tier config: session > project > global.

Rate-Limit Standby

Auto-detects rate-limit / quota messages from Claude / Codex / Gemini CLIs and HTTP 429/5xx errors. Parses cooldown when the message uses a recognised format (ISO timestamp, Retry-After: <seconds>, retry in N minutes, or Claude’s resets at YYYY-MM-DD HH:MM UTC) and waits exactly that long with 30 s heartbeats — even if it’s hours away. When no time is parseable, falls back to 5 min default with exponential backoff (cap 30 min) and up to 5 retries before asking a human.

Pipeline Tracker

Cumulative progress view during kj_run — see which stages are done, running, or pending in real time via MCP and CLI.

Plugin System

Extend with custom agents via .karajan/plugins/. Auto-discovered at startup.

TDD Enforcement

Test changes required when source files change. The pipeline rejects iterations without matching tests.

MCP Server

24 tools exposed via MCP — including kj_discover, kj_triage, kj_researcher, kj_architect for standalone role execution, kj_preflight for human-confirmed agent config, kj_board for HU Board management, kj_status for live parsed status, and kj_undo for reverting pipeline runs. Real-time progress notifications for all tools. Graceful restart after npm updates.

5 AI Agents

Claude, Codex, Gemini, Aider, and OpenCode. Mix and match — use Claude as coder and Codex as reviewer, or any combination. Extensible via plugins.

Multi-Agent Pipeline

24 configurable roles spanning pre-loop, iteration and post-loop phases — triage, planner, coder, reviewer, sonar, solomon, audit and more. Full catalogue in Pipeline roles. Mandatory audit post-approval ensures generated code is certified clean before completing.

Solomon — AI Judge (v2.0)

Refined from pipeline boss to AI judge. Consulted only on genuine dilemmas: security-vs-deadline, conflicting quality gates, stalled loops, risk evaluation. Security issues bypass Solomon deterministically and go straight back to the coder.

Karajan Brain (v2.0)

AI-powered central orchestrator that routes all role-to-role communication, enriches feedback with file hints, verifies outputs via git diff, executes direct actions (npm install, gitignore), and compresses role outputs for 40-70% token savings. Consults Solomon only on genuine dilemmas.

Executable Acceptance Tests (v2.4)

Each HU carries acceptance_tests: an array of shell commands Brain runs after every coder iteration. All pass → HU approved. Any fail → Brain reads the exact error output and sends a concrete diagnostic to the coder. No reviewer. No generic tester. Concrete pass/fail.

Budget: With KJ vs Without KJ (v2.6)

At session end, the budget display projects the cost you would have paid without Karajan’s compression and token savings (RTK + Brain). Clear -88% delta lines keep expectations grounded in real numbers.

Rich Session Journal (v2.6)

Every run writes .reviews/<session>/decisions.md, iterations.md, summary.md, and tree.txt. You get a per-iteration log of coder/reviewer/sonar/Solomon steps, an executive summary with a stages table and budget breakdown, and a directory-grouped view of every file touched during the pipeline.

Valibot Config Validation (v2.6)

Config is now schema-validated at load time with Valibot. review_mode typos, max_iterations: 0, out-of-range hu_board.port, negative max_budget_usd, or budget.warn_threshold_pct outside 0-100 fail fast with a readable message. CLI falsy overrides (--no-rebase, --reviewer-retries 0) finally work. Co-authored with Jorge del Casar.

Infrastructure Dependency Injection (v2.6)

FileSystemService and CommandRunner adapters live under src/infrastructure/. BaseAgent takes an optional Environment; createAgent(…, env) threads it through. Tests swap in MockFileSystem + MockCommandRunner via buildMockEnvironment() so every agent path (Claude, Codex, Gemini, Aider, OpenCode) is unit-testable without spawning real subprocesses.

Modular Orchestrator (v2.6)

src/orchestrator.js shrunk from a 2 084-line monolith to a 22-line public barrel over src/orchestrator/flow-runner.js. New StageExecutor contract (canRun / execute / onFailure) plus a StageRegistry lets future stages register themselves without touching the core. Adding a new stage is now a drop-in: put a StageExecutor subclass under src/orchestrator/stages/, register it, done.

addyosmani/agent-skills (v2.7)

First-source process skills from addyosmani/agent-skills: TDD, code-review-and-quality, security-and-hardening, performance-optimization, git-workflow-and-versioning, CI/CD, debugging, spec-driven-development, and more. Auto-cloned to ~/.karajan/agent-skills/, refreshed weekly via git pull. Role-aware: each Karajan role (tester, reviewer, security, architect, coder…) receives the workflows that match its job. Fully orthogonal to OpenSkills — process skills and stack skills compose.

Audit Reports + Token Cost Transparency (v2.9)

--report-file <path> persists the audit to .md (with reproducibility header: timestamp, branch, commit, invocation flags) or .json. $KJ_AUDIT_REPORT_DIR for CI defaults. Every audit ends with a ## LLM Usage section showing provider + model + duration + tokens (in/out/total) + estimated cost in USD. Visible in stdout, JSON, and persisted reports. CLI/MCP parity bug fixed — both paths now drive the same AuditRole flow.

Stack-Aware Audit (v2.9)

detectProjectStack feeds the LLM auditor what kind of project it’s looking at: frontend-only, backend-only, fullstack, language, frameworks. Heuristics get filtered — no more N+1 query nags on Astro sites, no more bundle-size nags on Express APIs. New accessibility dimension auto-activates for frontend / fullstack projects with WCAG 2.x checks (alt text, labels, ARIA, focus management, contrast hints). New WebPerf section with 10 frontend-perf patterns when no live CWV measurement is available.

Three Deterministic Security Collectors (v2.9)

SonarQube findings as ground truth in the prompt (rule ID + line precision). OSV-Scanner integration covers CVEs across the entire OSV.dev DB — broader than npm audit, no account, no upload. Semgrep SAST catches XSS, SQLi, taint flow, hardcoded secrets, language-specific anti-patterns — equivalent to snyk code but free for OSS. All three are best-effort: missing binary or unreachable host silently skips the section.

Two-Phase Audit (v2.9)

kj audit now collects deterministic findings (basalCost, Sonar, OSV-Scanner, Semgrep, WebPerf, stack detection) in parallel — zero tokens — and prints them BEFORE asking Continue with LLM analysis? [y/N]. New --deterministic-only flag for zero-token runs, -y/--yes to auto-confirm, --json bypasses the prompt for pipeable output. CI / non-TTY paths auto-confirm — zero behaviour change for pipelines.

HU Board Hardening (v2.10)

Default bind is now 127.0.0.1 (was: all interfaces). New --bind 0.0.0.0 for the explicit LAN-exposure case, with auto-generated token at ~/.karajan/hu-board/token (mode 0600). Auth middleware enforces the token only for non-loopback peers — same-machine browser keeps working without ?token=. helmet headers + express-rate-limit 300 req/min on /api. Three accepted carriers: Authorization: Bearer, ?token=, kj_board_token cookie.

Webperf Quality Gate (v2.10)

PerfStage slots into the iteration loop right after Impeccable when pipeline.perf.enabled is true. Wraps Lighthouse for a Core Web Vitals verdict per iteration. PASS continues; FAIL pushes blocking-metric feedback (e.g. LCP=5500 (poor>4000) plus top opportunities like render-blocking resources) back to the coder for the next iteration; scanner unavailable skips best-effort. CLI: --enable-perf. MCP: enablePerf. No retry-loop — max_iterations is the natural ceiling.

SKILL.md per CLI Command (v2.10)

docs/agents/SKILL.kj-{plan,run,audit,doctor,init,board,review,resume,clean}.md — one fetch per CLI capability (~ 2-4 KB tokens each), all under the same contract: What it does · Inputs · Outputs · Constraints · Side effects · Common failure modes · Example · Related. CI-guarded: every link in llms.txt must resolve to a file with all four required sections, or the build fails.

Agent-Readiness Score (v2.10)

kj audit --agent-readiness scores any repo 0–100 across 7 LLM-free checks: llms.txt presence + validity, robots.txt AI-bot allowlist, per-doc token budget (≤ 32 KB), heading hierarchy (markdown + HTML <h1>), docs/agents/README.md entry point, SKILL.md coverage. Pure data transformation — no network, no LLM, no side effects. --json for CI. Karajan-on-Karajan: 100/100. Run it on your own repo, see what agents struggle with, fix from the top-fixes list.

hu-board: Ephemeral-Project Cleanup + Help (v2.11)

On board start, projects whose id matches tmp_* / test_* / demo_* / kj-test-* AND have been inactive for >24 h are cascade-deleted (project + stories + sessions). Per-project override via a 3-state toggle on each card (🧪 force-test / 📌 pin / · default heuristic) and PATCH /api/projects/:id/is-test. The header also gained a ? button: opens a modal explaining each of the five views (Board / Graph / Dashboard / Sessions / Pipeline), and every nav tab carries a native title for the standard 1-second hover tooltip.

Dogfooding-Hardened (v2.11)

A two-day, ten-level dogfooding pass through every Karajan surface — from kj --version to a full plan-driven multi-HU sub-pipeline — fixed three latent bugs that only surfaced on fresh /tmp repos: the SonarStage no longer burns max_iterations looping on Missing git remote.origin.url, commitAll tolerates the locale-specific “nothing to commit” race, and the HU sub-pipeline now branches off master/HEAD when the configured main doesn’t exist. runFlow seals session.status at the boundary, so kj status never shows a zombi running run again. All N0–N8 levels re-validated green.

Coder fs-leak detector, second layer (v2.14)

The original fs-leak-detector snapshot-diffed $HOME before and after the coder ran. It caught the original incident (cd /home/manu/assistant && pnpm init creating 36 MB outside projectDir) only because ~/assistant was new. If the target dir pre-existed, the snapshot diff missed it. v2.14 adds detectTranscriptCdLeaks() as a second layer: it scans the coder’s transcript for cd <abs-out-of-project> && <write-cmd> patterns and flags them regardless of disk state. Write commands recognised: mkdir, touch, cp, mv, git init, {pnpm,npm,yarn} init/create, npx create-*, cat >, echo >, shell redirects. Pure-read commands (ls, which, grep) don’t flag, and /tmp is exempt by convention.

Solomon no longer rubber-stamps security blockers (v2.14)

Rule 6 of the Solomon rules engine (reviewer_style_block) used to classify any blocking issue with severity low/minor or matching cosmetic keywords (name, format, documentation, …) as “style” — even legitimate security blockers got passed through. v2.14 adds an anti-classifier: severities critical/high/blocker/major, categories security/correctness, and a security-keyword regex (SQL injection, XSS, CSRF, auth, password, secret, hash, traversal, …) all force-disqualify an issue from being “style”. 6 regression tests cover the false-positive cases from the original incident.

Planner self-fix loop (v2.14)

The plan-reviewer used to be flag-only: it surfaced missing HUs, missing dependencies and scope overlaps, then left them for the user to apply by hand. v2.14 closes that loop. After the first review pass, the new plan-fixer.js module asks the planner to PATCH the plan (additions / deps_to_add / deletions), applies the patch in-process via addHu / removeHu / blocked_by mutations, and re-reviews. Loops up to 2 iterations or until the reviewer reports zero issues. Opt-out via --no-plan-fixer / --quick. Combined with three planner prompt fixes (scope respect, transversal one-to-many deps, explicit reuse marker), the four pathologies that the GRETA Plan 2 dogfooding kept surfacing are now closed at the source.

Team guardrails — recommended setup

Drop-in config for a team that uses AI in its workflow: multi-account SSH (one key per identity), global git hooks (commit-msg blocking AI attribution, pre-push blocking direct push to main, git-secrets for credential scanning), per-agent permission policies (Claude Code, Codex, Codex rules, Gemini CLI), GitHub branch protection rulesets, PR / Issue templates and CODEOWNERS routing. Paste, adapt, deploy. → Read the guide

Team-shared HU Board (v2.31)

Multiple machines, one plan. kj plan share <planId> opts a plan into the .karajan-shared/ cohort: the loader merges shared HUs into the per-project plan, the board scans the cohort and surfaces them with a shared badge, and a new per-HU assignee field lets each runner claim its slice without trampling the others. Selective --only / --exclude filters, kj plan unshare round-trip, and a sharedConflictPolicy escape hatch (local-wins / shared-wins / error) cover the conflict edge cases. Seven PRs (#859–#865) close the team-shared prerequisite (KJC-PRP-0002).

AI Harness Scorecard hardening (v2.32)

Plan A of KJC-PCS-0051 closes five FAILs from the external scorecard audit in a single sprint. Prettier --check (PR #868) blocks PRs whose formatting drifts. Coverage v8 (PR #870) emits text + html + lcov and uploads coverage/ as a CI artifact with per-glob thresholds enforced when opted in. Conventional Commits (PR #872) gates PR commit messages with wagoid/commitlint-github-action@v6 on top of the local pre-commit hook. Nightly drift workflow (PR #873) re-runs the full CI suite at 04:17 UTC daily and auto-files a tracking issue on red. eslint-plugin-security (PR #874) hard-bans eval, new Function, dynamic require, pseudoRandomBytes and mustache-escape disabling. Plus two bug fixes shipped alongside.

AI Harness Scorecard golden metric (v2.33)

Plan B of KJC-PCS-0051 turns kj audit into a quality measurement loop with a single golden number. Docker bootstrap (PR #877, KJC-TSK-0470) auto-pulls addyosmani/ai-harness-scorecard and runs a one-shot scan in ~10 s. Audit integration (PR #878, KJC-TSK-0471) splices the deterministic 0–100 score and A–F grade into the audit report headline. History DB (PR #879, KJC-TSK-0472) persists every run to a per-project audit-history.db (SQLite + WAL, PRAGMA user_version=1). Diff + trend sparkline (PR #880, KJC-TSK-0473) renders the delta vs the previous baseline plus a Unicode-bar trend over the last N runs. One golden number for “how AI-friendly is this repo today vs last week,” zero LLM tokens spent.

Multi-language RAG + Quality & Observability (v2.34)

Two epics close in one window. KJC-PCS-0052 Multi-language RAG adds first-class AST chunkers for Python, Rust, Go and Java via vendored web-tree-sitter WASM grammars (SEA-safe), wires a language adapter registry, extends the watcher and kj onboard / kj audit to multi-stack repos, and ships kj rag index --since <ref> for incremental git-diff-based reindex with a post-merge hook + pre-run drift check. KJC-PCS-0053 RAG Quality & Observability introduces a golden-query harness (kj rag eval) with recall@k + MRR scoring, content-hash sha256 dedup that skips re-embed on unchanged chunks, MMR diversification in the retriever (λ=0.5), and a deep-dive expansion of docs/RAG.md. Seventeen PRs.

Latest release notes, oldest first. The current version is shown in the navbar; full version history lives in Architecture history.

16 PRs absorb bug blockers (Solomon no longer rubber-stamps security issues misclassified as ‘style’, coder fs-leak detection gains a second layer that catches cd <abs> && pnpm init even when the dir pre-existed, Sonar admin password rotation now surfaces silent failures), the four planner pathologies surfaced by GRETA Plan 2 dogfooding (scope respect, transversal one-to-many deps, explicit reuse marker, and a brand-new self-fix loop where the plan-reviewer re-invokes the planner with structured feedback until zero issues remain), HU Board polish (zombie-TTL for crashed-runner prompts, less aggressive rate-limit with SSE exempt), and the first wave of tests/ reorg (issue #368): 93 files moved from root to mirror-subfolders. 4577/4577 tests passing, 0 regressions across all 16 PRs.

Two more planner pathologies surfaced by dogfooding v2.14.0 against GRETA Plan 2: the self-fix loop could regress (iter 1 dropped 15→10 issues, iter 2 then deleted HUs the first iter had added and reached 17 — worse than before iter 2 started) and the planner declared blocked_by on async observers (HUs marked as depending on guardrails or cron jobs that merely react to them, breaking GRETA’s AVISA-no-BLOQUEA). Fixes: P5 snapshots plan.hus + plan.review before each self-fix iter and reverts if newCount > currentCount; P6 lists six async-observer patterns in the planner prompt with a ‘consume vs react’ heuristic. Regenerating Plan 2 GRETA returns to baseline-iter-1 quality (9 findings on 58 HUs, 15%) instead of v2.14.0’s 17. 2 PRs (#684, #685). 4580/4580 tests passing. Safe upgrade from 2.14.0.

More dogfooding fallout from GRETA Plan 2 v2.14.1: the ▶ Run button appeared on every pending card regardless of blocked_by (you could launch a HU whose deps don’t exist yet), and titles on the board lost their [EPICA] prefix so you couldn’t tell at a glance which area of the plan a card belonged to. Fixes: canRunHu now requires blockedBy.length === 0 before showing ▶; the planner prompt demands description: "[EPICA] one-sentence description" with INFRA/SHARED fallbacks. Plus a new doc spec-conventions.md collecting the 6 SPEC conventions the planner v2.14.x understands, so users don’t have to rediscover them by dogfooding. 2 PRs (#687, #688). 4584/4584 tests passing. Safe upgrade from 2.14.1.

Three preflight improvements surfaced by the first real kj run against a greenfield project: (1) gh keyring auth is now recognized (no more demanding GH_TOKEN env var when gh auth login --web already worked), (2) new degradable checks system that disables optional features (auto_pr/auto_push) with a WARN instead of aborting the run, and (3) new project-aware preflight that detects signals (Dockerfile/firebase.json/pyproject.toml/Cargo.toml/*.tf/.env.example), checks the corresponding tools, validates write permissions on the project paths, compares .env vs .env.example, and tests gh push access to the actual remote. New command kj doctor --project runs only this phase. 2 PRs (#690, #691). 4608/4608 tests passing.

Three epics, 30+ commits, ~4 000 LOC. Brain Recovery (KJC-PCS-0044): universal error classifier with 7 rich classes wired into ALL agent invocations (no silent failures), persistent hibernation in ~/.kj/standby/ with event-driven scheduler (no polling), kj standby list + kj standby resume, board reconcile at startup, and a fallback chain that switches provider automatically when quota is exhausted with retryAfter > 12h — critical for Anthropic’s new $200/month Agent SDK cap from June 15. Model Routing + Undo (KJC-PCS-0043): each HU gets coder_model + reviewer_model with cross-provider review by default (claude↔codex), per-HU override from the board modal, OpenCode + Aider as first-class providers, and a ⏪ Undo button that restores files via git snapshots. Self-Healing Plan (KJC-PCS-0042): structural integrity pass breaks cycles and cleans orphan refs, smart convergence guard for the self-fix loop, new kj plan fix [planId] [--prompt] command to iterate without regenerating, skip Sonar for SPIKE/DOC/RESEARCH HUs, and the Failed kanban column gives way to a result=fail badge in Pending. 4 835/4 835 tests passing across 400 files.

Quality-focused release. The headline feature is a deterministic Sonar false-positive filter (KJC-TSK-0416). Before Sonar findings reach the coder or the auditor they are pre-filtered by two mechanisms: (1) static rules { rule, filePattern, reason } from a built-in catalog (covering common false positives like javascript:S2699 on tests/architecture/ where assertions use expect(off, msg).toEqual([]) that Sonar can’t detect), extensible per project via config.sonar.false_positives; (2) inline ignores with // karajan-sonar-ignore: <ruleId> on the issue line. Result: the coder stops burning tokens “fixing” assertions that aren’t broken. Brain Recovery now wraps every AI call in the pipeline — semantic-detector was the last legacy caller, wired via adapter (TSK-0413 step D). Codemod .replace(/regex/g, …).replaceAll(/regex/g, …) in 41 sites, plus audit cleanup of BLOCKER false positives in tests. 4 846/4 846 tests passing across 401 files.

kj audit gains two new deterministic structural collectors and the v2.16 FP filter is generalised to apply across every collector. Knip dead-exports collector (codeQuality dimension): detects unused exports / types (MINOR) and unused files (MAJOR). Stack-aware — JS/TS only, needs package.json. Subprocess via --reporter json, 120 s timeout. Madge circular-import collector (architecture dimension): detects import cycles. Severity by chain length (≥4 files = MAJOR, shorter = MINOR). Honours tsconfig.json for path-alias resolution, 60 s timeout. Generalised FP filter: every collector — sonar, knip, madge, osv, semgrep — uses the same config.audit.false_positives shape with a new tool field plus the inline marker // karajan-audit-ignore: <tool>:<ruleId>; legacy config.sonar.false_positives and // karajan-sonar-ignore: markers keep working. Built-in FP catalogue ships 4 entries by default: knip:unused-files in tests/fixtures/ + examples/, knip:unused-exports on barrel files (index.{js,ts,…}), madge:circular-import in node_modules/. BREAKING(engines): Node >=20.10>=20.19 (knip 6.x requirement). 4 872/4 872 tests passing across 402 files. Safe upgrade from 2.16.0 if you’re on Node ≥ 20.19.

Fixes KJC-BUG-0055: a project deleted from the HU Board (🗑️) no longer resurrects on the next kj plan or board restart. Four independent leaks closed: (1) sync.js temporal gate — the unconditional removeTombstone introduced by KJC-BUG-0050 becomes a plan.updatedAt > tombstone.deleted_at comparison, so a tombstoned project revives only when the plan is genuinely newer than the delete; stale plans on disk are ignored and removed. (2) ephemeral-cleaner.js now writes a tombstone and rm -rf’s hu-stories/, sessions/ and ~/.kj/plans/ directories when wiping ephemeral projects at boot — previously it only deleted the DB row, so the orphan directories revived the project on the next scan. (3) fullScan boot GC sweeps orphan tombstoned directories at startup (the manual DB-wipe case). (4) DELETE /api/projects/:id honours KJ_PLANS_DIR instead of the hardcoded path. New getTombstone() helper in db.js. Also fixes a silent kj board start failure (#753): the daemon’s entry-point guard compared import.meta.url to a hand-built file:// path and wrongly returned false on Windows, symlinked / global installs and paths with spaces — the board exited 0 with an empty log. server.js now trusts a KJ_BOARD_DAEMON launcher flag and adds uncaughtException / unhandledRejection handlers plus an actionable better-sqlite3 load error; board.js detects an early daemon exit instead of reporting a phantom PID. 4 909/4 909 tests passing across 408 files. Safe upgrade from 2.17.0.

Wires quota-exhaustion hibernation end to end. A kj run / kj plan that hits a provider session or usage cap used to abort the task with an opaque UNKNOWN_FATAL; now it suspends, persists its state, and tells you how to resume. (1) Session-limit classification (#756) — "You've hit your session limit · resets 10:10pm" matched no rate-limit pattern; session limit / weekly limit are now recognised and parseCooldown learns the 12-hour resets 10:10pm clock so the Brain knows when the quota resets. (2) Hibernation persists (#757) — withBrainRecovery only wrote ~/.kj/standby/<id>.json when given a sessionState, which no caller passed; new buildStandbyState() assembles it with an allowlisted env subset (never the full process.env). (3) Orchestrator consumes action:"hibernate" (#758) — no code path checked for it, so a hibernation sealed the HU failed; the coder and refactorer stages now stop cleanly on a quota cap and the session is sealed hibernated (resumable), not failed. (4) Resume hint (#759) — a stopped kj run / kj plan’s last line is now the exact command, kj standby resume <id> for a hibernation. Plus CI now runs the packages/hu-board suite (#755). 4 931/4 931 tests passing across 410 files. Safe upgrade from 2.17.1.

Closes the resilience audit triggered by the public launch: 15 PRs across 5 phases hardening Karajan against the silent-failure family of bugs — the problem is not that something fails, the problem is failing without telling the user why. Phase 1 — hibernation end to end (#756–#759): a quota cap is classified as a recoverable quota class (incl. Claude Code’s "You've hit your session limit · resets 10:10pm"); withBrainRecovery persists a standby JSON; the orchestrator consumes action:"hibernate" and seals the session hibernated (resumable) instead of failed; the last line printed is the exact resume command. Phase 2 — don’t lie (#761–#763): runCommand surfaces ENOENT so a missing agent CLI gets an actionable error (not an empty one); silenceTimeoutMs is forwarded to every role so a hung agent is killed, not waited on forever; the six state-file writers go through writeJsonAtomic (write-temp + rename) — interrupted writes can no longer truncate plans, sessions, standby snapshots. Phase 3 — don’t lose or block (#764–#767): a corrupt plan JSON is renamed aside with a loud warn; kj.config.yml parse errors throw Invalid YAML in <path> (used to brick every kj command including kj doctor); injectLoadedPlan reconciles HUs left in coding / reviewing / running by a killed kj run; board SQLite gets busy_timeout, a user_version schema gate and corruption recovery. Phase 4 — don’t degrade silently (#768–#769): TriageRole warns loudly when LLM output is unparseable (used to skip researcher / architect / security / tester silently); verifyCoderOutput distinguishes a git failure from “the coder did nothing” — no more iterations blaming the agent for infra. Phase 5 — safety net (#770): a tests/resilience/ suite indexes every silent-failure mode and pins each one with a test. Plus CI runs the packages/hu-board suite on every PR (#755). 4 959/4 959 tests passing across 416 files. Safe upgrade from 2.17.1.

Six user-feedback follow-ups to v2.18.0: kj-tail after kj resume (#772, was silent because resume.js skipped withCliRunLog); standby waits in-process instead of exiting on a short cooldown (#773, kj stays alive for waits ≤ 12 h and retries on its own; Ctrl+C during the wait prints kj standby resume <id>); closed KJC-BUG-0040 — SEA linux binaries (#774, was a race between gh release create and softprops — not better-sqlite3 as the memory said — 60 s poll fix + make_latest:false); stack bias — Python repos no longer get vitest (#775 + #776 + #777, detectProjectStack finally cabled into the coder prompt, the HU auto-generator templates by language, and the synthesizer + auto-hu-batch take the stack from the filesystem). 4 971/4 971 tests passing across 416 files. Safe upgrade from 2.18.0.

Closes KJC-PCS-0047 — the home-directory consolidation epic. Three back-to-back PRs (#781, #782, #783) unify the HOME-level state of Karajan into a single ~/.karajan/ root. Previously ~/.kj/ held plans, hibernated standby state, run-registry entries and worktrees, while ~/.karajan/ held sessions, hu-stories, config — no ADR justified the split and four divergent getKjHome() implementations had drifted. PR #781 unifies the resolver behind KARAJAN_HOME (with KJ_HOME honoured + one-shot deprecation warning); PR #782 ships an idempotent auto-migrator that runs once on the next kj invocation — tarball backup at ~/.karajan/backup/kj-pre-migration-<ISO>.tar.gz BEFORE moving, marker file prevents re-runs, cross-device safe (renamecp + rm on EXDEV); PR #783 flips every default to ~/.karajan/ and adds a legacy-kj-home check to kj doctor. The HU Board reads BOTH locations until the migrator fires, so users who start the board first never see “missing plans”. Restore is one tar -xzf away. 4 984/4 984 tests passing across 418 files. ⚠ Shipped with a packaging bug that broke kj board start on fresh installs — use v2.19.1 or later.

APPLICATION BLOCKER fix for kj board start on every fresh npm install -g karajan-code since the HU Board feature shipped. Two combined causes (PR #791): (1) the root package.json::files array did not include packages/, so npm pack --dry-run produced tarballs with zero board files; (2) even when the user copied packages/hu-board/ manually, the server crashed with Cannot find package 'helmet' imported from .../server.js — the five HU Board dependencies (helmet, chokidar, better-sqlite3, express, express-rate-limit) were declared in packages/hu-board/package.json but missing from root dependencies, and npm install -g karajan-code only resolves root deps. Fix: add packages/hu-board/{src,public,package.json} to files; add the five deps to root at exact versions so npm dedupe collapses to one copy reachable by upward traversal from server.js; regenerate package-lock.json. Verified end-to-end: npm pack ships 28 board files; the server boots cleanly. Also internal (no user-visible change): 38 direct os.homedir() callers routed through the unified resolver (#790 — KARAJAN_HOME now redirects every component) and 5 inline constructions of ~/.karajan/hu-board-runs/ unified under one helper (#789). Reported by @aitormf.

SonarQube 401 now triggers automatic token re-bootstrap instead of failing the run (KJC-BUG-0057, PR #793). bootstrapSonarToken() had lived in src/sonar/token-bootstrap.js since v2.10.2 — it probes admin/admin, rotates the default password if still in place, revokes the existing karajan-cli token and generates a fresh GLOBAL_ANALYSIS_TOKEN — but was ONLY invoked from kj init. Every other code path that hit Sonar with a missing / stale / revoked / inconsistent-instance token just threw HTTP 401 with the hint “Regenerate with kj init”, putting the user in the loop for plumbing Karajan has the credentials to do itself. The user feedback was unambiguous: “Si karajan ve que no funciona sonar, que tiene el user/passw, que genere nuevo token, karajan debe tener capacidad de hacer esto y no tiene que hacerlo la IA, es algo programatico.” Fix: src/sonar/api.js::sonarFetchOnce now invokes a new src/sonar/token-recovery.js::recoverSonarToken() on the first 401. Recovery has a per-process latch (N endpoints 401-ing trigger ONE bootstrap, not N), reuses bootstrapSonarToken, mutates config.sonarqube.token in place, mirrors the new token to ~/.karajan/sonar-credentials.json, and the original request retries once. The user never sees the 401 when recovery succeeds. Programmatic, zero LLM involvement. Reported by @aitormf.

HU Board now reads + writes plans from the canonical home dir (KJC-BUG-0059, PR #795). Reported by @aitormf: the board’s top card showed Directorio del proyecto — no detectado even when the run had a valid projectDir and the coder was reading files from it. Root cause: five board call sites still hard-coded ~/.kj/plans/ as their plans root — leftover from the v2.19.0 home consolidation, which fixed sync.js but missed packages/hu-board/. After the auto-migrator ran (or the user created new plans post-v2.19.0), plans landed under ~/.karajan/plans/<slug>/; the board kept looking under ~/.kj/plans/<slug>/ and silently found nothing — so GET /api/projects/:id/preflight could not extract projectDir (the literal Aitor saw), GET /api/projects/:id/plans-outcome returned plans: [] for every project, DELETE /api/projects/:id swept the wrong path leaving residue on disk, DELETE /api/plans/:planId failed silently, plan-mutations.plansRoot wrote new per-HU run logs to the legacy root splitting state across both, and cleanup-zombies never GC’d zombies under ~/.karajan/plans/. Fix: three new exports in packages/hu-board/src/db.jsgetHuBoardPlansDir() (canonical, or KJ_PLANS_DIR override), getHuBoardLegacyPlansDir() (legacy, null when override set), getHuBoardPlansDirs() ordered [canonical, legacy?] for read callers. Single-write callers (plan-mutations) use the canonical root; read / delete / GC iterate both so users mid-migration with plans still under ~/.kj/ don’t regress. 349 hu-board tests still green.

Two bugs closed in one release: kj resume now continues from where it stopped, and autoInit() no longer commits zombies on the user’s main branch.

KJC-BUG-0058 (PR #798, reported by @aitormf) — a session that stopped during Sonar would re-run the full pre-loop on kj resume <id>: HU-reviewer, intent, discover, triage, domainCurator, researcher, architect, planner all from scratch. Doubled token cost and broke the value-prop of the command. Root cause: resumeFlow called runFlow without rehydrating stage state, and the session never persisted stage outputs in the first place. Fix: two new mutators in src/session/mutators.jssetStageResult(session, name, result) mirrors writes into stage_results[name] + stages_completed[] (idempotent on the array), and setStageBundle(session, name, bundle) adds stage_bundles[name] for cross-stage context that the stageResult alone does not carry (researcher → researchContext, architect → architectContext, planner → plannedTask). Two closures inside runPreLoopStages (persistStage + resumeSkip) wrap every cacheable pre-loop stage. init-context.js rehydrates ctx.stageResults from the loaded session before invoking the pre-loop. Triage is not skipped — it produces roleOverrides the Brain decisor depends on and is cheap to re-run. 10 orchestrator test files / 57 tests still green.

KJC-BUG-0060 (PR #797, reported during the v2.19.3 release) — git checkout main reported [ahead 27] of origin/main after running kj on the karajan-code repo itself (kj-linked). Every commit was titled initial commit, authored by the karajan-code-local git user.email (which diverges from the global one), and pointed to the exact same tree as its parent — completely empty. The reflog held 2 495 such SHAs accumulated since April 2026. None ever reached origin/main (push / CI would have rejected them) so runtime impact was zero, but local history was noisy and on every release it looked like a sync loss. Root cause: src/orchestrator/config-init.js::autoInit() guarded with !(await exists(projectDir/.git)), which fails two ways: (a) dogfooding kj on karajan-code from a subdirectory → exists() returns false → git init reinitializes the parent’s .git/ (idempotent, harmless) → git commit --allow-empty then resolves upward and lands an empty commit on the parent’s main; (b) transient FS hicks (EACCES/ENOENT) flip exists() to a false negative. Fix: switch the static FS probe for git rev-parse --is-inside-work-tree, which performs the same upward search git would use for the commit itself — guard cannot disagree with the operation it guards. And drop the git commit --allow-empty -m "initial commit" step entirely: no downstream stage needs a root commit; the 2 495 zombies never broke anything, the seed was decorative and turned out to be the actual user-visible symptom.

HU Board polish + UX papercuts cluster — 5 cards closed. Two net-new features, two PG housekeeping syncs for work that had already landed quietly in earlier minors, one docs refresh.

KJC-TSK-0397 (PR #801)kj plan generate now prepends a [PREFLIGHT-000] HU to every plan and gates every functional HU on it via blocked_by. The HU carries stack-aware shell acceptance tests so the sub-pipeline naturally refuses to spend tokens on functional work until the environment is reproducible: git status --porcelain clean tree always; Node projects get node --version + npm install + conditional npm test / npm run lint; Python projects get python --version + pip install -r requirements.txt (or poetry install) + pytest --collect-only; Firebase projects add firebase projects:list; GCP projects add gcloud auth list. Adding a stack is one branch in composePreflightTests. The injection is idempotent — a plan that already has a HU titled PREFLIGHT-000 / “verificar entorno” is respected, not duplicated. Opt out per-invocation with --no-preflight-hu. New module src/plan/preflight-hu.js (102 LOC) + 6 acceptance tests in tests/plan/preflight-hu.test.js.

KJC-TSK-0395 (PR #802)kj init learns a config scope wizard plus --global / --local flags. In an interactive TTY without flags, the wizard now asks where the config should land: ~/.karajan/kj.config.yml (global, applies to all projects) or ./.karajan/kj.config.yml (local override, project-scoped). --global and --local skip the prompt; passing both throws Cannot pass both --global and --local. Non-interactive without flags stays on global for legacy CI compatibility. More important: loadConfig (src/config/loader.js) now refuses to load a project that has a local config without a global counterpart — the override-only-on-top-of-base invariant — with an actionable message pointing at kj init --global to create the base. New exported function resolveConfigScope({ flags, interactive }) so the resolution is unit-testable without spinning up the rest of initCommand. 5 acceptance tests in tests/commands/init-scope.test.js.

KJC-TSK-0396 (PG sync, originally PRs #702 + #703) — HU Board ⏹ Stop button. Aborts every kj run associated with a plan: SIGTERM to all tracked PIDs first, 5-second timeout, escalates to SIGKILL on anything that did not respond. Running HUs revert to pending (not failed) so the plan can be re-launched cleanly. Available only when at least one HU is in coding/reviewing. Frontend uses the same delegate-on-document pattern as the ▶ Run button (data-plan-id + data-pids); backend exposes POST /api/runs/:planId/stop returning { stopped, killed, errors, hu_reset_count }; a persistent run-tracker registry bridges terminal-launched and board-launched runs so both Stop paths see each other’s PIDs. Already shipped in v2.10.x; today’s release closes the PG card with the canonical commits as evidence.

KJC-TSK-0377 (PG sync, originally PR #683) — auto-cleanup ampliado. The HU Board’s boot ephemeral-project sweep already culled tmp_* / test_* / demo_* / kj-test-* projects; today’s sync confirms the same sweep also handles s_* (stray session-id projects), plan-* (stray plan-id projects), auto-tmp_* and auto-test_*. Projects with is_test = 2 (📌 keep) stay exempt regardless of prefix; home-style home_<path> projects with real git repos are never swept. Already shipped in v2.12.x.

KJC-TSK-0385 (PR #800) — docs refresh. docs/task-templates/spec-conventions.md gains two sections: Section 8 documents that numbered headings (## 1., ### 2.1, §5) in a task file activate the spec_section REQUIRED field on every emitted step — without this, users saw ‘missing spec_section’ findings on otherwise good plans and didn’t understand the activation rule. Section 9 documents the acceptance_tests shape per step: 2-4 tests, mix of gherkin (observable behaviour) and shell (concrete commands that exit 0 on success), pre-implementation, never the placeholder npx vitest run. The top quick-reference table and the pre-generate checklist were updated. Plus docs/task-templates/plan-generate.md flips its two stale ~/.kj/plans/ example paths to ~/.karajan/plans/ (post-v2.19.0 home consolidation).

Brownfield Onboarder role. Karajan ships a dedicated path to bootstrap an Architecture Brief from any existing codebase, and the planner can consume that brief as automatic context. Closes KJC-TSK-0384 in three PRs.

KJC-TSK-0384 (PRs #804 + #805 + #806) — Three layers of work.

Layer 1 (PR #804) — deterministic collectors. New module src/onboarder/collectors/index.js exposes five pure, fail-soft extractors. collectTree walks the project root ignoring node_modules / .git / dist / build / coverage / .karajan / .next / __pycache__. collectGitHistory returns null on greenfield, or { commitCount, branches, hotFiles, headSha } on a real repo — hot files are top N by appearance count in the last 200 commits’ --name-only output. collectConfigs probes presence of 18 well-known config patterns (package.json, pyproject.toml, Cargo.toml, go.mod, tsconfig, eslint, vitest/jest, firebase/gcloud/Docker, CI workflows, Makefile) and reads package.json::scripts. collectAdrs scans docs/adr/, docs/adrs/, docs/architecture/ for ADR-style filenames. collectAll is the Promise.all bundle. Every slot is independent; no collector aborts the others. 0 LLM calls, JSON-serialisable output.

Layer 2 (PR #805) — kj onboard command + OnboarderRole. New CLI command runs collectAll → optional OnboarderRole.run → writes ~/.karajan/onboarding/<slug>.md. The role itself is a thin AgentRole subclass: extractInput accepts both { bundle } and a string projectDir; parseOutput unwraps a fenced markdown block if the agent emitted one or trims the raw output; handleParseNull returns a soft-success so greenfield never errors. The role instructions live in templates/roles/onboarder.md (Markdown output, sections-as-needed). Flags: --no-synth (skip the LLM call entirely and write the raw bundle inside a JSON fence — useful for CI / token-cost-sensitive contexts), --output <path> (override default target).

Layer 3 (PR #806) — kj plan generate --use-onboarding. New src/onboarder/cache.js::readCachedBrief(projectDir) reads the cached brief deterministically using the same briefPath() slug rule the writer uses. kj plan generate gains the --use-onboarding flag; when set, the brief is prepended to the planner’s context under a ## Architecture Brief (from kj onboard) heading. Silent on cache miss without the flag; loud warn when the flag is set but no cache exists, so a missed kj onboard invocation surfaces immediately. The brief composes with any explicit --context the user passes.

Terminal window
kj onboard # produces ~/.karajan/onboarding/<slug>.md once
kj plan generate task.md \
--use-onboarding # next plan reads the brief as context

What’s next — The Project RAG epic (KJC-PCS-0049) starts in v2.22.0: vector store (better-sqlite3 + sqlite-vec), Ollama embedder, indexer (chokidar watcher), retriever + ranking, CLI commands (kj rag <query> / kj rag index), MCP tool (kj_rag_query), and HU Board search panel. The Onboarder is its prerequisite — the brief feeds the indexer’s first pass with the project structure already digested. ~1200 LOC estimated across 8 PRs.

Project RAG MVP (KJC-PCS-0049, six PRs). Karajan indexes its plans + onboarding briefs (and optionally the project’s source files) into a local vector store and lets you query them semantically from the CLI. End-to-end from terminal in v2.22.0; MCP tool + HU Board search panel land in v2.23.0.

The six PRs map one-to-one onto the architectural layers — the modules are small, single-responsibility, and the chain is replaceable piece-by-piece.

StepPRModuleLOCWhat it does
1#808src/rag/vec-store.js197openVecStore / insertChunk / searchSimilar / deleteChunksBySource over better-sqlite3 + new dep sqlite-vec ^0.1.9. DB at ~/.karajan/rag.db (override KJ_RAG_DB). Idempotent open; KJ_RAG_DB env override.
2#809src/rag/embedder.js160OllamaEmbedder.embed/embedBatch against the local Ollama server. Defaults localhost:11434 + nomic-embed-text (dim 768). OllamaEmbedderError on dim mismatch — silent dim drift would corrupt the vec store over time. Zero new deps (global fetch).
3#810src/rag/chunker.js191Three pure chunkers: chunkMarkdown keeps headingPath (so a retriever can show ‘from # Plan## Auth’), chunkPlan emits one chunk per HU with hu_id + title, chunkSource splits JS/TS by top-level export symbol via regex. Shared windowed splitter for oversized sections (limit 800, overlap 100).
4#811src/rag/indexer.js194indexFile(path, { db, embedder }) dispatches by extension + path → chunker → embed → insertChunk. Idempotent (calls deleteChunksBySource first). Embedder failure on one chunk = warn + continue, the loop never aborts. indexProject(projectDir, { withSources }) walks plans + onboarding brief; sources gated behind --with-sources.
5#812src/rag/retriever.js105query(db, embedder, text, { topK, scope, kindBoost }) over-fetches topK * 2 (clamped at 50) for rerank headroom, applies kind boosts (plan +0.05, onboarding +0.03, code 0) that break ties without reordering across big distance gaps.
6#813src/commands/rag.js + src/cli/register-meta.js160kj rag index [--with-sources] [--json] + kj rag query <text> [--scope plans|code|onboarding|all] [--top-k N] [--json]. Empty-store bail-out with Run kj rag index first hint. Hits print as three-line blocks with the most-specific metadata label (hu_id > symbol > headingPath).

End-to-end workflow:

Terminal window
cd ~/your-project
kj onboard # Architecture Brief at ~/.karajan/onboarding/<slug>.md
kj plan generate task.md -y # Plans at ~/.karajan/plans/<slug>/plan-*.json
kj rag index # Seed the vec store
kj rag query "how did I handle auth in module X?"

The chain composes with the Onboarder from v2.21.0: the brief seeds the indexer with the project’s structure (stack, hot files, ADRs), so semantic queries can find the right plan or HU without re-walking the repo each time.

SEA binary stubs RAG and HU Board alike. src/rag/* + src/commands/rag.js join packages/hu-board/src/* in the list of modules scripts/esbuild-sea.config.mjs rewrites at bundle time — both subsystems depend on better-sqlite3 (native node-gyp, no JS entry point) and would break the SEA build. Standalone-binary users get an actionable error pointing at npm install -g karajan-code.

Coming in v2.23.0 — MCP tool kj_rag_query so other agents (Claude Desktop, the Karajan pipeline itself) can ask the same questions; HU Board search panel with the retriever wired into the same UI as the kanban; chokidar watcher for live re-indexing instead of on-demand; AST-aware source chunker (tree-sitter or @babel/parser); BM25 + cosine hybrid scoring.

RAG exposed to agents and humans alike (KJC-PCS-0049 Steps 7+8+Camino A). After v2.22.0’s CLI MVP, the corpus stops being a humans-only feature. Three PRs land three new consumers:

ConsumerPRModuleHow
MCP agents (Claude Desktop, Cursor, Claude Code, Karajan’s own roles)#815src/mcp/handlers/rag-handler.js + src/mcp/tools.jsNew kj_rag_query + kj_rag_index tools. Tool count 25 → 27. Same shape as the CLI. Empty store responds empty: true so the agent has a deterministic recovery signal.
Browser users#816packages/hu-board/src/routes/api.js + public/app.jsNew POST /api/rag/query endpoint + a panel between the preflight card and the kanban — input + scope dropdown (All / Plans / Onboarding / Code) + Search button + results pane. Hits render as bordered cards with kind + label + score + source + truncated text.
Pipeline roles (coder, researcher, architect, planner, spec-reviewer)#817templates/roles/*.mdTailored ‘Prior context (RAG, opt-in)’ section per role. The agents now KNOW the tool exists, what topK / scope to use, and that an empty store means proceed-without-retrieval, not block-and-ask.

The unifying contract across all three: opt-in retrieval. When the corpus is empty (greenfield project, brand-new install, kj rag index not run yet), every consumer degrades to no-retrieval instead of bailing or pestering the human.

End-to-end workflow:

Terminal window
kj onboard # one-time per project — Architecture Brief
kj rag index # one-time per project — seed the corpus
kj plan generate task.md --use-onboarding
# From here on:
# - Coder calls kj_rag_query before reaching for unfamiliar APIs.
# - Planner queries for prior plans that might already expose a reuse marker.
# - Researcher avoids re-walking the repo when the corpus already has the signal.
# - Architect aligns with prior patterns.
# - Spec-reviewer surfaces scope_overlap findings against approved HUs.
# Humans get the same retriever in the HU Board search panel.

Per-role guidance (verbatim from the role templates):

  • codertopK: 3, scope: 'all'. Use cases: ‘how the project handled a similar concern before (auth, error model, retry policy, naming convention…)’. Hard cap: one query per concern, not per file.
  • researchertopK: 5, scope: 'plans'. Before re-walking the repo. Quote retrieved evidence as bullets under affects; do NOT paste raw chunks.
  • architecttopK: 3, scope: 'all'. When a design decision touches an area the project already shaped.
  • plannertopK: 5, scope: 'plans'. Before emitting blocked_by / dependencies / reuse. If a prior plan produced the utility the new one needs, mark reuse: [<hu-id>] instead of re-implementing — closes the loop with KJC-BUG-0044 (P3).
  • spec-reviewertopK: 3, scope: 'all'. If retrieved chunks describe overlapping HUs already approved, surface as kind: 'scope_overlap' referencing the prior HU id.

Coming in v2.24.0+Camino B (/kj-rag-query slash command for hosts in Skills mode without MCP), Camino C (automatic pre-loop stage that pre-fetches retrieval and prepends it to the coder/researcher/architect prompt without the agent having to ask — transparent + automatic), Camino D (Brain decisor heuristic for when retrieval is worth the tokens). Plus chokidar watcher for live re-indexing, AST-aware source chunker, BM25 + cosine hybrid scoring.

RAG Camino C — pre-loop auto-retrieval (KJC-PCS-0049). The fourth layer of RAG integration after v2.22.0 (CLI), v2.23.0 (MCP + Board + role instructions). Where Camino A taught the agents that kj_rag_query exists for on-demand queries, Camino C makes Karajan inject the context for them automatically — transparent, no MCP call from the agent required.

The architectural move: instead of fanning out a ‘RAG-aware prompt builder’ across every role (researcher, architect, planner, coder), Karajan mutates the task parameter once, in the pre-loop driver, between triage and domainCurator. Because task already flows through runPlanningPhases to every downstream stage via parameter passing, one mutation feeds six consumers with zero per-stage code change. The token cost is paid once per kj run, not per role.

Five guards before the retrieval ever fires (src/orchestrator/stages/rag-context-stage.js):

GuardReturns skipped with reasonBehaviour
config.rag.preload.enabled missing/falsedisabledSilent no-op. Default state — opt-in.
task empty/non-stringno-taskSilent — the intent classifier sometimes hands us an empty string.
Vec store has zero chunksemptyInfo log pointing at kj rag index; the agent’s own role template (Camino A) already tells it not to block on this.
Retriever returns zero hitsno-hitsSilent — the query may simply not match anything in the corpus.
Embedder / network / vec store throwserrorWarn log + continue. The pipeline never sees the exception.

The stage never throws — best-effort enrichment, opt-out by default, opt-back-out on failure. The contract is: it’s safe to wire into pre-loop unconditionally because the worst case is a no-op + warn.

When all guards pass, the task parameter receives a ready-to-prepend markdown block:

## Prior context from RAG (auto-retrieved, top N)
### [plan · AUTH-1] /path/to/plan.json
<chunk text, truncated 600 chars>
### [plan · ARCH-3] /path/to/plan-2.json
<chunk text…>

The label resolution rule (`hu_id` > `symbol` > `headingPath` > `‘block’`) matches the CLI’s renderer in `src/commands/rag.js` and the HU Board panel from v2.23.0 — agents seeing the context via auto-retrieval, humans seeing it via CLI, and humans seeing it via Board panel all read the same shape.

Toggle and tuning:

~/.karajan/kj.config.yml
rag:
preload:
enabled: true # default: false (opt-in)
topK: 5 # default: 5
scope: all # default: all — also: plans | code | onboarding
embedder:
dim: 768 # default: 768 (nomic-embed-text)
model: nomic-embed-text
url: http://localhost:11434

Compatibility with Camino A (v2.23.0): complementary, not redundant. The role templates from PR #817 still tell each agent that kj_rag_query exists for on-demand queries — different from the automatic pre-loop injection. With both active, the typical flow is: pre-loop gives the agent prior context at kickoff; the agent uses it; if it needs more granular retrieval mid-task, it invokes kj_rag_query directly.

Workflow with v2.24.0:

Terminal window
cd ~/your-project
kj onboard
kj rag index
yq -i '.rag.preload.enabled = true' ~/.karajan/kj.config.yml
kj run task.md
# researcher/architect/planner/coder all see prior context in their
# task prompt at the start of the iteration. They can still call
# kj_rag_query for follow-up questions.

Coming in v2.27.0+chokidar watcher for live re-indexing without on-demand kj rag index, AST-aware source chunker (tree-sitter or @babel/parser) to replace the export-symbol regex, BM25 + cosine hybrid scoring for better precision on keyword-heavy queries, OpenAI / Voyage embedder adapters for users who cannot run a local Docker.

RAG Camino B + Camino D (KJC-PCS-0049). Closes the consumer-surface plan started in v2.22.0. The retriever is now reachable from every relevant host context, and the pre-loop stage from v2.24.0 only pays its cost when triage signals make retrieval worthwhile.

Camino B — /kj-rag-query slash command for Skills hosts. New templates/skills/kj-rag-query.md. kj init ships it to .claude/commands/ so hosts that load Karajan via the Skills mechanism — Claude Code without MCP, Cursor without MCP — can reach the RAG retriever through a slash command instead of needing the kj_rag_query MCP tool. Thin wrapper over the existing CLI:

/kj-rag-query <text> [--scope <all|plans|onboarding|code>] [--top-k <n>]

The skill defers flag validation to the CLI (single source of truth), surfaces empty-store as a one-line hint without blocking the conversation, and renders hits as [score] source — snippet blocks rather than raw JSON.

Camino D — Brain decisor heuristic for pre-loop retrieval. New module src/orchestrator/stages/rag-preload-decisor.js. Pure function shouldPreloadRag({ triage, task, config }) → { pull, reason } wired in pre-loop.js before runRagContextStage (v2.24.0). Adds a new config knob:

rag:
preload:
enabled: true
policy: auto # auto | always | never (default: auto)
brownfield: false # explicit signal — always pulls when true
  • always — pull every run (back-compat with v2.24.0 default behaviour).
  • never — never pull (benchmarking, debugging).
  • auto (default) — heuristic. Pulls when any of:
    • triage.shouldDecompose === true (multi-HU run, prior context benefits researcher/architect/planner)
    • triage.level ∈ {complex, high, epic}
    • task.length >= 200 chars (long brief usually = non-trivial scope)
    • config.rag.preload.brownfield === true

Otherwise the stage persists ragPreload: { skipped: true, reason: 'auto:low-value' } and the pipeline pays no retrieval cost on trivial tasks. The decisor reason is also persisted alongside hit counts when retrieval does fire — resume and audit can see exactly why retrieval ran (or didn’t) on any past session.

Why both at once: B and D are independent surfaces — one extends reach, the other refines cost. Shipped together so the v2.22.0 → v2.25.0 RAG arc closes in a single release rather than dribbling across several minors.

Coming in v2.27.0+ — chokidar watcher (live re-indexing), AST source chunker, BM25 + cosine hybrid scoring, OpenAI/Voyage embedder adapters.

RAG Auto-Bootstrap (KJC-PCS-0049). Closes the friction caught in the v2.25.0 dogfooding: the RAG feature was technically complete but invisible to new users — they had to install Ollama manually before any of kj rag index, kj rag query, the MCP tool or pre-loop auto-retrieval would do anything. From v2.26.0 onwards, the embedder is just there.

kj init provisions Ollama in Docker. Three integrated steps:

  1. Capability checksrc/rag/ollama-capability.js validates Docker daemon reachability and >= 4 GB free RAM. Returns an aggregated { capable, reasons[] }.
  2. Bootstrap — when capable, ollamaUp() writes ~/.karajan/docker-compose.ollama.yml and starts the kj-ollama container, then waitForOllamaReady() polls /api/tags until healthy.
  3. Model pulldocker exec kj-ollama ollama pull nomic-embed-text populates the embedder model (~270 MB on first run; cached afterwards in the kj_ollama_data volume).

Never crashes init: Windows without Docker Desktop, Linux under 4 GB free, or explicit --no-ollama opt-out all degrade to a one-line warning and continue. Hosts with Ollama already running on :11434 are detected via the discover-before-spawn pattern (same as SonarQube): the existing instance is reused, no second container.

kj doctor surfaces Ollama state:

rag.preload.enabledContainerDoctor reports
false (default)irrelevantinfo: Disabled in config — quiet on greenfield
truereachableinfo: Reachable at <host>
trueunreachablewarn with fix hint kj ollama start

kj ollama subcommand for lifecycle without docker compose:

kj ollama start # provision + start (idempotent)
kj ollama stop # docker compose stop
kj ollama status # host + container + reachable
kj ollama pull <model> # docker exec ... ollama pull <m>

Bug fix bundled — the same v2.25.0 smoke test that surfaced the bootstrap gap also caught KJC-BUG-0061: kj onboard --no-synth was silently ignored because Commander maps --no-synth to flags.synth=false, not flags.noSynth=true; the synth branch invoked OnboarderRole.run() without init(), breaking the command outright; and kj rag query --json on empty store emitted [] instead of the { hits, empty, topK, scope } shape the MCP handler returns, which broke the contract the /kj-rag-query skill from v2.25.0 promised to Skills hosts. All three fixed in PR #824 and shipped with v2.26.0.

Coming in v2.28.0+ — chokidar watcher (live re-indexing), AST source chunker, BM25 + cosine hybrid scoring, OpenAI/Voyage embedder adapters for users without local Docker.

RAG polish (KJC-PCS-0049). Three independent improvements triggered by the v2.26.0 smoke test on karajan-code itself, plus a CI workflow exclude fix surfaced while landing the docs.

Per-project isolation~/.karajan/rag.db is a single SQLite shared across every indexed project. The v2.26.0 smoke caught chunks from a sandbox mixing into queries against karajan-code, so the schema gained a project_slug TEXT column on chunks (migration is non-breaking — pre-v2.27 DBs keep working with NULL slugs). indexProject auto-stamps every chunk with the projectDir basename normalized; kj rag query --project <slug> filters retrieval; --project all disables the filter; default is the cwd basename. Same flag honoured by the MCP tool and the /kj-rag-query slash command.

docs/RAG.md unified — the v2.26.0 release exposed that RAG documentation was scattered across CHANGELOG entries, README banner, role templates and landing. New unified guide at docs/RAG.md (EN) + docs/es/RAG.md (ES) covering: quick start, architecture diagram, installation (auto-bootstrap + fallback paths), six workflows (CLI, MCP, slash command, HU Board, pre-loop, role instructions — Caminos A→D mapped explicitly), configuration matrix, limitations + their v2.28.0+ roadmap items, troubleshooting. README banner now links both languages. Bonus fix: docs/**/*.md only matched files in subdirectories — docs/RAG.md at the root was slipping past shrink-budget. Workflow patched to mirror every doc-extension exclude with both docs/*.ext and docs/**/*.ext patterns.

Asymmetric source-vs-test ranking — the v2.26.0 smoke caught a systematic ranking bias: natural-language queries (how does X work) ranked tests/X.test.js above src/X.js because tests carry more descriptive prose and cosine-only similarity rewarded that. New rule in retriever.js: when the query does NOT mention test-flavoured terms (test|spec|expect|describe|it|jest|vitest|mocha), code chunks whose source path is NOT a test file get +0.05 boost on top of the kind boost. Test-flavoured queries keep the baseline so vitest mock setup still surfaces test files. Helpers shouldBoostSources and isTestPath exported for tests.

KJC-BUG-0063 (bundled)tests/resilience/hibernate-end-to-end.test.js was TZ-dependent and broke every CI run on every PR (TZ=Europe/Madrid passed, TZ=UTC failed). Skipped in CI via process.env.CI === 'true' until parseCooldown becomes TZ-aware. Local runs unaffected (4/4 still pass).

RAG quality lift (KJC-PCS-0049 continued). Five PRs land the next layer of the RAG track: visibility, more providers, finer queries, better final ranking.

Retrieval dashboard on HU Board (PR #843). New standalone page at /rag.html, linked from the main nav. Shows total chunks, DB size, last-index timestamp, active embedder provider + dim, chunk counts grouped by kind (code / plan / onboarding) and project (top 20). Backend GET /api/rag/stats reads the local rag.db read-only and the embedder block from kj.config.yml. Empty-state when the DB has not been initialized. This is the first piece of the v2.30 config-UI roadmap — the board surface will grow writable controls (role toggles, embedder swap, alpha/mode/rerank) on top of the same shape.

Cohere + Mistral embedder adapters (PR #848). embed-multilingual-v3.0 (1024 dim, strong multilingual) and mistral-embed (1024 dim, EU-hosted for GDPR-conscious users). KJ_COHERE_KEY / KJ_MISTRAL_KEY Karajan-scoped env vars. The original “Anthropic via OAuth” roadmap slot is dropped — Anthropic has no embeddings API; Cohere + Mistral cover that gap with first-party services.

ONNX local embedder (PR #??). Sixth provider, fully local: no Docker, no API key, no external service. Loaded dynamically via @huggingface/transformers (legacy @xenova/transformers honoured as fallback). Default Xenova/all-MiniLM-L6-v2 (384 dim, ~80 MB cached on first use). Higher-quality option Xenova/jina-embeddings-v2-base-en (768 dim) selectable via KJ_ONNX_EMBED_MODEL. Both transformers packages are optional peer deps — not auto-installed (combined ~500 MB with WASM + ONNX runtime); the adapter throws a helpful install hint on missing dep. This is the missing piece for v2.31 zero-config init: a sensible default with zero infrastructure.

Per-chunk metadata --where filter (PR #??). New CLI flag with KEY=VALUE AND KEY=VALUE grammar (case-insensitive AND, quoted strings for values with spaces):

Terminal window
kj rag query 'login flow' --where 'symbol=loadConfig'
kj rag query 'auth' --where 'hu_id=HU-003 AND kind=plan'
kj rag query 'router' --where 'kind=code AND symbol=createRouter'

kind special-cases to the column; every other key routes through SQLite json_extract(c.metadata, '$.<key>') = ?, so anything the chunker emits as metadata (symbol, hu_id, headingPath, file, …) is queryable without schema changes. The filter applies uniformly to both semantic and BM25 sides of the hybrid retriever.

Cross-encoder rerank (PR #??). Opt-in --rerank flag re-scores the topK survivors with a (query, passage) cross-encoder. Default Xenova/ms-marco-MiniLM-L-6-v2 (sentence-transformers de-facto standard, ~80 MB cached on first use). Cross-encoders are slower than bi-encoders (jointly encode the pair instead of caching the passage), but materially more precise for final ranking — Karajan runs it only on the post-fusion, post-boost survivors so latency stays bounded. Plugs in after the kind+source boosts, acting as a finer-grained quality lever.

Coming in v2.30.0+ — writable config UI on the HU Board (toggle roles, switch coder/reviewer, adjust alpha/mode/rerank without re-editing kj.config.yml), then v2.31 zero-config init (reduce the wizard to one critical question, smart defaults for everything).

Writable config UI on HU Board (KJC-PCS-0042). Four PRs land the settings modal end-to-end. The kj.config.yml is no longer an editor-only file — toggle roles, switch coder/reviewer, swap embedder provider, adjust topK/scope without leaving the board. Hand-editing of YAML keeps working: the modal writes only whitelisted fields and preserves everything else verbatim. Atomic-write + .bak discipline applies.

Pipeline role toggles (PR #854, KJC-TSK-0450). Eight new boolean fields exposed on the board modal: pipeline.planner.enabled, researcher, architect, tester, security, refactorer, impeccable, brain. Mirrors the source-of-truth defaults in src/config/defaults.js. Brain stays opt-in (false default); the rest reflect Karajan’s recommended baseline.

RAG controls (PR #855, KJC-TSK-0451). Four new fields close the “I want to swap embedder without vim” gap: rag.preload.enabled (boolean), rag.preload.topK (1–20), rag.preload.scope (all / code / plans / onboarding), rag.embedder.provider (ollama / openai / voyage / cohere / mistral / onnx — matches the six adapters shipped through v2.28.0–v2.29.0).

Grouped sections in config modal (PR #856, KJC-TSK-0452). Fields are now categorised: Agentes y modelos, Roles del pipeline, RAG, Tiempos de sesión, Calidad. Each section has an icon and a deterministic order. Backend exports a CATEGORIES array so the front renders sections without hard-coding. Unknown categories fall back to “Otros” — defensive: a new field always renders, even if someone forgets to label it.

Scope toggle: global vs per-project (PR #857, KJC-TSK-0453). A two-pill toggle in the modal header switches between ~/.karajan/kj.config.yml (global, default — same path the rest of Karajan uses) and <projectDir>/.karajan/kj.config.yml (per-project override; created on demand on first save). Resolution uses KJ_PROJECT_DIR || cwd(), the same pattern as journal-parser.js. Unknown scopes are rejected on both read (throws) and write (no disk touch).

Coming in v2.31.0+ — zero-config init: reduce the wizard to one critical question, smart defaults for everything else. The writable UI of v2.30 makes that defensible — users can change defaults later from the board.

RAG advanced (KJC-PCS-0049). The four roadmap items announced in v2.27.0 shipped together, plus a real fix for the v2.27.0 workaround:

Live re-index — kj watch (PR #836). New kj watch [start|stop|status] daemon backed by chokidar. Vigila ~/.karajan/onboarding/, ~/.karajan/plans/ y (opt) projectDir sources; cada change/add/unlink dispara reindex incremental tras 1 s de debounce. Borrar archivo limpia sus chunks. PID file ~/.karajan/watcher.pid arbitra un solo daemon. Fin del manual kj rag index tras cada edit del código.

Cloud embedders — OpenAI + Voyage (PR #841). Para usuarios sin Docker local. Nueva factoría src/rag/embedders/factory.js resuelve provider desde config.rag.embedder.provider:

rag:
embedder:
provider: openai # ollama | openai | voyage (default: ollama)
api_key: ${KJ_OPENAI_KEY} # KJ_OPENAI_KEY / KJ_VOYAGE_KEY — Karajan-scoped env vars
model: text-embedding-3-small
dim: 1536

Voyage es la recomendación de Anthropic (no tienen first-party embedder); voyage-code-3 está tuneado para código. Las env vars son KJ_* (Karajan-scoped) — Karajan nunca lee OPENAI_API_KEY directamente, preservando el architecture invariant.

BM25 + cosine hybrid (PR #838). SQLite FTS5 virtual table chunks_fts (contentless mirror + triggers sync). Nueva función searchBM25(db, queryText, topK) + fuseHits() que normaliza ambos scores a [0,1] y linear-combina via alpha:

distance = α · norm(cosine_distance) + (1−α) · norm(bm25)

CLI: kj rag query "<text>" --mode hybrid|semantic|keyword --alpha 0.6. hybrid (default) cubre ambos casos; keyword skipea la llamada al embedder cuando la query es exact-symbol. Resuelve el ranking pobre de queries como projectSlug o runRagContextStage.

AST source chunker (PR #839). src/rag/chunker-ast.js con @babel/parser (plugins: typescript, jsx, decorators-legacy, classProperties, topLevelAwait). Cada top-level declaration entera en un chunk; JSDoc + comments leading se foldean. Reemplaza el regex chunker para JS/TS/JSX. Funciones largas ya NO se parten mid-statement.

KJC-BUG-0064 (bundled) (PR #840). parseCooldown ahora TZ-aware via Intl.DateTimeFormat. Cuando el stderr de Claude Code contiene (Continent/City), resuelve el wall-clock target en esa TZ específica en lugar de la TZ local del host. Deshace el skip-en-CI workaround shipped en v2.27.0 (KJC-BUG-0063). Tests pasan en TZ=UTC, Europe/Madrid, Asia/Tokyo, America/Los_Angeles.

Coming in v2.29.0+ — extended cloud providers, per-chunk metadata search, rerank with cross-encoder, retrieval-quality dashboard on the HU Board. (All landed in v2.29.0.)

Team-shared HU Board (KJC-PRP-0002). Seven PRs (#859–#865) land the multi-machine HU cohort end-to-end. A plan no longer lives only in ~/.karajan/plans/<planId>/: opting it in with kj plan share <planId> copies it into .karajan-shared/plans/<planId>/, where every machine running Karajan on the same project can see and claim slices of the same plan.

Loader merge + board scan (PR1 / PR2). loadPlan() merges shared HUs into the per-project plan, deduping by id. The HU Board scanner picks up .karajan-shared/ next to the local cohort, sets is_shared = 1 on the chunks it reads from there, and the API surfaces it as a shared badge — visible at a glance, no menu dive.

Unshare round-trip + assignee (PR3 / PR6). kj plan unshare reverses the share atomically (the shared copy is removed; the local copy stays). The new per-HU assignee field is part of EDITABLE_HU_FIELDS, so two runners can split work on the same cohort without overwriting each other.

Selective share + conflict policy (PR4 / PR5). --only id1,id2 and --exclude id3,id4 filter which HUs leave the local cohort. When the same HU exists in both local and shared (concurrent edits across machines), sharedConflictPolicy decides: local-wins (default), shared-wins, or error (refuse to load, force human resolution).

Frontend cache (PR2). projectIsSharedCache memoizes the shared-or-not lookup per project so the board UI doesn’t hammer the API on every HU row render.

This closes the team-shared HU Board prerequisite (KJC-PRP-0002) — the last piece before the v3.0 Brain rewrite can rely on a stable multi-runner substrate.

AI Harness Scorecard hardening (KJC-PCS-0051). Plan A closes five FAILs from the external scorecard audit in a single sprint, plus two bug fixes shipped alongside.

Prettier --check CI gate (PR #868, KJC-TSK-0464). A new format job blocks PRs whose formatting drifts from .prettierrc.json. Scope intentionally narrow at first (.github/workflows/, root config) per .prettierignore; future PRs fold in additional directories under the shrink-budget cap.

Coverage v8 + CI artifact (PR #870, KJC-TSK-0465). vitest.config.js now emits text + html + lcov via @vitest/coverage-v8. The new coverage job runs npm run test:coverage and uploads coverage/ as a downloadable artifact (14-day retention). Per-glob thresholds enforced when the user opts in; src/mcp/handlers/** floor ratcheted to 70/60 (was 80/80) to lock the current state — follow-up tracked to climb back.

Conventional Commits on PR head (PR #872, KJC-TSK-0466). wagoid/commitlint-github-action@v6 checks every PR commit against .commitlintrc.json. CI-side enforcement on top of the existing pre-commit local hook — bypassing husky no longer escapes the gate.

Nightly drift workflow (PR #873, KJC-TSK-0467). New .github/workflows/nightly.yml re-runs the full CI suite (lint + syntax + tests + format) every night at 04:17 UTC against main. Failures auto-file or update a tracking issue tagged drift via actions/github-script@v8, so a flaky dep or upstream regression surfaces within 24 h instead of on the next unrelated PR.

eslint-plugin-security policy (PR #874, KJC-TSK-0468). eslint.config.js hard-bans eval, new Function, Function-style implied evals, dynamic require, pseudoRandomBytes and mustache-escape disabling; flags detect-non-literal-regexp as warn (14 acceptable warnings noted for follow-up). High-signal members of the recommended preset are intentionally NOT enabled (detect-object-injection, detect-non-literal-fs-filename, detect-child-process) to avoid false-positive churn on legitimate orchestrator code.

Bug fixes shipped alongside. PR #869 (KJC-BUG-0065) repaired 42 failing tests on main so the hardening sprint sits on a clean base. PR #871 (KJC-BUG-0066) added the missing await to openEditor in the spec-reviewer refine-loop, eliminating a race where the SHA hash diff read the v2 contents before the user finished editing.

AI Harness Scorecard golden metric (KJC-PCS-0051, Plan B). Plan B turns kj audit into a measurable quality loop. Every run now produces one deterministic number and an A–F grade for “how AI-friendly is this repo today vs last week,” with zero LLM tokens spent on the metric itself.

Docker bootstrap of ai-harness-scorecard (PR #877, KJC-TSK-0470). kj audit now auto-pulls addyosmani/ai-harness-scorecard on first use (~10 s warm) and runs a one-shot scan against the cwd. The bootstrap follows the same default-on-with---no-* pattern as Ollama in v2.26.0 — --no-harness opts out for air-gapped environments.

Audit report integration (PR #878, KJC-TSK-0471). The harness score (0–100) and grade (A–F) splice into the audit report headline alongside the deterministic findings. CLI/MCP parity preserved; the JSON payload exposes harness.score, harness.grade, harness.checks[] for downstream tooling.

Per-project history DB (PR #879, KJC-TSK-0472). Every audit run persists to .karajan/audit-history.db (SQLite + WAL, PRAGMA user_version=1). Schema captures run_id, started_at, score, grade, checks_json, commit_sha. The DB is per-project (gitignored by default) and survives across releases via versioned migrations.

Diff vs baseline + trend sparkline (PR #880, KJC-TSK-0473). The audit report now shows the delta vs the previous baseline (Δ +7 vs run #12 from 2026-05-21) and an optional Unicode-bar trend sparkline (▁▂▃▄▅▆▇█) over the last N runs. Edge cases covered: first run (no diff), stale baseline (>30 days old), missing commit SHA. 12 new tests in tests/audit/audit-history-display.test.js cover firstRun, diff, biggest delta, stale baseline, and sparkline edge cases.

5 250+ tests passing across 466 files. The four PRs (#877–#880) close KJC-PCS-0051 in its entirety — Plan A (v2.32) fixed the FAILs flagged by the external scorecard audit; Plan B (v2.33) makes the scorecard a first-class signal inside Karajan’s own audit.

Multi-language RAG + Quality & Observability. Two epics close in one window: KJC-PCS-0052 brings the Project RAG out of the JS/TS island and into Python, Rust, Go and Java; KJC-PCS-0053 turns retrieval quality from a vibe into a measured signal. Seventeen PRs.

web-tree-sitter AST chunkers for Python, Rust, Go and Java (KJC-TSK-0478, 0479, 0481, 0486). Grammars are vendored as WASM under vendor/tree-sitter-grammars/ so SEA binaries keep working without runtime downloads. A language adapter registry (src/lang/registry.jsadapterForPath(file)) routes each file to its chunker; the indexer wires preparePython / prepareRust / prepareGo / prepareJava alongside the existing JS/TS path.

Multi-stack collectors and onboarder (KJC-PCS-0052 PR-C, PR-D). kj audit now recognises Python (pyproject/poetry/requirements), Rust (Cargo.toml), Go (go.mod), and Java (pom.xml/build.gradle) projects and adapts its checks; kj onboard mirrors the multi-stack detection so the first-run experience matches what the rest of the pipeline sees.

Incremental reindex by git diff (KJC-TSK-0455). New vec_store_meta table tracks last_indexed_commit; kj rag index --since <ref> reindexes only files changed since that ref. A post-merge git hook and a pre-run drift check (HEAD != last_indexed_commit) keep the index in sync without manual intervention.

Golden-query harness with recall@k + MRR (KJC-TSK-0483). New kj rag eval runs a curated set of golden queries against the current index and reports recall@k (binary) and MRR (pure mean reciprocal rank) — finally a number that says “did the change to chunker / embedder / hybrid weight make things better or worse?” instead of vibes.

Content-hash dedup + MMR diversification (KJC-TSK-0484). The indexer fingerprints each chunk with sha256 and skips re-embed when the hash matches the stored row; on the retrieve side, an MMR pass (λ=0.5) diversifies the top-k so the LLM receives spread instead of N copies of the same paragraph.

docs/RAG.md deep-dive expansion (KJC-TSK-0485). The RAG reference doubles in scope: per-stack chunker behaviour, hybrid weighting, eval workflow, hash-skip semantics, MMR tuning, multi-stack quirks.

5 368 / 5 368 tests passing across 482 test files. PRs #882, #883, #884, #885, #886, #888, #889, #890, #891, #892, #893, #894, #895, #896, #898, #899, #900, #901 close epics KJC-PCS-0052 and KJC-PCS-0053 in full.