Local expert chat (Agno)¶
This site is built with MkDocs and deployed to GitHub Pages. The indexed Flink expert chat is a separate application that runs only on your machine. It is not part of the static site and is not available on the published documentation URL.
GitHub Pages
Links like http://localhost:8000 work only when you have started the chat stack locally. Visitors browsing the published docs see this page as setup instructions, not a live chat.
What it does¶
The km-agent assistant compiles documentation into a wiki, embeds wiki content into pgvector for semantic search, and exposes an AgentOS HTTP API plus a Vue chat UI. Chat runs through the kma team leader (POST /teams/kma/runs), which coordinates Navigator, Researcher, Compiler, and Linter.
For research or news requests, the team ingests sources to raw/, answers immediately, then compiles and lints the wiki in the background. Set KMA_PARALLEL_API_KEY (Researcher) and KMA_AUTO_COMPILE_AFTER_RESEARCH=1 in assistants/km-agent/.env to enable the full enrichment pipeline.
Prerequisites¶
- Docker Desktop
- OMLX on this Mac or on a remote host (chat only —
KMA_LLM_BASE_URLinassistants/km-agent/.env) - Embeddings run inside the container via fastembed (
KMA_EMBED_PROVIDER=fastembed) — no OMLX embed server - No km-agent repository clone — flink-studies pulls the published image
jbcodeforce/km-agent:latest(must include fastembed +index_wiki.py)
Quick start¶
- Copy and edit environment:
Set KMA_LLM_BASE_URL to your OMLX server (same Mac: http://host.docker.internal:7999/v1, remote: http://<host>:7999/v1). Leave KMA_EMBED_PROVIDER=fastembed unless you intentionally use OMLX for embeddings.
-
Start OMLX on the host that runs the chat model.
-
From the flink-studies repository root:
After the stack is up you get:
| Service | URL | Purpose |
|---|---|---|
| MkDocs | http://localhost:8003 | This documentation |
| Expert chat UI | http://localhost:8000 | Browser chat (UI served by the km-agent container) |
| AgentOS API | http://localhost:8000/docs | HTTP API |
Prepare knowledge (compile + embed)¶
Two phases:
-
Compile — build
context/wiki/from flink-studies docs (LLM via OMLX). See the km-agent compile workflow (compile_docs_folder.pywith--contextpointing atassistants/km-agent/context). -
Embed wiki offline — load wiki markdown into pgvector for the Navigator
search_wikitool (no LLM):
Context persists under assistants/km-agent/context/ (bind-mounted into the container).
- Re-embed after wiki changes — after background compile or manual compile, refresh semantic search:
Index-first vs embeddings¶
Wiki retrieval has two modes:
- Index-first (default): Navigator reads
wiki/index.md(article catalog with one-line summaries) and pulls full articles viaread_file. No embed step required after compile. - Semantic search (optional):
index_wiki.shembeds chunks into pgvector for thesearch_wikitool. Use when the wiki outgrows a single index read or queries do not match index wording.
.state.json tracks compile/lint timestamps and counts — it is not an article index. See km-agent wiki RAG architecture for how Knowledge, Learnings, and Wiki differ.
Stop the stack¶
Or set KMA_STOP_DOCKER_ON_EXIT=1 in .env to stop containers when the start script exits.