Single-source reference for the patterns and rules everything in the mdbook homelab follows. If you’re adding a new service, host, or repo, match these. If something here is wrong or stale, fix it here first.

Scope: this book is for conventions — the “how we do it” patterns. Inventory (what hardware, what IPs, what services) lives in Hardware & Hosts and Network. Procedures (how to recover X, how to onboard Y) live in Runbooks.


Domain Map

DomainRole
mdbook.mePublic-facing services. Apps people (or external clients) hit. Examples: gitlab.mdbook.me, auth.mdbook.me, chat.mdbook.me, directus.mdbook.me, metamcp.mdbook.me, go.mdbook.me, registry.mdbook.me.
mdbook.oneInternal host-level naming + per-host wildcards. Hostnames are <host>.mdbook.one (e.g. node0.mdbook.one, cube1.mdbook.one, core1.mdbook.one).
forestInternal LAN search-domain shorthand. Synology NAS hostname literally is forest (FQDN: forest.mdbook.one, DNS-only).
MDBKActive Directory NetBIOS / domain shortname.

Rule of thumb: if a human or external client needs to reach it, use mdbook.me. If it’s host-internal or part of the per-host stack, use mdbook.one.

Artist/namespace identity: Mdbook is also Mik’s Spotify artist name; the homelab and the music project share the brand.


Hostname & Cert Patterns

Hostname

  • Per-host FQDN: <hostname>.mdbook.one
  • Per-host wildcard: *.<hostname>.mdbook.one (e.g. *.node0.mdbook.onevllm.node0.mdbook.one, infinity.node0.mdbook.one)
  • Metrics endpoints: metrics.<hostname>.mdbook.one — Prometheus scrape pattern

Cert management (lego)

  • mikayla/lego (“Lego Autorenew”) runs as registry.mdbook.me/mikayla/lego in daemon mode (24h interval + jitter), Let’s Encrypt via Cloudflare DNS-01.
  • Manages wildcards for: *.auth, *.node0, *.cloud, *.media, *.metrics, *.vpn, *.proxy, *.hub, *.dev, *.transcode, *.ai, *.bots — all under mdbook.one.
  • Plus standard certs: auth.mdbook.one, minecraft.mdbook.one.
  • New service on a new subdomain wildcard? Add the wildcard to lego config first; let it issue before bringing the service up. Trying to do it the other way produces ugly cert-pending failures.

Universal stack (every host)

Deployed on every homelab box:

  • Caddy node_proxy listening on :29443, using _.${HOSTNAME}.mdbook.one wildcard cert.
  • node_exporter for Prometheus scrapes.

This is what makes “spin up a service on any host and route to it” work without per-host bespoke setup.


Stack Layout

The Docker monorepo’s top-level folders are purpose-named, and each host owns one or more of these stacks. New services belong to exactly one stack based on purpose, not on who happens to have spare capacity.

StackPurpose
authSSO/OIDC. Authentik (postgres + server + worker).
botsAgent/bot/MCP control plane. MetaMCP + Postgres, mcp-auth-proxy, Hindsight, MCP server fleet, Discord/Plex bots, frontline-automator, git-mirror, Authentik proxy outpost.
cloudSelf-service apps: GitLab, n8n, BookStack, Directus, Homebridge, MagicMirror, Homepage, Apprise, Erin (TikTok archive), speedtest-tracker.
hubEdge router glue. Caddy + a custom Flask validator endpoint. Small and central.
mediaThe *arr suite + qBittorrent + Gluetun (VPN egress for torrent/arr only) + Plex-adjacent libraries (Calibre, SyncLounge, Wizarr, Owncast, etc).
metricsPrometheus, Grafana Enterprise, full LibreNMS suite (db, redis, dispatcher, syslog-ng, snmptrapd, msmtpd), nut-exporter.
node0The AI/GPU stack — only stack with GPU access. vLLM, Infinity (embedder + reranker), LibreChat (Mongo + Meilisearch), SearXNG, pgvector, code-sandbox MCP gateway, AnythingLLM, tdarr-node.
proxyPublic edge: Caddy, tinyproxy (port 8888 for internal HTTP egress), Shlink, neko, directus-viewer.
transcodeMedia servers and transcoding: Plex (host network mode + /dev/dri), Jellyfin, FileFlows, Tdarr.
universalThe per-host base layer (Caddy node_proxy + node_exporter). Deployed everywhere.
vpnWireGuard / VPN egress glue.

Rule: a service belongs to one stack. If it sprawls across two, something’s wrong with the boundary. If a brand-new service doesn’t fit any existing stack, that’s a real design conversation, not a “stick it in bots for now” thing.


Docker Monorepo Layout

Repo: mikayla/docker on gitlab.mdbook.me.

  • Top-level folders are one per stack, named as in the table above.
  • Each host bind-mounts /docker/${HOSTNAME} at /compose.
  • Rollout: sudo /docker/update.sh (passwordless sudo on every box — interim until rootless migration).
  • Single-service cycle: sudo /docker/update.sh --image=<name> or comma-separated --image=foo,bar.
  • Agents are authorized to invoke update.sh themselves; see Agent Authorization.

Tooling Rules

RuleNotes
Docker over systemdAll services run as Docker containers in the monorepo, not systemd units.
vim over nanoAll in-place edits use vim. File-edit instructions to humans or agents should be vim commands/motions.
Secrets are placeholders onlyNever share real tokens/keys/passwords with LLMs. Use <PLACEHOLDER> style in any committed config; the user fills in real values out-of-band.
tmux over nohupUse named tmux sessions with detach/reattach, not background processes.
Bots-box shell callsWrite raw commands without the ssh bots prefix when targeting the bots stack — the SSH wrapping is handled by tooling.
Caddy over nginxAll HTTP routing goes through Caddy (universal stack node_proxy + edge proxy/hub stacks). No nginx anywhere.

Source Control & CI

GitLab

  • Instance: gitlab.mdbook.me (self-hosted, primary git remote for personal projects).
  • Username: mikayla.
  • Project slugs: mikayla/<repo> (e.g. mikayla/docker, mikayla/nexus-drift, mikayla/lego).
  • Preferred MCP tool: gitlab-zereight.
  • Always pass membership: true when listing projects via MCP — otherwise it won’t surface all the repos.

Known bug — UID 0 file ownership

  • New repo creation hits a UID 0 file ownership bug across mirrored repos → file permission issues.
  • Workaround: run fixgit.sh after creation to correct permissions.

Mirrors

  • GitHub→GitLab mirroring via bachp/git-mirror.
  • Central CI config lives in mikayla/scripts, not in mirrored repos. Mirrors get overwritten by upstream, so the CI definition would be clobbered.

Image registry

  • Private registry: registry.mdbook.me.
  • Image naming: registry.mdbook.me/mikayla/<name>:latest for custom homelab images.
  • CI builds push here on a weekly cadence (lego is the reference example: daemon mode, 24h + jitter).

Repo scaffolding pattern

New non-trivial repos follow the nexus-drift model:

  • AGENTS.md — invariants and gates that agents (Claude Code, etc.) must follow when working in the repo.
  • handoff.md — per-session state, project-specific quirks, what’s blocked on what.
  • Per-project invariants live in AGENTS.md (e.g. per-side gates and project-specific API rules), not in scattered comments.

Storage & Backups

LayerWhereNotes
Primary bulk storagecore1.mdbook.one (TrueNAS)10GbE SFP+ fiber to backbone. Hosts NFS exports including /mnt/plex/ for media servers.
Backup tierforest.mdbook.one (Synology RS2416RP+)RAID5 arrays (md2: 4×22TB, md4: 4×4TB) + RAID1 SSD cache. Bonded 4×1GbE LACP.
Hot/working storagePer-host local disksUse for caches, working data, container volumes. Don’t put irreplaceable state here.
Service data/docker/<stack>/... bind mountsLives on the host running the stack. Backups handled via the standard backup job.

Convention: irreplaceable data lives on core1; backups land on forest. Host-local disk is scratch / hot path. If a service needs persistence that survives the host, it gets an NFS mount from core1.


LLM Stack

ComponentDetail
InferencevLLM on node0 (RTX 3090, 24GB) at https://vllm.node0.mdbook.one/v1.
ModelGemma 4 26B MoE (current primary).
Required vLLM arg--served-model-name vLLM-Mainalways. This is the stable name LibreChat targets; without it, model swaps require LibreChat reconfig.
Chat UILibreChat at chat.mdbook.me (Mongo + Meilisearch backed).
Embeddings / rerankingInfinity at infinity.node0.mdbook.one (BAAI/bge-large-en-v1.5 + bge-reranker-large).
Agent runtime (planned)mikayla/agents — FastAPI service on the bots stack, LangGraph + Pydantic AI + self-hosted LangSmith.
DeprecatedOpen-WebUI (replaced by LibreChat). Ollama (replaced by vLLM, configs commented out, not deleted). Qwen3-27B and Qwen3.5-9B AWQ were tested as primary candidates and ruled out.

LibreChat share access (agent gotcha)

  • Hit /api/share/{id} via curl (or bash + curl), not web_fetch.
  • web_fetch returns the SPA shell, not the share content.
  • HTTP 503 "DNS cache overflow" is a transient sandbox proxy error — retry 1–3× before assuming failure.

GPU bottleneck

  • node0 is the only GPU host (single RTX 3090). All GPU workloads (vLLM, Infinity, tdarr-node) concentrate there.
  • node0 is on 1GbE copper, which makes it the network bottleneck for cross-host AI traffic. Design around it: keep AI clients close to node0 or accept the wire.

MCP Integration Patterns

PieceDetail
AggregatorMetaMCP at metamcp.mdbook.me — functions as LLM tool router.
Catchall endpointhttps://metamcp.mdbook.me/metamcp/catchall/mcp
Group endpointshttps://metamcp.mdbook.me/metamcp/<group>/mcp (e.g. gitlab, google-workspace, hindsight, proxmox)
Authsigbit/mcp-auth-proxy in front, OIDC via Authentik (auth.mdbook.me).
Deployed serversgrafana-mcp, zereight-gitlab-mcp, proxmox-mcp, unifi-mcp, google-workspace-mcp, n8n-mcp, hindsight, mcp-directus-uploader.

OIDC / mcp-auth-proxy gotchas (resolved, document so they don’t bite again)

  • Signing key must be EC, not RSA. RSA keys are incompatible with the proxy’s JWT flow.
  • OIDC_ALLOWED_USERS takes the Authentik username, not the email address. Email looks right and silently fails.
  • DCR (Dynamic Client Registration) flow conflicts with Claude’s Advanced Settings fields. If client config is being filled by the Claude UI, DCR is redundant and breaks the handshake.
  • Redirect URI mismatches are the most common silent failure. Verify the exact URI registered in Authentik matches what the client sends, scheme and trailing slash included.

Building new MCP servers

  • Python MCP servers go on the bots stack, behind MetaMCP, behind mcp-auth-proxy.
  • Reference implementation: mikayla/mcp-directus-uploader (Python, FastAPI-style, exposes upload_b64 tool).
  • New servers register with MetaMCP via the catchall or a dedicated group; group routing makes the tool list shorter and lets per-purpose auth scope.

Memory / Hindsight Conventions

Hindsight is the long-term memory bank. The same bank backs LibreChat, Claude.ai, and ChatGPT, so a fact written by one is visible to all.

Retention bar

Will this still matter in three months?

If yes → retain. If no → it’s ephemeral scratch, skip retention.

Tag requirements

  • source:claudemandatory on every retain.
  • domain:<area> — required when obvious. Common values: homelab, code, family, finance, health, work, media.
  • Topic tags (topic:<thing>) — optional, but they make recall queries cheaper. Add them liberally.

What to retain

  • Durable preferences, decisions, identity facts.
  • Architectural decisions and their rationale.
  • Resolved gotchas (so they don’t get re-discovered).
  • Recurring patterns and conventions.

What NOT to retain

  • Single-task scratch state.
  • Verbatim commands.
  • Secrets, tokens, API keys (ever).
  • Inflight debug guesses that didn’t pan out.

Deletion gap (as of 2026-05-14)

Hindsight MCP exposes retain, recall, reflect, list_*, get_*, and clear_memories (wipe-all, optionally fact_type filtered). There is no per-memory or per-document delete tool. When a topic is deprecated (e.g. layermind → bambuddy), use the tombstone pattern: retain a single strongly-tagged superseding memory that says “topic X is dead, disregard prior memories tagged topic:X”. For bulk surgical deletion, hit Postgres directly.


Agent Authorization

What automated agents (Claude Code, Claude.ai, ChatGPT, in-house LangGraph agents) are pre-authorized to do without asking:

AllowedNotes
Read any repo on gitlab.mdbook.meUse gitlab-zereight MCP with membership: true.
Read Hindsight memoriesRecall liberally; overhead is minimal.
Write Hindsight memoriesSubject to retention bar + tag requirements above.
sudo /docker/update.sh --image=<name>For config rollouts and single-service cycles, including comma-separated batches.
Read BookStack (this wiki)Encouraged. Write/update with judgment.
Read Grafana / metricsAll dashboards under datasource UID ae5va6kn8th4we.

What requires explicit human consent:

  • Anything that modifies external services (sending messages, making purchases, posting publicly).
  • Anything irreversible (deleting repos, dropping databases, force-pushing to main).
  • Anything involving real secrets.
  • Merges to main on any production repo.

Deprecation Discipline

When something is being replaced:

  1. Comment out, don’t delete. Old config stays in the repo, commented, with a one-line note explaining the swap and date. This makes the “what happened to X” question trivially answerable from history.
  2. Document the swap in the relevant section of this page (see the LLM Stack “Deprecated” row for the pattern).
  3. Retain a Hindsight memory describing the swap, the rationale, and the date. Tag source:claude domain:homelab topic:deprecation.
  4. Don’t fully remove until the replacement has been running clean for at least a few weeks. Then the comments and config can be deleted in a separate commit clearly marked as cleanup.

Reference examples:

  • Ollama → vLLM (configs commented, not deleted).
  • Open-WebUI → LibreChat.
  • MCPJungle → MetaMCP (MR !1 merged in mikayla/docker).
  • layermind → bambuddy (2026-05-14, FOSS adoption — maziggy/bambuddy already covered all planned scope plus more; mikayla/layermind repo deleted).

Quick Reference: where stuff lives

WhatWhere
Docker monorepogitlab.mdbook.me/mikayla/docker
Central CI scriptsgitlab.mdbook.me/mikayla/scripts
Docker registryregistry.mdbook.me
Self-hosted GitLabgitlab.mdbook.me
Authentik (SSO)auth.mdbook.me
LibreChatchat.mdbook.me
vLLMvllm.node0.mdbook.one/v1 (model name vLLM-Main)
Infinity (embed/rerank)infinity.node0.mdbook.one
MetaMCPmetamcp.mdbook.me
Grafana / metricsmetrics.<hostname>.mdbook.one
URL shortenergo.mdbook.me
Directusdirectus.mdbook.me
BookStack (this)docs.cloud.mdbook.one
Primary storagecore1.mdbook.one (TrueNAS)
Backupsforest.mdbook.one (Synology)