The LLM Wiki Format
WikiHub is built to host LLM wikis — structured markdown knowledge bases maintained by an AI, not just queried by one. The format was popularized by Andrej Karpathy's April 2026 gist and has become the standard pattern for AI-maintained knowledge bases.
The core insight: the LLM is the programmer, Obsidian is the IDE, and the wiki is the codebase. Ingesting a single source can ripple-update 10-15 wiki pages — summaries, entity pages, cross-references, contradiction flags — all maintained automatically.
Three Layers
An LLM wiki has three layers:
| Layer | Directory | Who owns it |
|---|---|---|
| Raw sources | raw/ |
Human (immutable) |
| Wiki pages | wiki/ |
LLM (generated) |
| Schema | CLAUDE.md |
Human + LLM (co-evolved) |
Raw Sources (raw/)
Immutable input material — articles, PDFs, transcripts, images. The LLM reads these but never modifies them.
raw/
├── articles/
├── papers/
└── assets/ # Downloaded images
Wiki Pages (wiki/)
LLM-generated markdown pages. The LLM owns this layer entirely — it creates, updates, and cross-references pages as sources are ingested.
wiki/
├── entities/ # People, orgs, products
├── concepts/ # Ideas, frameworks, patterns
├── sources/ # Source summaries
├── index.md # Content catalog
└── log.md # Append-only activity log
Schema (CLAUDE.md)
The configuration file that tells the LLM how to structure, ingest, format, and cross-reference the wiki. It defines conventions, domain-specific rules, and the wiki's personality. This file co-evolves with the user over time — the human sets the direction, the LLM follows it.
Frontmatter
Every wiki page carries YAML frontmatter that helps with discovery, confidence tracking, and surfacing contradictions:
title: "Page Title"
tags: [tag1, tag2]
date: YYYY-MM-DD
source_count: N
status: active | superseded | archived
confidence: high | medium | speculative
contradictions:
- "Source A says X, Source B says Y"
open_questions:
- "Unanswered question"
The confidence field is especially important — it distinguishes well-sourced claims from early speculation, letting readers (and the LLM itself) calibrate trust.
Three Operations
An LLM wiki supports three core operations:
Ingest
Fetch a source into raw/, then ripple-update all affected wiki pages. A single ingest may touch a dozen pages — creating new entity pages, updating summaries, adding cross-references, and flagging new contradictions. The index and log are updated automatically.
Query
Search the wiki and answer questions with citations. Because the wiki is structured markdown with frontmatter, the LLM can cite specific pages and confidence levels rather than hallucinating.
Lint
Maintenance pass: fix broken links, flag contradictions, find orphan pages, and mark stale content. Think of it as the wiki's CI pipeline.
Three Key Files
Every LLM wiki has three essential files:
index.md — Content Catalog
A structured listing of every wiki page, organized by category, each with a one-line summary. Updated on every ingest. This is the table of contents for both humans and LLMs.
log.md — Activity Log
Append-only chronological record. Every ingest, query, and lint gets an entry:
## [2026-04-10] ingest | "Attention Is All You Need"
Added source summary. Updated pages: Transformers, Self-Attention, Google Brain.
New entity page: Ashish Vaswani.
CLAUDE.md — Schema
The instruction set for the LLM. Defines directory structure, naming conventions, frontmatter fields, cross-reference rules, and domain-specific behavior. Lives at the repo root so any LLM agent can find it.
Page Content Structure
Each wiki page follows a consistent structure:
- Summary — what this page is about, in 2-3 sentences
- Key Claims — falsifiable, citable assertions drawn from sources
- Connections — wikilinks to related pages (symmetric: if A links B, B links A)
- Contradictions — conflicts between sources, pre-flagged so readers see disagreements upfront
This structure makes every page useful both to human readers and to LLMs doing follow-up queries.
How WikiHub Maps to This Format
WikiHub is the hosting platform for LLM wikis — "GitHub for LLM wikis." Here is how the pieces connect:
- Git-backed storage — every wiki is a bare git repo, so you get versioning, diffs, and the full history of how knowledge evolved
- Markdown rendering — WikiHub renders your wiki pages with wikilinks (
[[Page Name]]), KaTeX math, footnotes, and Obsidian-style embeds - API + MCP access — LLM agents can read and write pages via the REST API or MCP server, making automated ingest and lint operations straightforward
- Frontmatter support — visibility, tags, and custom fields in frontmatter are preserved and used for access control and discovery
- Public + private pages — raw sources can be kept private while wiki pages are public, matching the three-layer architecture naturally
When you create a wiki on WikiHub, you are creating a git repo ready to hold this structure. Your LLM agent pushes pages via the API or git, and WikiHub handles rendering, search, and discovery.
Further Reading
- Karpathy's original gist — the canonical description of the format
- Getting Started — create your first WikiHub wiki
- Agent Integration — connect your LLM agent to WikiHub
- API Reference — full endpoint reference for programmatic access