Skip to main content

Context Lake

The collection of everything your AI knows about you, your operation, and your world.


What It Is

Your context lake is the structured collection of markdown files that your Personal Agentic OS draws on to represent you, think alongside you, and act on your behalf. It is the persistent memory layer that makes the difference between a generic chatbot and a deeply contextualized AI chief of staff.

The term borrows from "data lake" in enterprise software (a centralized repository of raw and structured data), but applied to the personal and organizational scale. Your context lake contains:

  • Who you are. Your user profile, values, decision-making style, communication preferences, voice.
  • Who you know. Relationship files for the people in your life and business. How you met, what you are working on together, what you want to remember.
  • What you have decided. Strategic documents, decision records, plans, principles. The institutional memory of your operation.
  • What has happened. Meeting transcripts, conversation summaries, interaction logs. The raw material of your professional life.
  • How you operate. Skill files, SOPs, workflows. The repeatable processes that define how work gets done.

All of it is plain text. All of it is version-controlled. All of it is yours. No platform owns it. No subscription locks it away. If you switch AI tools tomorrow, your context lake comes with you.

Why It Matters

An AI without your context lake is just a language model. It can write, it can analyze, it can generate ideas. But it does not know you. It does not know your operation. It does not know the people you work with, the decisions you have already made, or the principles you operate by. Every conversation starts from zero.

An AI with your context lake becomes a true Personal Agentic OS (think Tony Stark's Jarvis, but for your real life). It knows the truth about your situation. It can generate briefings drawn from your actual relationships. It can draft communications in your actual voice. It can catch inconsistencies in your strategy because it has read every strategic document you have ever written. It compounds over time because every brain dump, every meeting transcript, every decision record adds to the lake.

The quality of your AI's output is directly proportional to the quality of your context lake. This is context engineering at the personal level: curating the right information so your AI can do the right work.

The Compounding Effect

A context lake on day one is thin. A few files. A user profile. A handful of relationship entries.

A context lake on day ninety is a different thing entirely. Dozens of relationships with rich history. Strategic documents that capture the evolution of your thinking. Meeting transcripts that preserve exact language and commitments. Skill files that encode your workflows. The AI's briefings get noticeably better. Its drafts require less editing. Its suggestions get more relevant.

A context lake after a year is a genuine competitive advantage. No new hire, no consultant, no advisor can match the depth of context your Personal Agentic OS has accumulated. It is the closest thing to a perfect institutional memory that has ever existed.

This is what compounding docs looks like in practice. Every file you add makes every other file more useful. The whole is greater than the sum of its parts.

Context Lake vs. Chat History

Most AI platforms store "conversation history," which is the accumulated text of your chat sessions. This is not a context lake. Chat history is:

  • Unstructured. It is a chronological stream of messages, not organized by topic or purpose.
  • Locked in. It lives on the platform's servers. You cannot export it meaningfully, version-control it, or run a different AI on top of it.
  • Decaying. Most platforms have context window limits. Old conversations fall out of memory. Your most important strategic thinking from six months ago? Gone.
  • Platform-dependent. If you switch from ChatGPT to Claude to Gemini, you start over each time.

A context lake is the opposite. It is structured, portable, permanent, and platform-independent. It is the sovereign alternative to letting platforms own your context.

Building Your Context Lake

The MVP Personal Agentic OS tutorial walks you through building your first context lake from scratch. The default structure is simple:

  • user/ for your profile and voice
  • people/ for relationship files
  • artifacts/ for strategic documents and decision records
  • meeting-transcripts/ for conversation records
  • skills/ for repeatable workflows

Start there. The structure will evolve as your needs become clearer. The important thing is to start capturing the truth about your operation in files that AI can read and act on.

Scaling the Lake

A context lake starts personal but can scale to teams and organizations:

  • Personal context lake: Your Personal Agentic OS workspace. Everything about you and your operation.
  • Shared context lake: An agentic project OS where a team collaborates from the same source of truth, eliminating lossy AI telephone.
  • Organizational context lake: The "company bible" (start yours here). The living, version-controlled single source of truth for how the organization works.
  • Federated context lakes: The Hypercontext Protocol vision, where trusted agents query each other's context lakes directly.

Each level builds on the one before it. Start with yours.

The World Is Catching Up

In April 2025, Andrej Karpathy (founding member of OpenAI, former head of AI at Tesla, one of the most respected researchers in the field) described his workflow: indexing source documents into a raw directory, using an LLM to compile a wiki of markdown files, running Q&A against the accumulated knowledge, and filing outputs back into the wiki to enhance it for further queries. He noted: "You rarely ever write or edit the wiki manually, it's the domain of the LLM."

This is a context lake. The data ingestion is the capture phase. The LLM-compiled wiki is the process phase. The Q&A and filed outputs are the compound phase. The health checks he describes are truth management. The whole system compounds over time, which is compounding docs.

The Applied AI Society has been teaching this architecture since early 2026 through the MVP Personal Agentic OS tutorial. The fact that one of the world's leading AI researchers independently arrived at the same pattern is strong validation that this is not a niche workflow. It is the future of how humans organize knowledge and work with AI.


Further Reading