Karpathy shares a workflow for building personal knowledge bases with LLMs: most of the token usage is no longer spent on writing code, but on managing knowledge.

BlockBeatNews

According to 1M AI News monitoring, OpenAI co-founder Andrej Karpathy shared a recent finding on X: building a personal knowledge base with an LLM is more valuable than using it to write code. Most of his token consumption has currently shifted from operating on code to operating on knowledge.

The complete workflow has five steps:

  1. Data ingestion: index source documents such as articles, papers, code repositories, datasets, images, etc. into the raw/ directory, then use the LLM’s incremental “compilation” to produce a markdown wiki, including summaries, backlinks, concept categorization, and article interlinking
  2. Browsing interface: use Obsidian as the front end to view the raw data, the compiled wiki, and derived visualizations. The wiki content is fully maintained by the LLM, and people hardly edit it directly
  3. Q&A queries: when the wiki reaches a certain scale (in one of his research directions there are already about 100 papers and 400,000 words), you can ask the LLM complex questions. The LLM itself retrieves the wiki content to answer. He originally thought RAG would be needed, but at this scale the index files and summaries automatically maintained by the LLM are already sufficient
  4. Output backflow: generate query results in the form of markdown, Marp slides, or matplotlib charts; after viewing them in Obsidian, archive them back into the wiki so that personal exploration continuously accumulates
  5. Quality inspection: use the LLM to periodically perform a “health check” on the wiki—find data inconsistencies, fill in missing information, and uncover cross-concept relationships—incrementally improving data completeness

Karpathy says he also developed some additional tools, such as a simple wiki search engine, which can be used either by himself in a web interface or as a command-line tool to hand to the LLM for larger queries. He believes this workflow is still only a “hodgepodge of scripts” for now, but beneath it there’s an opportunity for an “incredible new product.” A farther-reaching vision is: for every question posed to a frontier model, a set of LLMs could be dispatched to automatically build a temporary wiki, perform quality inspections, iterate through multiple rounds, and ultimately output a complete report—“far beyond a single .decode().”

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.
Comment
0/400
No comments