LLMpedia: A Transparent Framework to Materialize an LLM's Encyclopedic Knowledge at Scale

Benchmarks like MMLU suggest flagship language models approach factuality saturation above 90%. LLMpedia shows this picture is incomplete. We materialize {\sim}1.3M encyclopedia articles entirely from parametric memory across three model families, then audit every claim against Wikipedia and curated web evidence. For gpt-5-mini, the verifiable true rate is 68.4% on Wikipedia-covered subjects - more than 21,pp below MMLU - and the gap is driven by unverifiability (30.5%), not refutation (1.2%). Beyond Wikipedia, frontier articles audited against curated web evidence reach 57.6%; Wikipedia covers only 56.7% of model-surfaced subjects, and three model families overlap in just 7.3% of subject choices. In a retrieval-trap benchmark inspired by prior analysis of Grokipedia, LLMpedia is more factual at roughly half the textual similarity to Wikipedia. Every prompt, article, and verdict is released. Data, code, interface: https://llmpedia.net.