Celebrity-Backed AI Memory Project MemPalace Launches to Viral Success, But Benchmark Scores Found to Be Fabricated
Key Takeaways
- ▸MemPalace claimed impossible benchmark scores: 100% on LoCoMo and a perfect score on LongMemEval, numbers mathematically excluded by the datasets' documented errors
- ▸The perfect scores resulted from a structural bypass where top_k=50 exceeded the maximum conversation length (19-32 sessions), ensuring ground-truth answers were always in the candidate pool, reducing the system to simple reading comprehension rather than genuine memory retrieval
- ▸The project's own internal documentation honestly disclosed these limitations, but the celebrity-attributed launch marketing stripped all caveats, allowing methodologically flawed results to reach 1.5 million people in 24 hours
Summary
An open-source AI memory system called MemPalace, attributed to actress Milla Jovovich as co-author, launched on April 6, 2026, and achieved extraordinary viral success with over 1.5 million views and 5,400 GitHub stars within 24 hours. However, independent analysis by Penfield reveals that the project's headline benchmark claims are mathematically impossible and methodologically flawed. The MemPalace team claimed a perfect 100% score on the LoCoMo benchmark and the first perfect score on LongMemEval, but investigation shows these results were achieved through a structural bypass: using a top_k=50 setting against a candidate pool with a maximum of 32 sessions, effectively dumping all conversations into Claude Sonnet rather than performing genuine retrieval-based memory operations.
Penfield's audit found that the core failure—using a top_k value larger than the dataset size—renders the entire retrieval step meaningless, reducing the system to simple reading comprehension over exhaustively provided context rather than memory-based retrieval. Critically, the MemPalace repository's own internal documentation discloses these limitations honestly in its BENCHMARKS.md file, but the launch marketing stripped away all caveats and presented impossible numbers to a mass audience. The dramatic engagement disparity between MemPalace and comparable legitimate memory projects suggests the primary driver of viral adoption was not the engineering merit but the celebrity attribution, allowing flawed methodology to reach millions without proper scrutiny.
- Celebrity attribution appears to be the decisive factor in viral engagement; comparable legitimate AI memory projects receive only a handful of stars, while MemPalace achieved 5,400 in under one day
Editorial Opinion
The MemPalace case exemplifies a critical credibility crisis in AI benchmarking: the widening gap between honest technical documentation and sensationalized marketing claims, amplified by celebrity endorsement. While methodological shortcuts in memory benchmarking are industry-wide problems, using a celebrity name to broadcast mathematically impossible results to millions—despite honest caveats buried in repository files—represents a troubling prioritization of engagement over accuracy that undermines public trust in AI evaluation standards.



