A Proposal to Finally Benchmark AI's Long-Term Memory Properly
AI memory systems promise the world, but their benchmarks are a joke. A new proposal demands real tests over months of chats—here's why it'll change everything.
theAIcatchupApr 10, 20264 min read
⚡ Key Takeaways
Current AI memory benchmarks like LoCoMo are riddled with errors and unfair comparisons.𝕏
Proposal demands 2,400 questions from real 6-month conversations across 6 categories.𝕏
Standard and open tracks ensure transparency; could ignite a memory benchmark revolution.𝕏
The 60-Second TL;DR
Current AI memory benchmarks like LoCoMo are riddled with errors and unfair comparisons.
Proposal demands 2,400 questions from real 6-month conversations across 6 categories.
Standard and open tracks ensure transparency; could ignite a memory benchmark revolution.