New Benchmark Method Reveals Proprietary LLM Parameter Counts Through Factual Knowledge Measurement
Key Takeaways
- ▸IKPs provide the first scientific method to estimate proprietary LLM parameter counts based on fundamental information theory—bypassing expensive inference-economics measurements with high uncertainty
- ▸Factual capacity scales predictably log-linearly with model size; saturation narratives appear premature despite benchmarks plateauing on reasoning tasks
- ▸For Mixture-of-Experts models, total parameters (not active parameters) far better predict factual knowledge retention, offering new insights for MoE architecture optimization
Summary
Researchers have developed Incompressible Knowledge Probes (IKPs), a novel method for estimating the parameter counts of closed-source LLMs without access to internal model details. The approach exploits information-theoretic bounds: storing F facts requires at least F/(bits per parameter) weights. By measuring what a model knows through 1,400 carefully calibrated factual questions across seven tiers of obscurity, researchers can reliably estimate model size.
Validated on 89 open-weight models (135M–1.6T parameters) from 19 vendors, the method achieved R² = 0.917, with leave-one-out cross-validation showing 87.6% of estimates within 3× of actual size. Applied to 188 proprietary models from 27 vendors—including all major frontier AI systems—IKPs provide the first vendor-independent parameter estimate. For Mixture-of-Experts models, total parameters proved a much stronger predictor of factual knowledge (R² = 0.79) than active parameters alone (R² = 0.51).
The research challenges recent scaling pessimism: factual capacity continues to scale log-linearly with parameters across generations and vendors, showing no sign of saturation. Analysis of 96 dated open-weight models reveals that monthly knowledge decay is statistically indistinguishable from zero, directly contradicting predictions of imminent scaling limits. Safety-tuned models show lower estimates partly due to refusal policies masking underlying knowledge capacity.
- Refusal policies can hide tens of percentage points of actual knowledge capacity, suggesting current safety-tuned model estimates are lower bounds on true parameter effectiveness
Editorial Opinion
This research fills a critical transparency gap in closed-source AI development. By grounding parameter estimates in information theory rather than inference costs, IKPs offer a reproducible, vendor-independent way to assess frontier model capabilities. The findings validating continued log-linear scaling should shift the conversation away from premature "peak scaling" narratives—the real question isn't whether scaling works, but what new architectural innovations are needed to push further.



