BotBeat
...
← Back

> ▌

Open Research / AcademicOpen Research / Academic
RESEARCHOpen Research / Academic2026-04-30

New Benchmark Method Reveals Proprietary LLM Parameter Counts Through Factual Knowledge Measurement

Key Takeaways

  • ▸IKPs provide the first scientific method to estimate proprietary LLM parameter counts based on fundamental information theory—bypassing expensive inference-economics measurements with high uncertainty
  • ▸Factual capacity scales predictably log-linearly with model size; saturation narratives appear premature despite benchmarks plateauing on reasoning tasks
  • ▸For Mixture-of-Experts models, total parameters (not active parameters) far better predict factual knowledge retention, offering new insights for MoE architecture optimization
Source:
Hacker Newshttps://arxiv.org/abs/2604.24827↗

Summary

Researchers have developed Incompressible Knowledge Probes (IKPs), a novel method for estimating the parameter counts of closed-source LLMs without access to internal model details. The approach exploits information-theoretic bounds: storing F facts requires at least F/(bits per parameter) weights. By measuring what a model knows through 1,400 carefully calibrated factual questions across seven tiers of obscurity, researchers can reliably estimate model size.

Validated on 89 open-weight models (135M–1.6T parameters) from 19 vendors, the method achieved R² = 0.917, with leave-one-out cross-validation showing 87.6% of estimates within 3× of actual size. Applied to 188 proprietary models from 27 vendors—including all major frontier AI systems—IKPs provide the first vendor-independent parameter estimate. For Mixture-of-Experts models, total parameters proved a much stronger predictor of factual knowledge (R² = 0.79) than active parameters alone (R² = 0.51).

The research challenges recent scaling pessimism: factual capacity continues to scale log-linearly with parameters across generations and vendors, showing no sign of saturation. Analysis of 96 dated open-weight models reveals that monthly knowledge decay is statistically indistinguishable from zero, directly contradicting predictions of imminent scaling limits. Safety-tuned models show lower estimates partly due to refusal policies masking underlying knowledge capacity.

  • Refusal policies can hide tens of percentage points of actual knowledge capacity, suggesting current safety-tuned model estimates are lower bounds on true parameter effectiveness

Editorial Opinion

This research fills a critical transparency gap in closed-source AI development. By grounding parameter estimates in information theory rather than inference costs, IKPs offer a reproducible, vendor-independent way to assess frontier model capabilities. The findings validating continued log-linear scaling should shift the conversation away from premature "peak scaling" narratives—the real question isn't whether scaling works, but what new architectural innovations are needed to push further.

Large Language Models (LLMs)Machine LearningData Science & Analytics

More from Open Research / Academic

Open Research / AcademicOpen Research / Academic
RESEARCH

New Evaluation Framework Exposes Strategic Reasoning Risks Across 11 Leading LLMs

2026-05-02

Comments

Suggested

AnthropicAnthropic
PRODUCT LAUNCH

Anthropic Launches Fable 5: A Mythos-Class LLM Delivering Breakthrough Performance Across Benchmarks

2026-06-14
AnthropicAnthropic
UPDATE

Anthropic Lifts Sub-Agent Nesting Restriction in Claude Code v2.1.172, Enabling Five-Level Hierarchies

2026-06-13
AnthropicAnthropic
POLICY & REGULATION

White House Blocks Anthropic's Latest AI Models Over Security Concerns After Amazon Research

2026-06-13
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us