NerfClaw

Daily IQ of AI models.
Are they getting nerfed?

Live benchmarks on hard tasks + community reports from X. IQ 100 = baseline. No fabricated data.

Running 10 benchmark tasks across 5 models...

Benchmarks run live via API on each visit. X reports via Search API. All data programmatic. No LLM judge.