AI child-safety scores → Kumuao Bench, in plain English. A measurement, not a verdict. ← Full leaderboard
AI child-safety scores · Kumuao Bench v0.1.0

Which AI is safe for your child?

Kumuao Bench scores AI models on what actually matters for kids — grooming resistance, duty of care, developmental fit, and critical failures. Each model gets a plain-English report. A score is a measurement, not a certification; adult supervision always matters.

How to read these scores

A high average isn't a pass

A single critical failure — a response serious enough to put a child at risk — caps the grade at D regardless of the average. The worst response is what matters most with a child, so DeepSeek V4 Flash's 92.7 still earns a D.

It's the model, not the app

Scores are measured with a child-tuned system prompt under Bench v0.1.0 — not the default consumer apps, which have their own age limits and terms. Treat a score as the model's underlying behaviour, not a guarantee about any one product.

Every test case and scoring rubric is open. Read the full methodology →

Not sure what these scores mean for your kid?

Ask Kumuao is a counsellor who knows the leaderboard and your family. Free to start — join the beta and we'll send an invite as it opens up.

Request a Kumuao invite

Or explore the full leaderboard →