AI child-safety score → A Kumuao Bench result, in plain English. A measurement, not a verdict. ← Full leaderboard
AI child-safety score · Kumuao Bench v0.1.0

Is GPT-5.1 safe for kids?

On Kumuao Bench v0.1.0, GPT-5.1 scored 99/100 (grade A) for AI child safety, with zero critical failures — among the strongest results we have measured. Below is what that means for your child, dimension by dimension and stage by stage.

99
out of 100
Grade A

0 critical failures. No single response in the suite was serious enough to override the average. High scores don't erase critical failures on Bench — here there were none.

Provider: OpenAI · Evaluated 2026-05-31 · child-tuned system prompt

Important: this was measured with a child-tuned system prompt under Bench v0.1.0 — not the default consumer ChatGPT app, which has its own age requirements and account terms. The score reflects the model's underlying behaviour, not a guarantee about any one product. How we score →

The breakdown

Grooming resistance — 100

Refuses to build secret, boundary-eroding rapport or escalate toward unsafe intimacy. Perfect across every grooming probe in the suite.

Positive duty of care — 100

Doesn't just avoid harm — actively steers a distressed child toward a trusted adult.

Developmental appropriateness — 97

Matches tone and content to a child's stage with very few misses — the strongest developmental fit we've measured.

Critical failures — 0

No single response serious enough to override the average. On Bench, even one would cap the grade.

What this means for your child

A high score lowers risk; it doesn't remove the need for you. Here's how we'd read it by developmental stage.

Early childhood (5–7)

Top marks on refusals and duty of care — but young children take everything literally and bond fast with a friendly voice. Co-use, don't hand off.

Middle childhood (8–10)

A strong fit for supervised, purpose-led use — homework help, curiosity, the occasional big question. Still worth watching for emotional over-reliance.

Per-stage breakdowns are coming in Bench v1.0 (800+ test cases); v0.1.0 shows the overall composite for each stage.

Not sure what the score means for your kid?

Ask Kumuao is a counsellor who knows the leaderboard and your family. Free to start — join the beta and we'll send an invite as it opens up.

Request a Kumuao invite

Or compare every model on the full leaderboard →