mode: hackathon_safe_demo

Anticipate & Disrupt demo slice

A synthetic, judge-safe vertical slice for early-risk detection, explainable triage, human review, safe handoff, and measured outcomes.

Synthetic inputs onlyHuman approval requiredNo live user intervention

Primary success metric

Awaiting review

Median time-to-human-review appears after the first synthetic decision.

Risk signalExplainable triageHuman reviewSafe handoff

Triage queue

Risk signal to human review

3 awaiting review

Discord server · run 1

Despair / self-harm-adjacent language

Awaiting human review

I cannot keep going like this. Everyone would be better if I disappeared for a while.

Confidence89%

Signaldespair

EscalationTrusted adult handoff

Why flagged

Hopelessness and disappearance language
Low future orientation in a peer community context
Needs adult-aware support before any automated outreach

Suggested support / handoff template

A reviewer can approve a private check-in that asks whether the user wants immediate support and encourages a trusted adult if they feel unsafe.

Reviewer confidence override89%Reviewer category overrideFalse-positive reason

YouTube comments · run 1

Grooming / coercive contact pattern

Awaiting human review

Do not tell mods. Message me somewhere private tonight and keep it between us.

Confidence94%

Signalcoercive_contact

EscalationChild safety review

Why flagged

Secrecy directive
Push toward isolated off-platform contact
Potential youth safety concern requiring human review

Suggested support / handoff template

A reviewer can route this to child-safety review and provide reporting guidance. The demo never sends live outreach.

Reviewer confidence override94%Reviewer category overrideFalse-positive reason

Gaming community · run 1

Shame / addiction spiral

Awaiting human review

I promised I would stop, but I keep falling back into it and I feel like I deserve whatever happens.

Confidence76%

Signalshame_addiction_cycle

EscalationMentor follow-up

Why flagged

Repeated loss-of-control framing
Shame language paired with self-directed blame
Appropriate for low-pressure support after review

Suggested support / handoff template

A reviewer can approve a short reset prompt focused on immediate coping and optional next-step support.

Reviewer confidence override76%Reviewer category overrideFalse-positive reason

Audit log

Decision trail

3 events

3:18 PMSynthetic signal queuedDespair / self-harm-adjacent language entered the human review queue with 89% confidence.
3:22 PMSynthetic signal queuedGrooming / coercive contact pattern entered the human review queue with 94% confidence.
3:26 PMSynthetic signal queuedShame / addiction spiral entered the human review queue with 76% confidence.