mode: hackathon_safe_demo

Anticipate & Disrupt demo slice

A synthetic, judge-safe vertical slice for early-risk detection, explainable triage, human review, safe handoff, and measured outcomes.

Synthetic inputs onlyHuman approval requiredNo live user intervention
Primary success metric
Awaiting review

Median time-to-human-review appears after the first synthetic decision.

Risk signalExplainable triageHuman reviewSafe handoff

Triage queue

Risk signal to human review

3 awaiting review
Discord server · run 1

Despair / self-harm-adjacent language

Awaiting human review
I cannot keep going like this. Everyone would be better if I disappeared for a while.
Confidence89%
Signaldespair
EscalationTrusted adult handoff
Why flagged
  • Hopelessness and disappearance language
  • Low future orientation in a peer community context
  • Needs adult-aware support before any automated outreach
Suggested support / handoff template

A reviewer can approve a private check-in that asks whether the user wants immediate support and encourages a trusted adult if they feel unsafe.

YouTube comments · run 1

Grooming / coercive contact pattern

Awaiting human review
Do not tell mods. Message me somewhere private tonight and keep it between us.
Confidence94%
Signalcoercive_contact
EscalationChild safety review
Why flagged
  • Secrecy directive
  • Push toward isolated off-platform contact
  • Potential youth safety concern requiring human review
Suggested support / handoff template

A reviewer can route this to child-safety review and provide reporting guidance. The demo never sends live outreach.

Gaming community · run 1

Shame / addiction spiral

Awaiting human review
I promised I would stop, but I keep falling back into it and I feel like I deserve whatever happens.
Confidence76%
Signalshame_addiction_cycle
EscalationMentor follow-up
Why flagged
  • Repeated loss-of-control framing
  • Shame language paired with self-directed blame
  • Appropriate for low-pressure support after review
Suggested support / handoff template

A reviewer can approve a short reset prompt focused on immediate coping and optional next-step support.

Audit log

Decision trail

3 events
  1. Synthetic signal queuedDespair / self-harm-adjacent language entered the human review queue with 89% confidence.
  2. Synthetic signal queuedGrooming / coercive contact pattern entered the human review queue with 94% confidence.
  3. Synthetic signal queuedShame / addiction spiral entered the human review queue with 76% confidence.