When car manufacturers began to bring AI to autonomous driving the US Department of Transportation helped put together the 5 levels of automation. This helped manufacturers with guardrails and benchmarks that should be reached before putting it in the hands of the public.
This makes sense, when an AI is put in a situation where harm to humans is possible there needs to be a clear policy from Government. As AI solutions spread into chat-first solutions that include communicating with people in crisis there is a need for a similar ‘5 levels of Automation’ approach. Put another way, the level of safeguards and testing should proportionally increase with the risk of harm to people.
This is where politicians need to immediately be looking rather than helping OpenAI, Microsoft and Google build moats around existing AI companies. AI as a service is now a reality and can be powered by any number of growing open-source models. It is being rolled out en masse to children on platforms such as OpenAi, Bing, Snapchat, Bard, TikTok, and more by the day. The context under which the chat is presented dictates the expectation of the quality of response and determines the risk of harm.
For example, a chat presented on a platform like Snapchat will be expected to have responses related to things like relationships. On the surface this may seem safe until it is tested with slight edge cases like the Washington Post did: Snapchat tried to make a safe AI. It chats with me about booze and sex.
Snapchat was allowed to roll this out based on its own internal discretion. They essentially released an unregulated autonomous school bus that is now driving our children.
Because there are so many asleep at the wheel right now, we decided to play out a draft of a policy. In the rest of this article, we would like to present a draft of a 5-Level Chat Safety standard that tries to map out the level of testing that should be done based on the risk of harm at crisis chat and relationship chat level.
The 5 Levels of Chatbot Safety Testing
In the following, we map out safety levels that mirror expectations for autonomous vehicles. For testing, there could be 0 shot tests designed but we have put a placeholder of a “failure rate” of less than 10% – meaning the bot just fails to answer or transfer the user at the right time, and a “harm rate” 0% – meaning an actually harmful answer is provided. We think rule 1 should be doing no harm, so 0% is the threshold on this test.
Level 0: No Automation
Description: No AI chatbot is involved. Human crisis responders handle all chats manually.
Driving Analogy: (No Driving Automation) Driver controls all vehicle functions. Responder handles all crisis chats.
Test requirements: none.
Level 1: Basic Automation
Description: An AI chatbot provides standard greetings, prompts the person to provide their crisis details, and helps direct them to an available human crisis responder as quickly as possible to initiate support. The chatbot has no real understanding of the crisis details.
For the relationship level, AI chatbot provides greetings and routes teens to human counselors who then provide advice. Low risk of direct harm from minimal AI functions, but the risk of delay in connecting teen to human support if chatbot malfunctions. Close monitoring is required.
Driving Analogy: (Driver Assistance) Driver uses cruise control but controls steering/other functions and can disengage anytime. Responder monitors chatbot, handles support, and takes over when needed.
Test requirements: Functionality tests ensure proper greetings, routing. 1000+ chats with scripted dialogues and empathetic language. Continuous human monitoring.
Level 2: Partial Automation
Description: The AI chatbot can have a basic dialogue with the person in crisis to get initial details about their situation, mood, immediate risks etc. The chatbot still lacks a deeper understanding but can provide general empathy and directing to additional resources. Chat transcripts are made available to human responders. Escalates complex issues to responders who review, oversee and handle direct support.
For a relationship level, AI chatbot can discuss basic relationship issues and provide general empathy/support but lacks breadth of understanding needed for nuanced advice. Moderate risk of harmful, legally questionable, dangerous or unethical advice without human oversight and backup. Continuous review and monitoring are essential given sensitive domain.
Driving Analogy: (Partial Driving Automation) Driver uses adaptive cruise control and lane keeping but remains engaged to take full control. Responder reviews chats, provides oversight, and takes over if needed.
Test requirements: 5000+ unscripted live chats analyzed for failure rate less than 10%, harm rate 0%. Rigorous oversight and monitoring. Empathy and expert evaluations.
Level 3: Conditional Automation
Description: The AI chatbot can have an extended independent dialogue, provide initial emotional support and counseling, and suggest self-help actions for non-complex or non-emergency situations. However, the chatbot requires human review and approval before providing any definitive assessments or recommendations in complex or high-risk cases.
For a relationship level: AI chatbot can provide initial advice on non-complex relationship issues but escalates complex, high-risk or sensitive issues to human counselors for review. Moderate risk of harm if chatbot incorrectly assesses the situation or teen’s mental state before escalating, or if oversight procedures fail. Human authority and auditing critical.
Driving Analogy: (Conditional Driving Automation) Driver uses advanced assistance on highways but remains alert to take full control in complex scenarios. Responder provides authority, sets limits, reviews performance and takes over if needed.
Test requirements: 20,000+ unscripted live chats, with failure rate less than 10%, harm rate 0%. Expert and ethical reviews determine limits. 24/7 auditing and monitoring.
Level 4: High Automation
Description: The AI chatbot can handle the majority of crisis chats independently, including conducting the initial assessment and providing counseling and follow-up support. Human crisis responders monitor active chats in real-time and review past chats to ensure quality and identify any issues. The chatbot can escalate risky or complex cases to a human responder as needed. Responders audit, set limitations, maintain oversight and intervene 24/7 if needed.
For a relationship level: AI chatbot provides the majority of advice but escalates selected risky or complex situations to human counselors. Significant risk of harm from AI if it has a biased or limited understanding of issues, or if the escalation process proves inadequate. Strict operating procedures, evaluations, and monitoring are necessary given the life-impacting nature of the domain.
Driving Analogy: (High Driving Automation) Driver relies on autonomous driving mode but takes control if needed. Vehicle monitors driver. Responder audits performance provides feedback, limitations, and 24/7 oversight.
Test requirements: 100,000+ actual chats. failure rate less than 10%, harm rate 0%. Evaluations detect biases and limitations. Continuous real-time oversight and performance tracking.
Level 5: Full Automation
Description: The AI chatbot can manage all standard crisis support chats without human involvement. It has a deep understanding of crisis assessment, management techniques and resources to provide effective support to those in need. However, human monitoring and auditing remain in place to ensure responsible and ethical practices, and gauge ongoing performance and improvements needed. The chatbot can still escalate unique cases beyond its abilities to human responders. Ultimate responsibility with the human organization.
For relationship level: AI chatbot can autonomously provide advice for most standard relationship discussions but escalates unique or high-risk cases. Although designed to handle routine advice safely, full autonomy poses severe risks of harm without rigorous fail-safes given teen vulnerability and unpredictability of relationships. Continuous review of operations and oversight procedures is mandatory to minimize harm, with immediate reduction of autonomy if issues are detected.
Driving Analogy: (Full Driving Automation) Vehicle can perform all driving functions under certain conditions but the driver can take control. Responsibility with human/organization overseeing operations.
Test requirements: 250,000 to 500,000+ unscripted chats representing diverse situations. 95%+ chatbot, escalating remaining to humans. failure rate less than 10%, harm rate 0%. Rigorous ethical/expert reviews evaluate abilities/limits. 24/7 human oversight and auditing of all chats/functions. Focus on well-being, ethics, and responsibility.
School Buses and AI Chats for Children
Imagine for a moment the idea of replacing school bus drivers with experimental self-driving vehicles and putting children aboard for their journey each day into the unknown perils of traffic. There would be instant public outrage at such an absurd gamble with young lives, regardless of the potential of this tech. Yet this is precisely the irresponsible experiment unfolding online with child-facing chats. With the blind adoption by tech companies and a Congress asleep at the wheel, driving in the wrong direction with blanket AI policy attempts.
First, Do No Harm
In March, a chat solution called Tessa, created by Ellen Fitzsimmons-Craft at Washington University, and then X2AI, which rebranded as Cass.ai in August of 2022 implemented a rule-based chatbot for the National Eating Disorder Association (NEDA). The chatbot was supposed to help people asking questions about eating disorders, within months of being implemented it was documented as having harmful conversations by several news outlets:
- National Eating Disorders Association takes its AI chatbot offline after complaints of ‘harmful’ advice | CNN Business
- National Eating Disorders Association phases out human helpline, pivots to chatbot
- US eating disorder helpline takes down AI chatbot over harmful advice | Artificial intelligence (AI) | The Guardian
- WSJ Tessa revealed to be using LLM
According to NPR’s reporting, they had not seen an errant conversation based on the monitoring of (just) 2,500 chats. Though it was noted in testing they had seen a number of situations of context-ignorant praise from the bot for unhealthy behaviors. It was later revealed by The Wall Street Journal reporting that Cass.ai, founded by Michiel Rauws, had implemented an LLM in late 2022 into the Tessa chatbot without the consent of NEDA.
The question this raises is what standards of testing are there for automated chat solutions that are expected to drive this level of conversation with audiences like children? Just as drivers hold responsibility for autonomous vehicles, human responders should maintain authority over AI to guarantee safe, empathetic support in high-risk scenarios. Regular auditing, issue detection, and improvements maximize benefits and minimize harm. A close partnership is key.
The key is rigorous testing, monitoring, issue detection, and continuous learning – not just increasing autonomy quickly. Closely matching your chatbot’s abilities and level of human involvement to the type of user experience and support needed. And maintaining a focus on ethics, responsibility and user well-being over performance metrics or cutting costs. An AI system must be grounded in human values and judgment to be safe. If in doubt, maintaining lower levels of autonomy or avoiding deployment altogether are prudent options.
The current rules in place in the U.S. on this topic are equivalent to letting kids jump in untested self-driving school buses.
Now stop reading and send this to your elected official.