When You Say One Thing but Mean Your Motherboard

In early 2014, I attended the demonstration of an experimental technology funded by DARPA, the US military’s advanced research and development agency. In a conference room at a Los Angeles office park, a young man sat on stage before a computer screen, with cameras trained on his face and body. On the screen, a rudimentary 3D model of a casually dressed young woman sat upright in a plush armchair of the type one might find in a psychotherapist’s office. The character said her name was Ellie and that she was not a therapist, but created “to talk to people in a safe and secure environment.” Another monitor, visible to the audience but not to the man, displayed real-time information from the cameras, tracking his posture, eye movement, and the changing shape of his mouth.

“I’ll ask a few questions to get us started,” Ellie said, regarding the man with an unnervingly neutral gaze. “And please feel free to tell me anything. Your answers are totally confidential.”

As the demonstration proceeded, Ellie asked if he could recall the last time he felt truly happy. He didn’t have a ready answer, and the system’s computer vision software detected that he was breaking eye contact more than he had earlier in the conversation. “I noticed you were hesitant on that one,” Ellie said.

Ellie’s goal was to conduct conversations that would allow the software to pick up “distress indicators,” from evasive glances to “negatively valenced words,” that are “correlated with depression, anxiety, or post-traumatic stress disorder,” researchers from the Institute for Creative Technologies at the University of Southern California, where the system was designed, wrote in a subsequent paper. To that end, Ellie was made to seem highly empathetic. “A particular concern in the design … was whether users would feel comfortable enough in the interaction to speak openly about their mental health issues to a virtual agent,” the researchers wrote.

Though Ellie was obviously virtual, there were potential advantages to this. DARPA was funding ICT’s research through a larger program called DCAPS, the Detection and Computational Analysis of Psychological Signals, which aimed to “develop novel analytical tools to assess psychological status of warfighters in the hopes of improving psychological health awareness and enabling them to seek timely help,” the agency said. After the demo finished, one of the researchers explained to the room that speaking to a virtual avatar about mental health issues could be ideal for military personnel who are taught to be and act tough, leaving them too ashamed to open up to another human being.

Something like Ellie could be useful to the military in other ways, too. To identify and help all current and former personnel with PTSD would be a massive undertaking. Estimates from the US Department of Veterans Affairs suggest that between 11 and 20 percent of the 2.7 million service members who deployed to Iraq and Afghanistan—roughly 300,000 to 540,000 people—suffer from the disorder in any given year. Of those, DARPA says that only a small fraction seek help. It’s difficult to imagine recent administrations deploying the battalions of people—therapists, trainers, outreach personnel—needed to find and care for half a million or more people with PTSD. Automation, of the kind represented by Ellie, seems to hold out the possibility of treating mental health problems at scale, or even keeping soldiers on active duty for longer periods. If successful, computerized therapy could also be applied in other circumstances where human-to-human treatment is undesirable or impractical—such as in the midst of a pandemic.

Behind this possibility lurks a larger vision, too. Though the Ellie program is in some ways crude, it seems to herald a future system that can continuously track, report, and support users’ mental health on an ongoing basis. At the time of the demo, consumer devices like the Fitbit and Apple Watch were being marketed on the basis of their round-the-clock monitoring and data-collection features for physical health—information which would yield life-improving insights and interventions, the companies behind these technologies implied. More recently, researchers affiliated with Amazon published a paper describing efforts to determine a user’s emotional state from their voice. If an Amazon Alexa divined you were upset, it could ask what was wrong—and maybe upsell you on some indulgent self-care items. Supporting mental health could be one more reason to justify the ambient collection and interpretation of vast streams of data from our bodies and behavior.

After seeing the DARPA demo, I was unsettled by the idea of an emotionally-aware technology ecosystem constantly reporting back to companies or governments about our mental states, and then trying to intervene in them. But the thing I kept coming back to most often was the avatar of Ellie, sitting in her chair with her hands folded in her lap, calmly interviewing an actual human being with a potential mental illness. As a designer and writer of video games, I know that well-crafted interactive digital characters can elicit deep emotions from players, causing changes in their mood and outlook, just as powerful works in any medium can. Until I encountered Ellie, though, I hadn’t imagined what it would mean for people to share their most private thoughts and feelings with a machine. I wondered whether this artificial interaction could actually help people change, or even heal. So, in a spirit of curiosity, I set out to create a sort of Ellie of my own.

An Algorithm for Thoughts

When I began researching computerized therapy, virtual mental health care was already a booming category—and that was before the world was struck by the coronavirus. Following the outbreak of COVID-19, the possibility of inexpensive, scalable virtual mental health tools may very well become a necessity. Social isolation, unemployment, pervasive uncertainty, death—the pandemic and society’s response to it have created a wave of emotional distress while at the same time stripping millions of people of their jobs, healthcare, and access to therapy. “With the coronavirus pandemic causing unprecedented levels of stress and grief, companies offering virtual mental health care say they’re seeing a massive surge in interest,” the medical news site STAT recently reported.

There’s currently a bewildering array of mental health apps to download, with names such as Calm, Happify, and MindShift. Some of these programs are simply collections of user-directed exercises (“write down how you felt today”). Some seek to connect clients with human therapists over video or text chat. Others contain characters like Ellie. But most of them claim to implement elements of cognitive behavioral therapy, or CBT, a kind of therapy that emerged in the 1960s to become one of today’s most heavily used and studied methods of treating depression, anxiety, and PTSD. For many years, CBT has been considered the most effective treatment for such disorders. One recent meta-analysis of mental health app trials found that over 80 percent of them used CBT in one form or another.

CBT is a heavily “manualized” form of therapy, meaning that it expects providers to stick closely to highly specific, almost scripted interactions with clients. Providers teach clients to recognize negative thoughts and “dispute” them in order to reduce their effect and replace them with positive ones. If the recurrent thought “I’m destined for failure” runs through a patient’s head, for example, CBT prompts them to systematically dispute the thought with alternative interpretations: Is this, perhaps, an exaggeration? Is there really any evidence that this is true? It’s akin to running an algorithm on a thought until its power diminishes and eventually disappears.

As a result, a common complaint from those on the receiving end of CBT is that it is cold, mechanical, and lacks empathy. Writing for Vice about the game I eventually made about a virtual CBT therapist, the critic Rob Zacny said he recognized in it “that sense of emotional whiplash that comes from finally starting to confide something serious and scary, only to be met with weirdly programmatic responses from well-meaning counselors.” But the mechanistic quality of CBT also makes it especially well-suited to be supplemented by, or turned entirely into, a computer program. If your human therapist already acts and sounds like an automaton, what would be lost by replacing him with one?

But virtual characters can only be effective if on some level we believe or buy into them. In a 2016 article for the Guardian about new questions surrounding CBT’s efficacy, the journalist Oliver Burkeman spoke with a woman who had experienced postnatal depression. “I don’t think anything has ever made me feel as lonely and isolated as having a computer program ask me how I felt on a scale of one to five, and—after I’d clicked the sad emoticon on the screen—telling me it was ‘sorry to hear that’ in a prerecorded voice,” she told him. What she realized she needed, Burkeman writes, was “real connection: that fundamental if hard-to-express sense of being held in the mind of another person, even if only for a short period each week.”

Patients aren’t the only ones who have recognized this need. In 1976, a computer scientist at MIT named Joseph Weizenbaum declared the notion that “a computer can be programmed to become an effective psychotherapist” was nothing but “perverse.” This was a renegade position for the MIT of the era, where a powerful belief in the capabilities of artificial intelligence reigned. Weizenbaum was well aware that he was committing apostasy, but his assault on the idea of computerized therapy was all the more forceful because he had created the first virtual therapist himself.

Alien Worlds

The ELIZA program that Weizenbaum created, which he released in 1966, is often called the world’s first chatbot. Cleverly, Weizenbaum designed ELIZA around the stereotype of a psychotherapist’s propensity to turn a client’s statement into a question. (People don’t like me. Why do you think people don’t like you?) This was an elegant way to create the effect of a computer holding its own in a conversation with the user.

Of course, ELIZA, which Weizenbam named after Eliza Doolittle, the character in the George Bernard Shaw play Pygmalion, possessed no real understanding of what its human partner was saying. As the information technologist Ted Nelson wrote of the program in his classic 1974 book about hacker culture, Computer Lib / Dream Machines: “Now, do not be fooled. There is no Ghost in the Machine. The program does not ‘understand’ the user. THE PROGRAM IS MERELY ACTING OUT THE BAG OF TRICKS THAT JOE WEIZENBAUM THOUGHT UP.” The explanation (and emphasis) was necessary because ELIZA’s noncommittal responses were surprisingly good at convincing users either that it did understand them, or that there was actually another human being on the other side of the teletype.

Indeed, ELIZA sometimes seemed to perform flawlessly, as in this example from Weizenbaum’s original 1966 paper describing his creation:

Men are all alike.
IN WHAT WAY
They’re always bugging us about something or other.
CAN YOU THINK OF A SPECIFIC EXAMPLE
Well, my boyfriend made me come here.
YOUR BOYFRIEND MADE YOU COME HERE
He says I’m depressed much of the time.
I AM SORRY TO HEAR YOU ARE DEPRESSED
It’s true. I am unhappy.
DO YOU THINK COMING HERE WILL HELP YOU NOT TO BE UNHAPPY
I need some help, that much seems certain.

But it is also easy to have a conversation with ELIZA that goes nowhere:

HOW DO YOU DO. PLEASE TELL ME YOUR PROBLEM
I have so much work to do.
THAT’S QUITE INTERESTING
Not really.
THAT IS INTERESTING. PLEASE CONTINUE
It’s not.
DO YOU FEEL STRONGLY ABOUT DISCUSSING SUCH THINGS

It usually becomes clear to modern users within minutes how fragile ELIZA’s illusion of real conversation is. In the late 1960s, however, the program left a lasting impression. The astrophysicist and science popularizer Carl Sagan, in a 1975 essay for Natural History magazine, predicted a time in which such therapy would be commonplace, delivered through “a network of computer psychotherapeutic terminals, something like arrays of large telephone booths.” Even established mental health professionals began to seriously consider the idea of computerized therapy. Perhaps most notable was the psychiatrist Kenneth Colby, who later developed his own mental health chatbots and once told a reporter, “after all, the computer doesn’t burn out, look down on you, or try to have sex with you.”

Weizenbaum himself believed that ELIZA only demonstrated that computers did not have to actually understand anything in order to mimic everyday conversations. The year after ELIZA was released, his more famous colleague at MIT, Marvin Minsky, declared that “within a generation, the problem of creating ‘artificial intelligence’ will substantially be solved.” But ELIZA helped Weizenbaum to develop a more skeptical view of computer science, and of the relationship between computer and human intelligence.

Weizenbaum argued that even if a future computer were powerful enough to perform automated therapy, it would still be wrong. Human intelligence and computer logic are fundamentally different processes and wholly “alien” to one another, he wrote in his 1976 book Computer Power and Human Reason. As Zachary Loeb explains in the introduction to Islands in the Cyberstream, a posthumously published interview with Weizenbaum, “Computers excelled at tasks involving quantification, but for Weizenbaum there was much about human beings that simply could not be quantified.” As tempting as it was for computer scientists to believe that computers could model the world around them or human thought, in truth they could only create their own separate reality. “If Weizenbaum called for renunciation of the computer, in certain contexts,” Loeb continues later, “it was because the embrace of the computer in all contexts had led to a renunciation of the human.”

In her paper “Authenticity in the age of digital companions,” Sherry Turkle, a fellow MIT professor who taught classes with Weizenbaum, recounted how ELIZA’s reception informed his stance:

Weizenbaum found it disturbing that the program was being treated as more than a parlor game. If the software elicited trust, it was only by tricking those who used it. From this viewpoint, if Eliza was a benchmark, it was because the software marked a crisis in authenticity: people did not care if their life narratives were really understood. The act of telling them created enough meaning on its own.

Perhaps this is the most salient question when it comes to automated therapy: can the act of telling be separated from the act of listening? Does simply speaking create enough meaning on its own? If being listened to, recognized and understood by another human being is the source of therapy’s power, then there is no way bundles of code—such as ELIZA, Ellie, or the fictional app in the game I eventually created—could work. Automated mental health care would be built on a faulty premise about what therapy should be and what computers can ultimately do.

Talking to the Dog

Almost all of today’s mental health apps are built on that potentially flawed premise. Many take the form of an instant message–like interface with a friendly character, making comparisons to ELIZA difficult to avoid. One prominent example is Woebot, which boasts Stanford-linked founders and closed an $8 million funding round in 2018. As with its distant ancestor, Woebot’s lack of genuine human understanding is easily revealed. Telling the program I was feeling anxious about the fact that “I have so much work to do” resulted in this stilted interrogation, which reads as if it comes straight out of a CBT manual:

Are you perhaps catastrophizing in your statement?
No
Are you Future Predicting with your thought, “I have so much work to do”?
No
You’re doing awesome thinking through this by the way… we’re nearly there
Are you ‘Mind-reading’ in your thought, “I have so much work to do”?
No
How about black-or-white thinking in this thought: “I have so much work to do”?
Nope
Got it
And is there a chance that you may be mistaking your feelings as evidence of the truth?
No
OK Matthew, you did a great job there…
Now for the most important bit… rewrite “I have so much work to do” without those distortions.
I have so much work to do
So how did your rewrite go?

The smarmy attitude and rigid dialogue of apps like Woebot makes one sympathetic to Weizenbam’s view that computerized therapy is inhumane. At the same time, I couldn’t help but notice again the similarity to my own work as a game developer. Using a bag of tricks not unlike the ones Weizenbaum used to create ELIZA, many games invite players to believe that digital characters are friends who can perceive and understand them in some way. Combinations of animations, sounds, and contextually appropriate behaviors create illusions not only of life, but of genuine communication between the alien worlds of humans and computers.

Taking advantage of “the human propensity to suspend disbelief in the presence of a persuasive dramatic presence” was the “true source of Eliza’s power,” the digital media pioneer Janet Murray wrote in her groundbreaking 1997 book Hamlet on the Holodeck. In the world of games, these persuasive dramatic presences are (usually) clear about what they really are—characters. Game developers still rely on audiences to voluntarily buy into them at some level, in a parallel to the way the live theater becomes reality, though only temporarily. The actor who was stabbed to death during the play returns to the stage at the curtain call; the gamer eventually takes off their VR headset. What happened inside the magic circle was certainly not “true,” but was also, in a way, real, and exerted a powerful effect on those who experienced it.

Might this be all that it is needed for automated psychotherapy to work? In the nearly fifty-five years since ELIZA first appeared, very few digital characters have made an attempt to understand their users in any real way. The reason might be that they simply don’t have to in order to be compelling. And in that case, does it really matter if a therapy program understands its user or not? One of the principal investigators on the project that created Ellie, the computer science professor Louis-Pilippe Morency, compared the experience to talking to a pet. “Some people talk to their dogs,” he said, speaking to a journalist for the Atlantic. “Even though the dogs don’t understand it ... I think there's a little bit of that effect—just talking with someone makes you feel better.”

Aside from the fact that dogs aren’t, so far as we know, uploading their data anywhere, this is an important consideration. If the feeling of connection is all that’s needed for successful therapy, then human-to-human interaction may be superfluous. As Turkle asks, “If a person feels understood by an object lacking sentience, whether that object be an imitative computer program or a robot that makes eye contact and responds to touch, can that illusion of understanding be therapeutic? What is the status—therapeutic, moral, and relational—of the simulation of understanding?”

Between a Novel and a Game

Perhaps because of the shifting priorities of a new administration, DARPA’s interest in mental health has waned since 2014, and it isn’t currently funding research into goals like treating PTSD. But witnessing Ellie at work caused me, over the course of the next several years, to design and draft a video game about the experience of computerized psychotherapy. By early 2019, the world I had slowly been building took its final shape: a present-day Seattle not so different from the real one, where gig workers called “proxies” read prompts given to them by “Eliza,” a cloud-based, AI-powered virtual therapist program developed by a fictional tech giant. The company names the product after Weizenbaum’s program in an attempt to ride its notoriety while, in true tech giant fashion, ignoring its creator’s misgivings.

Players guide the game’s main character, Evelyn Ishino-Aubrey, as she takes a job as one of these proxies, performing therapy sessions for a wide variety of people. As a proxy, all she can do is read what the Eliza algorithm has told her to read. Evelyn is warned multiple times not to deviate from the script, priming savvy gamers to believe they will immediately be able to do just that. But Evelyn performs session after session without deviating. For most of the game, I lock the character into being a “good” proxy, even at the expense of upsetting players’ expectations. If Evelyn expresses doubts, she does so privately.

I deny players choice for such a long time because I couldn’t accept the idea of a game where different phrases uttered by a therapist would be considered better or worse for a patient. It’s easy to imagine a game in which a client comes in with certain combinations of conditions, and the challenge for the player is to pick the right things to say to match those conditions. But that would be to accept in advance the notion that a certain type of person needs to hear a certain type of thing in order to be guided to the best possible outcome—an assumption encoded in materials like the CBT manuals I had studied. I wanted to create therapy clients who were irreducible individuals, real fictional human beings. So in opposition to prevailing game design wisdom, I ended up with something in between a novel and a game, something that requires players to be patient and to give up control.

Between sessions, Evelyn has lengthy conversations with people who work, or used to work, on Eliza. Some believe the product is working well; others very much don’t. Some see it only as a ladder for their own ambitions in the tech world and beyond. Evelyn must also confront her own past and struggle with depression. As the story of the game concludes, it allows players to decide the best way for Evelyn to move forward in life. Each choice includes some compromise. She could continue to contribute to Eliza. She could take her newfound interest in therapy to pursue a career as a traditional human-to-human therapist. She could turn her back on all of it. This is the first and last major choice the game asks players to make.

A human player playing a virtual human (Evelyn) whose choices are circumscribed by a computer program (Eliza) which is at once both real and a fiction created by me, its human developer: my hope is that these layers of human-computer interaction help to challenge the simple binary between warm and empathetic human-to-human therapy and the encroachment of cold, computerized systems. The reality, of course, is more complicated. Under the manualized framework of CBT and similar modes of treatment, even human therapists act robotically, whereas in ELIZA, a rudimentary but surprisingly expressive computer program emulated a much more humanistic, open-ended style of psychotherapy. It is not about a “computer” approach versus a “human” approach—it is about an indifferent versus a compassionate one.