Do No Harm

In early 2017, an investigative journalist uncovered a private Facebook group called Marines United, where hundreds of veterans and active-duty marines were circulating nude or invasive photos of military servicewomen without their knowledge. “Dozens of now-deleted Google Drive folders linked from the Facebook page included dossiers of women containing their names, military branches, nude photographs, screenshots of their social media accounts and images of sexual acts,” the journalist, Thomas Brennan, later wrote for the news site Reveal.

The Marines United scandal stood out for the coordinated nature and scale of the abuse, but it was only one egregious example of how toxic social media platforms have become. According to a study by the Pew Research Center in 2017, 41 percent of Americans have personally been subject to abusive behavior online, and one in five have been the targets of particularly severe forms such as sexual harassment, stalking, revenge porn, physical threats, and sustained harassment over time. Those who experience online harassment suffer from mental or emotional stress or even fear for their personal safety, and the stakes are particularly high for young internet users: other studies have found a significant association between cyberbullying and depression, self-harm, and suicidal ideation in people aged 12–18.

Following the Marines United revelations, Facebook quickly shut down the private group, but similar ones immediately began cropping up on the platform. Most major social media platforms ban sexually explicit photos, particularly those flagged as non-consensual, so if a victim sees the content they can report it. But this doesn’t keep the images from being shared on private groups where they are less likely to be reported. As a result of mounting pressure from advocacy groups for victims of sexual assault, Facebook promised to take a more active role in addressing such harassment.

The result, which Facebook began rolling out in April 2017, and which now includes partnerships with nine organizations across eight countries, is one of the most proactive efforts by any of the social media companies to address online abuse. And yet, despite the amount of time and money that Facebook has spent on the program, the whole thing is ultimately doomed to fail. It’s a revealing failure, though, because it points to fundamental limitations in the way that social media companies think about—and have encouraged the rest of us to think about—the problem of online harm.

The Narrowsight Board

Facebook’s process for addressing revenge porn and other non-consensual sexual images requires victims to upload their images to the company, where a “specially trained” employee reviews the image and then creates a digital fingerprint of it. This allows image-matching software to detect if the same photo appears elsewhere on the site or is later uploaded again. This can potentially help some victims, but it has major shortcomings. It assumes that the victim has access to the non-consensual image, and it requires that the victim trust Facebook with extremely sensitive content. The image-matching software is also remarkably easy to fool; slight alterations to an image, such as changing the background, have been shown to elude the technology.

More importantly, the technology takes control away from the victim by assuming that deleting the images automatically is all that a victim wants. The victim never learns whether anyone else has tried to upload the images and has no proof for further action. Platforms even routinely ignore “preservation letters” from lawyers of victims of revenge porn, and delete crucial evidence.

At the most basic level, Facebook’s process, like other attempts to address online harm, suffers not from faulty algorithms, but from a crucial misrepresentation of the problem. Social media companies have construed a wide range of online harms as essentially problems of content (violating photos, violent or threatening posts, Nazi symbolism). As a result of this framing, the solution to online harm has largely been presented by these companies as “content moderation”: removing posts that a platform deems against the rules or toxic, and occasionally banning the user who posted the content.

Social media companies have a strong incentive to adopt the content moderation framework, which was originally developed to minimize spam, and all of the large social media companies moderate content to some extent. That’s because the quality of their platforms would spiral downward if they didn’t. Imagine logging into Facebook and seeing an unabated stream of violent images and junk messages—you probably wouldn’t want to log in again. Most advertisers don’t want their ads to show up next to such content either. A decline in user engagement and ad sales is bad for a social media company’s bottom line, and removing potentially offensive content is the cheapest way to ensure that doesn’t happen.

But although it might maximize profits, this way of thinking about online harassment is almost entirely unable to address the harm that harassment causes. It assumes that the problem is individual pieces of harmful content that must be moderated—not people and their relationships. As a result, content moderation fails to serve the needs of those who are harmed online or to change the conditions that make such harm possible.

Once the problem of online harm is framed as content moderation, it is already a lost cause for victims. Inevitably, platforms claim that the sheer amount of content makes it impossible to monitor. This is true, but it conveniently leaves out the fact that every single decision made by platforms prioritizes scale, and platforms generally avoid taking actions that might reduce user engagement. At the same time, as the scholar Sarah T. Roberts has detailed, they strive to minimize costs, especially for things like human moderators.

If platforms accept that something needs to be done about online harm, but frame the problem as needing simply to detect and remove content that breaks platform rules, the logical next step becomes automating this process. When Mark Zuckerberg was questioned by Congress in 2018, he mentioned “artificial intelligence” more than thirty times, framing it as “the scalable way to identify and root out most of this harmful content.”

Many experts disagree that AI will ever be able to apply content moderation rules effectively by itself. Even if it could, abusers quickly adapt to automatically enforced content moderation rules, for instance by editing non-consensual sexual images to bypass AI detection, creating fake porn videos of victims using deepfake techniques, or by coordinating attacks and using memes to mask their intent. What the vague promise of a forthcoming technological solution has been effective at doing is dampening calls for greater regulation of social media platforms.

The content moderation framework also severely limits our collective ability to push back against the way social media companies deal with harm on their platforms. Most platforms have no mechanism for contesting moderation decisions. In response to pressure from users and lawmakers, Facebook recently created an oversight board for content moderation, but because the problem itself is so narrowly defined, so is the solution. The oversight board can only review individual pieces of content, and only if the author of the content objects to their removal from the platform. This means that the oversight board has no influence over what content is allowed, how private groups are governed, or what the platform’s algorithms promote. Nor can it do anything to support people who have already been victims of harassment.

In order to do that, we need to think differently about the problem of online harm. Over the past two years I have been working with collaborators in communications, information studies, and computer science to imagine alternatives to the content moderation model. Our greatest resource has been abolitionist traditions that have challenged the criminal justice system for dealing with offline harm, and which have been extremely influential in shifting public perceptions of policing in the wake of the police murder of George Floyd.

In particular, those rallying to abolish prisons have developed models of restorative and transformative justice that can expand our imaginations of what can be done to address harm more generally. In listening to members of online communities, my research group has found that what is needed is not more sophisticated ways to identify and remove offending content—just as we don’t need better ways of policing and imprisoning people—but ways of supporting survivors and transforming the societies in which harm happens, including our online social worlds.

Obligation and Restoration

Practices of restorative justice, which have their roots in Indigenous ways of repairing harm, are particularly useful to rethinking how we address online harm.

Restorative justice views harm not as a crime against the rules of the state, but as a violation of people and their interpersonal relationships. Violations create obligations, and the central obligation of restorative justice is to right the wrong. At minimum, restorative justice requires that we address the victim’s needs related to the harm; hold offenders accountable to right those wrongs; and involve victims, offenders, and communities in this process.

The primary tool of restorative justice is communication, and practitioners have developed structured modes of communication to take people through a process of reckoning with and repairing harm as much as possible. Although restorative justice has to be adapted to different communities and circumstances, its principles have been successfully codified into processes and formal training and embedded in churches, workplaces, and neighborhoods. In Oakland schools that have implemented restorative justice, students feel more respected by and connected to their peers, and detention rates have fallen to zero.

What would an approach to online harm grounded in restorative justice look like? My research group has conducted extensive interviews with restorative justice practitioners, as well as with the moderators of online communities on platforms such as Reddit and Discord.

As part of this work, we practiced speculative design: giving participants real-world scenarios of online harm and asking them to discuss the ways that those harms might be addressed, including assessing which scenarios that were most likely to happen, most desirable, and most unexpected. Here is a basic table contrasting the kinds of questions that would be asked when using a content moderation framework versus one based on restorative justice:

Content moderation
What content has been reported?
Is the content against the rules?
Should the content be removed, demoted, flagged, or ignored?

Restorative justice
Who has been hurt?
What are their needs?
Whose obligation is it to meet those needs?

For example, in the case of non-consensual sexual images like the ones shared in the Marines United scandal, a restorative justice process might begin with the following questions:

Who has been hurt? The person whose private, intimate photos were shared without their consent.
What are their needs? Victims have a diverse range of needs, including: protection, if they are in immediate danger; validation; and contextual information (Are they being stalked? Have the images been shared? Where?). While “blocking” is often presented to victims as a way to deal with abusers, in practice many victims seek to actively monitor their abuser so that they are not caught off-guard. Some victims want the harm to be acknowledged by the person who committed it, or by the community. Many victims want the harm not to happen again, to themselves or anyone else.
Whose obligation is it to meet those needs? The obligations mentioned above fall to many people: the person who committed the harm, the wider community of people who know and care about these people, and the platform that provided the space for the harm to occur.

While the obligations of the person who committed the harm and of the wider community have been extensively discussed by restorative justice researchers and practitioners, the obligations of platforms are new and unique to online harm, and deserve special attention. Each case is different, and should be addressed individually and with care, but here are several possible approaches, which may need to be combined:

Assign a trained caseworker. In the most extreme cases of harm, such as sharing non-consensual sexual images, trained caseworkers should be assigned to the case. Facebook currently assigns a “partner” to someone who seeks to report revenge porn, but their role is only to instruct the person on how to upload their images to Facebook. Instead, the case worker should support the victim, provide validation, share information on their options, and help them identify and activate a community of people that cares for them and can be involved in the restorative justice process.
Be trauma-aware. Current methods of addressing online harm, including Facebook’s revenge porn process, are detached from the reality of being a victim of sexual harm. For instance, sharing private, intimate photos with unknown Facebook employees can retraumatize the victim. An alternative could be giving victims the tools to create a digital fingerprint of the image themselves with the help of trained outside support groups, who could verify and share the symbolic identifier with Facebook without sharing the actual photos.
Support the harmer to take accountability. With the victim’s consent, their case worker may reach out to the offender, have a conversation about the harm, and gather and provide information. At present, Facebook deletes the photo or prevents it from getting uploaded, and in some cases they ban the offender with little explanation. This actively discourages responsibility on the part of the offender. But offenders need to be encouraged to understand the harm and to work towards repairing it. That might mean apologizing and working to not repeat the harm, becoming educated about the broader effects of the harm, or paying reparations for harms done in the past.
Stop the continuation of the harm. This may mean placing limitations on posting, or even removing the person who has committed harm from the platform. This does not mean that anyone who commits harm should be immediately banned forever, but does mean that removal from the space is a valid and possible consequence of harm. Removal should be done with care so that it is not simply a punitive response but is actually necessary to stop the harm and is done with explanations and an opportunity to engage with the process.
Create structures for continued accountability. Platforms should make addressing harm a priority before it occurs. This can mean setting expectations that sexual harm will not be tolerated when members join, and periodically reminding them as well as explaining what accountability protocols exist to address harm. It can also mean training and supporting users to intervene in sexual violence. Facebook Groups currently rely on admins from the group to moderate it; the platform could mandate that groups above a certain size conduct a yearly review of their protocols for dealing with sexual harm, and this could include designating at least two point people to whom such issues should be reported and a process for addressing incidents.

Communities of Care

Restorative justice approaches to online harm require resources, time, training, and labor. Each harm case—from Donald Trump threatening violence against protesters to trolls sending journalists violent memes—will be unique and require a creative response. It’s a revolutionary idea that victims of harm, who are often people with the least power in society, should be cared for, and we cannot expect today’s large social media companies to commit the resources necessary to do this of their own accord. But we can pressure them to, and we can create alternative platforms that do.

The project of content moderation is costly, complicated, and controversial. To move beyond it, platforms need to spend considerably more resources to hire experts who are trained in restorative justice, trauma response, community accountability, and anti-racism. They would also need to help fund outside organizations such as victim support groups. This would mean lower profit margins; it would also mean adapting protocols to different cultures and geographical regions.

All of this would be difficult and expensive to achieve at the scale of current platforms, which is an argument for breaking those platforms down into smaller online communities governed by the users themselves. If companies whose main activity is to host user-generated content can’t effectively address harm on their platforms, maybe they shouldn’t have such high profit margins—or maybe they shouldn’t exist at all.

My research group has begun to practice some aspects of restorative justice in online communities in coordination with the moderators of those communities. Pre-conferencing, which involves one-on-one conversation between the mediator and different people involved in the harm, is often the first step of a restorative process. In order to get a deeper understanding of the types of harm that happen, the needs of those who are harmed, and what potential next steps could look like, we are currently conducting pre-conferencing interviews with people who have been harmed in online gaming communities, those who have been banned from certain games, and moderators.

In building a just future we cannot however rely solely on the intervention of platforms, or on restoring justice one harm at a time. Even as we work towards restoring justice right now, our long-term aim must be to transform the societies in which harm occurs. This is the work of transformative justice, which was popularized by women and trans people of color as a way to address interpersonal violence and tie it to structural and systemic forms of violence. As the organizer and educator Mariame Kaba puts it: “I am actively working towards abolition, which means that I am trying to create the necessary conditions to ensure the possibility of a world without prisons.”

The future we should be working toward is one in which every single person has the skills to identify harm, hold themselves and others accountable, and work towards justice. At the same time, we must transform the social conditions, including patriarchy and racism, in which harm thrives. This kind of work leads us to fundamentally transform our relationships with one another, and it cannot be scaled or outsourced. When building a future that addresses online harm we should not seek mere alternatives to content moderation; we should work towards a world where no content moderation is needed.