Given the scale that Twitter is at, a one-in-a-million chance happens 500 times a day. It’s the same for other companies dealing at this sort of scale. For us, edge cases, those rare situations that are unlikely to occur, are more like norms. Say 99.999 percent of tweets pose no risk to anyone. There’s no threat involved… After you take out that 99.999 percent, that tiny percentage of tweets remaining works out to roughly 150,000 per month. The sheer scale of what we’re dealing with makes for a challenge.
— Del Harvey, Vice President of Trust and Safety at Twitter, from “Protecting Twitter Users (Sometimes from Themselves),” March 2014.
Scale and size, though, are not the same thing at all. If we think about simple massification, we might talk about the difference between a single cell and ten million cells, but that’s not the same as talking about a human arm. Scale is a dimension along which different objects arise; the arm exists as an object at one scale, the cell at another.
—Paul Dourish, The Stuff of Bits: An Essay on the Materialities of Information (2017)
Social media platforms must moderate, in some form or another.
They must moderate to protect one user from another, or one group from its antagonists. They must moderate to remove offensive, vile, or illegal content. Finally, they must moderate to present their best face to new users, to their advertisers and partners, and to the public at large.
Whether they want to or not, then, platforms must serve as setters of norms, interpreters of laws, arbiters of taste, adjudicators of disputes, and enforcers of whatever rules they choose to establish. Having in many ways taken custody of the web, they now find themselves its custodians.
But the problem of moderation is not new. Broadcasters, booksellers, publishers, and music labels all grappled with the problem of being in the middle: not just between producer and audience, but between providing and restricting, between audience preference and public propriety. They have all had to set and enforce rules about what they will and will not make available.
Moderation is not new to the web either. From the earliest days, users disagreed about what online spaces were for. Someone always wanted to circulate porn; someone always looked for ways to harass others. So parameters of proper content and behavior were always necessary.
What is different about today’s social media platforms is that they host and oversee an unprecedented amount of content and an unprecedented number of people. As a result, they must moderate on a qualitatively different scale.
From Workshop to Factory
The most obvious difference is the sheer number of users, the sheer amount of content, and the relentless pace at which they circulate. It was once difficult to imagine how traditional media could review, rate, and police all of the US television programming running twenty-four hours a day across multiple channels. But that was a small challenge compared to what social media platforms must now do. As one policy manager at Flickr put it to me recently, “the scale is just unfathomable.”
At this size, certain approaches to content moderation are practically impossible. For instance, there is simply too much content and activity to conduct “proactive review,” in which a moderator would examine each contribution before it appeared. Instead, nearly all platforms have embraced a “publish-then-filter” approach: user posts are immediately public, without review, and platforms can remove questionable content only after the fact.
This means that everything, no matter how reprehensible or illegal, can be posted to these platforms and will be available until it is noticed and removed. Vile or criminal behavior may occur, and have its intended impact, before anything is done in response.
There is always something on a social media platform that violates the rules—and typically lots of it. Someone is being harassed right now. Plenty of porn, graphic violence, animal cruelty, and hate speech are available as you read this. They will remain there for hours, days, or years, because of the challenges of policing platforms as immense as these. Because social media platforms operate at this scale, we as a society are being asked to tolerate the fact that even content as universally abhorred and clearly illegal as child pornography can be and is available on our favorite platforms, if only briefly.
But the question of scale is more than just the sheer number of users. Social media platforms are not just big; at this scale, they become fundamentally different than they once were. They are qualitatively more complex. While these platforms may speak of their online “community,” singular, at a billion active users there can be no such thing. Platforms must manage multiple and shifting communities, across multiple nations and cultures and religions, each participating for different reasons, often with incommensurable values and aims. And communities do not independently coexist on a platform. Rather, they overlap and intermingle—by proximity, and by design.
At this scale, moderation techniques that might have fit smaller venues simply will not translate. For instance, the techniques of online community management are ill-suited to the scale of major social media platforms. Managing early online communities depended in part on community members knowing the webmaster, regulars knowing one another, and users sharing an accumulated history of interactions that provided the familiarity and trust necessary for a moderator to arbitrate when members disagreed. Tough cases could be debated collectively; policies could be weighed and changed by the community. The scale of the forum made self- government possible. But as these platforms have grown, traditional community management has become increasingly untenable.
This means that the approaches social media platforms take, toward not just content moderation but all types of information management, are tied to this immense scale. Content is policed at scale, and most complaints are fielded at scale. More important, the ways that moderators understand problems have been shaped by working at scale. As a content policy manager from Facebook noted:
The huge scale of the platforms has robbed anyone who is at all acquainted with the torrent of reports coming in of the illusion that there was any such thing as a unique case… On any sufficiently large social network everything you could possibly imagine happens every week, right? So there are no hypothetical situations, and there are no cases that are different or really edgy. There’s no such thing as a true edge case. There’s just more and less frequent cases, all of which happen all the time.
What to do with a questionable photo or a bad actor changes when you’re facing not one violation but hundreds exactly like it, and thousands much like it. This is not just a difference of size—it is fundamentally a different problem. For large-scale platforms, moderation is industrial, not artisanal.
Industrial moderation requires industrial-scale human resources. These include community managers who work at the platform, crowdworkers who perform tasks farmed out to them, and users who avail themselves of complaint mechanisms.
The labor that platforms put toward moderation, and the labor that we as users are conscripted to perform, are not just part of how platforms function—they constitute it. Platforms are made by the work that goes into content moderation. They are not platforms without it. But because this work is distributed among different labor forces, because it is unavailable to public or regulatory scrutiny, and because it is performed under high-pressure conditions, there is a great deal of room for slippage, distortion, and failure.
In their earliest days, many platforms did not anticipate that content moderation would be a significant problem. Some began with relatively homogenous user populations who shared values and norms with one another and with the developers—like back when “TheFacebook” was only open to Ivy League university students. Many of the social norms that first emerged were familiar from college life, and the diversity of opinions, values, and intentions would be attenuated by the narrow band of people who were even there in the first place. Facebook, when it began, relied on Harvard students as volunteers, until the backlog of user complaints reached the tens of thousands. The company made its first permanent hire for content moderation in late 2005, almost eighteen months after launching.
As these sites grew, so did the volume and variety of concerns coming from users. Platforms experienced these in waves, especially as they grew in cultural prominence, changed dramatically in their demographics, or expanded to an international audience. Some tried to address these concerns the same way that online discussion groups had: through collective deliberation and public rule-making. But this proved increasingly difficult.
So the major social media platforms found that they had to expand their internal content policy groups. Today, most platforms have a team charged with overseeing moderation. The team sets the rules of the platform, supervises their enforcement, adjudicates the particularly hard cases, and crafts new policies in response.
At the scale at which most platforms now operate, however, these internal teams would be insufficient by themselves. Back in 2009, 150 of Facebook’s then 850 employees—based in California and in Dublin, Ireland—handled moderation, one click at a time. But as Facebook expanded, such an artisanal approach became infeasible. So platforms turned to outsourcing much of the front-line work of content moderation to crowdworkers.
As recently as 2014, Twitter was still claiming, “Every report by a user is reviewed by a member of Twitter’s Trust and Safety team.” Even for Twitter, which has leaner rules than similar-sized platforms, this statement is hard to believe—if what it meant was its handful of permanent employees devoted to content moderation.
But Twitter, like many social media platforms, now employs a substantially larger group of people to provide a first wave of review, beneath the internal moderation team. These workers might be employed by the platform itself, either at its headquarters or in satellite offices located around the world in places like Dublin and Hyderabad.
But more commonly they are hired on a contract basis: as independent contractors through third-party temp companies, or as on-demand labor employed through crowdwork services such as Amazon’s Mechanical Turk, Upwork, Accenture, or TaskUs—or both, in a two-tiered system. Crowdworkers are now used as a first-response team, looking at flagged posts and images from users and making quick decisions about how to respond.
Human review must be handled fast. “Fast” can mean mere seconds per complaint—approve, reject, approve—and moderators are often evaluated on their speed as well as their accuracy, so they must keep up the pace. Each complaint is thus getting just a sliver of human attention, and the moderator is under enormous pressure to be responsive not just to the current complaint, but to the queue of complaints behind it. To speed things along, and to protect users’ privacy, the images and posts are also usually detached from their original material.
Users expect the judgments of moderators to weigh competing values, show cultural sensitivity, and appear fair and consistent. This is hard to achieve when the content is distilled from its context, and the labor is performed under the intense pressure of an endless queue of complaints.
But crowdworkers aren’t the only labor pool that platforms can draw on to perform moderation work. Platforms have another resource close at hand: the users themselves.
Most platforms invite users to “flag” problematic content and behavior, generating a queue of complaints that can be fed to its army of crowdworkers to adjudicate. Flagging puts the work of review right at the point of offense, in front of those most motivated to complain. Even if moderators can’t be everywhere, users can. If internal content policy teams cannot anticipate emerging problems, users can serve as their nerve endings.
Flaggers may also help an internal team better understand the norms of the community. Content that is not flagged will remain, even if members of the team might have worried about it had they encountered it. Similarly, they may look twice at materials they think are acceptable but that users keep flagging. This becomes especially useful as a site expands into different countries: the population of possible flaggers automatically includes people with the necessary linguistic and cultural expertise.
Logistics and Translations
Each of these three labor forces—the internal team, the crowdworkers, and the flaggers—is an answer to the problem of moderation at scale. Each provides a way for either a few people to do a great deal with limited resources, or for a lot of people to each do a little, together.
The challenge of content moderation, then, is as much about the coordination of work as it is about making judgments. What the press or disgruntled users might see as mistaken or hypocritical might be the result of slippage between these divisions of labor—between what is allowed and what is flagged; between how a policy is set and how it is conveyed to a fluctuating population of crowdworkers; between how a violation is understood in one cultural climate and how it is understood in another; between what does trigger a complaint and what should.
The challenges of coordinating the distributed and decentralized work of moderation are probably most acute at the points of contact. Between flaggers and crowdworkers, and between crowdworkers and policy teams, are membranes across which must flow expressions of principle in one direction and expressions of concern in the other. At these points of contact, expressions get rewritten, translated into a new form that is meant to both retain the meaning and fit it to the work that will respond to it. These translations can introduce distortions, ambiguities, and even new meanings.
For example, enlisting the crowd to police itself can only work if the concerns of users can be transformed into manageable bits of data. When a user flags some problematic content, the character of his concern must be reduced to fit the data entry format provided: the flag itself, its particular submenus, time codes, and so on. This takes an affective, socially loaded, and meaningful expression of a user, scrubs it of emotion and detail, and restates it.
What may have been “Ugh! That’s terrible!” becomes “flag :: hateful or abusive content :: promotes hatred and violence :: 5:40.” The data comes to stand in for the users and their objections. Something is lost, and something is added.
Similarly, when policy teams craft specific guidance for crowdworkers, they must translate the platform’s community guidelines into instructions that will suit the modular, rapid work of the reviewers. But these two documents are not congruent in their spirit, emphasis, or detail. Again, much is lost and much is added.
The Robots Aren’t Coming
In May 2016, TechCrunch reported a significant milestone in how Facebook handles moderation: “Facebook’s artificial intelligence systems now report more offensive photos than humans do.” This does not mean that Facebook is using automatic techniques to remove photos, as some of the press coverage that followed incorrectly assumed. The platform’s automatic detection software is designed to detect nudity, hate speech, and the like—to identify it and flag it for human review. Still, the fact that the majority of what is reviewed is flagged by software speaks to how important automated techniques for moderation are becoming.
Is AI the answer? Software-based detection techniques promise, first, to solve the problem of scale. Particularly for sites that are too vast, like YouTube, or that emphasize real-time communication, like Twitter, platforms would like to have moderation techniques that do not depend on direct and real-time human oversight, and that can immediately and automatically identify unacceptable content. Ideally, these automated methods could be paired with automated interventions: algorithmically identified porn or hate speech would be instantly removed or withheld from some users.
Automated detection isn’t easy, however. Arguably, it’s impossible, given that what constitutes an offense depends so critically on both interpretation and context. State-of-the-art detection algorithms have a difficult time discerning offensive content or behavior even when they know precisely what they are looking for. There are fundamental limitations that may be impossible to overcome: the lack of context, the evasive tactics of users, and the fluid nature of offense.
Machine-learning recognition tools typically measure their success in terms of their detection rate. The best nudity detection algorithm claims to detect nudity with 94 percent accuracy and a false positive rate of 5 percent—that is, it spots nearly all the nude images while misidentifying as nude only one of every twenty non-nude images. In 2015 Twitter purchased Madbits, which promised an algorithm that could identify NSFW (not safe for work) images, including porn, violence, and gore, with 99 percent accuracy and a 7 percent false positive rate.
Is 94 percent good? What about 99 percent? It depends on what the platforms hope to do with these tools. These are, from one vantage point, phenomenal achievements. But when it comes to culture and expression, even a few false positives can be a cause for real concern, depending on whether those errors are idiosyncratic or systemic.
The stakes for false positives and false negatives differ, depending on the context and audience. While marking the occasional baby photo as pornography is one kind of problem for the users involved, incorrectly identifying black skin in ways systemically different from white skin is a different kind of problem—a public problem about representation and equity, rather than a consumer problem about efficiency and inconvenience.
These platforms now function at a scale and under a set of expectations that increasingly demand automation. Yet the kinds of decisions that platforms must make, especially in content moderation, are precisely the kinds of decisions that should not be automated, and perhaps cannot be. They are judgments of value, meaning, importance, and offense. They depend both on a human revulsion to the horrific and a human sensitivity to contested cultural values.
There is, in many cases, no right answer for what to allow or disallow, except in relation to specific individuals, communities, or nations that have debated and regulated standards of propriety and legality. And even then, the edges of what is considered appropriate are constantly recontested, and the values they represent are always shifting.
Feeling Like a Data Point
No matter how they handle content moderation, what their politics and premises are, or what tactics they choose, platforms must work at an impersonal scale: the scale of data. Platforms must treat users as data points, subpopulations, and statistics, and their interventions must be semi-automated so as to keep up with the relentless pace of both violations and complaints. This is not customer service or community management but logistics—where concerns must be addressed not individually, but procedurally.
However, the user experiences moderation very differently. Even if a user knows, intellectually, that moderation is an industrial-sized effort, it feels like it happens on an intimate scale. “This is happening to me; I am under attack; I feel unsafe. Why won’t someone do something about this?” Or, “That’s my post you deleted; my account you suspended. What did I do that was so wrong?”
The press also tends to cover moderation at this intimate scale. Journalists challenge the actions of a behemoth company by pointing out a single violation, and telling the story of the people who were harmed by it. Compelling, human-sized stories fit the journalistic frame well, but they say little about the scale at which social media platforms operate.
It may be that this gulf can be bridged. Users may become more aware of this sociotechnical apparatus and their place in it; social media platforms might better recognize that impersonal interventions make users feel personally judged or silenced. On the other hand, it may be that these scales are simply irreconcilable. Interventions made at a data scale may always run up against the lived experience of those interventions, felt intimately by individuals.