Since the early months of the Covid-19 pandemic, María’s house has been run like a factory. Every day, her family of six synchronizes their routines so two people are always behind a computer. María, her husband Rodrigo, and their children, Daniela (20), Andrés (18), and Camila (13) are among the unknown number of Venezuelans who, after years of political and economic crisis exacerbated by the pandemic, now try to make a living by annotating data through crowdsourcing platforms. Using two Canaima laptops, which the Hugo Chavéz government provided a decade ago for school children, they tag images and videos, transcribe text and audio, search for information online, and send videos and pictures of themselves to developers at companies and research institutions in Europe and North America. The developers use this data to train machine learning algorithms, like the ones that do facial recognition, moderate content, and guide self-driving cars.
The family’s activities all revolve around data production because this is their only source of income and, according to María, they have to “focus on the same objective to survive.” She and Rodrigo do most of the work, although she also takes care of many domestic duties. Camila, Andrés, and Daniela work part time on data annotation while attending high school and university. Only María’s youngest child, Sebastián (7) is able to focus exclusively on school. Although most crowdsourcing platforms’ terms of use state that each account must be run by a single adult, often the only hard requirement to set up an account is for someone to prove that they are at least eighteen years old by taking pictures of an identification card and their face, and praying that a third-party facial recognition verification system called Onfido detects a match.
The platform the family works for pays them a few cents per task, in cryptocurrency. They are only allowed to transfer the money to their online wallet once they have made at least the equivalent of ten dollars. After working every day of the week, they usually earn around twice that much, but recently they have barely made the minimum. “Last week, we couldn’t cash in,” Maria told me. “We couldn’t even make five dollars in total.” Her family dreads the day when the tasks will stop coming, the computer breaks, or they will lose access to the internet and electricity. Ofelia, another data annotation worker, who has diabetes, depends entirely on the platform to purchase insulin. “I would die without this income,” she told me. “I would literally die.”
Income from data annotation is essential to Ophelia, María’s family, and the other Venezuelans who do this work because hyperinflation has made the official monthly minimum wage in the country worth only a few dollars, which is not enough to afford staple foods to survive even a week. That has rendered most jobs paid in bolivars, the national currency, unsustainable. After years of economic mismanagement due to government corruption and its economic dependency on oil, Venezuela has a goods and services shortage and has inflation levels that are consistently among the highest in the world. This situation, combined with its existing internet infrastructure, has made the country an appealing target of crowdsourcing platforms. In the absence of a robust social safety net, workers often see these platforms as their most reliable source of income in US dollars.
Before the pandemic, María and her family were migrants in neighboring Colombia for a year. María worked at a beauty salon while her husband Rodrigo worked selling coffees in the streets. The children all studied in the public education system. These were difficult but more stable times for the family. When the pandemic hit, María lost her job and, with deserted streets, Rodrigo couldn’t find many clients. With no other choice, they decided to return to Venezuela. “Here we had to look for options, and a friend recommended the platform to us,” Maria said. When the pandemic stopped in-person teaching, it meant that her three eldest children were stuck at home too, and could also perform data annotation work. In dozens of interviews with platform workers in Latin America, many of whom are or were migrants, I have heard similar stories: they were collectivizing platform labor across their household members, with teenage children doing more and more work after the onset of the pandemic.
In these ways, the political and economic crisis in Venezuela, as well as the pandemic and remote schooling, have turned out to be productive for data annotation platforms, their clients, and the venture capitalists that back them. (These crises have also generated profits for companies selling information to carceral states: Onfido, the identity verification company used by electronic wallets, shares the identity and facial recognition data it collects with the United Kingdom police.) The thousands of companies and research institutions that develop artificial intelligence are using platforms to find cheap outsourced labor, especially from low-income economies, for global markets in which data and labor are sold as commodities. One of the results is a race to the bottom in which wages get lower and lower as competition between platforms—and their ability to find pools of ready labor even among people living in refugee camps—goes up.
The invisibility of the workers in this process, and the myth of “one user, one account,” which permeates the technology industry, are at the center of many tech companies’ business models; in many cases, they pretend that their products are entirely automated and devoid of human intervention. In fact, the most popular data-annotation platform in the United States and India, Amazon Mechanical Turk, is named after an eighteenth-century automaton that deceived spectators by seeming to play chess autonomously while concealing a human player inside. From the clients’ perspective, workers are just users or, even worse, less than robots: “To be successful at this job, you have to think like a machine,” said one of the platform administrators to Cecilia, a worker I interviewed.
Once we see through the single-user facade, we can begin to appreciate the ways in which workers and their networks have gamed data annotation platforms and collaborated to mitigate the crises they face. These survival tactics can be important resources for other workers and communities facing similar exploitation. At the same time, though, these tactics indirectly serve to prop up the neocolonial labor practices of the platforms, their clients, and venture capital. In order to challenge these larger forces, we not only need to hold companies and research institutions accountable for the value they extract from “indivisibilized” workers. More importantly, we need to support emerging community-based alternatives to data annotation platforms—alternatives built by the people actually subject to this highly extractive form of work.
Survival Tactics
In the suburbs of Valencia, in Carabobo State, data-annotation workers like Alfredo must increasingly rely on their own efforts to survive. “Our water comes from a pump,” Alfredo told me recently. “Every block of houses has a well, and every day the community designates someone to operate the pump to fill those wells.” The country’s ongoing economic and political crisis has caused the state and traditional businesses to become less present and effective in peoples’ lives; with little support from their local institutions or from employers, workers have increasingly had to rely on families and local communities for survival. Many of these communities manage water and waste disposal locally as common goods and services. These community-based services are necessary so workers can be ready for work, but not always robust or safe; for example, workers in communities that have had to resort to private waste incineration have been poisoned by smoke pollution.
Community support is also critical online, where workers share resources and organize with others through social media platforms like Facebook and Discord. Originally, the data-annotation platforms themselves created internal online groups to communicate with workers; however, these groups were heavily policed by moderators from the platforms. On one occasion, Roberto, a Black Venezuelan, wondered why Onfido’s AI could not manage to match his face and ID to validate his account. The moderators responded by expelling him from the online group. “I was astonished,” he told me. “I was expelled for asking a question!”
This kind of policing prompted many workers to form their own groups. Through my interviews, I found groups where workers would seek help learning how to complete tasks, complain about the platforms, and, on one occasion, organize a strike: members of a major Facebook group for data-annotation workers in Venezuela tried to convince their peers not to work for a few days, inspired by colleagues in the Philippines who, according to the organizers, successfully improved their wages by refusing to work. Despite the organizers’ efforts, most workers and their families were so dependent on the platform income that they couldn’t join the strike.
Through my interviews, I also found that some workers are part of smaller, closed groups on text-based apps like Telegram. These groups have a few dozen members with fees of a few dollars per month. (It was through Rogelio, the administrator of one of these groups, that I interviewed María and her family.) These sort of online professional associations were built for workers to help each other, to generate trust, and to access currency traders, who exchange virtual dollars and cryptocurrency for bolivars. In these groups, workers share resources, like bots that alert them when tasks are available, and guides that explain how to solve tasks more efficiently. I was told that one group pooled their savings to pay a programmer in Spain to code one of these bots for them.
Workers also use the groups to buy and sell accounts on the data-annotation platforms. New platform workers are not allowed by the platforms to perform many annotation tasks, and it takes a great investment of time to gain access to the best work. As a result, there is an informal market for the highest, “level 3” accounts, which are sometimes sold individually for roughly ten US dollars, or in packages of at least as many as ten.
Online worker groups can also transcend the virtual—for example, when a fellow member cannot work from home and needs a place to go, or when a member tries to take advantage of another one. A worker named Rodolfo told me about a colleague who refused to pay after receiving login details for ten platform accounts. “Hopefully, the moderators have personal information of every single member, including addresses,” he told me. “They contacted the seller physically and realized that she had lost access to electricity and couldn’t complete the transaction.” Trust in his fellow workers is essential in a context where online scams are common.
These forms of community support are vital to workers, but they also put the onus on workers to make the data platforms’ business models sustainable. But even then, workers are disposable: when a platform can no longer drive down wages in a particular country, they can simply look to other places and other crises.
Beyond the Visible
Since platforms are not usually physically and administratively present in the countries where workers are located, they can relocate quickly. I have conducted quantitative research on the web traffic of ninety-three crowdsourcing platforms and shown that some of those present in Venezuela are now targeting workers in Kenya. This repeats the same model used by non-data-annotation gig work apps, such as Uber, which launch in a country with incentives that make workers dependent on the platform and then remove those incentives once they create dependency.
Making workers and their communities more visible may be one way to demand these platforms change their business models. But pointing out the collective exploitation at the heart of data annotation should also be used to pressure the platforms’ clients, including developers, management, and individual researchers, as well as investors, shareholders, and university administrators. In order to do this, we need more efforts to document the origins of datasets, third-party audits on AI models to assess their compliance with local and international labor standards, and assessments of the working conditions of platforms.
However, as the scholar Noopur Raval argues, making workers and their communities visible is not enough. Since the economic incentives for platforms and their clients will remain, major actors in the data-annotation pipeline need to be directed to change their practices through regulation. In addition, in the place of exploitative multinational gig work platforms, companies and research institutions that require annotation need to support local initiatives from unionization to cooperative ownership of locally created platforms.
In fact, there is a burgeoning ecosystem of platform companies owned and managed by workers. These companies have the potential to be more sustainable alternatives to the mainstream gig economy platforms. The Platform Cooperative Consortium lists 506 co-op projects in thirty-three countries. Many have emerged in sectors like ride-hailing and delivery, but the market for cooperative data annotation remains untapped. It is also worth mentioning that local impact sourcing companies, where data annotation occurs on-site by employed workers, could represent a more reliable alternative, in terms of data quality and labor conditions, than many platforms currently in the market. Ultimately, only solutions that recognize the communal nature of work and economic justice can have transformative effects on the lives of workers like María and her family. Inequality in platform labor is not an issue of individual workers, but networks of people who resist the paradox of a technology that innovates for some by exploiting others.