The Art of Eyeball Harvesting: Shengwu Li on Online Advertising

Can you start by telling us a little bit about your area of expertise?

I work on behavioral economics and game theory, so I study insights from psychology and build them into mathematical models of human behavior. In particular, I’m interested in what we call market design.

For a lot of the history of economics, we've taken as given the way that market institutions work. Market design turns that on its head. It says that instead of taking the rules of the market as given, we should look at how those rules are written.

There are any number of times where the government or a private company needs to set up a market. For example, the Federal Communications Commission (FCC) might be selling the rights for companies to broadcast over a certain band of the wireless spectrum. The agency needs to come up with a set of rules that determines who pays them what for which portion of the spectrum. That's a market design question.

How are big tech companies engaged in market design?

Google and Facebook get most of their revenue from selling advertising. So they’ve had to design a system that enables advertisers to transact with them.

The evolution of that system has been very ad hoc. It started at Google. Early on, the company realized that it needed a way to allocate the different ad slots that appear next to its search results to different advertisers. So it started looking at something called a “second-price auction.” This is a kind of auction that gets invented by William Vickrey in the 1960s.

At the time, there were two standard auction formats: the first-price auction and the ascending auction. The first-price auction is where we all simultaneously submit bids and the highest bidder wins. The ascending auction is the one we generally think of when we picture an auction: there's somebody with a gavel and the price keeps going up and everybody drops out except the highest bidder and that person wins.

Vickrey recognized that both types of auctions have their advantages. One obvious virtue of the first-price auction is that everybody can participate asynchronously at a distance. We don’t all have to be in the room bidding at the same time. Logistically, it’s very efficient.

On the other hand, one virtue of the ascending auction is that it’s easy to know how to bid. You watch the price go up and once it's above your value, you quit. It’s simple: you keep bidding at all prices below your value, and you quit at all prices above your value.

Whereas in a first-price auction, you really need to strategize. You need to say, “Okay, I guess this object is worth $1000 to me, but if I get it for $1000 then I walk away no better off than I was before, so I should put in a bid somewhere below $1000.” But how much below $1000? That depends on what I think the other bidders are going to do—and what the other bidders do depend on what they think I’m going to do. It’s potentially a very complicated calculation.

So in 1961 Vickrey writes a paper where he proposes a new kind of auction that combines the benefits of both formats. In this new kind of auction, we can all submit bids asynchronously and we don’t need to strategize. Here’s how it works: everybody bids, the highest bidder wins, and then the winner pays the second-highest bid. Thus the second-price auction was born.

Why does that solve the problem of strategy? Why don't you have to strategize in a second-price auction?

The trick is to see that there is a neat isomorphism between the price you choose to quit at in an ascending auction and a bid that you place in a second-price auction. Instead of running an ascending auction and having everybody decide dynamically at the time when to quit, you can just ask everybody when they are going to quit in advance and call that their bid. The highest bidder will win and then pay the second-highest bid.

This encourages you to submit a bid at however much you value the object. So if you value the object at $1000, it is better to bid $1000 than to bid any other amount. And that's true regardless of the other players’ behavior.

What happens to Vickrey’s idea?

Mostly it gets ignored. No real auctioneers adopt this way of selling things. Then, about a half-century later, in the early 2000s, Google picks it up and dusts it off as they’re figuring out how to sell ads.

If you’re searching for car insurance on Google, a bunch of companies that sell car insurance want to place their ads next to those search results. Google needs to figure out a way to determine which companies get to put their ads where. Let's say there are three slots for ads on the page with the search results. There is the top slot, which is the most valuable because people see it first. Then there is the second slot below it, which is less valuable, and the third slot even further down, which is the least valuable.

Google decides it’s going to run an auction to determine which advertiser gets which slot. Google knows roughly how many clicks each slot will get. For example, they may know that the top slot gets clicked 300 times for every 10,000 views, the second slot 200, the third slot 100.

When advertisers submit their bids in the auction, they’re submitting their bid per click. They’re saying how much they’re willing to pay Google for each click they get. The highest bidder gets the top slot, the second-highest bidder gets the second, and the third-highest bidder gets the third.

The payment works in the following way. The top slot is worth 300 clicks. If you bid ten and I bid nine, you win the top slot and you’re going to pay my bid—the second-highest bid—times 300 clicks. And if I bid nine and somebody else bids eight, I'm going to pay eight, the third-highest bid, times 200 clicks, which is how many clicks the second slot is worth. And then let's say somebody else bid five. That person, the third-highest bidder, pays the fourth-highest bid times 100 clicks, which is how many clicks the third slot is worth.

So Google implements something that looks a lot like Vickrey's idea.

Do they cite Vickrey?

Their advertising materials for what they called the “generalized second-price auction” claimed that it used “Nobel Prize-winning economic theory to eliminate... that feeling that you’ve paid too much.” So they directly draw a line to Vickrey and his academic credentials as a justification for the format that they're using. Except they get it wrong.

How so?

A defining feature of a Vickrey's auction is the “dominant strategy property.” This means there is a strategy that will always perform the best no matter what any of the other players do. In the case of Vickrey’s auction, the dominant strategy is to bid your value.

Now, there is a correct way to generalize Vickrey's auction to preserve this property. But Google did not do it correctly, at least not at first. The way Google did it, there were times when you should strategize. Sometimes you can benefit by misrepresenting your value per click.

When is that?

Let's say there are a bunch of advertisers who are already in the three slots. You come in as a new advertiser, you submit the highest bid per click, and you take the top slot. Now, Google's auction made you pay for displacing the previous top bidder down to the second slot. But that's not the full effect of your participation. What you've really done is knock everybody below you down one rung of the ladder, but Google’s auction instead charged you as though you knocked the top bidder out entirely. It doesn’t account for this waterfall effect.

The point of Vickrey's auction is that it charges you your externality. It charges you an amount equal to what all the other bidders lose due to your participation. It turns out that there were certain circumstances where Google wasn’t calculating that externality correctly—and advertisers could benefit from bidding slightly less than their value.

Was that miscalculation good for Google?

It's certainly possible that if people misunderstand the auction this way, Google might make more money. But this also gets to an important issue: whether or not people are willing to play in your auction depends on how user-friendly it is. If Google's auction requires players to strategize in order to bid well, it's entirely possible that fewer bidders will be willing to participate.

It's one thing if your only job in Google's auction is to figure out how much a click is worth to you. That's difficult enough because you need to rule out the bots, you have to think about your sales model, and so on. But if on top of all of that you also have to think about how everybody else is bidding—and come up with an optimal strategy given the possible strategies of other bidders—the ensuing complexity will deter a lot of participation.

Anyway, the format that Google starts using in the early 2000s has this problem. And it remains for almost a decade. It's only fairly recently that Google has adopted the correct generalization of Vickrey's auction that accounts for the waterfall effect.

What about Facebook? How do they sell advertising?

In Google you're mostly bidding on keywords. You're placing ads around certain search terms, like “car insurance.”

Facebook is selling a more personal product because Facebook knows a lot about you. It also has expert data scientists who can take everything Facebook knows about you and infer clever things from it.

The result is that advertisers can’t bid on Facebook the way they bid on Google because they don’t know what an ad is worth. They can’t take all that data that Facebook has—your gender, race, age, interests, social graph, online history, and so on—and turn it into, "Here's how much we think it's worth to show this person an ad for a ski holiday." Moreover, there are black-box machine learning algorithms that Facebook uses internally to help calibrate each campaign. It would be difficult to explain them to potential advertisers—and it's not even clear that Facebook can explain them to itself.

So Facebook adopted a different solution to selling ads. Instead of placing a bid, you tell Facebook what kind of ad campaign you want to run and they bid on your behalf. And then there's something like Vickrey's auction running in the background deciding who bids what and at what price.

That doesn’t really sound like a market anymore.

That's the oddness of it. It’s as if you went to a supermarket and rather than the owner saying, “Here are all the prices, please buy what you want,” the owner says, “Why don't you tell me how you're feeling this week and what you have a taste for, and I’ll find the optimal bundle and tell you how much it costs. Don't worry, I know my warehouse much better than you do.”

One of the weaknesses of game theory as a way of thinking about the world is that it assumes we all know the rules of the game we're playing. But when some players have the lion’s share of the information, and when there are black-box algorithms in the middle, it becomes impossible for all of the players to understand the game they’re playing.

Facebook isn’t telling advertisers, “Here's the price you would need to pay to buy Mary’s eyeballs versus Adam’s eyeballs and here's all of the information you would need to make the decision about whose eyeballs you should purchase.” Instead, Facebook is saying, “Tell us about your advertising campaign, and we will figure out what you would have rationally done if you had access to our troves of information.”

Do advertisers not care about that lack of transparency?

Whenever there is a black-box element to a system, that system has to run on trust. When Facebook says they know better than you what kind of advertising you want to be buying, you really have to believe that Facebook has your interests at heart. And in some cases it’s not that Facebook doesn’t want to explain their reasoning—it’s that they can’t explain it, because the machine-learning models they’re using aren’t explainable. So it’s up to Facebook to convince advertisers that they’re not taking advantage of them. That may be difficult to prove when everything is so opaque.

Thinking more broadly about this new kind of advertising, what do you think is most distinctive about it? How does it differ from what came before?

One difference that springs to mind is the sheer individualization of it. There are some auctions where you can even bid for an individual human impression. For example, there’s a startup that will let you target a particular person with an ad campaign.

How does that work?

Maybe you want your partner to stop smoking. This startup will generate a special link for you that looks like it’s an e-commerce site. You send it to your partner and when they click it, they get a cookie secretly loaded into their browser. This cookie enables the company to track your partner across the web. You write up an anti-smoking ad, and the company will ensure that your partner sees that ad everywhere. Now your partner’s entire internet experience is permeated with pressures to stop smoking.

You can design a similar campaign for a coworker you don't like. You can show them ads for job-hunting websites, to encourage them to get another job.

That's pretty funny. I knew about the extent to which Facebook and Google track you around the web, but I hadn’t realized that smaller companies could do it as well.

Most people don't realize how many companies have access to the cookies that are in your browser, and how much information those companies can learn about you from those cookies.

Let's say you go to the New York Times website. Now, at various points in your browsing history, all sorts of cookies have attached to your browser. When you click through to the New York Times, as the page is loading, there is an instantaneous algorithmic auction for the right to show you an ad impression. That auction takes about a tenth of a second. In that tenth of a second, the New York Times passes on all of your cookies to an online auctioneer called a supply-side platform. These auctioneers then send out a query to a whole bunch of bidding companies that exist to help advertisers run online ad campaigns. They'll say, essentially, "At 4:01 pm, from the following IP address in Cambridge, Massachusetts, somebody we think is female, twenty-five to thirty-five years old, is looking at an article about Trump."

Presumably the time and the IP address are logged by the New York Times, but how do they know I’m female and twenty-five to thirty-five years old?

They’re guessing that based on your cookies, which are just strings of text that contain information about your online activity. But the New York Times may not be able to interpret your cookies. So they’ll pass the cookies on to the auctioneer, who passes them on to the bidders, and the bidders will interpret them.

Now maybe the bidders can understand your cookies better than the New York Times. Maybe they can make a pretty good guess that you’ve been looking for a ski holiday or that you've been looking for a divorce lawyer. And from that they'll compute how much they think your eyeballs are worth to them. They'll send their bid back to the auctioneer, who determines the winner and then sends the winning impression back to the New York Times. That's how you get an ad for a ski holiday or a divorce lawyer one tenth of a second later.

And so every time I go to the New York Times it is taking all this information and sending it to third parties who send it along to fourth parties?

Yes. Whenever you're accessing one of the many websites that sells advertising in this way, all of your cookies are being made public in this fashion. I think people don't realize that this is part of their everyday internet experience.

Is there any government oversight of this process? Who's even supposed to be regulating that?

No, there's essentially no oversight. It's the Wild West out there.