We’re constantly inundated with stories about how data science, machine learning, deep learning, and artificial intelligence are revolutionizing everything. But what do these terms even mean? And are they likely to have anywhere near the impact that the media hype would lead us to believe?
We sat down with a veteran data scientist to help us answer these questions. Over beer and Chinese food, we spent hours discussing a wide array of subjects, ranging from neural networks to algorithmic racism.
Later, when we transcribed and edited our conversation, we realized we had way more than we needed for a single piece. So we broke our conversation into four installments, each organized around a different theme—“Introduction,” “Deep Learning,” “FinTech,” and “The Future”—and decided to distribute them throughout the issue. Read the first installment below.
LOGIC: Alright, let’s get started with the basics. What is a data scientist? Do you self-identify as one?
DATA SCIENTIST: I would say the people who are the most confident about self-identifying as data scientists are almost unilaterally frauds. They are not people that you would voluntarily spend a lot of time with. There are a lot of people in this category that have only been exposed to a little bit of real stuff—they’re sort of peripheral. You see actually a lot of this with these strong AI companies: companies that claim to be able to build human intelligence using some inventive “Neural Pathway Connector Machine System,” or something. You can look at the profiles of every single one of these companies. They are always people who have strong technical credentials, and they are in a field that is just slightly adjacent to AI, like physics or electrical engineering.
And that’s close, but the issue is that no person with a PhD in AI starts one of these companies, because if you get a PhD in AI, you’ve spent years building a bunch of really shitty models, or you see robots fall over again and again and again. You become so acutely aware of the limitations of what you’re doing that the interest just gets beaten out of you. You would never go and say, “Oh yeah, I know the secret to building human-level AI.”
In a way it’s sort of like my Dad, who has a PhD in biology and is a researcher back East, and I told him a little bit about the Theranos story. I told him their shtick: “Okay, you remove this small amount of blood, and run these tests…” He asked me what the credentials were of the person starting it, and I was like, “She dropped out of Stanford undergrad.” And he was like, “Yeah, I was wondering, since the science is just not there.” Only somebody who never actually killed hundreds of mice and looked at their blood—like my Dad did—would ever be crazy enough to think that was a viable idea.
So I think a lot of the strong AI stuff is like that. A lot of data science is like that too. Another way of looking at it is that it’s a bunch of people who got PhDs in the wrong thing, and realized they wanted to have a job. Another way of looking at it—I think the most positive way, which is maybe a bit contrarian—is that it’s really, really good marketing.
As someone who tries not to sell fraudulent solutions to people, it actually has made my life significantly better because you can say “big data machine learning,” and people will be like, “Oh, I’ve heard of that, I want that.” It makes it way easier to sell them something than having to explain this complex series of mathematical operations. The hype around it—and that there’s so much hype—has made the actual sales process so much easier. The fact that there is a thing with a label is really good for me professionally.
But that doesn’t mean there’s not a lot of ridiculous hype around the discipline.