Companies keep overpromising about AI
The insurance company Lemonade is all about automation. To file claims, customers upload a selfie video and Lemonade’s chatbot, AI Jim, will handle some claims automatically. On Twitter this week, Lemonade got in trouble suggesting that artificial intelligence handles fraud detection and uses nonverbal cues to assess some claims. Researchers said that capability doesn’t exist and could be discriminatory.
Lemonade quickly downplayed how much AI it uses and said it’s not based on physical features. I spoke with Ryan Calo, a professor of law at the University of Washington who studies emerging tech and policy. He and I talked just a month ago about the Federal Trade Commission warning companies on how they communicate about AI. The following is an edited transcript of our conversation.
Ryan Calo: [Lemonade] has a bot that is called AI Jim. And it does handle some automatic transactions, but it does not use AI. However, Lemonade is a company, and a big part of its supposed competitive edge is the fact that it uses AI. And so that’s what’s so confusing. I mean, you call something AI Jim and you say, “Well, it actually doesn’t use AI.” But what I love about it is it’s a beautiful illustration of all the problems in the industry. Calling something a bot, only to have to backpedal and deny that you’re using AI in some circumstances where it looks like you are because it’s so concerning and problematic.
Molly Wood: We just had a conversation, you and I, about the Federal Trade Commission saying it’s going to crack down on companies that exaggerate what their AI can do or tout AI capabilities that could prove discriminatory. Seems like this is that.
Calo: The FTC, the Federal Trade Commission, publishes a blog post, says, “Don’t do A, don’t do B.” A couple of weeks later, a company puts out a thread on Twitter saying, “We do A, we do B,” basically. And so people said, “Look, if you try to analyze video footage, and you try to use these nonverbal cues, what you’re going to wind up finding is that people that don’t behave exactly the way as the honest people in your training set do are going to look like they’re dishonest, and, in fact, they’re not.” So they did something that seems like it would be discriminatory, if they really did it. As a sort of observer of the space, it does look like they’re doing the very same things that the FTC said not to do two weeks ago.
Wood: And what does this do to public confidence? At a minimum, it seems that the FTC’s declaration has raised awareness of this, so that when Lemonade comes out and tweets these things and says these things, that they immediately get called on it. But Twitter is kind of a walled garden. What does this mean for the public in terms of trying to understand exactly what happens? It’s possible Lemonade uses AI in some processes but not in others, and we just don’t know.
Calo: Well, I just want to say that, yes, the FTC blog raised awareness. But it’s also these researchers who have been doing this work for years and years, often [Black, Indigenous and people of color] researchers, but also other researchers, [including] Kate Crawford and her book “Atlas of AI,” Luke Stark and one of my own students, Jevan Hutson. People in the community have been doing this research, and it’s so validating that it has reached the government agencies and it’s reached the public. I think it’s a great development that people are so attuned. You’re right, of course, Twitter is not the whole world. And so I think what we do need is we do need policymakers, enforcers to actually come up with laws and enforce them, or at least come up with the standards that they’re saying. And I do think that’s necessary. People criticized Lemonade. Many people spotted it way before me. I do think that some of the talk about the Federal Trade Commission might have also contributed to them taking it down because it’s sort of one thing to draw the ire of this community, it’s another to think that you’re going to be in the spotlight of the nation’s leading consumer watchdog. But I do want to give credit where it’s really due, which is the many researchers, often themselves from marginalized populations, who have evidence that this happens again and again and have called attention to it.
Wood: How common do you think it has become for companies to — I mean, we certainly see it all the time — for companies to essentially say, “We use AI, and it’s magic”? There was literally a point in one of Lemonade’s either tweets or documents where it said, “We do this with technology we can’t understand.”
Calo: That’s right. So imagine sort of bragging about using something that has a consequential decision, has a consequential impact, like whether you get insurance or whether your claim is validated, and then bragging about the fact that you don’t even know how it works. What Kate Crawford calls this is “enchanted determinism” in her book about AI. It calls to mind, like, the Sorcerer’s Apprentice and it’s just, like, this idea that this stuff is just magical and so on. It’s not. I mean, the other thing too I think is important is that when it goes wrong, these days it’s harder to say that this is not foreseeable and not predictable. We’ve seen so many examples of AI that purports to tell you whether someone is politically left or right or whether they’re gay or whether they’re honest, or whatever.
We’ve seen it happen again and again, and again and again we’ve seen it get debunked. We’ve also seen these systems be deployed and again and again, they are not working well for the most vulnerable and minority populations because of the way training data works. And then, sometimes when they’re working only too well, they’re deployed against vulnerable populations, like the Uyghurs at the border of China. What was so shocking to me was that Lemonade hadn’t read the tea leaves. What was so shocking to me is that the company was not paying attention enough in the space that they would realize that this series of claims that they made and their attempts to honestly roll them back would be encountering such a sophisticated, knowledgeable context. And so I’m hoping that the researchers and the journalists and everyone else on Twitter, I’m hoping that’s going to spill into broader society. And we’ll start to see some guardrails being placed there by policymakers and enforcers.
Wood: Yeah, I mean, Lemonade ran into a wall of public sentiment. How important is it, though, that the FTC follow on behind either in this case or others and actually do what it said it would do in its blog post?
Calo: I think it’s important. News cycles are fast. Consent decrees last for 20 years. I think for the long run, it really matters that the Federal Trade Commission or other bodies pursue these kinds of things. Being in a consent order with the FTC matters, and it matters for a long time.
Related links: More insight from Molly Wood
I simplified a lot of the details of what actually happened with Lemonade. One of the problems is that the company first tweeted that it uses AI to handle fraud, then said it doesn’t use AI to auto-deny claims. But CNN reporter Rachel Metz found an SEC filing where Lemonade quite clearly said it did resolve claims in about a third of cases without any human intervention.
We had a long back and forth with Lemonade to try to understand when AI does and does not intervene in claims. The company told us that AI Jim doesn’t use AI to deal with claims without human intervention, and that what it meant in the Securities and Exchange Commission filing is that AI Jim is a product, but that when it denies claims without intervention, it’s because those claims are inaccurate, like the person didn’t have a policy when the claim was filed or the amount they claimed is below their deductible. And their system can check that without AI or machine learning. Just regular, old computers comparing datasets using an algorithm, but not the learning kind.
So we asked when Lemonade does use AI, since that is a big part of its business claims. The company said that “AI Jim, the chatbot, does not detect fraud on its own but detects explicit signals that then flag claims that get escalated to our investigators. None of these data points are based on the physical attributes of the person submitting the claim. These flagged claims then get reviewed by our human investigators.”
The statement went on to say that the type of detection done is described in a blog post and says it is “basically using facial recognition technology to flag claims submitted by the same person under different identities.” In the post, Lemonade cites an example of a person who tried to file multiple claims by changing the account details and also putting on wigs and lipstick. Which seems to me to be physical attributes, or at least nonverbal cues, but honestly at this point, I just don’t know.
The future of this podcast starts with you.
Every day, the “Marketplace Tech” team demystifies the digital economy with stories that explore more than just Big Tech. We’re committed to covering topics that matter to you and the world around us, diving deep into how technology intersects with climate change, inequity, and disinformation.
As part of a nonprofit newsroom, we’re counting on listeners like you to keep this public service paywall-free and available to all.
Support “Marketplace Tech” in any amount today and become a partner in our mission.