You realized the AI you’re creating may be dangerous. Now what?
Share Now on:
It’s been about seven months since Elon Musk, Apple co-founder Steve Wozniak and other prominent names in tech signed an open letter, released by the Future of Life Institute, calling for a temporary pause on artificial intelligence development.
The gist of the missive is that the risks of advanced AI are profound and that we don’t understand them well enough to keep expanding the technology without having proper guardrails in place. Although that pause did not happen, there has been some movement on the policy front. Most notably, President Joe Biden’s issued an executive order this week in an effort to set rules and establish standards around the safety and security of AI.
But for some researchers, the core concerns that led to the “AI pause” letter still exist, and there’s not much agreement on how to address them.
Marketplace’s Lily Jamali spoke with Jonas Schuett, research fellow at the Centre for the Governance of AI, about a recent paper he co-authored that has a different take on the question of pausing development.
Jonas Schuett: Many people worry that the next generation of these frontier AI models might have certain dangerous capabilities. But then the question is, what should developers like OpenAI do if they realize that the model they’re developing might actually be really, really dangerous? So, the obvious response then is to stop the development process and not release the model until they know exactly what’s going on and have made it safe and can demonstrate that it is actually sufficiently safe to use. But even that might not be enough. Because other developers might continue to develop and release their models, which might be very similar and have very similar capabilities, they should probably also stop. And that’s essentially what the paper is about. How can we make sure that whenever a model with dangerous capabilities is discovered, all frontier developers temporarily pause?
Lily Jamali: It seems like for the AI companies that pausing development like that would be at odds with their business model.
Schuett: That’s certainly right to some extent. But you might argue that a model with these dangerous capabilities is just not a good product. Ultimately, I’m not particularly excited about scenarios where all this is purely voluntary. At some point, you probably want some requirements to do that and some government enforcement of these requirements. An intermediate way might be something like having these model evaluations done by an external auditor, and then this kind of pausing mechanism could be built into their contractual relationship between the auditor and the developer. And so, the hope would be that different developers work with the same auditor to kind of essentially set up such a regime.
Jamali: How is your proposal different from the AI pause letter that was put out by the Future of Life Institute? That made a lot of headlines back in March.
Schuett: One difference is that our proposal is much more concrete. So, you would have to run specific tests, and then only if you fail these tests would you have to pause. And also, the length of the pause wouldn’t be arbitrary. And the Future of Life proposal was six months. And then another difference is that I think our proposal might result in many pauses as more dangerous systems are identified, and the FLI only pushed for this single six-month moratorium. Overall, I think their proposal is not particularly realistic and I think we’re proposing a somewhat more realistic alternative. I’m also not 100% convinced that their proposal is actually that desirable.
Jamali: I want to make sure we’re not glossing over the potential danger that you’re flagging in your paper. If I am an AI developer, how would you propose that I identify dangerous capabilities in my models?
Schuett: Some of these developers are already doing that. This approach, this kind of risk identification or assessment method, which we call a model evaluation, is now increasingly popular. The way that these kind of model evaluations work is you’re essentially trying to make the model do a certain dangerous thing. So, you might be concerned that the model is able to create copies of itself and acquire resources, then you might break down this kind of slightly worrying capability into smaller tasks. This might include questions like can the model create a bitcoin wallet or solve a Captcha or stuff like that. And then you essentially test for these things and see how well they do on each of these tasks. Of course, this is like a somewhat subjective test at the end of the day. But this is roughly how these types of evaluations work.
Jamali: Where does that lead? Let’s say a model is able to create a bitcoin wallet. What’s the worst-case scenario there?
Schuett: The worry with this particular capability, which is creating copies of itself and acquiring resources, is if you have this kind of autonomous agent that can do this and it tries to make the developers some money and it essentially spreads over the internet like a virus, then it might become extremely costly to shut it down. And at some point, it might essentially be impossible to shut down. It might be very annoying to have all these agents being annoying on the internet. But also, if these agents will have harmful goals like committing cybercrimes, then, then this might actually lead to a bunch of harm.
Jamali: One criticism that we hear a lot about is that those who are advocating for an AI pause are overplaying the capabilities of large language models as they exist today. What do you say to that argument?
Schuett: One of the points we are trying to make is that a pause only needs to happen if you have evidence that the model is actually, is actually dangerous. We are actually not advocating for this kind of general pause. We think we are just highly uncertain of how dangerous these models will actually be. It’s possible that, that all these concerns are justified and that the next generation of model is actually really dangerous. But they may also not be. And in that case, you would not have to pause. The point is, given our uncertainty, we should at least test and collect evidence, such that we are in a position to do something about it if it turns out, if they turn out to be dangerous.
Jamali: In the paper, you talk about a number of obstacles that would need to be overcome for this kind of pause to work. What would you say is the most important of those obstacles?
Schuett: Yeah, so one is evaluations. The problem is we don’t have good model evaluations for all dangerous capabilities. Only a few dozen people and maybe a handful of organizations are able to do these evaluations because they are extremely complex and it’s a fairly new kind of instrument and they don’t cover all the capabilities that people worry about. Even the evals that we do have, they might not be good enough. So, it’s like a fairly new risk assessment technique and we’re just not there yet. And so a lack of good, reliable evaluations is probably the main obstacle. And then another obstacle is antitrust law. So some kinds of coordination between private companies might just violate antitrust law. So, for example, different labs probably can’t make a legal agreement in which they commit to pause. That’s probably just illegal. But some of the other versions we discussed might be less problematic, like the audit version.
“Marketplace Tech” covered the Future of Life’s AI pause letter pretty extensively when it came out this year. One of the many perspectives we heard came from Gary Marcus, an AI expert who signed the letter. He told my colleague Meghan McCarty Carino that he hoped a hiatus would give governments and developers time to put regulatory policies in place. Without that, he warned, we could be staring down an “information apocalypse.”
We also heard from AI skeptic and computational linguist Emily Bender, who said we’re framing this debate all wrong. Concerns about a future AI-induced apocalypse are a distraction, she said, from real harms happening now, like synthetic text being mistaken for reliable information and becoming actual misinformation.