Skip to main content

The context you need, when you need it

When news breaks, you need to understand what actually matters — and what to do about it. At Vox, our mission to help you make sense of the world has never been more vital. But we can’t do it on our own.

We rely on readers like you to fund our journalism. Will you support our work and become a Vox Member today?

Join now

OpenAI’s Jan Leike is trying to ensure superintelligent AI remains on our side

It might just be the most important job in the world.

Illustrated portrait of Jan Leike
Illustrated portrait of Jan Leike
Lauren Tamaki for Vox

OpenAI’s Jan Leike is trying to ensure superintelligent AI remains on our side

It might just be the most important job in the world.

Kelsey Piper
Kelsey Piper is a contributing editor at Future Perfect, Vox’s effective altruism-inspired section on the world’s biggest challenges. She explores wide-ranging topics like climate change, artificial intelligence, vaccine development, and factory farms, and also writes the Future Perfect newsletter.

OpenAI, the maker of ChatGPT, believes it’s on the cusp of transforming our world with powerful AI systems. At minimum, it thinks these will fundamentally change how we work and live. At maximum, it could make our world unrecognizable overnight.

To make this go well, instead of catastrophically badly, OpenAI has created what it calls the superalignment team, which tries to understand how to make superhuman AI do what we want, instead of doing its own thing.

The team head is Jan Leike, a machine learning researcher who worked at Google’s DeepMind before joining OpenAI. His team is in a race against time: The goal is to figure out how to align powerful AI systems before unaligned powerful AI systems get developed. (An AI system is “aligned” if it’s trying to do the things that humans want, and “unaligned” if it’s trying to do other things outside our control. A big, unanswered question is how well we can tell what our AI systems are trying to do at all.)

“I think alignment is tractable,” Leike told Rob Wiblin on the 80,000 Hours podcast this August. “I think we can actually make a lot of progress if we focus on it and put effort into it. … Honestly, it really feels like we have a real angle of attack on the problem that we can actually iterate on, we can actually build towards. And I think it’s pretty likely going to work, actually. And that’s really, really wild, and it’s really exciting. It’s like we have this hard problem that we’ve been talking about for years and years and years, and now we have a real shot at actually solving it.”

The basic approach is to develop techniques that work to align systems slightly more powerful than the ones we have today, safely build those systems, and then use them to align their successors.

Our methodology

To select this year’s Future Perfect 50, our team went through a months-long process. Starting with last year’s list, we brainstormed, researched deeply, and connected with our audience and sources. We didn’t want to overrepresent in any one category, so we aimed for diversity in theories of change, academic specialities, age, geographic location, identity, and many other criteria.

To learn more about the FP50 methodology and criteria, go here.

Many people justifiably don’t want to gamble the fate of the world on the success of OpenAI’s internal alignment research team (I don’t, myself, want to take that gamble). But even if one would like to see technical alignment research accompanied with much stronger external oversight, governance, auditing, and measures to prevent the deployment of potentially dangerous systems, technical work on making AI systems safe will certainly be a huge element of any solution to this pressing challenge.

Sometimes, progress on the technical side can open up new options for political and governance solutions. And I think it’s to their immense credit that Leike’s team openly admits the insane stakes of the work they’re doing, and that they are willing to explain how they intend to do it. Their candor means that other researchers can evaluate their approach and figure out if this approach will get us to safe superintelligences — and if not, what will go wrong.

Future Perfect
The 2025 Future Perfect 25The 2025 Future Perfect 25
Future Perfect

Meet the heroes keeping global progress alive.

By Bryan Walsh
The End of HIV
India’s drug industry saved the world once. Can it do it again?India’s drug industry saved the world once. Can it do it again?
The End of HIV

The “pharmacy of the world” needs to reinvent itself.

By Pratik Pawar
Future Perfect
How 6 organizers are building effective global health solutions from the bottom upHow 6 organizers are building effective global health solutions from the bottom up
Future Perfect

Meet the Future Perfect 25: On the ground.

By Bryan Walsh, Marina Bolotnikova and 3 more
Future Perfect
Free cancer treatment for all — and 5 other ideas to transform global healthFree cancer treatment for all — and 5 other ideas to transform global health
Future Perfect

Meet the Future Perfect 25: Movers and Shakers.

By Izzie Ramirez, Sara Herschander and 2 more
Future Perfect
The future of global health is at stake. These 7 pioneers could revolutionize it.The future of global health is at stake. These 7 pioneers could revolutionize it.
Future Perfect

Meet the Future Perfect 25: The Innovators.

By Izzie Ramirez, Sigal Samuel and 3 more
Future Perfect
The 6 big thinkers reshaping foreign aid, masculinity, and developmentThe 6 big thinkers reshaping foreign aid, masculinity, and development
Future Perfect

Meet the Future Perfect 25: The Thinkers

By Izzie Ramirez, Sara Herschander and 4 more