Skip to main content

The context you need, when you need it

When news breaks, you need to understand what actually matters — and what to do about it. At Vox, our mission to help you make sense of the world has never been more vital. But we can’t do it on our own.

We rely on readers like you to fund our journalism. Will you support our work and become a Vox Member today?

Join now

Black Nazis? A woman pope? That’s just the start of Google’s AI problem.

The Gemini image generator isn’t just suffering from a technical problem, but from a philosophical one.

Sigal Samuel
Sigal Samuel is a senior reporter for Vox’s Future Perfect. She writes primarily about the future of consciousness, tracking advances in artificial intelligence and neuroscience and their staggering ethical implications. Before joining Vox, Sigal was the religion editor at the Atlantic.

Just last week, Google was forced to pump the brakes on its AI image generator, called Gemini, after critics complained that it was pushing bias ... against white people.

The controversy started with — you guessed it — a viral post on X. According to that post from the user @EndWokeness, when asked for an image of a Founding Father of America, Gemini showed a Black man, a Native American man, an Asian man, and a relatively dark-skinned man. Asked for a portrait of a pope, it showed a Black man and a woman of color. Nazis, too, were reportedly portrayed as racially diverse.

After complaints from the likes of Elon Musk, who called Gemini’s output “racist” and Google “woke,” the company suspended the AI tool’s ability to generate pictures of people.

“It’s clear that this feature missed the mark. Some of the images generated are inaccurate or even offensive,” Google Senior Vice President Prabhakar Raghavan wrote, adding that Gemini does sometimes “overcompensate” in its quest to show diversity.

Raghavan gave a technical explanation for why the tool overcompensates: Google had taught Gemini to avoid falling into some of AI’s classic traps, like stereotypically portraying all lawyers as men. But, Raghavan wrote, “our tuning to ensure that Gemini showed a range of people failed to account for cases that should clearly not show a range.”

This might all sound like just the latest iteration of the dreary culture war over “wokeness” — and one that, at least this time, can be solved by quickly patching a technical problem. (Google plans to relaunch the tool in a few weeks.)

But there’s something deeper going on here. The problem with Gemini is not just a technical problem.

It’s a philosophical problem — one for which the AI world has no clear-cut solution.

What does bias mean?

Imagine that you work at Google. Your boss tells you to design an AI image generator. That’s a piece of cake for you — you’re a brilliant computer scientist! But one day, as you’re testing the tool, you realize you’ve got a conundrum.

You ask the AI to generate an image of a CEO. Lo and behold, it’s a man. On the one hand, you live in a world where the vast majority of CEOs are male, so maybe your tool should accurately reflect that, creating images of man after man after man. On the other hand, that may reinforce gender stereotypes that keep women out of the C-suite. And there’s nothing in the definition of “CEO” that specifies a gender. So should you instead make a tool that shows a balanced mix, even if it’s not a mix that reflects today’s reality?

This comes down to how you understand bias.

Computer scientists are used to thinking about “bias” in terms of its statistical meaning: A program for making predictions is biased if it’s consistently wrong in one direction or another. (For example, if a weather app always overestimates the probability of rain, its predictions are statistically biased.) That’s very clear, but it’s also very different from the way most people use the word “bias” — which is more like “prejudiced against a certain group.”

The problem is, if you design your image generator to make statistically unbiased predictions about the gender breakdown among CEOs, then it will be biased in the second sense of the word. And if you design it not to have its predictions correlate with gender, it will be biased in the statistical sense.

So how should you resolve the trade-off?

“I don’t think there can be a clear answer to these questions,” Julia Stoyanovich, director of the NYU Center for Responsible AI, told me when I previously reported on this topic. “Because this is all based on values.”

Embedded within any algorithm is a value judgment about what to prioritize, including when it comes to these competing notions of bias. So companies have to decide whether they want to be accurate in portraying what society currently looks like, or promote a vision of what they think society could or even should look like — a dream world.

Related

How can tech companies do a better job navigating this tension?

The first thing we should expect companies to do is get explicit about what an algorithm is optimizing for: Which type of bias will it focus on reducing? Then companies have to figure out how to build that into the algorithm.

Part of that is predicting how people are likely to use an AI tool. They might try to create historical depictions of the world (think: white popes) but they might also try to create depictions of a dream world (female popes, bring it on!).

“In Gemini, they erred towards the ‘dream world’ approach, understanding that defaulting to the historic biases that the model learned would (minimally) result in massive public pushback,” wrote Margaret Mitchell, chief ethics scientist at the AI startup Hugging Face.

Google might have used certain tricks “under the hood” to push Gemini to produce dream-world images, Mitchell explained. For example, it may have been appending diversity terms to users’ prompts, turning “a pope” into “a pope who is female” or “a Founding Father” into “a Founding Father who is Black.”

But instead of adopting only a dream-world approach, Google could have equipped Gemini to suss out which approach the user actually wants (say, by soliciting feedback about the user’s preferences) — and then generate that, assuming the user isn’t asking for something off-limits.

What counts as off-limits comes down, once again, to values. Every company needs to explicitly define its values and then equip its AI tool to refuse requests that violate them. Otherwise, we end up with things like Taylor Swift porn.

AI developers have the technical ability to do this. The question is whether they’ve got the philosophical ability to reckon with the value choices they’re making — and the integrity to be transparent about them.

This story appeared originally in Today, Explained, Vox’s flagship daily newsletter. Sign up here for future editions.

Future Perfect
The tax code rewards generosity. But probably not yours.The tax code rewards generosity. But probably not yours.
Future Perfect

Why giving to charity is a better deal if you’re rich.

By Sara Herschander
Technology
The case for AI realismThe case for AI realism
Technology

AI isn’t going to be the end of the world — no matter what this documentary sometimes argues.

By Shayna Korol
Climate
The electric grid’s next power source might be sitting in your drivewayThe electric grid’s next power source might be sitting in your driveway
Climate

Batteries that could help drive the switch to renewable energy are already, well, driving.

By Matt Simon
Future Perfect
Am I too poor to have a baby?Am I too poor to have a baby?
Future Perfect

How society convinced us that childbearing is morally wrong without a fat budget.

By Sigal Samuel
Future Perfect
How Austin’s stunning drop in rents explains housing in AmericaHow Austin’s stunning drop in rents explains housing in America
Future Perfect

We finally have some good news about housing affordability.

By Marina Bolotnikova
Future Perfect
Ozempic just got cheap enough to change the worldOzempic just got cheap enough to change the world
Future Perfect

Why the $14 drug could reshape global health.

By Pratik Pawar