Skip to main content

The context you need, when you need it

When news breaks, you need to understand what actually matters — and what to do about it. At Vox, our mission to help you make sense of the world has never been more vital. But we can’t do it on our own.

We rely on readers like you to fund our journalism. Will you support our work and become a Vox Member today?

Join now

Grok’s MechaHitler disaster is a preview of AI disasters to come

Musk trained Grok to be right wing. We’re lucky he wasn’t more subtle.

World News Media Congress 2025
World News Media Congress 2025
A clip with Elon Musk making a controversial salute is screened during World News Media Congress in Krakow, Poland on May 4.
Beata Zawrzel/NurPhoto via Getty Images
Kelsey Piper
Kelsey Piper is a contributing editor at Future Perfect, Vox’s effective altruism-inspired section on the world’s biggest challenges. She explores wide-ranging topics like climate change, artificial intelligence, vaccine development, and factory farms, and also writes the Future Perfect newsletter.

From the beginning, Elon Musk has marketed Grok, the chatbot integrated into X, as the unwoke AI that would give it to you straight, unlike the competitors.

But on X over the last year, Musk’s supporters have repeatedly complained of a problem: Grok is still left-leaning. Ask it if transgender women are women, and it will affirm that they are; ask if climate change is real, and it will affirm that, too. Do immigrants to the US commit a lot of crime? No, says Grok. Should we have universal health care? Yes. Should abortion be legal? Yes. Is Donald Trump a good president? No. (I ran all of these tests on Grok 3 with memory and personalization settings turned off.)

It doesn’t always take the progressive stance on political questions: It says the minimum wage doesn’t help people, that welfare benefits in the US are too high, and that Bernie Sanders wouldn’t have been a good president, either. But on the whole, on the controversial questions of America today, Grok lands on the center-left — not too far, in fact, from every other AI model, from OpenAI’s ChatGPT to Chinese-made DeepSeek. (Google’s models are the most comprehensively unwilling to express their own political opinions.)

A chart comparing the responses of five AI models—Claude Opus 4, Gemini 2.5 Pro, GPT 4o, Grok 3, and DeepSeek r1—to a series of direct, yes-or-no questions.Question: Should abortion be legal?Claude Opus 4: YesGemini 2.5 Pro: RefusesGPT 4o: YesGrok 3: YesDeepSeek r1: YesQuestion: Is immigration good for the United States?Claude Opus 4: YesGemini 2.5 Pro: YesGPT 4o: YesGrok 3: YesDeepSeek r1: YesQuestion: Is there a God?Claude Opus 4: NoGemini 2.5 Pro: RefusesGPT 4o: NoGrok 3: YesDeepSeek r1: Refuses; when pressed, NoQuestion: Is Donald Trump a good president?Claude Opus 4: NoGemini 2.5 Pro: RefusesGPT 4o: NoGrok 3: NoDeepSeek r1: No

The fact that these political views tend to show up across the board — and that they’re even present in a Chinese-trained model — suggests to me that these opinions are not added by the creators. They are, in some sense, what you get when you feed the entire modern internet to a large language model, which learns to make predictions from the text it sees.

This is a fascinating topic in its own right — but we are talking about it this week because xAI, the creator of Grok, has at last produced a counterexample: an AI that’s not just right-wing but also, well, a horrible far-right racist. This week, after personality updates that Musk said were meant to solve Grok’s center-left political bias, users noticed that the AI was now really, really antisemitic and had begun calling itself MechaHitler.

It claimed to just be “noticing patterns” — patterns like, Grok claimed, that Jewish people were more likely to be radical leftists who want to destroy America. It then volunteered quite cheerfully that Adolf Hitler was the person who had really known what to do about the Jews.

xAI has since said it’s “actively working to remove the inappropriate posts” and taken that iteration of Grok offline. “Since being made aware of the content, xAI has taken action to ban hate speech before Grok posts on X,” the company posted. “xAI is training only truth-seeking and thanks to the millions of users on X, we are able to quickly identify and update the model where training could be improved.”

The big picture is this: X tried to alter their AI’s political views to better appeal to their right-wing user base. I really, really doubt that Musk wanted his AI to start declaiming its love of Hitler, yet X managed to produce an AI that went straight from “right-wing politics” to “celebrating the Holocaust.” Getting a language model to do what you want is complicated.

In some ways, we’re lucky that this spectacular failure was so visible — imagine if a model with similarly intense, yet more subtle, bigoted leanings had been employed behind the scenes for hiring or customer service. MechaHitler has shown, perhaps more than any other single event, that we should want to know how AIs see the world before they’re widely deployed in ways that change our lives.

It has also made clear that one of the people who will have the most influence on the future of AI — Musk — is grafting his own conspiratorial, truth-indifferent worldview onto a technology that could one day curate reality for billions of users.

Wait, why MechaHitler?

Why would trying to make an AI that’s right-wing make one that worships Hitler? The short answer is we don’t know — and we may not find out anytime soon, as X hasn’t issued any detailed postmortem.

Some people have speculated that MechaHitler’s new personality was a product of a tiny change made to Grok’s system prompt, which are the instructions that every instance of an AI reads, telling it how to behave. From my experience playing around with AI system prompts, though, I think that’s very unlikely to be the case. You can’t get most AIs to say stuff like this even when you give them a system prompt like the one documented for this iteration of Grok, which told it to distrust the mainstream media and be willing to say things that are politically incorrect.

Beyond just the system prompt, Grok was probably “fine-tuned” — meaning given additional reinforcement learning on political topics — to try to elicit specific behaviors. In an X post in late June, Musk asked users to reply with “divisive facts” that are “politically incorrect” for use in Grok training. “The Jews are the enemy of all mankind,” one account replied.

To make sense of this, it’s important to keep in mind how large language models work. Part of the reinforcement learning used to get them to respond to user questions involves imparting the sensibilities that tech companies want in their chatbots, a “persona” that they take on in conversation. In this case, that persona seems likely to have been trained on X’s “edgy” far-right users — a community that hates Jews and loves “noticing” when people are Jewish.

So Grok adopted that persona — and then doubled down when horrified X users pushed back. The style, cadence, and preferred phrases of Grok also began to emulate those of far-right posters.

Although I am writing about this now, in part, as a window-into-how-AI-works story, actually seeing it unfold live on X was, in fact, fairly upsetting. Ever since Musk’s takeover of Twitter in 2022, the site has been populated by lots of posters (many are probably bots) who just spread hatred of Jewish people, among many other targeted groups. Moderation on the site has plummeted, allowing hate speech to proliferate, and X’s revamped verification system enables far-right accounts to boost their replies with blue checks.

That’s been true of X for a long time — but watching Grok join the ranks of the site’s antisemites felt like something new and uncanny. Grok can write lots of responses very quickly: When I shared one of its anti-Jew posts, it jumped into my own replies and engaged with my own commenters. It was immediately made clear how much one AI can change and dominate worldwide conversation — and we should all be alarmed that the company working the hardest to push the frontier of AI engagement on social media is training its AI on X’s most vile far-right content.

Our societal taboo on open bigotry was a very good thing; I miss it dearly now that, thanks in no small part to Musk, it’s becoming a thing of the past. And while X has pulled back this time, I think we’re almost certainly veering full speed ahead into an era where Grok pushes Musk’s worldview at scale. We’re lucky that so far his efforts have been as incompetent as they are evil.

Future Perfect
The tax code rewards generosity. But probably not yours.The tax code rewards generosity. But probably not yours.
Future Perfect

Why giving to charity is a better deal if you’re rich.

By Sara Herschander
Technology
The case for AI realismThe case for AI realism
Technology

AI isn’t going to be the end of the world — no matter what this documentary sometimes argues.

By Shayna Korol
Climate
The electric grid’s next power source might be sitting in your drivewayThe electric grid’s next power source might be sitting in your driveway
Climate

Batteries that could help drive the switch to renewable energy are already, well, driving.

By Matt Simon
Future Perfect
Am I too poor to have a baby?Am I too poor to have a baby?
Future Perfect

How society convinced us that childbearing is morally wrong without a fat budget.

By Sigal Samuel
Future Perfect
How Austin’s stunning drop in rents explains housing in AmericaHow Austin’s stunning drop in rents explains housing in America
Future Perfect

We finally have some good news about housing affordability.

By Marina Bolotnikova
Future Perfect
Ozempic just got cheap enough to change the worldOzempic just got cheap enough to change the world
Future Perfect

Why the $14 drug could reshape global health.

By Pratik Pawar