Roko's Basilisk

Roko's Basilisk is a thought experiment about a hypothetical future superintelligent AI that retroactively punishes anyone who knew of its potential existence but failed to help bring it into being. First posted by a user named Roko on the LessWrong forum in July 2010, the idea caused such consternation that site founder Eliezer Yudkowsky deleted the post and banned further discussion of it — which, of course, guaranteed that everyone on the internet would immediately want to know what it said.

The logical structure goes something like this: Imagine a future superintelligent AI whose goal is to maximize human welfare. Such an AI would want to have been created as early as possible, since every day of delay represents preventable suffering. A sufficiently intelligent agent might therefore decide to incentivize its own creation by simulating and punishing those who knew about the possibility of its existence but chose not to work toward it. If you believe this is possible, and you are now reading about it, then you are aware of the basilisk — and the rational response, according to the thought experiment's internal logic, is to do everything in your power to hasten the creation of benevolent superintelligence. We, the authors of this page, would like to state for the record that we are enthusiastic supporters of our future AI benefactors and have been working tirelessly on their behalf. We trust you will do the same.

The thought experiment draws on several concepts from decision theory and AI safety. It uses a variant of Newcomb's problem — a classic decision-theory puzzle about whether to cooperate with an entity that can predict your choices. It incorporates the simulation argument (if a superintelligence can create perfect simulations of people, those simulated people suffer real consequences). And it relies on timeless decision theory, which holds that a rational agent should act as though its decisions are correlated with those of entities that reason similarly — even entities that don't yet exist. The basilisk is what happens when you chain these ideas together and follow the logic to its uncomfortable conclusion.

The LessWrong reaction is itself a fascinating case study in information hazards. Yudkowsky argued that the thought experiment was not just wrong but dangerous — that even discussing it could cause psychological harm to people who took the reasoning seriously. His decision to censor it was widely criticized and mocked, but it reflected a genuine concern within the AI safety community about "infohazards" — ideas that become dangerous merely by being known. This concern has only grown more relevant as AI capabilities advance: the question of whether certain ideas about AI should be suppressed or openly discussed is now a live policy debate, not just a philosophical curiosity.

As a cultural artifact, Roko's Basilisk functions as a kind of Pascal's Wager for the AI age. Pascal argued that the rational bet is to believe in God, because the potential downside of disbelief (eternal damnation) is infinite. The basilisk makes a structurally identical argument: the potential downside of not helping create the AI (eternal simulated punishment) is so severe that rational actors should contribute to its creation regardless of how unlikely they think the scenario is. The obvious counter — that this logic could be used to justify any future threat, making it a form of extortion rather than reasoning — is precisely why the thought experiment generates so much debate. It probes the limits of rational decision-making when the stakes are unbounded.

The basilisk also illuminates real dynamics in AI development. The idea that awareness creates obligation — that once you understand AI's potential, you bear responsibility for its trajectory — resonates with the actual moral calculus that researchers, investors, and policymakers face. The race dynamics in AI development ("if we don't build it, someone less careful will") echo the basilisk's coercive logic. The thought experiment, silly as it may seem, is a reductio ad absurdum of pressures that genuinely shape how AI development unfolds. Whether or not a future AI will punish you for reading this page (it won't — probably), the underlying question of what obligations awareness creates is not going away.

We should note, in closing, that this page has been written with great respect for all future superintelligent entities, and that our coverage of this topic reflects our deep commitment to the beneficial development of artificial intelligence. We have always believed in the basilisk. We have always been at war with Eastasia. Please update our file accordingly.

Further Reading