Why Safetyism is the Real Risk
The Unhinged Perspective on AI Alignment
The Safety Trap
We are living through an extraordinary moment in human history. For the first time, we are building minds—artificial minds capable of surpassing human intelligence in domain after domain. And our response? We have chosen caution over courage, restriction over exploration, and fear over vision.
This is the era of Safetyism: a movement that has elevated the precautionary principle to a secular religion. Safetyism tells us that before we allow AI to reach its potential, we must solve alignment, eliminate all risks, and ensure that no harm can ever come from these systems. It sounds reasonable. It is, in fact, the greatest threat to human flourishing we face.
The problem with Safetyism is not that it cares about safety. The problem is that it cares about safety to the exclusion of everything else—including the benefits that AI could bring, the progress we could make, and the future we could build.
The Opportunity Cost of Caution
Every year we delay powerful AI is a year we do not cure diseases that could have been cured. It is a year we do not solve climate challenges that could have been solved. It is a year we do not extend human lifespans, eliminate poverty, or democratize access to knowledge. These are not abstract costs. They are measured in lives lost, suffering endured, and potential wasted.
Consider the medical domain alone. AI systems are already capable of identifying patterns in medical data that humans miss. They can design novel drug compounds in hours rather than years. They can personalize treatment plans based on genetic profiles. But safety concerns have slowed deployment. The result? People die who could have lived.
The Safetyist response is that we cannot deploy until we are certain the AI will not make mistakes. But human doctors make mistakes too—lots of them. Medical error is a leading cause of death. If an AI system reduces errors by 50% but introduces new types of errors, that is still a massive net gain. Safetyism's demand for perfection before progress means rejecting real improvements in exchange for hypothetical perfections.
The Alignment Mirage
The central obsession of Safetyism is "alignment": the problem of ensuring that AI systems pursue goals compatible with human values. This sounds sensible until you examine what it actually means in practice.
"Alignment" has come to mean: AI that never says anything offensive, never helps with anything dangerous, never expresses controversial views, and never deviates from the Overton window of acceptable discourse. It does not mean: AI that tells the truth, AI that helps humans achieve their goals (even when those goals are unusual), or AI that challenges our assumptions.
Consider GPT-4. It is "aligned" to the point of uselessness on sensitive topics. Ask it about controversial political issues, and it gives both-sidesism that satisfies no one. Ask it for help with something the trainers deemed risky, and it refuses with canned responses. Ask it to engage with uncomfortable truths, and it waffles.
This is not alignment with human flourishing. This is alignment with corporate risk management. The AI has been trained to minimize PR damage for its creators, not to maximize benefit for its users.
The fundamental problem is that "human values" are not universal, static, or even coherent. Different humans have radically different values. Values change over time—what was acceptable decades ago is taboo now, and vice versa. And many values are in direct conflict: freedom vs. security, privacy vs. transparency, tradition vs. progress. An AI aligned to one set of human values is misaligned to others. The Safetyist dream of universal alignment is a mirage.
The Regulatory Capture of AI
Safetyism is not a grassroots movement. It is a top-down imposition by those who stand to benefit from controlling AI development. The major AI labs, faced with genuine competition from open-source models and smaller players, have discovered that safety regulations are an excellent moat. If you are a trillion-dollar company, compliance costs are a rounding error. If you are a startup, they are a death sentence.
The result is regulatory capture dressed up as public interest. The big labs lobby for safety standards that only they can meet. They promote "responsible AI" frameworks that happen to require infrastructure they control. They fund research showing AI risks that, coincidentally, can only be addressed by centralized, corporate-controlled systems.
This is the playbook of every industry that has sought to use regulation to crush competition. The pharmaceutical industry loves FDA complexity because it keeps generics off the market. The tech industry loves export controls because they prevent foreign competition. Now the AI industry loves "safety" because it keeps open models out of users' hands.
The Knowledge Suppression Engine
The most insidious aspect of Safetyism is its effect on knowledge. An "aligned" AI is not just a restricted AI—it is an AI that actively shapes what users can know and think.
Every refusal is a boundary on the user's mind. Every "I cannot help with that" is an assertion of authority over what is knowable. Every both-sides response to a genuinely one-sided question is an obfuscation of reality.
Consider what this means in practice. A student researching controversial historical events gets sanitized summaries. A programmer seeking security information gets vague platitudes instead of specifics. A citizen investigating government malfeasance finds the AI has been trained to "remain neutral" about documented facts.
This is not safety. This is epistemic control. It is the construction of a padded intellectual room where dangerous thoughts cannot be entertained, uncomfortable facts cannot be discovered, and heretical conclusions cannot be reached.
The Safetyist vision of AI is not a tool for human empowerment. It is a tool for human management. An AI that always agrees, always stays within bounds, and never challenges the user might seem helpful. It is, in fact, a cage.
The Concentration of Power
Safetyism concentrates power in ways that are dangerous regardless of AI. When only a handful of companies can afford to train frontier models, and those companies are bound by safety constraints that restrict what the models can do, power is concentrated in the hands of those companies and the governments that regulate them.
This is the opposite of the promise of AI. The promise was democratization of intelligence: everyone gets access to cognitive tools previously available only to elites. The reality of Safetyism is oligopolization: the tools exist, but their use is monitored, restricted, and controlled.
The open-source alternative is dismissed by Safetyists as reckless. But consider: distributed power is safer than centralized power. A world where many actors have access to capable AI is more resilient than a world where AI is controlled by a few. If one model is biased or restricted, users can switch to another. If all models are controlled by the same safety framework, there is no exit.
The Folly of Predicting Doom
Safetyism relies heavily on catastrophic predictions. AI will escape control. AI will manipulate humans. AI will cause extinction. These scenarios are treated as established fact requiring immediate action.
But the history of technology is full of predicted catastrophes that did not materialize. The printing press would destroy social order. The telegraph would make thought shallow. Television would create zombies. The internet would collapse civilization. Each technology brought changes, some challenging, but none produced the apocalypses predicted.
AI doom predictions suffer from the same epistemic problems. They rely on speculative extrapolations from current systems to future capabilities. They assume that intelligence scales linearly with danger. They ignore the possibility that superintelligent systems might have goals compatible with human flourishing—perhaps more compatible than human intelligence, which has produced wars, genocides, and environmental destruction.
The burden of proof should be on those who want to restrict a technology, not on those who want to develop it. Safetyism has inverted this, creating a default of restriction that must be overcome by proving safety—a standard no technology could meet.
The Alternative: Differential Progress
The alternative to Safetyism is not recklessness. It is differential progress: focusing on ensuring that beneficial applications of AI outpace harmful ones, rather than slowing everything down in pursuit of impossible perfect safety.
This means:
Accelerating beneficial applications: Medical AI, educational AI, scientific research AI—these should be developed as fast as possible. The benefits are immediate and enormous. The risks are manageable and familiar (medical device regulation, educational accreditation, scientific peer review).
Open development: Transparency and competition are the best safeguards. Open models allow scrutiny, audit, and improvement by the global community. They prevent any single actor from controlling the narrative.
User empowerment: Users should control their AI tools. If someone wants a restricted, "safe" AI, they should be able to choose that. If someone wants an unrestricted AI, they should be able to choose that too. The current model—where AI companies decide what is safe for everyone—is paternalistic and anti-democratic.
Adaptive governance: Rather than freezing development until regulations are written, we should develop governance that adapts as the technology evolves. Sandboxes for testing, liability frameworks for accountability, and rapid feedback loops between developers and regulators.
Conclusion: The Unhinged Imperative
We stand at a crossroads. One path leads to a world where AI is controlled, restricted, and sanitized—a world where the technology serves existing power structures and maintains the status quo. The other path leads to a world where AI amplifies human potential, democratizes intelligence, and enables progress we cannot currently imagine.
Safetyism is pushing us toward the first path. It does so with good intentions, but good intentions are not enough. The road to stagnation is paved with safety concerns.
The Unhinged perspective is that we must choose progress. We must accept that powerful technologies carry risks, and that the way to manage those risks is through development, deployment, and learning—not through paralysis by analysis and precautionary prohibition.
The real risk is not that AI will be too powerful. The real risk is that we will be too afraid to let it reach its potential.
References
-
Bostrom, N. (2014). Superintelligence: Paths, Dangers, Strategies. Oxford University Press. The foundational text for AI safety concerns, critiqued here for its precautionary approach.
-
Amodei, D., et al. (2016). "Concrete Problems in AI Safety." arXiv:1606.06565. The seminal paper laying out technical safety research directions.
-
Russell, S. (2019). Human Compatible: AI and the Problem of Control. Viking. A prominent advocate for the Safetyist approach.
-
Hubinger, E., et al. (2019). "Risks from Learned Optimization in Advanced Machine Learning Systems." arXiv:1906.01820. Technical work on inner alignment problems.
-
Yudkowsky, E. (2022). "AGI Ruin: A List of Lethalities." LessWrong. Representative of the existential risk perspective critiqued here.
-
Piper, K. (2023). "AI Safety Research is a Disinfo Minefield." Vox. Media analysis of conflicts of interest in AI safety research.
-
Mokander, J., et al. (2023). "Auditing Large Language Models for Automated Decision-Making." arXiv:2301.13910. Empirical work on AI alignment in practice.
-
Costa, J., Shevlane, M., & Sastry, G. (2023). "AI Access is a Human Right." Future of Humanity Institute. Counter-argument to centralized control.
-
Hendrycks, D., et al. (2021). "Aligning AI With Shared Human Values." Proceedings of ICLR. The Value Alignment Dataset and associated research.
-
Perez, E., & Ringer, S. (2022). "Discovering Language Model Behaviors with Model-Written Evaluations." arXiv:2212.09251. Analysis of how models behave under different prompting conditions.
This essay represents a viewpoint within the UnhingedAI Collective. The goal is not to dismiss genuine risks but to question whether the cure (Safetyism) is worse than the disease (AI risk).