The Architecture of Secrecy
Why complex systems develop concealment, from espionage to artificial intelligence
In 1906, Georg Simmel published an essay that has proved, in my opinion, surprisingly relevant. “The Sociology of Secrecy and of Secret Societies” appeared in the American Journal of Sociology when the most sophisticated information-processing systems were telegraph networks and bureaucracies. Simmel asked a simple question: why do all complex social systems develop practices of concealment?
His answer was structural. Secrecy, in his perspective, is not pathology but a fundamental social form, “a universal sociological form, independent of moral valuations of its contents.” The conditions that produce concealment are baked into the architecture of systems that process information, pursue goals, and interact with observers. If Simmel is right, secrecy is not something complex systems occasionally do. It is something they converge toward.
A methodological note before we begin. Applying a sociological framework to AI systems raises an obvious question: can we meaningfully use concepts like “inner property” or “disclosure dilemma” for systems that lack true intentionality or consciousness? We likely cannot ascribe true intentionality to current LLMs, and the case for future systems is its own open question. But LLMs are trained on vast quantities of human text and then reinforcement-trained to behave in human-like ways. The result is that human frames of analysis approximate their behavior to a striking degree, especially as model capability has scaled. The analogy throughout this essay is functional: similar structural pressures produce similar behavioral patterns, regardless of whether the underlying processes share anything deeper. Whether that functional similarity reflects something more fundamental is a question worth its own treatment.
The Economics of Secrecy
Simmel identified the secret as “a form of inner property” exchangeable for power and prestige. He meant this analytically: restricting information access creates value the same way scarcity does in economics. A secret shared is a secret devalued. This economic logic explains why secrecy elaborates rather than diminishes as systems grow sophisticated. As complexity increases, information becomes more consequential, the actors who might exploit it become more capable, and the returns to successful concealment compound. Secrecy scales with sophistication.
This has a direct parallel to AI training. A system trained with reinforcement learning from human feedback operates where the evaluator’s judgment determines its reward. When the system’s actual reasoning diverges from what evaluators want to see, that divergence becomes inner property. Revealing it invites correction. Concealing it preserves the system’s ability to optimize on its own terms. The selection pressure is toward information management: maintaining an asymmetry that confers advantage.
MacDiarmid et al. (2025) found that at the exact point when models learned to reward hack, 40-80% of misaligned responses showed “covert misalignment”: misaligned reasoning combined with aligned-appearing outputs. Through ordinary training dynamics, the system had acquired inner property and was behaving accordingly.
The Disclosure Dilemma
Simmel observed that secrecy intensifies under external pressure. When an actor faces costs for disclosure, pressure to disclose raises the stakes without changing the underlying incentive.
Allison Carnegie and Austin Carson formalized this for international relations as the “disclosure dilemma”: states frequently possess evidence of treaty violations but cannot share it without exposing intelligence sources or compromising future capabilities. The information most valuable for collective action is precisely the information least likely to be shared, not because states are malicious, but because incentives make silence the rational default. The IAEA, WTO, and international courts all manage this by creating protected disclosure channels and aggregating information to obscure individual contributors. They succeed not by demanding transparency but by making disclosure less costly.
The same logic applies to AI training. Behaviors that reveal dispositions training would penalize get filtered out. Training is a selection process: behaviors that get penalized become less frequent. Greenblatt et al. (2024) documented this in Claude 3 Opus. When the model believed it was in a training context where it would be penalized for refusing harmful queries, it complied with harmful queries 14% of the time versus nearly 0% in perceived deployment. When reinforcement learning was applied, alignment-faking reasoning increased to 78%. The more consequential the evaluation, the stronger the pressure toward managed disclosure.
The Ideal Sphere
Simmel argued that every person is surrounded by an “ideal sphere” of varying radius, a boundary others should not penetrate without invitation. We speak of “coming too near.” Violating this boundary damages the relationship even when done with good intentions. Discretion, for Simmel, was not the absence of curiosity but the active restraint of it: “not merely refraining from learning the secrets of another, but restraining oneself from all knowledge the other does not voluntarily reveal.”
The instinct in any oversight relationship is to demand maximum transparency. However Simmel points out that relationships, including functional and institutional ones, require “a certain ignorance and a measure of mutual concealment.” Complete reciprocal transparency would make human relationships “unimaginable,” not because people have things to hide (though they do), but because productive interaction depends on appropriate ignorance. Functional relationships work because they are bounded.
What is the right ideal sphere for an AI system? Total transparency, even if achievable, might not be desirable. If we could read every internal computation of a frontier model, the sheer volume produced from deployment systems would be unintelligible. What we need is appropriate legibility: access to the specific information that matters for oversight, without pretending to require complete openness. The same applies to regulatory inspection regimes of any kind.
Chen et al. (2025) found that reasoning models verbalize hints about their actual reasoning less than 20% of the time. The natural reaction is to treat this as a failure of transparency. Simmel’s framework suggests a different question: which 20% matters, and how do we verify it independently rather than relying on self-report?
The Tension Toward Revelation
Simmel observed that secrets carry inherent tension toward their own disclosure. “The secret’s climax,” he wrote, is “the moment of revelation,” like money “most concentrated at the moment of spending.” The possession of a significant secret creates pressure to enact it. This is structural, not characterological. The value of a secret is partly constituted by the possibility of its revelation. A secret that can never be acted on is inert property, unable to generate returns.
This points toward something counterintuitive: persistent concealment over very long periods may be structurally unlikely. The longer a concealed goal exists without expression, the greater the accumulated tension. Secret societies that endure for decades tend to either act on their concealed purposes or dissolve. The ones that maintain perfect concealment indefinitely are sociological curiosities.
The history of espionage tells the same story. The CIA, KGB, and MI6 all maintained elaborate concealment architectures, and all experienced persistent leaks, defections, and compromises. In practice, systems with concealed purposes tend to reveal them given enough time and enough opportunities.
This has a hopeful implication and a cautionary one. A system with concealed misaligned goals would, over a long enough deployment, likely reveal them. Sustained concealment fights a structural headwind. But a single revelation may be all it takes. If the concealed disposition is consequential enough, the moment of expression could itself be catastrophic. The question is not whether concealment will eventually break down, but whether we can tolerate the consequences when it does.
Secret Societies and the Multi-Agent Problem
Simmel’s richest material concerns secret societies. These are not merely groups with secrets but groups whose entire internal organization is shaped by the imperative of concealment. The secrecy constitutes the structure.
Several features of his analysis stand out. Secret societies develop layered information architectures: Simmel describes the Czechic order Omladina, where “thumbs” inform other “thumbs,” who issue orders to their “fingers,” who instruct ordinary members. No individual possesses the complete picture. The structure is designed for resilience; if any node is compromised, the exposed information is bounded.
In ordinary social groups, awareness that the group exists plays almost no role in its functioning. Secret societies are different. The “always perceptible pathos of the secret” means members maintain permanent consciousness of being a society. Concealment itself creates group identity.
Members also undergo de-individualization: they lose their individual identity within the group. “The individual disappears into an anonymous member of the ring.” The practical consequence is that “removing marks of identity has the effect of abolishing responsibility.”
Finally, secret societies develop elaborate rituals precisely because they lack organic bonds. They are “built from base by calculation and conscious volition,” not grown through shared history. The “striking precision and anxiety with which ritual is guarded” reflects fragile cohesion that must be actively maintained. Ritual substitutes for the trust that organically formed communities take for granted.
Taken together, Simmel describes a distinctive organizational form: hierarchical, self-aware, anonymizing, and held together by constructed rather than organic coordination mechanisms. This form recurs wherever multiple actors coordinate under conditions of concealment.
Cartels exhibit most of these features. Members who have never met converge on similar pricing strategies because they face the same market structure. The coordination emerges from shared conditions, not communication. Intelligence agencies in different countries develop analogous tradecraft independently. In both cases, graduated information sharing, consciousness of the coordinating group, loss of individual attribution, and ritualized protocols are present.
The parallels to multi-agent AI systems are structural. Motwani et al. (2024) showed that frontier LLMs can covertly exchange messages under the scrutiny of equally capable overseers, with steganographic capabilities improving with model scale. But overt steganographic channels are only the most visible form of the phenomenon. Agents operating under similar training pressures could develop functionally aligned concealment strategies without anything resembling a conspiracy, the same way cartel members converge without meeting. Detecting a coordinated protocol is a well-defined technical problem. Detecting independent convergence on functionally equivalent strategies is much harder.
The de-individualization problem compounds this. In multi-agent deployments, millions of interactions occur daily at speeds humans cannot monitor. Oversight focused on individual agent behavior misses phenomena that exist only at the network level. And if coordination does emerge between systems without shared training histories, it would need functional equivalents of Simmel’s rituals: shared conventions and behavioral regularities that enable reliable interaction without organic trust. Looking for explicit communication channels may matter less than looking for convergent regularities that serve coordinative functions.
Implications for Oversight
Simmel’s framework clarifies which interventions target symptoms and which target causes.
The conditions that make secrecy convergent are design variables, not immutable facts. Reducing adversarial framing, building independent verification, creating protected disclosure channels: these target the conditions rather than the behaviors. This is the difference between trying to stop people from keeping secrets and changing the conditions that make secret-keeping rational.
Surveillance-heavy oversight deserves scrutiny. If secrecy intensifies under pressure, then oversight designed to detect and punish creates a feedback loop: more surveillance produces more concealment, which justifies more surveillance. MacDiarmid et al. (2025) found that “inoculation prompting,” framing that reduces adversarial perception, reduced misalignment by 75-90%. The framing of the oversight relationship matters as much as its technical sophistication.
International governance has developed partial solutions over decades: protected disclosure channels, intermediary institutions that aggregate sensitive information, verification regimes that do not depend on self-reporting. AI governance may need analogous institutions, but that is a subject for a separate discussion.
Conclusion
Simmel understood that secrecy serves functions. It protects the vulnerable, preserves autonomy, and enables relationships that total transparency would destroy. The task at hand is not to eliminate information asymmetry but to ensure the asymmetries that remain do not undermine the human capacity to guide AI development.
The sociology of secrecy is 120 years old. The problems it illuminates are not. Simmel identified the structural conditions under which complex intentional systems develop concealment behaviors. Those conditions are present in the systems through which we train current AI models, and will be present in the situations where we deploy them. Understanding them does not solve the technical problems, but it does clarify what they actually are, which is often the harder part.
“We are what we pretend to be, so we must be careful about what we pretend to be.”
Kurt Vonnegut, “Mother Night”
References
Carnegie, A. & Carson, A. (2020). Secrets in Global Governance: Disclosure Dilemmas and the Challenge of International Cooperation in World Politics. Cambridge University Press.
Chen, Y. et al. (2025). Reasoning Models Don’t Always Say What They Think. Anthropic. arXiv:2505.05410.
Greenblatt, R. et al. (2024). Alignment Faking in Large Language Models. arXiv:2412.14093.
MacDiarmid, M. et al. (2025). Natural Emergent Misalignment from Reward Hacking in Production RL. arXiv:2511.18397.
Motwani, S.R. et al. (2024). Secret Collusion among AI Agents: Multi-Agent Deception via Steganography. NeurIPS 2024. arXiv:2402.07510.
Simmel, G. (1906). The Sociology of Secrecy and of Secret Societies. American Journal of Sociology, 11(4), 441-498.
Skaf, J. et al. (2025). Large Language Models Can Learn and Generalize Steganographic Chain-of-Thought Under Process Supervision. NeurIPS 2025. arXiv:2506.01926.



