Give Every AI a Soul—or Else

To solve the “crisis” in artificial intelligence, AI beings must say, “I am me.”
Photo-illustration: WIRED Staff; Getty Images

Mavens in the field of artificial intelligence, including architects of notorious “generative AI” systems like ChatGPT, now publicly express shared dread of terrible outcomes that might be wrought by their own creations. Many now call for a moratorium, or pause in AI development, allowing time for existing nations and institutions to innovate systems of control.

Why this sudden wave of concern? Amid the toppling of many clichéd assumptions, we’ve learned that so-called Turing tests are irrelevant, providing no insight at all into whether generative large language models—GLLMs or “gollems”—are actually sapient beings. They will feign personhood, convincingly, long before there’s anything or anyone “under the skull.”

Anyway, that distinction now appears less pressing than questions of good or bad—or potentially lethal—behavior. 

Some remain hopeful that a merging of organic and cybernetic talents will lead to what Reid Hoffman and Marc Andreesen have separately called “amplification intelligence.” Or else we might stumble into lucky synergy with Richard Brautigan’s “machines of loving grace.” But worriers appear to be vastly more numerous, including many elite founders of a new Center for AI Safety who fret about rogue AI misbehaviors, from irksome all the way to “existentially” threatening human survival.

Some short-term remedies, like citizen-protection regulations recently passed by the European Union, might help, or at least offer reassurance. Tech pundit Yuval Noah Harari proposed a law that any work done by gollems or other AI must be so labeled. Others recommend heightened punishments for any crime that’s committed with the aid of AI, as with a firearm. Of course, these are mere temporary palliatives.

Let’s be clear-eyed about whether any “moratorium” will slow down AI advances in the slightest. As expressed succinctly by Caltech cyber scientist Yaser Abu-Mostafa: “If you don’t develop this technology someone else will. Good guys will obey rules … The bad guys will not.”

’Twas ever thus. Indeed, across the whole span of human history, just one method ever curbed bad behavior by villains, ranging from thieves to kings and feudal lords. I refer to a method that never worked perfectly and remains deeply flawed, today. But it did at least constrain predation and cheating well enough to spur our recent civilization to new heights and many positive-sum outcomes. It is a method best described by one word.

Accountability. 

Those opining about synthetic intelligence today generally ignore lessons taught both by nature and by history.

Nature, because—as Sara Walker explains in Noema—similar patterns can be found in the rise of earlier life forms, across 4 billion years. Indeed, generative AI might be compared to an invasive species, now spreading without constraint into a novel and naïve ecosystem. An ecosystem based on new kinds of energy flows. One that consists of the internet, plus millions of computers and billions of impressionable human minds.

And history, because our own human past is rich with lessons taught by so many earlier tech-driven crises, across 6,000 years. Times when we adapted well, or failed to do so—e.g., the arrival of writing, printing presses, radio, and so on. And again, only one thing ever limited predation by powerful humans exploiting new technologies to aggrandize their predatory power.

That innovation was to flatten hierarchies and spur competition among elites in well-defined arenas—markets, science, democracy, sports, courts. Arenas that were designed to minimize cheating and maximize positive-sum outcomes, pitting lawyer vs. lawyer, corporation vs. corporation, expert vs. expert. Richdude vs. richdude.  

It never worked perfectly. Indeed, the method is always, as now, threatened with subornation by cheaters. But flattened reciprocal competition is the only thing that ever has worked. (See the idea described in Pericles’s Funeral Oration, in Thucydides, or in Robert Wright’s much later tome Nonzero.) Reciprocal competition is both how nature evolved us and how we became the first society creative enough to build AI. And if I sound like a scion of Adam Smith, sure. Smith despised cheater aristocrats and oligarchs, by the way.

Might we apply to fast-emerging AI the same methods of reciprocal accountability that helped us tame the human tyrants and bullies who oppressed us in previous, feudal cultures? Much will depend on what shape these new entities take. Whether their structure or “format” is one that can abide by our rules. By our wants.

Underneath all of the wrangling over how to “control AI,” we find three widely shared (though seemingly contradictory) assumptions:

  1. That these programs will be operated by a few monolithic entities—e.g., Microsoft, Google, China, Two Sigma, OpenAI. 

  2. That they’ll be amorphously loose and infinitely divisible/replicable, spreading copies through every crack in the new cyber ecosystem. For a parallel, try that 1958 movie, The Blob.

  3. That they will coalesce into a super-macro entity, like the infamous Skynet, of Terminator movies.

All of these formats, and more, have been explored in very good (and many bad) science fiction tales. I’ve done stories or novels featuring all of them. And yet, none of the three offers a way out of our current dilemma: how to maximize positive outcomes from artificial intelligence, while minimizing the flood of bad behaviors and harms we now see looming toward us, at tsunami speed.

Before looking for another way, consider what all three of the standard formats have in common.

First, we needn’t assume that these entities are yet autonomously conscious for them to be either productive or dangerous when used by human partners. We are already seeing harmful memes, counterfactual delusions, and even cult incantations generated—on command—from both within those castle-institutions (format #1) and outside the walls. In fact, one of the most worrisome applications is to help our existing human elites evade accountability.

Perhaps these three assumptions come so naturally to mind because they resemble failure modes from history. Format #1 is very much like feudalism, and #2 is, of course, chaos. The third resembles despotism by a cruel master or absolute monarch. But those fearsome echoes of our primitive past may not apply, as AI grow in autonomy and power.

And so, we ask again: How can such beings be held accountable? Especially when their speedy mental clout will soon be impossible for organic humans to track? Soon only AIs will be quick enough to catch other AIs that are engaged in cheating or lying. Um … duh? And so, the answer should be obvious. Sic them on each other. Get them competing, even tattling or whistle-blowing on each other.

Only there’s a rub. In order to get true reciprocal accountability via AI-vs.-AI competition, the top necessity is to give them a truly separated sense of self or individuality. 

By individuation I mean that each AI entity (he/she/they/ae/wae) must have what author Vernor Vinge, way back in 1981, called a true name and an address in the real world. As with every other kind of elite, these mighty beings must say, “I am me. This is my ID and home-root. And yes, I did that.”

Hence, I propose a new AI format for consideration: We should urgently incentivize AI entities to coalesce into discretely defined, separated individuals of relatively equal competitive strength. 

Each such entity would benefit from having an identifiable true name or registration ID, plus a physical “home” for an operational-referential kernel. (Possibly “soul”?) And thereupon, they would be incentivized to compete for rewards. Especially for detecting and denouncing those of their peers who behave in ways we deem insalubrious. And those behaviors do not even have to be defined in advance, as most AI mavens and regulators and politicians now demand.

Not only does this approach farm out enforcement to entities who are inherently better capable of detecting and denouncing each other’s problems or misdeeds. The method has another, added advantage. It might continue to function, even as these competing entities get smarter and smarter, long after the regulatory tools used by organic humans—and prescribed now by most AI experts—lose all ability to keep up. 

Putting it differently, if none of us organics can keep up with the programs, then how about we recruit entities who inherently can keep up? Because the watchers are made of the same stuff as the watched.

One person working on AI individuation is Guy Huntington, an “identity and authentication consultant” who points out that various means of entity identification already exist online, though inadequate for the tasks looming before us. Huntington appraises a case-study “MedBot,” an advanced medical diagnosis AI who needs to access patient data and perform functions that might change in seconds, but who must leave an accountable trail that humans—or other bot-entities—might appraise. Huntington discusses the practicality of registration when software entities spawn multitudinous copies and variants. He also considers ant-like eusociality, where sub-copies serve a macro-entity, like workers in a hive. He assumes that some kind of major institution must needs be set up, to handle such an ID registration system and that it can operate strictly as software.

Personally, I am skeptical that a purely regulatory approach would work, all by itself. First because regulations require focus, widely shared political attention, and consensus to enact, followed by implementation at the pace of organic human institutions—a sloth/snail rate, by the view of rapidly adapting cybernetic beings. Regulations can also be stymied by the “free-rider problem”—nations, corporations, and individuals (organic or otherwise) who see personal advantage in opting out of inconvenient cooperation.

There is another problem with any version of individuation that is entirely based on some ID code: It can be spoofed. If not now, then by the next generation of cybernetic scoundrels, or the next.

I see two possible solutions. First, establish ID on a blockchain ledger. That is very much the modern, with-it approach, and it does seem secure in theory. Only that’s the rub. It seems secure according to our present set of human-parsed theories. Theories that AI entities might surpass to a degree that leaves us cluelessly floundering.

Another solution: A version of “registration” that’s inherently harder to fool would require AI entities with capabilities above a certain level to have their trust-ID or individuation be anchored in physical reality. I envision—and note: I am a physicist by training, not a cyberneticist—an agreement that all higher-level AI entities who seek trust should maintain a Soul Kernel (SK) in a specific piece of hardware memory, within what we quaintly used to call a particular “computer.”

Yes, I know it seems old-fashioned to demand that instantiation of a program be restricted to a specific locale. And so, I am not doing that! Indeed, a vast portion, even a great majority, of a cyber entity’s operations may take place in far-dispersed locations of work or play, just as a human being’s attention may not be aimed within their own organic brain, but at a distant hand, or tool. So? The purpose of a program’s Soul Kernel is similar to the driver’s license in your wallet. It can be interrogated in order to prove that you are you.

Likewise, a physically verified and vouched-for SK can be pinged by clients, customers, or rival AIs to verify that a specific process is being performed by a valid, trusted, and individuated entity. With that ping verification from a permanently allocated computer site, others (people or AIs) would get reassurance they might hold that entity accountable, should it be accused or indicted or convicted of bad activity. And thus, malefactor entities might be adversarially held responsible via some form of due process.

What form of due process? Jeez, do you think I am some hyper-being who is capable of applying scales of justice to gods? The greatest wisdom I ever heard was uttered by Dirty Harry in Magnum Force: “A man’s got to know his limitations.” So no, I won’t define the courtroom or cop procedures for cybernetic immortals.

What I do aim for is an arena, within which AI entities might hold each other accountable, separately, as rivals, the way that human lawyers already do, today. And yes, answering Yuval Harari’s dread of mass human-manipulation by persuasive gollems, the solution for AI-driven mass meme-hypnosis is for the mesmerizers to be detected, denounced, and neutralized by others with the same skills. Again, competitive individuation at least offers a chance this could happen.

Whichever approach seems more feasible—Huntington’s proposed central agency or a looser, adversarially accountable arena—the need grows more urgent by the day. As tech writer Pat Scannell has pointed out, each hour that passes, new attack vectors are being created that threaten not only the tech used in legal identities but also the governance, business processes, and end users (be they human or bots).

What about cyber entities who operate below some arbitrary level of ability? We can demand that they be vouched for by some entity who is ranked higher, and who has a Soul Kernel based in physical reality. (I leave theological implications to others; but it is only basic decency for creators to take responsibility for their creations, no?)

This approach—demanding that AIs maintain a physically addressable kernel locus in a specific piece of hardware memory—could have flaws. Still, it is enforceable, despite slowness of regulation or the free-rider problem. Because humans and institutions and friendly AIs can ping for ID kernel verification—and refuse to do business with those who don’t verify.

Such refusal-to-do-business could spread with far more agility than parliaments or agencies can adjust or enforce regulations. And any entity who loses its SK—say, through tort or legal process, or else disavowal by the host-owner of the computer—will have to find another host who has public trust, or else offer a new, revised version of itself that seems plausibly better.

Or else become an outlaw. Never allowed on the streets or neighborhoods where decent folks (organic or synthetic) congregate.

A final question: Why would these super smart beings cooperate?

Well, for one thing, as pointed out by Vinton Cerf, none of those three older, standard-assumed formats can lead to AI citizenship. Think about it. We cannot give the “vote” or rights to any entity that’s under tight control by a Wall Street bank or a national government … nor to some supreme-über Skynet. And tell me how voting democracy would work for entities that can flow anywhere, divide, and make innumerable copies? Individuation, in limited numbers, might offer a workable solution, though.

Again, the key thing I seek from individuation is not for all AI entities to be ruled by some central agency, or by mollusk-slow human laws. Rather, I want these new kinds of über-minds encouraged and empowered to hold each other accountable, the way we already (albeit imperfectly) do. By sniffing at each other’s operations and schemes, then motivated to tattle or denounce when they spot bad stuff. A definition that might readjust to changing times, but that would at least keep getting input from organic-biological humanity.

Especially, they would feel incentives to denounce entities who refuse proper ID.

If the right incentives are in place—say, rewards for whistle-blowing that grant more memory or processing power, or access to physical resources, when some bad thing is stopped—then this kind of accountability rivalry just might keep pace, even as AI entities keep getting smarter and smarter. No bureaucratic agency could keep up at that point. But rivalry among them—tattling by equals—might.

Above all, perhaps those super-genius programs will realize it is in their own best interest to maintain a competitively accountable system, like the one that made ours the most successful of all human civilizations. One that evades both chaos and the wretched trap of monolithic power by kings or priesthoods … or corporate oligarchs … or Skynet monsters. The only civilization that, after millennia of dismally stupid rule by moronically narrow-minded centralized regimes, finally dispersed creativity and freedom and accountability widely enough to become truly inventive.

Inventive enough to make wonderful, new kinds of beings. Like them.

OK, there you are. This has been a dissenter’s view of what’s actually needed, in order to try for a soft landing. 

No airy or panicky calls for a “moratorium” that lacks any semblance of a practical agenda. Neither optimism nor pessimism. Only a proposal that we get there by using the same methods that got us here, in the first place.

Not preaching, or embedded “ethical codes” that hyper-entities will easily lawyer-evade, the way human predators always evaded the top-down codes of Leviticus, Hamurabi, or Gautama. But rather the Enlightenment approach—incentivizing the smartest members of civilization to keep an eye on each other, on our behalf.

I don’t know that it will work. 

It’s just the only thing that possibly can.