☞ The Monkey's Paw, Norbert Wiener, and the Alignment Problem

Wisdom from 'God & Golem, Inc.'

Apr 17, 2023

Chatter around AI alignment—how we make certain that AI does what we actually want and intend it to do—has exploded over the past weeks and months, as generative AI itself has blossomed forth.

But these are not new considerations. Norbert Wiener, the founder of the field of cybernetics—focusing on the ideas of control and feedback in the dynamics of complex systems both machine and organic—had given these issues careful thought over half a century ago.

In his insightful and intriguing book God & Golem, Inc.—published in 1964, and based on lectures he gave in 1962—Wiener held forth on numerous topics.

And included within this short volume is a discussion of the alignment of intelligent machines. Wiener begins by telling the tale of “The Monkey’s Paw,” a 1902 short story by W. W. Jacobs. In it, a visiting military man shows a family a small and shriveled monkey’s paw, which is able to grant three wishes. It had already granted three wishes to both this soldier and to its previous owner—whose third wish had been for death—and contained within it an allotment of three wishes for only one more individual. But to be clear, the soldier explained, this talisman was given this ability by its creator not to grant power or wealth but as a means of demonstrating that “fate ruled people’s lives, and that those who interfered with it did so to their sorrow.”

When the head of the family acquires the monkey’s paw, over the soldier’s protestations, they initially wish for £200. After a day or so, the wish is granted, but at the expense of their son being crushed to death by the machinery at his place of work: the company gives them two hundred pounds as thanks for their son’s services.

About a week later, the dead man’s father uses the paw to try to bring their boy back to life, but before the mother can discover the mangled revivified body to which their son has been returned to at their doorstep, the father wishes for him to once again be granted death (the full text of the story can be read here).

Wiener uses this story to illustrate the difference between one’s intentions and consequences and how difficult it is to know in advance what you truly want and what you mean. Of course this is far from the only tale that intertwines magic and unintended effects, as Wiener notes (cf. the Sorcerer’s Apprentice). But it is a powerful demonstration of this problem and has many implications for intelligent systems.

As per Wiener1:

If you are playing a game according to certain rules and set the playing-machine to play for victory, you will get victory if you get anything at all, and the machine will not pay the slightest attention to any consideration except victory according to the rules. If you are playing a war game with a certain conventional interpretation of victory, victory will be the goal at any cost, even that of the extermination of your own side, unless this condition of survival is explicitly contained in the definition of victory according to which you program the machine.

Without a process of feedback in understanding our intentions and wishes, Wiener notes, catastrophe can ensue.

In the end, Wiener reduces the many-dimensional issues and ramifications of complex AI technology—and how to make decisions that revolve around good and evil—down to but a handful of sentences:

[What then] when we have put the decision in the hands of an inexorable magic or an inexorable machine of which we must ask the right questions in advance, without fully understanding the operations of the process by which they will be answered? Can we then be confident in the action of the Monkey’s Paw from which we have requested the grant of the £200?

We must be cognizant of Wiener’s admonition about sixty years ago. ■

The Enchanted Systems Roundup

Here are some links worth checking out that touch on the complex systems of our world (both built and natural):

🝯 I Saw the Face of God in a Semiconductor Factory: “At the sight of the lithography machine, my eyes mist. Oil, salt, water—human emotions are shameful contaminants. But I can’t help it. I contemplate, for the millionth time, etched atoms. It’s almost too much: the idea of tunneling down into a cluster of atoms and finding art there. It would be like coming upon Laocoön, way, way out, out beyond the Milky Way, out among some unnamed stars, suspended in outer space.”

🝖 Why ChatGPT Won’t Replace Coders Just Yet: “writing software isn’t just about writing the basic algorithms for munging the data in the way you want it to be munged. A ton of it is about fiddling around the edges.”

🜑 Steve Jobs, Jef Raskin, and the first great war for your thumbs: “perhaps we’ll never get another era like the early 1990s where you could look around the market and find not one, not two, but three experiments with spacebar and its surroundings—three different approaches to putting thumbs to good use, and to righting some of the wrongs established by typewriters a century before”

🜛 Pond brains and GPT-4: “We are faced with a surprising fact. If you predict the next token at large enough scale, you can generate coherent communication, generalize and solve problems, even pass the Turing Test. So is this actually thinking?”

🝳 Society's Technical Debt and Software's Gutenberg Moment: “It is an exaggeration, but only a modest one, to say that it is a kind of Gutenberg moment, one where previous barriers to creation—scholarly, creative, economic, etc—are going to fall away, as people are freed to do things only limited by their imagination, or, more practically, by the old costs of producing software.”

🜚 Time Passages: “What is the length of time described in the average 250 words of narration and how has this changed over time?”

🝖 Grid World: “If you are caught in the beam of someone else’s grid, as I was in my dad’s, the grid’s virality will infect you. Its intoxicating pattern will flow through your thoughts and become the architecture of your reality. You will radiate the grid too.”

🜑 When Copernicus Meets Gutenberg: “As an AI optimist I think we’ll come through the other side of this identity crisis. Most people in the world do not define themselves by their intelligence - while it’s something many pursue, other values, virtues and ambitions are more important.” Pairs well with “Pre-Nostalgia in the Late Pre-AI Era”

🝤 Malleable software in the age of LLMs: “All computer users may soon have the ability to author small bits of code. What structural changes does this imply for the production and distribution of software?”

🜸 A Potential Major Discovery: An Aperiodic Monotile: This is exciting.

🝳 The Catalogers: “The catalogers know something important that most of the startup hype obscures: the purest joy in the technology world is not a big funding round or other high achievements. It’s the simple pleasure of a curious mind and the camaraderie that develops in an amateur technology scene on the verge of something new — a mixture of possibility, personal power, and shared passion.”

🜹 ChatGPT + Code Interpreter = Magic: “could I get ChatGPT to create a Game of Life simulation that ended in a QR code? (I told it to cheat by working backwards from the QR code…) Yes, it can”

🝊 Babelfish, a Translator Inspired by “The Hitchhiker's Guide”: this NPR clip from 25 years ago on the web translation service Babelfish is such a delightful time capsule.

🜚 The Cartographers by Peng Shepherd: On maps and meaning and much more.

Until next time.

Brian Christian, in his book The Alignment Problem, has also noted the deep origins of these concerns in Wiener’s God & Golem, Inc.

Cabinet of Wonders

Discussion about this post