☞ Programming Languages and Natural Language

Oct 25, 2021

File:Code on computer monitor (Unsplash).jpg — Some code.

I’m a sucker for generative analogies when it comes to computer programming: the kinds of analogies or metaphors that get you to think about code in a new sort of way, e.g. coding equals magic. And one fun one is to examine the ways that natural language is similar—and different—to programming languages. Therefore, I was delighted to come across the following exploration from Venkatesh Rao of how to think about how we use language (I’m going to quote at length, because it’s so intriguing):

Natural human languages are sometimes analogized to computer languages, and they do share many key features, but there is one key difference. A single natural human language like English maps not to a specific programming language like Python, but to an entire stack of computing languages, from the lowest to the highest level.
Sometimes we use English in ways that look like machine code (think step-by-step instructions or UI text strings), sometimes in ways that look like shell scripts, sometimes like C, sometimes like Javascript. In each case, the “compile target” is the human brain. As with computer languages, sometimes you compile one language into a lower-level language — esoteric scientific journal jargon into pop-science language for example.
Given a piece of natural language text, it is always interesting to ask — what level of abstraction in a computing stack is it analogous to?
A PowerPoint-based speech is perhaps like Javascript and CSS. It belongs in the public presentation layer of the stack.
A legal contract is perhaps at the C level. A to-do list is perhaps at the bytecode level.
The precise mappings do not matter. The point is, human natural languages fluidly operate at all levels of abstraction, and there are no sharp boundaries. They can be used to program other minds at any level from machine code to CSS.

The rest of the article takes this to another level around memes and phrases that stand for entire conceptual frameworks, and the need for rebuilding a type of higher-level writing appropriate for our current moment.

But, let’s return to thinking about programming languages and natural human languages. One theory in linguistics is the idea that the structure and nature of each language has an impact on how we think, or at least how we communicate. In other words, when certain languages have certain grammatical structures, or certain words and phrases that describe specific concepts, this can affect how you think about the world. This is essentially the Sapir-Whorf hypothesis, which some readers might understand from the Ted Chiang short story “Story of Your Life” (or its movie version Arrival).

Of course then, the question becomes, can programming languages have the same property? To borrow the analogy from above, are specific programming languages better suited to each mental layer, helping the programmer manipulate the world more effectively (or at least think about what they want to build better)?

Well, one project is working on this:

“The idea is to examine the way in which computer languages can both expand and limit how individual and collective minds work,” says James Evans, director of the Knowledge Lab.“We’re trying to figure out how human minds respond to different functions and different domains, both in programming languages and in popular data science environments.”
The idea that learning and using certain computer languages can influence how people solve problems resonates with the famous Sapir–Whorf hypothesis. This theory holds that spoken languages differ in psychologically important ways, such that learning the grammar and vocabulary of a language can nudge thinking in certain directions.

If you want to get involved in this kind of work, it looks like the researchers are interested in hiring a postdoc (or they were at one point), who will get to do, among other things, the following:

This project involves analysis of all public GitHub and other code repositories with statistical and machine learning approaches that generate insights that link programming language properties to individual and group behavior to coding and analytical outputs. Based on insights from these large-scale analyses and ongoing surveys of programming communities, we will generate programming experiments (e.g., with the Jupyter interface) to test whether discovered associations are causal—whether changing languages can predictably improve the efficiency, collaboration, and creativity of coders and coding communities.

So provocative.

A few links worth checking out:

What a Crossword AI Reveals About Humans' Way With Words: “The program hasn’t been explicitly taught that a question mark signals some sort of semantic shenanigans, Klein explains, but through machine learning it can gradually surmise that it needs to look for less straightforward options than it would for a regular clue.”
The Insane Innovation of TI Calculator Hobbyists
Space probe software bug: “…it was a software bug in the fault protection software’s internal clock. And we only worked that out because of one extremely telling number: we lost contact with the probe exactly 429,496,729.6 seconds after midnight, January 1, 2000.”
Remystifying Supply Chains (also by Venkatesh Rao)
Slowed canonical progress in large fields of science: one of the authors is James Evans, who is working on the programming language project discussed above.

Cabinet of Wonders

Discussion about this post