There has been a great deal of buzz about the latest developments in AI such as ChatGPT. There have been practical considerations about how dangerous it might be to develop it, but there have also been concerns that the current incarnations of AI are overblown, that they are merely large language models that use massive databases of language to seek out patterns and then use those patterns to provide merely a facsimile of intelligence, similar in principle to Siri and Alexa and to the algorithms that autocorrect words or suggest the next words in our text messages, except that these are far more sophisticated.
Leaving aside those issues, Elizabeth Kolbert writes about a very practical application of large language models, and that is to try and decipher whale communication, because they seem to use regular patterns.
The world’s largest predators, sperm whales spend most of their lives hunting. To find their prey—generally squid—in the darkness of the depths, they rely on echolocation. By means of a specialized organ in their heads, they generate streams of clicks that bounce off any solid (or semi-solid) object. Sperm whales also produce quick bursts of clicks, known as codas, which they exchange with one another. The exchanges seem to have the structure of conversation.
Since then, cetologists have spent thousands of hours listening to codas, trying to figure out what that function might be. Gero, who wrote his Ph.D. thesis on vocal communication between sperm whales, told me that one of the “universal truths” about codas is their timing. There are always four seconds between the start of one coda and the beginning of the next. Roughly two of those seconds are given over to clicks; the rest is silence. Only after the pause, which may or may not be analogous to the pause a human speaker would put between words, does the clicking resume.
Codas are clearly learned or, to use the term of art, socially transmitted. Whales in the eastern Pacific exchange one set of codas, those in the eastern Caribbean another, and those in the South Atlantic yet another. Baby sperm whales pick up the codas exchanged by their relatives, and before they can click them out proficiently they “babble.”
The whales around Dominica have a repertoire of around twenty-five codas. These codas differ from one another in the number of their clicks and also in their rhythms. The coda known as three regular, or 3R, for example, consists of three clicks issued at equal intervals. The coda 7R consists of seven evenly spaced clicks. In seven increasing, or 7I, by contrast, the interval between the clicks grows longer; it’s about five-hundredths of a second between the first two clicks, and between the last two it’s twice that long. In four decreasing, or 4D, there’s a fifth of a second between the first two clicks and only a tenth of a second between the last two. Then, there are syncopated codas. The coda most frequently issued by members of Unit R, which has been dubbed 1+1+3, has a cha-cha-esque rhythm and might be rendered in English as click . . . click . . . click-click-click.
If codas are in any way comparable to words, a repertoire of twenty-five represents a pretty limited vocabulary. But, just as no one can yet say what, if anything, codas mean to sperm whales, no one can say exactly what features are significant to them. It may be that there are nuances in, say, pacing or pitch that have so far escaped human detection. Already, ceti team members have identified a new kind of signal—a single click—that may serve as some kind of punctuation mark.
At present, we know that these codas seem to serve the purpose of language for whales but we don’t know what they are saying. The idea occurred to some researchers that these large language models could be fed a database of codas and that they may be able to interpret the meanings. To do that would require the collection of a large database of whale codas.
CETI [Cetacean Translation Initiative] has its unofficial headquarters in a rental house above Roseau, [Dominica’s] capital. The group’s plan is to turn Dominica’s west coast into a giant whale-recording studio. This involves installing a network of underwater microphones to capture the codas of passing whales. It also involves planting recording devices on the whales themselves—cetacean bugs, as it were. The data thus collected can then be used to “train” machine-learning algorithms.
The most famous whale calls are the long, melancholy “songs” issued by humpbacks. Sperm-whale codas are neither mournful nor musical. Some people compare them to the sound of bacon frying, others to popcorn popping. That morning, as I listened through the headphones, I thought of horses clomping over cobbled streets. Then I changed my mind. The clatter was more mechanical, as if somewhere deep beneath the waves someone was pecking out a memo on a manual typewriter.
As anyone who has been conscious for the past ten months knows, ChatGPT is capable of amazing feats. It can write essays, compose sonnets, explain scientific concepts, and produce jokes (though these last are not necessarily funny). If you ask ChatGPT how it was created, it will tell you that first it was trained on a “massive corpus” of data from the Internet. This phase consisted of what’s called “unsupervised machine learning,” which was performed by an intricate array of processing nodes known as a neural network. Basically, the “learning” involved filling in the blanks; according to ChatGPT, the exercise entailed “predicting the next word in a sentence given the context of the previous words.” By digesting millions of Web pages—and calculating and recalculating the odds—ChatGPT got so good at this guessing game that, without ever understanding English, it mastered the language. (Other languages it is “fluent” in include Chinese, Spanish, and French.)
In theory at least, what goes for English (and Chinese and French) also goes for sperm whale. Provided that a computer model can be trained on enough data, it should be able to master coda prediction. It could then—once again in theory—generate sequences of codas that a sperm whale would find convincing. The model wouldn’t understand sperm whale-ese, but it could, in a manner of speaking, speak it. Call it ClickGPT.
Andreas told me that ceti had already made significant strides, just by reanalyzing Gero’s archive. Not only had the team uncovered the new kind of signal but also it had found that codas have much more internal structure than had previously been recognized. “The amount of information that this system can carry is much bigger,” he said.
“The holy grail here—the thing that separates human language from all other animal communication systems—is what’s called ‘duality of patterning,’ ” Andreas went on. “Duality of patterning” refers to the way that meaningless units—in English, sounds like “sp” or “ot”—can be combined to form meaningful units, like “spot.” If, as is suspected, clicks are empty of significance but codas refer to something, then sperm whales, too, would have arrived at duality of patterning. “Based on what we know about how the coda inventory works, I’m optimistic—though still not sure—that this is going to be something that we find in sperm whales,” Andreas said.
There have long been heated debates as to whether language is something that is unique to humans. While developments like ChatGPT are interesting, I must say that this particular application of the technology is what I have found to be the most exciting so far. Communicating with other species would open up a wonderful new world.