On a crisp spring morning in 2008, Shane Gero overheard a pair of whales having a chat. Gero, a Canadian biologist, had been tracking sperm whales off the Caribbean island nation of Dominica when two males, babies from the same family, popped up not far from his boat. The animals, nicknamed Drop and Doublebend, nuzzled their enormous boxy heads and began to talk.
Sperm whales “speak” in clicks, which they make in rhythmic series called codas. For three years Gero had been using underwater recorders to capture codas from hundreds of whales. But he’d never heard anything quite like this. The whales clicked back and forth for 40 minutes, sometimes while motionless, sometimes twirling their silver bodies together like strands of rope, rarely going silent for long. Never had Gero so desperately wished he understood what whales were saying. He felt as if he were eavesdropping on brothers wrestling in their room. “They were talking and playing and being siblings,” he says. “There was clearly so much going on.”
Over the next 13 years, Gero, a National Geographic Explorer, would record and get to know hundreds of sperm whales. But he kept coming back to a revelation that struck him as he’d listened to Drop and Doublebend: If humans were ever to decode the language of whales, or even determine if whales possessed something we might truly call language, we’d need to pair their clicks with the context. The key to unlocking whale communication would be knowing who the animals are and what they’re doing as they make their sounds.
One of humanity’s most enduring desires is the enchanting notion that we might one day converse with other species. In the years since Gero’s insight, and partly because of it, the potential to bridge this communications gap has grown less fanciful. On Monday, a team of scientists announced that they have embarked on a five-year odyssey to build on Gero’s work with a cutting-edge research project to try to decipher what sperm whales are saying to one another.
Such an attempt would have seemed folly even just a few years ago. But this effort won’t rely solely on Gero. The team includes experts in linguistics, robotics, machine learning, and camera engineering. They will lean heavily on advances in artificial intelligence, which can now translate one human language to another without help from a Rosetta Stone, or key. The quest, dubbed Project CETI (Cetacean Translation Initiative), is likely the largest interspecies communication effort in history.
Already, these scientists have been at work building specialized video and audio recording devices. They aim to capture millions of whale codas and analyze them. The hope is to expose the underlying architecture of whale chatter: What units make up whale communication? Is there grammar, syntax, or anything analogous to words and sentences? These experts will track how whales behave when making, or hearing, clicks. And using breakthroughs in natural language processing—the branch of artificial intelligence that helps Alexa and Siri respond to voice commands—researchers will attempt to interpret this information.
Tune into Secrets of the Whales, a Disney+ original series from National Geographic that premieres Earth Day, April 22, 2021, and hear about the secret culture of killer whales in our podcast, Overheard at National Geographic.
Nothing like this has ever been attempted. We’ve trained dogs to respond to our commands, and dolphins have learned to mimic human whistles. We’ve taught chimpanzees and gorillas to use sign language and bonobos to answer questions by tapping symbols on a keyboard. An elephant in Seoul, named Koshik, can even speak a few words in Korean—really.
But the goal isn’t to get whales to understand humans. It’s to understand what sperm whales say to one another as they go about their lives in the wild.
‘They sound like Morse code’
The project started with another marine biologist and with a simple conceit: Great advances often come when top experts from different rapidly advancing disciplines collaborate.
David Gruber is also a National Geographic Explorer, but his interests have long crossed traditional boundaries. The City University of New York professor of biology and environmental science has used submarines to examine coral reefs. But he’s also discovered a biofluorescent sea turtle in the Solomon Islands, found that schools of flashlight fish use their glowing light to coordinate movements, studied the molecules that make catsharks and some eels appear to glow, and built a camera to mimic one shark’s view of the world. He once teamed up with a roboticist to develop a delicate six-tentacled device that lets researchers pick up jellyfish without harming them.
In 2017, while a fellow at Harvard University’s Radcliffe Institute, Gruber, a diver, became fascinated with sperm whales, the largest toothed whales, after reading a book about free divers who study them. One day while listening to whale codas on his laptop, another Radcliffe fellow, Shafi Goldwasser, happened by.
“‘Those are really interesting—they sound like Morse code,’” Gruber recalls Goldwasser saying. She had been hosting lectures for a group of Radcliffe Fellows on machine learning, a subfield of artificial intelligence that employs algorithms to find and predict patterns in data. Today machine learning drives everything from search engines, to home robot vacuums like Roomba, to autonomous vehicles. She urged Gruber to share the clicks with her Radcliffe group.
That group included some unusually sharp computer minds. Goldwasser is a computer scientist and one of the world’s foremost experts in cryptography. Michael Bronstein, chair of machine learning at Imperial College London, created a machine learning company that he later sold to Twitter to detect fake news. The group was intrigued by Gruber’s presentation. Could machine learning help humans understand animal communication?
Gruber saw an opportunity. He’d spent an eclectic career trying to get people to embrace the magic of the oceans by focusing on things he found remarkable, such as corals, biofluorescence, and jellyfish. Maybe this was the project that could ignite the public’s imagination, inspiring people to revel in the mystery and wonder of the sea. “I’d had this idea that if I could get people to fall in love with jellyfish, they could fall in love with anything,” Gruber says. “But there’s something about whales that really taps into human curiosity.”
Gruber needed to talk to someone who understood the whales. So he looked up Gero, founder of the Dominica Sperm Whale Project, which tracks whale family dynamics, and sent him an email. Gero agreed to hear Gruber out.
Linguists contend that even the most intelligent non-human animals lack a communication system that could be called language. But could whales prove an exception? Human language evolved at least partly to mediate social relationships, and Gero has shown that sperm whales lead complex social lives. (Read more about whales’ cultures.)
Sperm whales have the animal kingdom’s biggest brains, six times larger than ours. They live in female-dominated social networks and exchange codas in a type of staccato duet, especially when near the surface. They segregate into clans of hundreds or thousands, which identify themselves using different click codas. In a sense, clans speak different dialects. The whales also identify one another by specific click patterns, which they appear to use like names. And they learn their codas much as humans learn language, by babbling clicks as juveniles until they pick up their family’s repertoire.
Through the years, Gero has identified hundreds of individuals from two large clans off Dominica. He can recognize many on sight, through unique markings on their flukes. By analyzing DNA from whale poop and skin samples, he has identified grandmothers, aunts, brothers, and sisters.
And he has kept detailed records, including thousands of exhaustively annotated recordings of clicks that described who was speaking, which clan they belonged to, who they were with, and what they were doing at the time.
That was more than enough for a test. Applying AI techniques to some of Gero’s audio, Gruber’s machine learning colleagues trained a computer to identify individual sperm whales from their sounds. The computer was right more than 94 percent of the time.
Excited, Gruber put together a working group to expand on this promising result. In addition to Gero and Gruber’s Radcliffe computer colleagues, there is whale biologist Roger Payne, a MacArthur Award winner, who had popularized the mesmerizing songs of humpbacks in the 1960s and 1970s, helping to ignite the “Save the Whales” movement. There is Robert Wood, a Harvard roboticist who, with Gruber, constructed the jellyfish handler and whose lab has built self-folding origami and an insect-sized flying drone. And there is Daniela Rus, another MacArthur recipient and director of computer science and artificial intelligence at the Massachusetts Institute of Technology.
They agreed that, for the first time, humans might finally have the tools to begin to more thoroughly understand what animals are saying—even beings that live mostly in darkness and hunt squid a thousand feet below the surface of the sea.
The fact that these animals rely almost exclusively on acoustic information might even simplify the task. At a restaurant a few blocks from Harvard Yard, the team sketched plans for a new Apollo program, one focused on translating speech from aliens of the deep. At one point someone even suggested their work, if successful, might provide a framework for conversing with extraterrestrial life. “I kept looking around waiting for someone to laugh and saw nothing but a lot of nods,” Gruber says.
Machine learning could spur breakthroughs
That doesn’t mean the odds are in the scientists’ favor.
We’ve learned a lot in the last few decades about the unique ways animals communicate. Prairie dogs vary their calls depending on whether they’re being approached by hawks, coyotes, or people. They’ll even produce different sounds if the person they see is tall or short, or wearing white or red. Some monkey species make distinct alarm sounds for specific dangers. They screech differently when leopards approach than they do at the sight of an eagle.
Increasingly, animal communication discoveries are assisted by AI. Through machine learning, researchers in 2016 decoded call differences between Egyptian fruit bats squabbling over food and those fighting over resting spots. Rats and mice communicate far above the range of human hearing. By transforming those sounds into sonograms and running the images through artificial neural networks loosely inspired by human brain circuitry, scientists in 2019 linked different sounds to different behaviors, such as fleeing danger or trying to attract a mate. Researchers dubbed their algorithm “DeepSqueak.”
These insights are now possible because breakthroughs in machine learning have come at a lightning clip in the last decade as algorithms get more sophisticated and computer processing power explodes.
Some computer learning is “supervised,” meaning scientists give algorithms examples annotated by humans to train them. By analyzing thousands of pictures labeled “cats,” for example, algorithms can learn to recognize cats in other photographs.
But neural networks can find patterns in things like language without an initial assist from humans. By feeding a network millions of stories from Google News along with phrases with missing elements—“To _ or not to be”—that network was able to build a mathematical model for the language. That model then learned associations between words, for example, that “Paris” is to “France” as “Rome” is to “Italy.” Such models are now a cornerstone of natural language processing, used, for example, to predict if a restaurant review on Yelp is negative or to detect spam email.
But the challenges are many. Machine translation is possible for humans in part because word associations are usually similar across languages; “moon” and “sky”
relate to each other the same way as the French words “lune” and “ciel.” “With whales, the big question is whether any of this stuff is even present,” says Jacob Andreas, a natural language processing expert at MIT and a Project CETI team member. “Are there minimal units inside this communication system that behave like language, and are there rules for putting them together?”
To find out, the team expects to use a host of techniques. For example, one deep network approach takes random stabs at outlining a system of rules for language. Then it checks to see if “units” of conversation meet those rules. If they don’t, it makes tweaks and tries again. Computers perform “this process of tweaking-and-validating rules very quickly, repeating it thousands or millions of times to produce a set of rules that do a good job of explaining data,” Andreas says.
Of course, progress depends on researchers gathering enough data. Machine learning requires vast amounts of information, but Gero’s recordings only number in the thousands. Finding patterns in whale speak will likely require tens of millions of codas, perhaps more.
Plus, as Gero suspected with Drop and Doublebend, scientists believe they’ll need to match communication with behavior. Is there a specific coda that shows up prior to hunting, or a sequence that gets made when whales decide to mate?
“It’s the cocktail party problem,” Gruber says. Scatter a few microphones around a party, and they’ll pick up snatches of conversation. But watch people—tracking who touches someone’s arm, who scans the room for better company—“and the whole scene starts to make more sense,” Gruber says.
Revolutionizing the study of animal communication
On Monday the team unveiled the major steps it’s taking toward that end. CETI leaders have established a partnership with Dominica to deploy more whale-monitoring technology in the country’s waters. CETI also has been designated a Ted Audacious Project, which has linked the effort with eight major philanthropic donors interested in tackling bold ideas. The team also has received funding from the National Geographic Society.
CETI researchers already have spent a year developing a massive array of sophisticated high-resolution underwater sensors that will record sound 24 hours a day across a vast portion of Gero’s whale study area. Three of these listening systems, each attached to a buoy at the surface, will drop straight down thousands of feet to the bottom, with hydrophones every few hundred meters.
National Geographic’s Exploration Technology Lab and Wood, the Harvard roboticist who is also a National Geographic Explorer, helped design a new iteration of video camera that attaches to whales with suction cups. This camera, unlike previous versions, can withstand pressure at the depth where whales hunt, take images in near-total darkness, and record high-quality audio.
Rus, at MIT, is working on additional robotics, helping develop aerial, floating, and underwater drones that can unobtrusively record sound and video. She recently helped build a swimming robot that travels silently, mimicking the undulating tail movements of reef fish.
“We want to know as much as we can,” Gruber says. “What’s the weather doing? Who’s talking to who? What’s happening 10 kilometers away. Is the whale hungry, sick, pregnant, mating? But we want to be as invisible as possible as we do it.”
Outside experts say CETI could revolutionize elements of wildlife research. Janet Mann, a Georgetown University professor who has studied dolphins in Australia for decades, says the project could be “groundbreaking for sperm whales, but also for the study of other animal communication systems as well.”
Michelle Fournet, an acoustic ecologist at Cornell University, says the project addresses a key difficulty of animal research. People, including scientists, tend to see human-like patterns in animal behavior. “We see a humpback waving its pectoral fin and think they’re saying hello,” she says. But humpbacks are usually just being aggressive. Artificial intelligence can weed out our biases and more accurately find meaning in communication and behavior, Fournet says.
For the CETI researchers, much of the value will be in the journey of discovery itself. The Apollo mission put people on the moon, but along the way humans invented calculators, Velcro, and transistors, and they helped launch the digital age that makes this project possible. Even if CETI never cracks the sperm whale code, researchers are bound to make significant advancements in machine learning, animal communication, and our understanding of one of the world’s most mysterious creatures.
And, years from now, if the structure of sperm whale vocalizations becomes clearer, the team may attempt to communicate with the whales—not to hold an interspecies dialogue but to see if the whales respond predictably. The goal would be to validate the team’s assessment of sperm whale communication.
“The question comes up: What are you going to say to them? That kind of misses the point,” Gero says. “It assumes they have a language to talk about us and boats or the weather or whatever we might want to ask them about.”
Gruber agrees. “It’s not about us talking to them,” he says. “This is about listening to the whales in their own setting, on their own terms. It’s the idea that we want to know what they’re saying—that we care.”
The National Geographic Society, committed to illuminating and protecting the wonder of our world, funded Explorers David Gruber, Shane Gero, and Robert Wood. Learn more about the Society’s support of ocean Explorers.