Decoding Jeff Jonas, Wizard of Big Data

Jonas solves problems the high-tech way—mining the "data warehouse in his head."

Jeff Jonas was 14 years old when he saw his first computer. It would not be an overstatement to label the experience profound, and completely accurate to call it a revelation. The scriptural resonance of the word "revelation" is not inappropriate; the discovery would launch Jonas on a Pilgrim's Progress through the landscape of digital information and personal growth.

The journey began in his hometown of Healdsburg, California, on the day his mother, a lawyer, said, "I am going to look at a computer to help me with time and billing." It was 1979, and the TRS-80 they went to check out at the RadioShack store was the first mass-produced, preassembled PC. It had a giant floppy disk drive and was shipped with four kilobytes of memory.

The salesman gave a demonstration, and added, "It's connected to a research network; you can search for articles. Pick a topic."

"Cooling with copper wires," said a family friend and self-styled inventor who had come along and was working on that idea, convinced of its uniqueness. The salesman tapped in the words, and a hundred articles on the subject popped up on the screen.

At that moment, whatever other career aspirations Jonas may have had vanished.

"I do that," he said.

He has been doing that ever since.

Cowboy of the 21st Century

Jonas is a data scientist, a member of an elite cadre of scientists able to pan the accumulating silt of data for gold. He plies his art at IBM, but his style is far from corporate buttoned-up. He dresses in black, skips the tie and coat, doesn't wear a watch ("my phone keeps time for me"), and keeps his head clean-shaven. The 49-year-old is perpetually, almost relentlessly, upbeat.

His inventions (he has about a hundred to his credit) are based on programs that "data mine" torrents of information into usable form formulated with a cunning purpose: What can the data tell you that you didn't think to ask? The applications for these programs include fingering potential terrorists, catching fraudulent behavior in casinos, smoking out evidence of global money laundering, and reuniting families separated by natural disasters. He also created a newly hatched system that can potentially flag possible asteroid collisions over a 25-year arc.

Data scientist, the Harvard Business Review has said, is the "sexiest job of the 21st century." It might be more instructive, though, to think of the data scientist as the cowboy of the 21st century, except instead of cows being wrangled, it's information that gets rounded up and wrestled into shape.

"And the herd keeps getting bigger," adds James Cortada, a historian of computers and a senior research fellow at the University of Minnesota in Minneapolis. "The herd doesn't consist of one kind of cow, but also hogs, giraffes, cockroaches, horses—and you have to make sense of the whole thing. The good news is [the computer scientist] now has the tools to do that: He has a lasso that works as well on a cockroach as it does on a cow."

The cows, hogs, and cockroaches are different data sets. And analytics, the act of lassoing them, Cortada says, "is about connecting the dots."

Bankrupt at 20

Jonas's high school offered two computer classes. Only juniors could enroll, but he talked his way in as a sophomore. As a summer project he created a word-processing program that his computer teacher sold to the Los Angeles County School District. The $200 check he received prompted a second revelation: "You could do something for fun and people would send you money."

He was a junior when his teacher said, "There's nothing more we can teach you," so he went to Santa Rosa Junior College but, realizing there wasn't anything he could learn there either, dropped out after three months to start his own computer business.

He had 21 people working for him when he was 19 years old. The problem was he had no experience running a business. He overpromised and underdelivered. Payroll checks bounced; office furniture was repossessed; he couldn't pay the rent. He owed creditors $200,000. He pared the debt down but not nearly enough, and on December 31, 1983, he was forced to declare bankruptcy.

"I was 20, couldn't even legally drink, and I owed $160,000," he says. "I sat on the floor and cried."

That wasn't all. He was about to make his father a grandfather. His father was incredulous. "You don't even have a girlfriend," he said. But in fact Jonas had gotten a girl pregnant.

"Son," his father told him, "you can't be living on my coattails. You move out in two weeks."

Data Set of Reef Fish

So Jonas was homeless. He couch-surfed or slept in his car. The memento mori of his failure—the manila folder containing his bankruptcy file, with its list of 50 creditors—stayed in the trunk of his car and followed him everywhere.

"It sounds strange," he admits, "but I immediately started another software company. I realized how the bankruptcy had happened and was determined not to let it happen again."

He borrowed money from friends to make ends meet, and took back-road routes between cities because he lacked the money for tolls—but he eventually found his footing and landed his first programming job in Las Vegas at the Mirage Hotel, which was about to open.

He walked in and was shown the 20,000-gallon (75,700-liter), 50-foot-long (15-meter-long) saltwater aquarium in the hotel lobby. The data set he would be working with was reef fish.

"They wanted to better understand what was happening in the aquarium, so I built a system to track the fish," he explains. "How many did you put in? How many did you take out? How many ended up floating on top?" The inventory program allowed better decisions to be made in stocking the aquarium.

"After that it was, 'Can you build this?' 'Can you build that?'" Jonas says. "There was so much work, I decided I should just move here."

Ultimately, he invented the software program that is the protoplasm of his innovative genius: NORA—an acronym for Non-Obvious Relationship Awareness.

Enter NORA

Last year nearly 40 million visitors visited the visual cacophony that is Las Vegas. About $6.5 billion was spent on gambling. A casino can lose a quarter of a million dollars in 15 minutes to fraud—a player may bend the corner of a card to mark it, or a dealer can collude with a player to use a deck of cards that's been put in order beforehand.

Gaming is strictly regulated in Vegas, and the state gaming commission keeps a watch list of no-admits, including known cheaters, felons, and people who have voluntarily declared themselves to be gambling addicts. A casino doing business with someone on the list can be fined or lose its license. So casinos want to know who they are hiring—for example, is there a relationship between an employee who works at the gaming tables and a known criminal? Casinos also want to know about desirable guests, high rollers they want to pamper.

Enter NORA, which assembles data from different sources and then sorts through and sniffs out the connective circuitry.

Here, simplified, is how Jonas explains it:

Let's hover over a Las Vegas hotel and put on special glasses so all we see is the data.

Data about hotel employees is a pile of blue puzzle pieces.

There's another pile of puzzle pieces about the hotel guests. These are gold.

Then there are the bad guys—cheats or felons. Let's make those puzzle pieces red.

The puzzle pieces form a picture that tells a story.

Using available and legally obtained data (Jonas emphasizes the program has built-in privacy safeguards)—such as employee records, phone numbers, addresses, job applications, hotel reservations, customer loyalty program information, and the gaming commission's list of banned players—NORA figures out if an employee and a bad guy are related, live near each other, or share the same phone number; it may also detect if a guest has links to an employee.

But the information is not just useful to casinos. In one instance, NORA figured out that two out of every thousand people a retail operation hired had already been arrested for stealing at the same store. "But they didn't know it because the data lived in separate piles. It's about assembling what you know," Jonas says. "And the closer to real time you know, the faster you can do something about it."

Big Data

Where, other than Las Vegas, do analytics apply? Everywhere. Analytics can reveal traffic patterns: Should I take this route or that? It can evaluate medical conditions: What are the chances a mammogram is suspicious? Advise on purchase decisions: Do I buy my airline ticket now or wait a few days in hopes the price will come down? And help determine retail strategy: What is selling and to whom? How much of what product should a store stock?

But what about privacy, you ask—all that personal information available for the gleaning or subject to ferreting out with a bit of computer tinkering? If knowledge is power, what about the potential to abuse that power?

"Companies are essentially obligated to understand their customers as best they can. If companies don't do a good job of understanding their customers and delivering what their customers want ... the customers punish them by wandering off," Jonas says in answer to the question of why the mining of personal data is important.

"Can it be misused?" he asks rhetorically. "I think about this often. Pencils are generally used for good. But now and then someone uses a pencil to plan a crime."

Jonas emphasizes that the programs he builds have privacy safeguards "baked in," as he puts it. The answer, he says, is to make software in ways that reduce the risk of misuse and secondary harm while maintaining utility, like having what he calls "data anonymization" options to protect privacy.

The raw material of analytics is Big Data—a phrase that came into currency about ten years ago—and it is worth pausing for an explanation. The world is an accumulating snowball of data: not just the words and numbers in records available in the public domain like addresses, phone numbers, property records, and the Internet exhaust trail of our spending and site-visiting history, but also "unstructured data" like videos, photographs, a traffic-camera shot of your car going through a red light.

Gigabytes? Terabytes? Bah, small potatoes. These days the world is full of exabytes—zettabytes, even. Quantifying it is tricky, but Cortada of the University of Minnesota says at least 2.5 billion gigabytes of data are created daily. "One gigabyte has been likened to ten yards of books on a shelf. Now multiply that by 2.5 billion and that is what probably got created in the past 24 hours."

After NORA, G2

In 2005, IBM acquired Jonas's company, Systems Research and Development. (A nondisclosure contract precludes him from revealing the amount, but it was significant). As soon as he received the money, Jonas repaid his creditors, with 3 percent compound interest. He even paid an outstanding Diner's Club bill from the bankruptcy. The astonished, but appreciative, company sent it back; it had long since retired the debt.

The buyout money was irrelevant. "I would have sold my company for a dollar," he says. He was chasing his work, not money. The lure was the global platform a multinational company offered for his ideas. "Jeff doesn't care about accolades or money," says a client. "He appreciates that people have burdens and that his problem-solving can lift some of those burdens."

At IBM, where he was named a fellow and chief scientist of context computing, Jonas developed NORA's progeny—G2. It is faster, more sophisticated, and works in different languages.

In 2012, G2 was used to solve problems that had plagued the voter registration process for years. "Mobility in the United States is high," explains David Becker, director of election initiatives for the Pew Charitable Trusts. "If someone moves from state to state, election officials usually don't know it."

They also often don't know if someone has died, or who is eligible to vote but hasn't registered.

G2 matches multiple data sets from different states—the red, blue, and gold puzzle pieces—to resolve those issues. The system that resulted, known as ERIC (Electronic Registration Information Center), has brought in 300,000 to 400,000 new voters. "It's a tool that enhances democracy," Becker says.

In 2009, after he'd been brought into the process, Jonas explained how G2 could solve the conundrums of voter registration to a group of local and state election officials. "There was an actual gasp from one woman," Becker recalls. "If there was a light bulb above her, you would have literally seen it light up."

Jonas doesn't just think out of the box, he says. "For Jonas there is no box."

Thinking in Four Dimensions

Ever since the Mirage job in 1990, Jonas has been based in Las Vegas. "Based" is used loosely: Last year he slept in his bed 11 days.

You could say Jonas lives his life in fast forward, 30,000 feet (9,144 meters) up. In one eight-day stretch last June, for example, he racked up 13,612 miles (21,906 kilometers) hopping from one Pacific destination to another. In addition to his work life, Jonas, who has been married three times, remains deeply involved with his kids—three of his own, and four step-children.

Where Jonas works is simple. "Seat 12A," he says when asked. "Do you know how much work you can get done on a 17-hour flight to Singapore?"

How he works is more complicated.

Usually there has to be a problem to solve. A "help wanted" sign goes up, and, "Jonas brainstorms into infinity," says Arte Nathan, former head of human resources for Wynn Casinos, who worked with Jonas on systems that, among other things, ensured fair hiring practices. (When the Mirage opened, there were 55,000 applications to sort through.)

"He thinks in three—no, four dimensions," Nathan says. "He has a data warehouse in his head." And that's where the work takes place—in his head. Not on paper. Not on a computer. He resorts to paper only to work the details out. When asked about his thought process, Jonas reaches for words, then says: "It's like a Rubik's Cube. It all clicks into place.

"The solution," he says, is "simply there to find."

After an idea is hatched, the concept is translated into computer code known as C++. Jonas has programmers to actually write the code (he supervises 60 people on his team at IBM) and take the idea across the finish line. The kick for him is in dreaming up solutions—whether the problem is voter registration or tracking asteroids—that help people and companies make better decisions.

In one recent case, the decision was whether to approve a money transfer. A 100-year-old grandmother received a call saying her grandson was in a Mexican jail and needed $2,500 for bail. The grandmother tried to send funds by MoneyGram, but the request was denied despite her insistence. It was a scam, and later, when the fraud was revealed, the grandmother was beyond grateful. She lived on Social Security benefits and couldn't afford the loss.

When he read that his computer program had saved a 100-year-old grandmother from being bilked out of her savings, Jonas, who has an unabashedly sentimental streak, cried. This time, not out of despair, but from happiness.

Read This Next

The world’s newest whale is already endangered
Sanibel Island was a paradise. Then Hurricane Ian struck.
Capturing the art and science of NASA’s origami starshade

Go Further

Subscriber Exclusive Content

Why are people so dang obsessed with Mars?

How viruses shape our world

The era of greyhound racing in the U.S. is coming to an end

See how people have imagined life on Mars through history

See how NASA’s new Mars rover will explore the red planet

Why are people so dang obsessed with Mars?

How viruses shape our world

The era of greyhound racing in the U.S. is coming to an end

See how people have imagined life on Mars through history

See how NASA’s new Mars rover will explore the red planet

Why are people so dang obsessed with Mars?

How viruses shape our world

The era of greyhound racing in the U.S. is coming to an end

See how people have imagined life on Mars through history

See how NASA’s new Mars rover will explore the red planet