Photograph by Jim Richardson, National Geographic Creative

Read Caption

Scientists began in 2011 to sequence the genome of the type of wheat used to make bread.

Photograph by Jim Richardson, National Geographic Creative

Complex Bread Wheat Genome Cracked

The genome of bread wheat contains a staggering 100,000 genes.

The genome of bread wheat, the grass-turned-crop whose cultivation ushered in the rise of civilization, has been mapped by an international consortium.

The genome's unusual size and form made the sequencing especially difficult for the team led by scientists from Germany, the United States, the Czech Republic, and Canada, they report in today's Science. The gene map reveals unexpected surprises about the evolution of the crop behind the staff of life.

"It's always astonishing [that] the number of genes does not directly translate into the complexity of the organism," said Klaus Mayer, director of genome analytics with the Plant Genome and Systems Biology Group at the Helmholtz Center Munich and one of the leaders of the project.

The genome of bread wheat contains a staggering 100,000 or so genes (the human genome contains roughly 20,000). What determines complexity is not the absolute number of genes, he said, but how and when the genes are activated and the interplay between genes and tissues.

The huge wheat genome can be traced directly to three ancient, closely related grasses that underwent a hybridization process known as "polyploidization," in which multiple excess copies of genes are passed along to offspring. Wheat essentially combines three grasses in one genetic package. While the process is relatively common in plants (but rare in animals), what's unusual about wheat is that some strains went through polyploidization more than once.

Here's what happened: Plants usually contain two copies of DNA, one from each parent. But sometimes—very rarely—both egg and sperm accidentally get two copies of DNA instead of one. If they fertilize, the offspring will have four copies, a tetraploid.

In wheat, this happened once several hundred thousand years ago, when two grass species, Triticum urartu and Aegilops speltoides, produced the tetraploid emmer wheat, typically used to produce chewy pasta. And then, surprisingly, it happened again when emmer wheat hybridized with another grass species, Aegilops tauschii, to produce the six-copy (hexaploid) bread wheat. The gluten proteins of the hexaploid version have the right properties to produce airy, spongy loaves of bread.

Bread wheats retain three subgenomes, each of which represents about 35,000 genes from the three original grass species, and about 80 percent to 90 percent of bread wheat's genome is made up of long, repetitive sequences of 12,000 to 15,000 base pairs. These repeats defy conventional sequencing methods.

"It was simply size and complexity that was putting up a roadblock for a fairly long time," Mayer said.

Work on the bread wheat genome began in 2011, while the genomes for rice and corn were published in 2002 and 2009.

Evolutionary "Playground"

The mapping methods for wheat were unusually labor intensive. What resulted is still a "draft" sequence: All the genes are understood in the right order along their respective chromosomes, but still missing is the genes' orientation and the sequences of the regions between the genes.

Mayer was surprised to find that unlike in other polyploid plants, ostensibly redundant genes that perform the same functions on all three subgenomes have been retained. In addition, wheat seems to have many more duplicated genes than other grains within individual chromosomes. The reason for these redundancies is unknown, but Mayer suggested that multiple gene copies may have provided an evolutionary "playground" for the generation of novel traits.

The new draft genome is expected to dramatically decrease the time it will take to identify and isolate genes of interest to plant breeders, such as genes for resistance to heat, stress, insects, or disease.

It has previously taken about ten years to isolate a single gene from the bread wheat genome, but Mayer expects the process to speed up in the same way gene isolation from other plants speeds up after their sequences are published. For example, in the late 1990s it took Mayer four years to isolate a gene from the model plant Arabidopsis thaliana; after its sequence was published in 2000, the same thing can now be done in six to eight weeks.

Wheat Family Tree

Robert Bowden, supervisory research plant pathologist at the U.S. Department of Agriculture's Hard Winter Wheat Genetics Research Unit in Manhattan, Kansas, who was not involved in the sequencing research, said he was impressed with the work that went into the draft genome. He expects it to be "extremely useful" to plant breeders and speed up their research, in part by helping them identify and use better markers for the traits they are trying to select.

Bowden also mentioned a historical anecdote that points to the value of basic research. The genetic stocks used to do the sequencing came from Ernie Sears, a USDA researcher who worked in Columbia, Missouri. In the 1960s and 1970s, Sears created stocks of wheat DNA that were missing single arms from single chromosomes.

"At the time, people were saying, 'Exactly what are we going to do with that?'" Bowden recalled. Their full use didn't emerge until now, he said: the ability to isolate individual arms using these stocks is what made the draft wheat sequence possible. "It's an example of basic research finally having its day," he said.

From an evolutionary perspective, said Bowden, what makes the story of wheat especially intriguing is that hybridization seems to have occurred twice in two different ways between the same two species, and at least three times overall. If it could happen that way in wheat, he said, "speciation by hybridization may be even more common than we thought."

Follow Jennifer Frazer on Twitter.