Editor’s note: Police made an arrest in the killing of Sierra Bouzigard, whose unsolved murder opens this story, on July 24, 2017. Authorities in Louisiana charged Blake A. Russell with second-degree murder after they said that they matched his DNA with samples found on Bouzigard’s body. This story appears in the July 2016 issue of National Geographic magazine.
On the morning of November 23, 2009, a cyclist riding near Lake Charles, Louisiana, discovered the body of a young woman lying near a country road. Her face had been beaten beyond recognition, but an unusual tattoo led the police to identify her as 19-year-old Sierra Bouzigard. Investigators from the Calcasieu Parish Sheriff’s Office, headed by Sheriff Tony Mancuso, immediately set about reconstructing her final hours. The people who last saw Bouzigard alive had let her use their phone. The number she dialed gave police a lead.
Bouzigard’s assailant had also left behind a promising clue. From tissue caught under her fingernails as she struggled for her life, the detectives were able to pick up a clear DNA sample. To find the killer, all they needed was a match. The number she had dialed led police to a crew of undocumented Mexican workers. “So we started getting warrants for DNA swabs, getting translators, working with immigration,” Mancuso recalls.
But none of the Mexicans’ DNA matched the sample from the crime scene. Nor was there a hit in the FBI’s database of prior felons, missing persons, and arrestees, a system known as CODIS—the Combined DNA Index System. The investigators continued to issue calls for people with any information to come forward, and Bouzigard’s family offered a $10,000 reward. But the case grew cold.
Then, in June 2015, Monica Quaal, a lead DNA analyst at the lab that works with the sheriff’s office, learned about an intriguing new way of exploiting the information contained in a DNA sample—one that would not require a suspect’s DNA or a match in a database. Called DNA phenotyping, the technique conjures up a physical likeness of the person who left the sample behind, including traits such as geographic ancestry, eye and natural hair color, and even a possible shape for facial features. Quaal immediately thought of the Bouzigard case, in which the DNA left at the scene was virtually the only lead. She contacted Mancuso and Lt. Les Blanchard, a detective on the case, and they sent their sample to Ellen Greytak, director of bioinformatics at Parabon NanoLabs, a company specializing in DNA phenotyping.
Here the investigation took an unexpected turn. Based on the available evidence, the detectives still believed her killer was likely Hispanic—perhaps a member of the Mexican crew who had fled the area soon after committing the crime. But the person in the DNA-generated portrait Parabon produced had pale skin and freckles. His hair was brown, and his eyes were probably green or blue. His ancestry, the analysis said, was northern European.
“We kind of had to take a step back and say all this time, we’re not even in the right direction,” Mancuso says. But armed with this new evidence, he is optimistic. “I think at some point we can solve this case, because we have such a good DNA sample and this profile,” he says. “We know who the killer is. We just don’t know who the killer is.”
DNA phenotyping is a relatively recent arrival in forensic science, and some critics question how useful it will be. The facial composites it produces are predictions from genetics, not photographs. Many aspects of a person’s appearance are not encoded in DNA and thus can never be unearthed from it, like whether someone has a beard, or dyed hair. Nevertheless, Parabon, which calls its facial composite service Snapshot, has had more than 40 law enforcement organizations as customers. Human genome pioneer Craig Venter, as part of his new personalized health company called Human Longevity, is also investigating facial reconstruction from DNA, as are many academic labs.
Meanwhile other high-tech forensic methods are coming on the scene. CT scanners allow doctors to perform virtual autopsies, peering into bodies for signs of murder undetected by standard autopsies. Researchers are studying whether bacteria on corpses can provide a more accurate clock to gauge when death occurred. And they’re even investigating whether culprits might be identified not just by the DNA left at a crime scene but also by the microbial signature of the bacteria they leave behind.
The forensic techniques we’re more familiar with from movies and television shows such as CSI have far longer histories. In 1910 Thomas Jennings became the first American convicted of murder based primarily on fingerprint evidence. He was accused of shooting one Clarence Hiller during a bungled burglary. The culprit had left his fingerprints behind on a freshly painted windowsill, and the testimony of four fingerprint experts was nearly the entire basis on which Jennings was found guilty and sentenced to death. In response to his appeal, a higher court pointed both to the long heritage of using fingerprints for identification—pharaohs employed thumbprints as signatures, they said—and to “the great success of the system in England, where it has been used since 1891 in thousands of cases without error.” The court did caution that because such evidence fell beyond the purview of the average person’s experience, it must be presented by experts who could explain it to the jury. The verdict was upheld, and Jennings was hanged.
By the late 20th century, there were numerous investigative techniques in the courtroom. FBI analysts gave testimony comparing hairs found at a crime scene with those from suspects. Hair-analysis experts note the shape of the microscopic scales that coat hairs, the thickness and coloration of the hair, and the organization of pigment granules in it, among other qualities. Bite-mark analysis, in which experts compare the pattern left by a bite on a victim to a suspect’s teeth, was widely adopted in the early 1970s, including a 1974 court case that hinged on marks identified on a dead woman’s nose after she’d been exhumed. Other visual comparisons—between tire tracks, shoe prints, and patterns on bullet casings—also made their way from being clues used by law enforcement to identify suspects to becoming evidence presented in court to help prove guilt. In thousands of cases, judges tasked with deciding whether evidence is reliable have leaned on ample precedent to allow such forensic results to be admitted in court. Experts with years of experience at their craft have testified with assurance.
But over the past decade or so, it’s become apparent that many forensic methodologies offer far less certitude than TV dramas suggest. And when forensic evidence is oversold in court, innocent people go to jail, or worse.
In 1981 a woman in Washington, D.C., was attacked in her apartment—gagged, blindfolded, and raped. She worked with a police artist to create a composite of her attacker, and about a month later an officer tipped off detectives to 18-year-old Kirk Odom, who he believed resembled the sketch. Odom’s mother testified that he’d been at home; she remembered the day because his sister had just had a baby. The victim uncertainly picked a picture of Odom out of a photo lineup, then positively identified him in a live version. An FBI analyst’s subsequent testimony that Odom’s hair was microscopically indistinguishable from a single hair found on the victim’s nightgown helped clinch the case against him. He spent more than 22 years in prison and eight on parole as a sex offender before D.C.’s Public Defender Service pursued new evidence that proved him innocent.
In 1992 Cameron Todd Willingham was accused of setting the fire in his house in Corsicana, Texas, that killed his three young daughters. Fire investigators interpreted charred patterns on the home’s floor and what appeared to be multiple places where the fire started as signs of an intentional, gasoline-lit blaze. In 2011 the state of Texas found that the interpretation of evidence in the case had been fatally flawed. But it was too late for Willingham: He had been executed seven years earlier.
And then there’s Oregon attorney Brandon Mayfield, who was arrested by the FBI at his law office in May 2004. Mayfield remembers an agent shouting obscenities at him during the arrest. The agents didn’t clarify the reason behind the arrest; to find out, he had to read the warrant with his hands cuffed behind him. His fingerprints had turned up in a search of the Integrated Automated Fingerprint Identification System, and were determined by two FBI fingerprint examiners to be a match to those found on a plastic bag containing materials used in the terrorist bombings in Madrid, which had killed 191 people. The Spanish authorities, however, didn’t agree. Two weeks after Mayfield’s arrest, they sent word that they had found their own match to the prints—an Algerian man, still at large, now regarded as one of the key planners of the attacks.
What all these stories have in common is their reliance on methods and interpretations that involve more craft than science. The power of hair analysis, for instance, has been vastly overstated. The FBI admits that its analysts have made erroneous statements in more than 90 percent of the microscopic-hair-comparison cases it has reviewed.
Arson evidence is also being challenged. For many years, arson investigators examined patterns on windows where a fire occurred to see if they were cracked—or “crazed”—in a characteristic way. They looked for whether a metal doorsill had melted, or a concrete floor had burst under the heat, a phenomenon called spalling. If temperatures were high enough to cause such damage, it was regarded as evidence that a substance such as gasoline was used to start the blaze. But fire investigator John Lentini, who co-authored a report to the Texas Forensic Science Commission about the Willingham case, says that such assumptions are outdated.
“The theory was that after a short time, a fire started with gasoline is throwing off much more heat than a fire burning wood only,” Lentini says. “Therefore, the flame temperature must be higher, right? Wrong!” Research shows that ventilation, much more than what started the fire, is what determines the heat and speed of a blaze. Crazed glass, spalled concrete, melted metal—in tests with burning rooms, all can happen in the absence of gasoline, if the ventilation and other factors are right.
Even the reliability of fingerprint evidence has been called into question. While computers do a good job of matching a set of standard ink-recorded or electronically scanned fingerprints through a database search, they’re still not as good as the human eye when it comes to matching latent fingerprints with those of a suspect. And because latent prints often are distorted or smudged, matches rely on the judgment of experts who, however skilled, are providing a subjective opinion. One study found that examiners sometimes came to different conclusions about the same fingerprint if they were told the print had come from a suspect who had confessed to the crime or was in custody. In the case of Brandon Mayfield, a federal report revealed that the analysts had convinced themselves of similarities that didn’t exist.
In 2009 the National Academy of Sciences released a blistering report calling into question the scientific validity of the analysis of fingerprints, bite marks, blood spatters, clothing fiber, handwriting, bullet markings, and many other mainstays of forensic investigation. It concluded that with one exception, no forensic method could be relied on with a high degree of certainty to “demonstrate a connection between evidence and a specific individual or source.”
Not coincidentally, the one forensic methodology that passed scrutiny in the NAS report was developed not by law enforcement to aid the investigation of crimes but by a scientist working in an academic laboratory. In 1984 British geneticist Alec Jeffreys stumbled upon a surprising truth: He could tell people in his experiment apart solely by patterns in each person’s minced-up DNA, the genetic code we all inherit from our parents.
Jeffreys’s discovery formed the basis of the first generation of DNA tests. Three years later Jeffreys’s lab processed DNA from a 17-year-old suspect in the rape and murder of two teenage girls in central England, and saw that it did not match DNA from semen found in the victims. Thus the first use of DNA in a criminal case led not to a conviction but to an exoneration. (The true killer later confessed, after he tried to elude DNA screening of a group of men in the area.)
Soon other, more sensitive tests were in use, and by 1997 the FBI was employing one that looked at 13 places on the genome where stutters in the DNA code cropped up. The odds of any two unrelated people having the same 13 patterns were one in at least hundreds of billions. It was these patterns that wound up forming the basis of the FBI’s CODIS database. By the 1990s, DNA profiling was being widely used in court cases around the world—in the United States, most famously in the murder trial of O. J. Simpson.
DNA evidence is hardly incontrovertible. Its value can be compromised by contamination from extraneous DNA anywhere along the chain from the crime scene to the laboratory where the sample is sequenced. A robust signal from semen, saliva, or tissue can narrow the probability of a false match to virtually zero, but trace amounts of DNA left on an object handled by a suspect can yield much less accurate results. And a DNA sequence in a lab is only as good as the training of the person conducting the analysis. In April 2015 DNA analysis in the D.C. crime lab was suspended for 10 months and more than a hundred of its cases were reviewed, after an accreditation board found that analysts there were “not competent” and were using “inadequate procedures.”
It’s been seven years since the National Academy of Sciences report called for a complete overhaul of forensic science. Some of its recommendations—to create a National Institute of Forensic Sciences, for example—are unlikely to come to pass for financial reasons, say government sources. Others, like an increase in research to establish how reliable fingerprints, bite marks, and other patterns really are at identifying individuals, are under way. In the first five years after the NAS report, the National Institute of Justice spent $100 million on projects that have resulted in more than 600 scientific studies and reports. But the going is slow.
“There are the rudiments of some important, beginning changes,” says Jennifer Mnookin, dean of the UCLA law school, “in fingerprint evidence in particular.” In 2012 the federal government released new guidelines for a fingerprint-analysis work flow aimed at preventing opportunities for error. And some fingerprint experts say that they’re in the midst of a paradigm shift, away from the long-standing professional protocol that required fingerprint analysts to present their courtroom opinions with absolute certainty. Instead these experts now argue that they should express their findings in probabilities, as DNA experts do.
Cedric Neumann, a professor of statistics at South Dakota State University who specializes in fingerprints, is one of those arguing for a better way for analysts to express the uncertainty in their results. Neumann and others also hope to develop a more objective way to look at the loops, arches, and whorls used to compare fingerprints.
The development of such standards is key to making forensic science, well, scientific. The National Institute of Standards and Technology (NIST) is helping to hammer out a list of best practices for how to calibrate instruments, what processes to use when comparing fingerprints, and how to interpret bullet casings and DNA typing and drug analysis results, along with many others. “Eventually there will be this registry of standards, which says this is the level at which the bar is set,” says John Butler, an analytical chemist who is special assistant to the director for forensic science at NIST. “Right now there’s nothing there.”
Even once the bar is set, NIST cannot require facilities to meet its guidelines. What will likely happen is that facilities that can show they meet the standards for a procedure will be eligible for accreditation—an optional certification process offered by a third party. Currently more than 80 percent of forensic labs have general accreditation, indicating that they’ve fulfilled basic requirements for best practices. But plenty of forensic work takes place outside of labs, in forensic units of police departments. When a 2014 survey looked at more than a thousand forensics providers, including both labs and police departments, the authors found that upwards of 70 percent didn’t have a general accreditation.
Another impediment, says Mnookin, is that judges, who are the gatekeepers of the courts, continue to admit questionable forensic evidence—including hair and bite-mark analysis. As long as such testimony continues to be admitted in court, there’s little incentive for forensics experts to make substantive changes. “Judges haven’t actually taken seriously the need to establish validity, to have a known error rate,” Mnookin says. “The judiciary has largely punted on this.”
But while the pace of change is slow, there are some hopeful signs. In February of this year, the Texas Forensic Science Commission became the first in the country to recommend a moratorium on the use of bite marks as evidence in court until their validity could be studied and confirmed. The decision was prompted in part by the release the previous fall of Steven Mark Chaney, a Texas man who had served 28 years in prison after a murder conviction that hung largely on bite-mark evidence that has since been dismissed as scientifically unsound.
With traditional forensic techniques facing such scrutiny, does a new science-based one such as DNA phenotyping offer more hope, or another source of uncertainty?
On September 1, 2015, the Calcasieu Parish Sheriff’s Office in Lake Charles, Louisiana, released to the media a likeness of the white male suspected in the killing of Sierra Bouzigard. The image produced by Parabon NanoLabs expresses both how much of a person’s appearance can be coaxed from DNA and how much cannot. The face is eerily devoid of personality or affect. Nothing lurking in the eyes suggests a troubled childhood; there’s no sneer on the full lips that might betray a penchant for evil or contempt for the law. It could be your second cousin, or the guy who served you at the deli yesterday, or the fellow you have a crush on in your graduate economics seminar. Or it could be the man who in 2009 battered a young woman to death.
The Bouzigard case is not the first time DNA phenotyping has come to the aid of a criminal investigation. A cruder form of the technology, using DNA to identify only the geographic ancestry of a suspect, was instrumental in catching a serial killer in Louisiana in 2003. Much as in the Bouzigard case, the police had been looking for someone of a particular ancestry, and the DNA profile indicated they were looking at the wrong segment of the population.
Newer versions of the technique flesh out the ancestral profiles with physical traits that have known genetic roots. Both East Asians and Europeans have pale skin—but because of different underlying genetic influences. Pale skin in Europeans is linked to a particular version of a gene called SLC24A5. Almost all Europeans have two copies of that version of the gene; non-European people with one copy of the version have much lighter skin than those with none at all. “I can probably sit down with a room of African Americans and tell you who has that gene with pretty good accuracy,” says Mark Shriver, a professor of biological anthropology at Pennsylvania State University. “It has that strong of an effect.”
In addition to tracking versions of genes known to code for certain traits, creators of a DNA phenotype can look for tiny variations, called single nucleotide polymorphisms (SNPs), sprinkled throughout the genome. They’re known to be associated with physical features, such as hair and eye color, a tendency to freckle, and whether the earlobes of the individual are attached or unattached.
Researchers at Parabon and Venter’s Human Longevity go a step further and use huge computer databases to seek out connections between a pattern of SNPs and the shape of a person’s facial features. Volunteers are asked to fill out questionnaires about their appearance, including details such as whether they have freckles. A sample of each volunteer’s DNA is then checked at about a million points where SNPs are known to occur. A 3-D scanner records the shape of the volunteer’s face—the particular angles of cheekbones, jawline, nose, and so on—to create a computerized image of the face. Computer algorithms can then look for associations between particular patterns of SNPs with salient features in the 3-D scans of the same individual, such as jawline or nose shape—a massive data crunch that can take weeks, running on hundreds of computers. The resulting correlations between SNP patterns and known features can then be used to reverse engineer a face from a sample of DNA from an unknown individual—such as the killer whose tissue was found under Bouzigard’s fingernails.
The question, of course, is how closely that face resembles the person who contributed the DNA from which it was derived, to the exclusion of other people, even ones with similar geographic ancestry. Theoretically, the more people of different ethnicities and facial profiles contribute to the DNA database, the better it will become at predicting the faces of crime suspects. What’s missing so far, says Manfred Kayser, a professor at Erasmus University Rotterdam who has developed a test that predicts eye and hair color from DNA, is clear, published proof that models built from a database of even many thousands of people can generate an accurate picture of someone from outside the database. “The key thing is that whatever they are doing, it has to be validated, it has to be replicated,” he says. Parabon is currently working on a small test of its methodology with Bruce Budowle, head of the University of North Texas Institute of Applied Genetics and a former DNA expert for the FBI.
Steven Armentrout, Parabon’s CEO, says it’s important to be clear about how the company’s facial reconstructions should best be used: not to identify a particular suspect but to eliminate ones who clearly don’t resemble the image, beginning with people who obviously don’t match—such as the Mexican laborers in the Bouzigard case.
“In the future,” says Armentrout, “we would be doing this at the beginning of the investigation—who should and shouldn’t be on your suspect list.” As the field of inquiry narrows, the DNA of a suspect not excluded by the Parabon Snapshot could be tested against the actual sample left at the crime scene. And Parabon’s phenotyping is not intended to identify specific individuals.
“I would underscore that message,” Armentrout says. “These new technologies are really just making the process of law enforcement more efficient.”
Les Blanchard, the detective in Lake Charles who hopes to solve the killing of Sierra Bouzigard, says he and his team have received multiple tips since releasing the Parabon Snapshot to the public last September. They’ve started knocking on doors.
As of this writing, no matches yet.
Max Aguilera-Hellweg is a photographer, filmmaker, and physician—though he doesn’t practice medicine.
What’s it like working at a crime scene? I shot at the site where Sierra Bouzigard was murdered. It was calm and peaceful, and a deep sense of respect overtook me for the person who died there. I wanted the picture I took to help tell her story. Maybe it could even help catch her killer.
Grant: Your National Geographic Society membership helped fund DNA-phenotyping research.