Image by Humphrey King, via Flickr
Image by Humphrey King, via Flickr

In Defense of Brain Imaging

Brain imaging has fared pretty well in its three decades of existence, all in all. A quick search of the PubMed database for one of the most popular methods, functional magnetic resonance imaging (fMRI), yields some 22,000 studies.  In 2010 the federal government promised $40 million for the Human Connectome Project, which aims to map all of the human brain’s connections. And brain imaging will no doubt play a big part in the president’s new, $4.5 billion BRAIN Initiative. If you bring up brain scanning at a summer BBQ party, your neighbors may think you’re weird, but they’ll be somewhat familiar with what you’re talking about. (Not so for, say, calcium imaging of zebrafish neurons…)

And yet, like any youngster, neuroimaging has suffered its share of embarrassing moments. In 2008, researchers from MIT reported that many high-profile imaging studies used statistical methods resulting in ‘voodoo correlations’: artificially inflated links between emotions or personality traits and specific patterns of brain activity. The next year, a Dartmouth team put a dead salmon in a scanner, showed it a bunch of photos of people, and then asked the salmon to determine what emotion the people in the photos were feeling. Thanks to random noise in the data, a small region in the fish’s brain appeared to “activate” when it was “thinking” about others’ emotions. Books like Brainwashed, A Skeptic’s Guide to the Mind, Neuro: The New Brain Sciences and the Management of the Mind, and the upcoming The Myth of Mirror Neurons have all added fuel to the skeptical fire.

There are many valid concerns about brain imaging — I’ve called them out, on occasion. But a new commentary in the Hastings Center Report has me wondering if the criticism itself has gone a bit overboard. In the piece, titled “Brain Images, Babies, and Bathwater: Critiquing Critiques of Functional Neuroimaging,” neuroscientist Martha Farah makes two compelling counterpoints. One is that brain imaging methods have improved a great deal since the technology’s inception. The second is that its drawbacks — statistical pitfalls, inappropriate interpretations, and the like — are not much different from those of other scientific fields.

First, the improvements. At the dawn of brain imaging, Farah notes, researchers were concerned largely with mapping which parts of the brain light up during specific tasks, such as reading words or seeing colors. This garnered criticism from many who said that imaging was just a flashy, expensive, modern phrenology. “If the mind happens in space at all, it happens somewhere north of the neck. What exactly turns on knowing how far north?” wrote philosopher Jerry Fodor in the London Review of Books.

But the purpose of those early localization experiments, according to Farah, was mostly to validate the new technology — to make sure that the areas that were preferentially activated in the scanner during reading, say, were the same regions that older methods (such as lesion studies) had identified as being important for reading. Once validated, researchers moved on to more interesting questions. “The bulk of functional neuroimaging research in the 21st century is not motivated by localization per se,” Farah writes.

Researchers have developed new ways of analyzing imaging data that doesn’t have anything to do with matching specific regions to specific behaviors. Last year, for example, I wrote about a method developed by Farah’s colleague Geoffrey Aguirre that allows researchers to study how a brain adapts to seeing (or hearing or smelling or whatever) the same stimulus again and again, or how the brain responds to a stimulus differently depending on what it experienced just before.

Other groups are using brain scanners to visualize not the activity of a single region, but rather the coordinated synchrony of many regions across the entire brain. This method, called ‘resting-state functional connectivity’, has revealed, among other things, that there is a network of regions that are most active when we are daydreaming, or introspecting, not engaged in anything in particular.

All that is to say: Today’s neuroimaging is more sophisticated than it used to be. But yes, it still has problems.

Its statistics, for one thing, are complicated as hell. Researchers divide brain scans into tens of thousands of ‘voxels’, or three-dimensional pixels. And each voxel gets its own statistical test to determine whether its activity really differs between two experimental conditions (reading and not-reading, say). Most statistical tests are considered legit if they reach a ‘significance level’ of .05 or less, which means that there’s a 5 percent or less chance that the activity occurred due to random chance. But if you have 50,000 voxels, then a significance level of .05 means that 2,500 of them would look significant by chance alone!

This problem, known as ‘multiple comparisons’, is what caused a dead salmon to show brain activity. “There’s no simple solution to it,” Farah writes. The salmon study, in fact, used a much more stringent significance level of .001, meaning that there was just a .1 percent chance that any given voxel’s activity was due to chance. And yet, that cut-off would still mean a brain of 50,000 voxels would have 50 spurious signals.

Researchers can control for multiple comparisons by focusing on a smaller region of interest to begin with, or by using various statistical tricks. Some studies don’t control for it properly. But then again — and here’s Farah’s strongest point — the same could be said for lots of other fields. To make this point, a 2006 study in the Journal of Clinical Epidemiology compared the astrological signs and hospital diagnoses for all 10.7 million adult residents of Ontario, finding that “residents born under Leo had a higher probability of gastrointestinal hemorrhage, while Sagittarians had a higher probability of humerus fracture.”

A different statistical snag led to the aforementioned voodoo correlations. These false associations between brain and behavior arose because researchers used the same dataset to both discover a trend and to use the newly discovered trend to make predictions. It’s obviously a problem that many headline-grabbing studies (several were published in top journals) made this mistake. Here again, though, the error is not unique to brain imaging. The same kind of double-dipping happens in epidemiology, genetics, and finance. For example, some economists will use a dataset to group assets into portfolios and then use the same dataset to test pricing models of said assets.

Perhaps the stickiest criticism lodged against brain imaging is the idea that it is more “seductive” to the public than other forms of scientific data. One 2008 study reported that people are more likely to find news articles about cognitive neuroscience convincing if the text appears next to brain scans, as opposed to other images or no image. “These data lend support to the notion that part of the fascination, and the credibility, of brain imaging research lies in the persuasive power of the actual brain images themselves,” the authors wrote. Farah points out, however, that four other laboratories (including hers) have tried — and failed — to replicate that study.

Anecdotally, I’ve certainly noticed that my non-scientist friends are often awe-struck regarding brain imaging in a way that they aren’t with, oh, optogenetics. But even if that’s the case, and brain imaging is especially attractive to the public, why would that be a valid argument against its continued use? It would be like saying that because the public is interested in genetic testing, and because genetic testing is often misinterpreted, scientists should stop studying genetics. It doesn’t make much sense.

Brain imaging isn’t a perfect scientific tool; nothing is. But there are many good reasons why it has revolutionized neuroscience over the past few decades. We — the media, the public, scientists themselves — should always be skeptical of neuroimaging data, and be quick to acknowledge shoddy statistics and hype. Just as we should for data of any kind.