Looking under the lamppost

In previous posts I’ve discussed the limitations of genomics, especially as compared to other fundamental advances in biomedicine. It’s not that I think genomics is a fraud or is bad science; to the contrary, it is providing significant insights into human biology and history every day. But genomics is just one field of science among many, and its funding is all out of proportion to its potential to improve health care. Its limitations are not technical and will not be solved by more and better data and analysis. Instead, its potential is limited by this basic fact: our genes just don’t affect our health all that much.

This fact was knowable (and indeed was largely known) long before the first genome sequences were produced. You don’t need sequence data to know the limitations of sequences — you just need twins with medical records.

Since monozygotic (identical) twins have essentially identical genomes, discordances in their health histories can be used to estimate the relative contributions of genes and environment to disease. For instance, if you found that when one twin has breast cancer the likelihood that the other twin has breast cancer is over 80% (as opposed to 12% in the general population), you would conclude that the genetic component of breast cancer likelihood is 80%-12% = 68%. It’s a little more complicated than that, but not much. And if this is all the information you have, then this estimation of genetic risk is an upper limit. Twins share environmental exposures to a greater degree than non-twins and these environmental exposures look like genetic influences at this level of analysis.

Stephen Rapoport carried out this analysis recently for 28 major chronic diseases using large databases of Western European twins. His estimates of the contribution of genetics plus shared environmental exposures (“population attributable fraction”) are summarized in this figure:

About half the burden of asthma can be attributed to genes and shared environment; almost none of the burden of leukemia is. Another way to look at these data is to scale them to their overall contribution to mortality:

The entire potential for disease risk estimation by genomics is contained in the red portions of those bars. The efficacy of any preventive interventions based on genomics, such as gene editing, is also limited to the red portions, and in practice will be much smaller.

That is the top-down macro approach. When we take the bottom-up micro approach and look at the association of genetic variants with disease, the value of genomics shrinks even more. The ten most common genetic loci associated with colorectal cancer collectively account for 1.3% of risk. This low contribution of genetic variants to disease risk is typical for all common chronic diseases. Looking at large panels of these variants does not help much. A panel of 101 different genetic variants proved to have no value in predicting the risk of coronary heart disease. Lots of weak data do not constitute strong evidence.

Clearly, if we want to prevent disease, investigation of environmental exposures and behaviors has much more potential and should be the focus of research. But a search of the biomedical literature using the terms “disease causes AND genetics” yields 609,879 hits, while “disease causes AND exposure” yields 76,858. These proportions are exactly the opposite of what they should be.

And that is the real problem with genomic research. Genomic technology makes collecting and analyzing sequence data easy: get samples, get health histories, generate mountains of data and start digging. You are sure to find something publishable that will look like cutting-edge research. Teasing out the associations between environmental exposures and health outcomes, by contrast, is grindingly hard and frustrating work. Ideally we would follow people around for decades, tally up their environmental exposures and match them to health outcomes. That is a very tough way to launch or sustain a career, and there are few studies of this kind. Instead researchers are forced to rely on imperfect instruments like subject’s memories of exposure. Not surprisingly, these methods are unreliable and we are constantly bombarded with contradictory results from flawed studies.

In truth, we don’t want to know how our environment and behavior contribute to disease, because then we might feel obliged to change them and that could be unsettling. Genomics is comfortable because nobody controls their genome. Genomics displaces the onus of disease from society and individuals on to something akin to fate or karma. And if genomic research leads to new pills to treat disease, then so much the better. We much prefer to treat disease than to prevent it.

Ultimately, genomics is about looking under the lamppost, searching where the light is brightest. There is nothing wrong with that, because we will find a few coins on the sidewalk that way. But many more coins lie outside in the dark, and we need to find them too.