You know, I’ve been looking, and I don’t think any human genes have 100% penetrance, at least in the strict way I’d define it. Penetrance is loosely defined as the percentage of people (or organisms) with a genotype who show the phenotype. If it’s 100%, 100% of people with that genotype should show that phenotype. And I don’t think that happens, if we define phenotypes appropriately strictly.
I’ve long known that very few traits are actually 100% penetrant, in the traditional Mendelian, Punnett square sense. You know, the one where there are dominant and recessive alleles, and there’s the homozygous AA and heterozygous Aa, and recessive x recessive = recessive but dominant x recessive = dominant. The unrealisticness of Punnett squares is one of the first things you learn in Intro to Molecular Bio. In fact, Mendel knew it too, which is why he carefully chose pea traits that actually fit that pattern.
But I’ve been thinking a lot about genes and genetic inheritance, recently, and I think the number of human genes with 100% penetrance might actually be zero. Now, I know this is a bold claim. There are quite a few diseases that people get that are caused genetically with 100% guarantee. So, like Huntington’s, for example: if you have the wrong mutation in the gene that encodes for the huntingtin protein, the repeated sequence in the gene, the trinucleotide repeat expansion, can get too long.
If the repeated section gets above 36 repeats, you’re at risk for Huntington’s. If it gets above 40 repeats, you’re guaranteed to get Huntington’s. If it gets above 60 repeats, you’re guaranteed to get juvenile Huntington’s. These mutations are almost always inherited from parents, and they are inherited in classical autosomal dominant fashion: if your parent has it, you have a 50% chance of getting it. This is about as close a case for 100% penetrance as you can get.
But I still don’t think it’s actually 100% penetrant if we define the phenotype narrowly. Our Mendelian pea plants were born with their wrinkly, yellow skin or purple flowers. The timing was incontrovertible and so were the traits. The same is not true for Huntington’s. Huntington’s has variable timing and variable prognosis, regardless of the number of repeats. There’s no guarantee on when you’ll get it or what form it will take.
And that remains true even if we describe the phenotype as just misshapen proteins. While the Huntington’s mutation does cause the huntingtin protein to be misshapen, it doesn’t guarantee exactly how it will be misshapen, which is part of the reason the prognosis is so variable. So, again, we’re a far cry from our yellow and green peas.
The lack of penetrance, then, comes from both bottom up uncertainty and top down uncertainty. By bottom up uncertainty, I mean the uncertainty that comes from going from a gene to a functional (or dysfunctional) protein. This is not a straightforward process. When the concept of gene penetrance was invented, the general belief was that the process was DNA -> mRNA -> tRNA -> amino acids -> proteins1.
But regulatory and environmental factors interfere at every part of that. If I were to annotate this process with these factors, it might actually look more like this
1. DNA -> mRNA, modified by transcription factors including Sp1 and p53 response elements
2. mRNA, modified by splicing, capping, polyadenylation, and miRNA
3. Amino acids -> proteins, modified by environmental factors, like temperature and pH, and quantum effects (van der Waals forces acting on electrons)
4. Proteins, modified by phosphorylation, ubiquitination, and/or palmitoylation
This means that there are no guarantees on what will happen to any given huntingtin protein, regardless of whether it’s correctly encoded or not. The only real guarantee is that, if it’s encoded the wrong way, it will be messed up and not function well, but exactly how it will be messed up and malfunction is unclear.
That’s the bottom-up uncertainty. The top-down uncertainty comes from the actual actions of huntingtin protein in the body. Most proteins are used in lots of places, but huntingtin is unusually widely used. It’s most used in the brain for axonal transport, but it’s also used through the cell, the body, and the mitochondria. When it malfunctions, it impacts a lot of things, and the impact can be variable depending on which part of the body or cell we’re talking about.
It’s at this point that I think this kind of uncertainty is best understood through a factory analogy. The body, in many ways, is like a series of nested factories. Like, if you think of our body as Apple trying to make an iPhone, our liver tissues might be the equivalent of the cogeneration plants that Apple springs up around their factories. Then, inside our livers, the cells with their mitochondria might be the equivalent of the solar panels the cogeneration plant uses to keep itself running. Proteins, then, would be the individual workers.
This is an imperfect analogy, sure, but it can give you an idea of how unpredictable errors can be. If one guy/protein is drunk at work, his coworkers can cover for him. The factory would barely be affected and you would not notice anything about the quality of your iPhone. If all of the solar panel technicians at a cogeneration plant are drunk (i.e. all of the mitochondria in one part of your liver malfunction), this would probably ruin the cogeneration plant, at least temporarily. It might also temporarily knock out the individual factory or data center that relies on the cogeneration plant, and Apple would have to scramble to find a temporary replacement for that factory. You might notice that your iCloud is a lot slower than normal, but hopefully it wouldn’t ruin your iPhone, unless that happens at a crucial point in the iPhone manufacturing process.
But then, imagine if a class of worker who’s used ubiquitously is just perpetually drunk. Like, let’s say Apple just only hires alcoholic solar panel technicians (the equivalent of malformed huntingtin proteins in the mitochondria). This would definitely have a bad effect on the cogeneration plants and on the factories, but it’s hard to say exactly what would happen. The solar panel technicians would make a lot more mistakes, sure, and solar panels would be knocked offline much more often. Other people in the cogeneration plants would have to adjust for that, and the factories would have to adjust for intermittent power. This would make it way harder to run a factory. From an end user’s perspective, you’d definitely notice things are wrong with your iPhone and its services, but the bugs would be intermittent and it’d be unclear why or how they’re happening. At some point, too many solar panel technicians would be drunk at the wrong time, they’d all spill beers on the solar panels, and the lights would go out in the cogeneration plant. The cogeneration plant would shut down and the factory would malfunction. If the power goes out when the factory’s in the middle of making a motherboard, then that iPhone would be a brick. By analogy, if mitochondria fail in the moment of the liver processing a crucial toxin or huntingtin proteins interfere with the brain instructing the autonomous nervous system to breathe, the person who that liver/brain belongs to would be very unhappy.
To sum up the analogy, this sort of uncertainty is because of interaction between the systems. The cell and the body are complicated and living, and they can adjust for errors, depending on when or how the errors are made. The adjustments and the errors manifest as phenotypes, but predicting these phenotypes relies on predicting not just the protein’s malfunction (the bottom-up uncertainty), but how that malfunction will interact with other proteins and the function of the cell or body as a whole, and how the cell or body will adjust for that malfunction.
The immune system adds a huge amount of additional complexity and uncertainty to this as well. The immune system, in its search for cancers and foreign invaders in the factories, does not just add complexities in its additional interactions with the cells, although that is an additional source of complexity. The biggest source of complexity that it adds is through its insistence on legibility. The immune system doesn’t just check if cells are cancerous or infected, it makes healthy cells show that they’re not.
In this sense, the immune system is like Apple’s security. They’re not only checking each factory and each worker, but they’re going up to the factories and the workers and demanding that they prove they’re working on what they’re supposed to be working on. The factories, and the data centers, and the cogeneration plants all have to be prepared to show that they have not gone off course or risk being terminated.
And this, I believe, is ultimately why 100% penetrance is impossible, at least in humans. The complexity from our proteins is a lot, but Mendel’s peas also were made of proteins. The complexity from our interacting systems is a lot, but Mendel’s peas also had interacting systems.
But our immune system is uniquely complex compared to peas. It has to be. Compared to peas, we are much longer lived, have multiple immune-privileged areas of the body, have much more complicated moving parts, and are much more susceptible to being killed by cancer. Our immune system doesn’t just have to work harder than peas’ immune systems. Our immune system demands legibility from the rest of the body, which increases the top-down complexity to the point that 100% penetrance becomes impossible.
Note that Crick himself did not believe in this strict pathway. He emphasized that the central dogma was just that information always goes from nucleic acids to proteins, and never the reverse. But he was ignored on this for a long time.
I found this a really interesting supporting argument for genes not being destiny.
Love your factory analogy! I have been pondering the penetrance puzzle for years, and the Apple factory seems to capture many of its features and subtleties.
An additional wrinkle to perhaps consider is genetic variation. To extend your analogy, say Apple were to dip into its bags of cash and build multiple additional factories. Just as no two humans are identical and might differ at thousands or even millions of genetic loci, any pair of factories might, for example, have slightly (or markedly) different levels of security (i.e. immune variation), say, or might be built in cities/countries with significant differences in drinking/drug use, work habits, education level, etc.
Whatever the source(s) of variation, the point is that it's hard to imagine that any two of these factories would be precisely the same. And so, even if these factories used the same centralized operations manuals and produced iPhones to the same specs, inevitably there would be differences in how they looked and perhaps even operated internally.
Now, going back to your huntingtin analogy, say Apple hired only alcoholic solar panel technicians in two factories, one in China and the other in India or Mexico City. Given the size and complexity of the networks in these factories, it's similarly hard to imagine that they would respond identically to the same insult.