The other day I ran across the biochemical pathways poster from Roche.
This is the first of two posters. Both posters are about 26 square feet in area. Below is a close up of about one square foot in the middle of the first poster.
I’d seen this before, I think while I was working at MD Anderson Cancer Center, and it’s quite an impressive piece of work. A lot of work went into the production of the posters themselves, but the amount of hard-won knowledge the posters represent is mind-boggling. One little arrow on one poster might represent a career’s work.
A paper distributed with the charts explains that the pathways included represent a small selection of what is known.
Some indication of the degree of selection can be taken from the fact that in the present “Pathways” about 1,000 enzymes are shown. … Estimations of the number of proteins (with and without enzymatic activity) in a single mammalian cell are in the order of magnitude of 30,000.
I told a friend that I was thinking about getting a copy of the poster as a reminder of complexity, an analog of a memento mori meant to serve as a reminder of one’s mortality. The Latin phrase memento mori means “remember that you must die.” The biochemical pathways makes me thing “remember that you are complex” or “remember the world is complex.”
I asked a Latin scholar what an appropriate aphorism would be, and she suggested the phrase memento complexitatis, which translates as “be mindful of complexity.” Another suggestion was omnia contexta sunt, meaning “all things have been braided.” As Rich Hickey explains in his popular video, complex literally means braided together.
Everything impacts everything. Independence is always a simplifying assumption. The assumption may be justified, but it is an assumption.
The poster is also a reminder that we need not throw up our hands in the face of complexity. Often the implication of mathematical discussions of complexity is that predicting the future is hopeless; there’s no point trying to understand the behavior of a complex system except in some abstract qualitative way. The Roche posters are inspiring reminders that with hard work, we can learn something about complex systems. In fact, over time we can learn quite a lot.
There’s a debate in data science that is relevant to this discussion. And that is: machine learning vs causal inference. We can build highly predictive models, but those predictions can fall apart with novel data. An explanation is that we are failing to capture cause and effect (the data generated is under a slightly different intervention). In the last 20 years, we now know we can estimate causal effects–but only if the causes and effects are known a priori. That is, you have to construct a DAG. If you are able to do so, you are in a good place. Because you can ask cause and effect questions, the model will generalize very well to novel data, etc. But constructing that DAG is very difficult and requires a lot of thinking (not nearly as complex as these biochemical pathways). Do we give up and resort to correlation (of course, depends on the problem: recommender system vs medical decision)? But I feel like it’s a similar problem.
I have a copy of the original poster when it was published by Boehringer Mannheim. In the early 1980’s, when I was a biochemistry student, you could get a copy just by writing and asking for it.
Impressive! I’d be satisfied to understand the relationships between probability distributions, as immortalized by L. Leemis (2008) in a poster-sized graphic, which you mentioned in 2012:
https://www.johndcook.com/blog/2012/12/10/extended-distribution-chart/
Several of my work colleagues have printed versions on their wall. As you say, each arrow might represent a career’s work.