Science Stories — December 17th, 2008 by Biologic Staff
The parting advice given to Caltech’s graduating class of 2008 was to tell good stories. In his commencement address, science journalist Robert Krulwich emphasized that “scientists have to tell stories to nonscientists, because science stories have to compete with other stories about how the universe works and how it came to be.” [1] He warned that “to protect science and scientists—and this is not a gentle competition—you’ve got to get in there and tell your version of how things are, and why things came to be.”
Krulwich is right about the importance of communicating science clearly to nonscientists. But his suggestion that the strength of science lies in storytelling is troublesome. Quoting E. O. Wilson, Krulwich proposed that “science, like the rest of culture, is based on the manufacture of narrative…. We all live by narrative.”
Huh?
Narrative is clearly a component of science, but the basis? Shouldn’t less manufactured things like observation and analysis be given that spot? If not, then the “protection” that Krulwich advocates looks to be nothing more than a power grab. He surely doesn’t intend this, but neither does he articulate just how a story-based science can escape it.
Science is more than stories, though. Science is the human study of nature, so stories are in the mix because humans are. But if nature itself is to remain the object of study, then all stories have to bow to data at some point.
As the current evolution controversy shows, this process of yielding to data isn’t always easy, even for scientists. To see this we need to look not at popular versions of the Darwinian story but at the technical versions that scientists are telling each other in science journals. Several forthcoming Perspectives articles will be devoted to this, aiming to make the main ideas accessible to nonscientists.
Here we’ll look at a recent paper by Dryden, Thomson, and White (DTW), published in Journal of the Royal Society Interface. [2] Its stated conclusion is: “It is entirely feasible that for all practical (i.e. functional and structural) purposes, protein sequence space has been fully explored during the course of evolution of life on Earth.” What this means, in simple language, is that the functions we see proteins performing in cells are not so extraordinary that we should be surprised to see them. This claim is a response to a number of other technical papers that have drawn the opposite conclusion—that the functional proteins of biology are highly extraordinary when compared to the whole set of possible proteins.

To understand how the DTW paper attempts to justify its claim, consider the following analogy between proteins and sentences. Just as sentences are written by arranging characters in sequence, so proteins are built by linking amino acids into strings with specified sequences. The amino acid ‘alphabet’ has twenty members, comparable to the size of actual alphabets, and the length of protein ‘sentences’ written in their alphabet is similar to the length of actual written sentences. In both cases the ability to do many useful things by arranging characters into appropriate sequences opens up a world of possibilities.
By this way of viewing things, cells depend on several thousand protein ‘sentences’, each with its own important meaning. Considering the complexity of this biological ‘text’, chance-based explanations of it certainly call for careful probabilistic evaluation. But as DTW point out, the actual probabilistic difficulty of such a thing depends on several factors. Their paper focuses on two of these: the length of the required ‘sentences’, and the size of the ‘alphabet’ needed to write them. Their claim is that neither of these requirements is really as stringent as it appears to be.
We’ll use the analogy to get a feel for this claim, keeping in mind of course that the claim is about proteins rather than sentences. Consider the DTW conclusion quoted above. That sentence is 185 characters long, making it similar in length to biological proteins [3]. According to DTW, the functions that biological proteins perform could be adequately performed with proteins that are much shorter and incorporate considerably fewer kinds of amino acids. So let’s start by asking whether that works for their sentence. Can it be simplified the way they claim proteins can?
We might try shortening it to “Earthly life has fully explored protein functions.” This brings the length down to 50 characters, though not without affecting the meaning. The bigger problem, though, is that the DTW proposal also calls for radical reduction of the alphabet size. In fact, for this shortened sentence to meet their proposal, we would need to re-write it with a tiny alphabet of four or five symbols—and that mini-alphabet would have to work not just for this sentence but for all sentences in a text the size of the DTW paper.
Now, there is a tradeoff between sentence length and alphabet size, which means that a larger alphabet could be purchased by further reduction of sentence length. It’s tempting to think that normal sentences might be broken into several mini-sentences in hopes of achieving this. But keep in mind that the DTW claim has as much to do with sentence function as sentence length. The claim is that every single function performed by a biological protein can also be performed adequately by a simplified mini-protein. If we find that many sentence meanings cannot be conveyed adequately by mini-sentences each written from the same mini-alphabet (as seems to be the case) then at least the sentence version of the DTW claim fails.
Of course, the apparent failure of the DTW proposal in this analogy doesn’t prove that it fails for proteins. But it does provide rational grounds for skepticism. In particular, we would need to see convincing evidence that protein functions are very much less fussy about their amino acids than sentence functions are about their characters in order to think that the analogy may have misled us.
Dryden, Thomson and White refer to a number of studies of various kinds in support of their claim that protein functions are indeed very relaxed in this way. We won’t take the time to discuss these. What we can say is that in our judgment none of them provides any firm support for their claim.
In fact, the most conclusive scientific evidence on this matter seems to contradict their claim. First and foremost is the very observation they seek to explain—the functional proteins we see in nature. The mere fact that these proteins are far too long and employ far too many amino acids to meet the DTW restrictions ought to make us assume that they don’t meet those restrictions, absent a convincing case that they do. After all, why would cells go to so much trouble making all twenty amino acids if far fewer would do? And if fewer really would do, why do cells so meticulously avoid mistaking any of the twenty for any other in their manufacture of proteins? [4]
The cellular apparatus for making proteins does incorporate wrong amino acids, but the rarity of these errors makes the process remarkably well tuned for accurate synthesis of long proteins, not mini-proteins. A popular biochemisty textbook puts it this way: “An error frequency of about 0.0001 per amino acid residue was selected in the course of evolution to accurately produce proteins consisting of as many as 1000 amino acids while maintaining a remarkably rapid rate for protein synthesis.” [5] So, while the textbook ignores the problem that the DTW paper addresses—how on earth such things could evolve—the DTW paper ignores the aspects of proteins that plainly defy simplification.
What’s more, the inherent complexity of biological proteins is confirmed by experiments that test it directly. We know that amino acid changes tend to be functionally disruptive even when the replacements are similar to the originals [6], and we know what typically causes this—reduced structural stability of the functional form [7]. So, not only does the DTW claim suffer from a lack of direct supportive evidence—it also suffers from a substantial body of directly contrary evidence.
Having said that, we can conceive of the kind of supportive evidence that would contradict all the contrary evidence so plainly as to require some serious re-thinking. Some genuine examples of these simplified mini-proteins doing the work of life-size proteins inside cells would at least be a start—a mini-protein doing the work of an RNA polymerase, perhaps. Even a handful of clear and convincing examples like that would call for a serious rethink.
But all the clear and convincing evidence seems to be contrary to the DTW claim. And as aggravating as contrary evidence can be for scientists, it’s the only thing that keeps the whole enterprise honest. Because no matter how well stories are told… no matter how many people are captivated by them, sooner or later the false ones will be spoiled by the facts. The wait may be uncomfortably long, but eventually a simple spin-free description of reality is bound to capture people’s attention in a way that well packaged falsehoods don’t.
That’s the real strength of science.
[1] http://pr.caltech.edu/periodicals/EandS/articles/LXXI3/Krulwich.pdf
[2] http://journals.royalsociety.org/content/vx617715116m973h/fulltext.pdf
[3] http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1150220
[4] http://www.sciencedirect.com/science
[5] Berg JM, Tymoczko JL, Stryer L (2002) Biochemistry (5th edition). Freeman.