Over at Recursivity, I left a comment in response to some things Kirk Durston said about my analysis of a paper by Doug Axe. Blogspot doesn’t like my formatting, so I have moved the last few paragraphs here, and posted the link to this message on Jeffrey Shallit’s blog.
The last two paragraphs:
4. The occurrence of functional islands in sequence space is far less rare than you let on, Kirk. The scope of the problem for you is illustrated (briefly, and among other places – I apologize for the shameless self-promotion) here, here, and here.
5. Said occurrence may be far more common than you let on, Kirk. Studies such as those of Chiarabelli et al. (Chemistry and Biodiversity 3, 840-859, 2006) show that folded sequences are rather common (20%!) in collections of random polypeptides. This is a large part of the problem of function, and it’s not nearly the impossibly inaccessible one you imply.
Here is the rest of the comment, in case anyone is wondering about what the fuss is about:
Hmmm…. my ears were burning, now I see why.
Kirk said, about my analysis of Axe (2004):
I just skimmed through this, because I do my own work and do not rely on anyone else when it comes to understanding protein sequence space. However, Hunt’s 3-D graphics where very, very misleading. They make it look like finding a folding, functional protein is a hill-climbing problem with a large base. While it is true that for most proteins, the functional efficiency of the sequence space that defines that protein tends to drop off toward the edges, it drops off very rapidly, such that the island is more like an area bounded by steep cliffs. It is also a reality of sequence space that the distance between these islands is vast. I’ve done some preliminary computations using real data from Pfam, and a very rough but conservative result is that if all the sequences that define a particular structure or fold-set where gathered into an area 1 square meter in area, the next island would be more than a thousand light years away. There are increasing numbers of scientists beginning to notice the paucity of folding sequences in sequence space. That is why we need functional information to locate these functional biological sequences. Biology cannot make them up. Biology has to ‘find’ the sequences that physics pre-determines will do the job. It is the ‘finding’ that is the problem, unless of course, one is permitted to search using intelligence. Intelligence can do stuff like that.
1. My illustrations were spot-on, IMO. A good way to see this is to think about studies such as those of Meinke et al (Biochemistry 47, 6859-6869, 2008). Notice in particular the results obtained with Fip1. Add these to studies such as those of Lange et al. (Science 320, 1471, 2008 ) and one can begin to uncover a remarkable universe of protein dynamics, and how accessible this universe is to the engine of mutation, variation, and selection. (I’ll have more to say about Meinke et al. on my own blog in the next few weeks.)
2. Kirk, you have to pay attention to the details of Axe (2004) to see why your arguments (and his) make no sense. Axe’s own method leads one to the conclusion that a fully-functional, robust beta-lactamase is more “abundant” in sequence space than the severely-crippled variant he dissected. My discussion is pretty much the only way to resolve this paradox. That it pretty much demolishes your arguments is tangential but revealing.
3. It’s experimentally-observed fact that there are many ways to “leap” from functional island to functional island.