Late last fall, I published a short review in WIRES RNA that discussed some curious findings coming out of the growing community of plant scientists whose research touches on mRNA polyadenylation. When we think about the polyadenylation machinery, it is reflexive to consider that the core subunits (the CPSF, CstF, CFIm, and CFIIm subunits) should be essential. Indeed, this is the case in yeast and mammals, as far as one can tell. It is thus very surprising that Arabidopsis is able to grow (sometimes, with almost imperceptible phenotypes) in the absence of several supposedly core subunits. The list of dispensable proteins in plants includes CPSF30, FIP1, CstF77, and CstF64.
This list raises many questions. For example, CPSF30 is one of two proteins that cooperate to bind the AAUAAA polyadenylation signal in mammals (see the pictures at the end). Plants have an analogous motif (we call it the Near Upstream Element, or NUE), and it seems to be utilized just fine in plants that do not make CPSF30. To be sure, most poly(A) sites that require CPSF30 also have the typical plant NUE, and most sites that are seen only in CPSF30 mutants do not have the NUE. But it seems clear that general NUE usage happens just fine in the absence of CPSF30. (I have talked about polyadenylation signals here, and the CPSF30 mutant here.)
FIP1 is an essential protein in yeast, and tethers poly(A) polymerase to the processed RNA and the rest of the complex. It’s absence in plants (see this and this) leads to the question as to how PAP is linked to the processes RNA in the fip1 mutants.
The situation with CstF is equally perplexing. This complex binds sequences downstream from the poly(A) site in mammals, and its subunits are essential in yeast. Mammals have more than one CstF64 isoform, and it is remotely conceivable that plants may also have more than one. But bioinformatics analysis of even the most complete genomes (such as Arabidopsis) doesn’t really support this possibility. It’s worse for CstF77. There seems to be only one isoform in plants, and it can be dispensed with (although CstF77 mutants are pretty feeble, they do grow and can set a few seed). How plants can get by without it is hard to imagine.
I’m not going to speculate about the possibilities here. I did some of that in the WIRES review, and have more thoughts that need additional massaging. Instead, I would point out another publication and an area that I believe is going to grow in “popularity”. A few years ago, we published a study done by an undergraduate in our labs, Ashley Stevens. By “we”, I mean Dr. Dan Howe and myself. Dan is a professor in the Dept. of Veterinary Science here at the University of Kentucky and works on animal parasites, mainly apicomplexans that are related to Toxoplasma gondii and Plasmodium species. Ashley did a combined bioinformatics and transcriptomics project. When she did BLAST searches of different apicomplexans genomes, she had a hard time finding orthologs for most of the subunits of the mammalian complex. Notable by their absences were FIP1 and the three CstF subunits.
Ashley’ analysis is corroborated (for the most part) by another study published more recently, and it is in line with a somewhat dated study that I discussed on this blog. To be sure, there is not perfect agreement between these studies, mainly because the the authors of the more recent study used a much more permissive e-value cut-off when identifying possible orthologs of the mammalian polyadenylation complex. This is questionable, since many polyadenylation complex subunits consist of common protein domains (such as RRM, TPR, WD40, and zinc fingers), and a permissive filter leads to the flagging of many proteins that are not really involved in polyadenylation. With that caveat, it is safe to say that, in all likelihood, apicomplexans parasites do not seem to possess canonical CstF subunits.
So, what may be going on? How can different eukaryotes seemingly dispense with CstF? Might any of these considerations affect how we think about the mammalian complex? Might there be completely unrelated proteins that perform similar functions in parasites (and plants, for that matter)? I don’t have many answers at the moment. But I believe there is opportunity for lots of discovery. I am interested to see where all this goes. I’d be delighted to discuss ideas in the comments.
I’ll close this with some pictures that help to illustrate some of the possibilities for plants. These were made by downloading the PDB file for the mammalian polyadenylation complex from the recent study by Zhang et al. and playing around with it using the Molegro Molecular Viewer to delete some of these subunits. When viewing these, it probably is good to remember that these structures may not be exactly what exists in plants. But they are good starting points for speculation.
Here is the wild-type structure of the module that recognizes the poly(A) signal (PAS, or AAUAAA) in complex with CstF77. (In this structure, there are two “copies” of CstF77, owing to the fact it works as a dimer.) Note that I choose to refer to WDR33 by its proper name, FY (well, at least that’s what plant scientists call the protein):
While not in the study, we can conceptually place CstF64 here, since we know where CstF64 binds to CstF77 in plants:
In plants, we can remove CPSF30:
and still retain at least some viability. (The CPSF30 mutant grows pretty normally.)