Novelty in the polyadenylation machinery

Late last fall, I published a short review in WIRES RNA that discussed some curious findings coming out of the growing community of plant scientists whose research touches on mRNA polyadenylation. When we think about the polyadenylation machinery, it is reflexive to consider that the core subunits (the CPSF, CstF, CFIm, and CFIIm subunits) should be essential. Indeed, this is the case in yeast and mammals, as far as one can tell. It is thus very surprising that Arabidopsis is able to grow (sometimes, with almost imperceptible phenotypes) in the absence of several supposedly core subunits. The list of dispensable proteins in plants includes CPSF30, FIP1, CstF77, and CstF64.

This list raises many questions. For example, CPSF30 is one of two proteins that cooperate to bind the AAUAAA polyadenylation signal in mammals (see the pictures at the end). Plants have an analogous motif (we call it the Near Upstream Element, or NUE), and it seems to be utilized just fine in plants that do not make CPSF30. To be sure, most poly(A) sites that require CPSF30 also have the typical plant NUE, and most sites that are seen only in CPSF30 mutants do not have the NUE. But it seems clear that general NUE usage happens just fine in the absence of CPSF30. (I have talked about polyadenylation signals here, and the CPSF30 mutant here.)

FIP1 is an essential protein in yeast, and tethers poly(A) polymerase to the processed RNA and the rest of the complex. It’s absence in plants (see this and this) leads to the question as to how PAP is linked to the processes RNA in the fip1 mutants.

The situation with CstF is equally perplexing. This complex binds sequences downstream from the poly(A) site in mammals, and its subunits are essential in yeast. Mammals have more than one CstF64 isoform, and it is remotely conceivable that plants may also have more than one. But bioinformatics analysis of even the most complete genomes (such as Arabidopsis) doesn’t really support this possibility. It’s worse for CstF77. There seems to be only one isoform in plants, and it can be dispensed with (although CstF77 mutants are pretty feeble, they do grow and can set a few seed). How plants can get by without it is hard to imagine.

I’m not going to speculate about the possibilities here. I did some of that in the WIRES review, and have more thoughts that need additional massaging. Instead, I would point out another publication and an area that I believe is going to grow in “popularity”. A few years ago, we published a study done by an undergraduate in our labs, Ashley Stevens. By “we”, I mean Dr. Dan Howe and myself. Dan is a professor in the Dept. of Veterinary Science here at the University of Kentucky and works on animal parasites, mainly apicomplexans that are related to Toxoplasma gondii and Plasmodium species. Ashley did a combined bioinformatics and transcriptomics project. When she did BLAST searches of different apicomplexans genomes, she had a hard time finding orthologs for most of the subunits of the mammalian complex. Notable by their absences were FIP1 and the three CstF subunits.

Ashley’ analysis is corroborated (for the most part) by another study published more recently, and it is in line with a somewhat dated study that I discussed on this blog. To be sure, there is not perfect agreement between these studies, mainly because the the authors of the more recent study used a much more permissive e-value cut-off when identifying possible orthologs of the mammalian polyadenylation complex. This is questionable, since many polyadenylation complex subunits consist of common protein domains (such as RRM, TPR, WD40, and zinc fingers), and a permissive filter leads to the flagging of many proteins that are not really involved in polyadenylation. With that caveat, it is safe to say that, in all likelihood, apicomplexans parasites do not seem to possess canonical CstF subunits.

So, what may be going on? How can different eukaryotes seemingly dispense with CstF? Might any of these considerations affect how we think about the mammalian complex? Might there be completely unrelated proteins that perform similar functions in parasites (and plants, for that matter)? I don’t have many answers at the moment. But I believe there is opportunity for lots of discovery. I am interested to see where all this goes. I’d be delighted to discuss ideas in the comments.

I’ll close this with some pictures that help to illustrate some of the possibilities for plants. These were made by downloading the PDB file for the mammalian polyadenylation complex from the recent study by Zhang et al. and playing around with it using the Molegro Molecular Viewer to delete some of these subunits. When viewing these, it probably is good to remember that these structures may not be exactly what exists in plants. But they are good starting points for speculation.

Here is the wild-type structure of the module that recognizes the poly(A) signal (PAS, or AAUAAA) in complex with CstF77. (In this structure, there are two “copies” of CstF77, owing to the fact it works as a dimer.) Note that I choose to refer to WDR33 by its proper name, FY (well, at least that’s what plant scientists call the protein):

While not in the study, we can conceptually place CstF64 here, since we know where CstF64 binds to CstF77 in plants:

In plants, we can remove CPSF30:

or CstF77:

and still retain at least some viability. (The CPSF30 mutant grows pretty normally.) 

3 Responses to Novelty in the polyadenylation machinery

  1. Clinton C. MacDonald says:

    Dr. Hunt:

    Interesting stuff, as always.

    I will add that we have found that CstF-64 is dispensable in several mammalian systems, albeit with an asterisk in each case. When we knocked out tauCstF-64 (Cstf2t, the Cstf2 paralog) in mice, the mice were fine. Males were infertile, since germ cells have the highest expression of tauCstF-64. But otherwise the mice were fine. Of interest, polyadenylation continued in the germ cells, although spermatogenesis was messed up severely.

    Similarly, we knocked out CstF-64 (Cstf2) in mouse embryonic stem cells, the cells were fine, too. They grew more slowly and had a partially differentiated phenotype, but we were surprised that they even survived. In this case, tauCstF-64 probably compensated for the absence of CstF-64. But CstF-64 is not genetically necessary for stem cells.

    Although it pains me to say this, I think that CstF-64 is “optional” for polyadenylation in most organisms. The role it plays is in fine-tuning polyadenylation site choice, but is not strictly necessary for the overall process.

    But please don’t tell anybody. My career has been based on trying to prove that CstF-64 is important. 😀

    Best wishes,

  2. Arthur Hunt says:

    CstF64 may be particularly important for male gametogenesis in plants. A long time ago, we noted that the Arabidopsis CstF64 gene was one of a handful with somewhat higher expression levels in pollen. (Hunt, A.G. et al., BMC Genomics 9, 220 (2008).

    Pollen seems to be a special case unto itself. In addition to the behavior of CstF64 and a few other genes, the expression of CPSF160 and CstF77 genes in pollen was less than 10% than what is seen in most other tissues. Most remarkably, plants have a distinctive PAP gene expressed almost exclusively in pollen. We really don’t know what may be going on in pollen. (The near-absence of CPSF160 in pollen intrigues me – maybe pollen is more like the apicomplexans than I could ever have imagined….)

  3. Clinton C. MacDonald says:

    Dr. Hunt:

    Some interesting parallels between male gametes in plants (pollen is male, right?) and in mammals. CstF-64 is greatly overexpressed (250-fold) in mammalian male germ cells compared to other tissues (Dass et al. Biol Reprod 64, 1722–1729, 2001) as it is in pollen. Unlike pollen though, CPSF-160 is also overexpressed in mammalian germ cells.

    Along with your observations that lots of seemingly important polyadenylation genes are absent or underexpressed in different species, I wonder if there is an interesting, though highly speculative review article in here somewhere? What does it mean for a gene to be genetically essential (knock it out and the organism dies) versus important-but-optional (some tissues or some organisms get along without them)? What do you think?

    Best wishes,

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: