Almost* all mRNAs in a eukaryotic cell have a poly(A) tail that is added by a conserved complex of proteins. Given that the poly(A) tail is added by a processing event, the question arises as to how this complex knows when and where to process the mRNA. The answer, to a first approximation, is that polyadenylation is guided by specific sequence signals carried by the precursor mRNA itself – the so-called polyadenylation signal.
That this is so was first established by direct mutational analysis of putative signals – briefly, the regions surrounding poly(A) sites were systematically altered and the effects of changes on the polyadenylation of the altered genes analyzed using standard (some might say “old-school”) techniques. These studies were first done in mammalian cells, with mammalian (usually human or viral) genes, and the results yielded the picture that is probably most widely-taught in college classes (see Fig. 1). (In this and subsequent figures, the vicinity of the polyadenylation site of a pre-mRNA is depicted in a linear fashion, reading along the RNA from 5’ -> 3’ left to right.) Additional study has revealed a more subtle structure of the mammalian poly(A) signal, shown in Fig. 2. The canonical signal consists of the hexamer AAUAAA, a U- or UG- rich downstream element, and less-recognizable upstream elements that affect the overall strength of the polyadenylation signal. There is considerable variability in these motifs in mammalian genes; this is even true for the AAUAAA motif, that may be suboptimal (varying from the consensus by one or more bases) or absent from more than 40% of all poly(A) signals (Tian et al., 2005). There are “strong” and “weak” variants of all three elements, a facet that adds a fascinating range of regulatory possibilities to the process. (One consequence of this, the matter of alternative polyadenylation, will be the subject of a future essay on this blog.)
This is the state of the art, as it were, in mammals. Interestingly enough, poly(A) signals are somewhat different in other eukaryotes. As was done with mammalian genes, mutational analyses were performed in yeast (Saccharomyces cerevisiae) and plants. The outcomes of these studies are summarized in Fig. 2, and reveal a common theme as well as subtle differences in these systems, as well as distinctions from what has been found in mammals. Briefly, poly(A) signals in plants and yeast seem to be tripartite, and each sub-element consists of rather heterogeneous signals (as opposed to an easily-recognizable motif such as AAUAAA). Curiously, there seems to be not required downstream element. While one might expect that there are “strong” or “weak” members of each class of motif, this has not been systematically explored. However, the potential as far as regulation and alternative polyadenylation is obvious.
The preceding body of knowledge (summarized in admittedly brief form here) was first derived from direct mutational studies of a relative handful of genes and polyadenylation signals. Large-scale computational analysis of finished genomes has confirmed that these features are general properties of genes in mammals, yeast, and plants (Graber et al., 1999; Loke et al., 2005). However, bioinformatics studies have also revealed an interesting variability in poly(A) signals amongst eukaryotes. Examples are the occurrence of very restrictive signals in some photosynthetic eukaryotes (Fig. 3) and in other protists (Fig. 4). While these novel putative signals have not been experimentally confirmed by mutational analysis, it seems likely that there is a considerable range of poly(A) signal make-up in eukaryotes. The significance of this, with respect to the nature of the complex responsible for mRNA polyadenylation as well as possibly novel aspects of gene expression in eukaryotes, is not yet known.
To summarize this essay, I would state what hopefully runs through the mind of readers of this blog – this seems to be an awfully complicated way to “tag” or mark mRNAs for such a simple process (cutting an RNA and then adding the poly(A) tail). Why this is so is a subject of active study by researchers, and will likely be the focus of the occasional entry on this blog in the future.
References for more information:
Cann et al. (2004), Mol. Biochem. Parasitol. 137, 239-245.
Espinosa et al. (2002), Gene 289, 81-86.
Gilmartin (2005), Genes Develop. 19, 2517-2521
Graber et al. (1999), Proc. Natl. Acad. Sci. USA 96, 14055-14060
Hu et al. (2005), RNA 11, 1485-1493.
Loke et al. (2005), Plant Physiol. 138, 1467-1468.
Peattie et al. (1989), J. Cell Biol. 109, 2323-2335.
Que et al. (1996), Mol. Biochem. Parasitol. 81, 101-110.
Tian et al. (2005), Nucl. Acids Res. 33, 201-212.
Zamorano et al. (2008), Computational Biology and Chemistry 32, 256-263
Zhao et al. (1999), Microbiol. Mol. Biol. Rev. 63, 405-445.
* – while most eukaryotic mRNAs are polyadenylated, those that encode cell cycle-regulated histones are not. The 3’ ends of these mRNAs consist of a distinctive secondary structure, and these 3’ ends are formed by a complex that includes the U7 snRNP as well as a subset of the proteins that mediate mRNA polyadenylation. The review by Dominski and Marzluff (Gene 396, 373-390, 2007) provides a nice and current overview of this subject.