Not Star Trek. What we’re talking about here is DNA sequencing, and the impact it is having, and will have, on studies of polyadenylation.
Since last summer, there has been a spate of papers describing the application of so-called Next Generation DNA sequencing (in its many manifestations) to the matter of polyadenylation. The general idea is simple – to generate and analyze large numbers of short DNA tags that are derived from the junctions of the poly(A) tail and bodies of mRNAs. The expected outcomes of such studies are qualitative and quantitative descriptions of the genome-wide distributions of polyadenylation sites. This information would help to better annotate (or describe) the genome, and to help identify unusual occurrences. The latter might include alternative poly(A) sites, sites associated with as-yet unidentified transcripts, and sites that define antisense RNAs.
I’ll (very) briefly summarize four studies that have been published since last July, focusing on the different approaches to making and sequencing DNA tags that query poly(A) sites. I hope to say more in later essays about some of the findings that these studies yield. But one thing that I find interesting and important is that, even though the experimental strategies differ, the general findings are fairly consistent.
The first study was co-authored by 23 persons affiliated with 11 institutions. (For brevity’s sake, I won’t list the authors or institutions; readers are welcome to visit the publication’s web page and get the information for themselves.) This paper, entitled “The Landscape of C. Elegans 3’UTRs”, took a multifaceted approach to identifying poly(A) sites. This included the preparation and sequencing of full-length cDNAs from worms at different developmental stages, the “mining” of sites from the sequence traces that may be found at NCBI, by high-throughput 3’-RACE (involving more than 7000 different genes!), by mining extant RNA-Seq data, and, most pertinent to this essay, by producing cDNA fragments that query poly(A)-mRNA junctions and sequencing these cDNA fragments using 454 pyrosequencing technology.
The second study (Ozsolak et al.) to discuss took a somewhat different approach to producing large numbers of sequences that query poly(A) sites. These authors utilized so-called Helicos sequencing technology and capitalized on the fact that the desired targets for sequencing – polyadenylated RNAs – are “natural” substrates for the sequencing method. (Other nucleic acid molecules need to be polyadenylated prior to capture with immobilized oligo-dT.) They used this method to sequence the 3’ ends of human liver and brain mRNAs as well as yeast mRNAs.
The third study (Jan et al.) took yet another approach to generate sequences that define the mRNA-poly(A) junction. The two-tiered strategy involved an initial capture of polyadenylated RNAs using immobilized oligo-dT followed by a series of steps intended to permit reverse transcription and subsequent amplification of DNA tags, but without using oligo-dT for the reverse transcription. (The concern obviously is that oligo-dT may prime internally and not just within poly(A) tails.) The DNA fragments so generated were sequenced using Illumina sequencing technology. This technology was applied to the characterization of poly(A) sites in C. elegans.
The fourth study (Shepard et al.) implemented a rather simple and clever method to produce and sequence DNA tags that query the mRNA-poly(A) junction. This group fragmented RNA and performed reverse transcription reactions using and anchored oligo-dT primer that had, at its 5’ end, sequences compatible with Illumina sequencing. They also incorporated a sort of “cap-capture” or SMART step, in which strand-switching was capitalized upon to place the opposing Illumina adapter at the other end of the DNA tags. Thus, a one-step RT reaction followed by PCR amplification and work-up sufficed to generate DNA tags for sequencing. This method was applied to human and mouse mRNA preparations and the resulting sequences analyzed.
Without getting too lost in the numerous details, this group of studies brings into sharper focus several interesting aspects of polyadenylation. Regardless of the organism, it is apparent from these studies that many genes (ranging from 30-70%, the latter figure for yeast) possess more than one polyadenylation site, suggesting important roles for alternative polyadenylation in gene expression. Previously-noted developmental trends in poly(A) site choice are confirmed in these studies. There are strong suggestions of somewhat different classes of polyadenylation signal in C. elegans and mammals. A possible interplay between poly(A) site choice and microRNA targeting (such as is discussed here and here) is suggested by these studies.
There is much more to be gleaned from this group of papers than I am inclined to discuss in this brief essay. Beyond this, there is great potential in this approach in studying various and sundry aspects of polyadenylation. We shall see what the future holds in this regard.
The four papers:
Mangone M, Manoharan AP, Thierry-Mieg D, Thierry-Mieg J, Han T, Mackowiak SD, Mis E, Zegar C, Gutwein MR, Khivansara V, Attie O, Chen K, Salehi-Ashtiani K, Vidal M, Harkins TT, Bouffard P, Suzuki Y, Sugano S, Kohara Y, Rajewsky N, Piano F, Gunsalus KC, Kim JK. The landscape of C. elegans 3’UTRs. Science. 2010 Jul 23;329(5990):432-5.
Ozsolak F, Kapranov P, Foissac S, Kim SW, Fishilevich E, Monaghan AP, John B, Milos PM. Comprehensive polyadenylation site maps in yeast and human reveal pervasive alternative polyadenylation. Cell. 2010 Dec 10;143(6):1018-29.
Jan CH, Friedman RC, Ruby JG, Bartel DP. Formation, regulation and evolution of Caenorhabditis elegans 3’UTRs. Nature. 2011 Jan 6;469(7328):97-101. Epub 2010 Nov 17.
Shepard PJ, Choi EA, Lu J, Flanagan LA, Hertel KJ, Shi Y. Complex and dynamic landscape of RNA polyadenylation revealed by PAS-Seq. RNA. 2011 Feb 22. [Epub ahead of print]