Genome-wide identification of transcript start and end sites by transcript isoform sequencing

Pelechano V, Wei W, Jakob P, Steinmetz LM

Nat Protoc 9 (7) 1740-1759 [2014-07-00; online 2014-06-26]

Hundreds of transcript isoforms with varying boundaries and alternative regulatory signals are transcribed from the genome, even in a genetically homogeneous population of cells. To study this transcriptional heterogeneity, we developed transcript isoform sequencing (TIF-seq), a method that allows the genome-wide profiling of full-length transcript isoforms defined by their exact 5' and 3' boundaries. TIF-seq entails the generation of full-length cDNA libraries, followed by their circularization and the sequencing of the junction fragments spanning the 5' and 3' transcript ends. By determining the respective co-occurrence of start and end sites of individual transcript molecules, TIF-seq can distinguish variations that conventional approaches for mapping single ends cannot, such as short abortive transcripts, bicistronic messages and overlapping transcripts that differ in lengths. The TIF-seq protocol we describe here can be applied to any eukaryotic organism (e.g., yeast, human), and it requires 6-10 d for generating TIF-seq libraries, 10 d for sequencing and 2-3 d for analysis.

Vicent Pelechano

PubMed 24967623

DOI 10.1038/nprot.2014.121

Crossref 10.1038/nprot.2014.121