General information
2.1.1 release. Trinity, represents a novel method for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data. Trinity combines three independent software modules: Inchworm, Chrysalis, and Butterfly, applied sequentially to process large volumes of RNA-seq reads. Trinity partitions the sequence data into many individual de Bruijn graphs, each representing the transcriptional complexity at at a given gene or locus, and then processes each graph independently to extract full-length splicing isoforms and to tease apart transcripts derived from paralogous genes. Briefly, the process works like so:
- Inchworm assembles the RNA-seq data into the unique sequences of transcripts, often generating full-length transcripts for a dominant isoform, but then reports just the unique portions of alternatively spliced transcripts.
- Chrysalis clusters the Inchworm contigs into clusters and constructs complete de Bruijn graphs for each cluster. Each cluster represents the full transcriptonal complexity for a given gene (or sets of genes that share sequences in common). Chrysalis then partitions the full read set among these disjoint graphs.
- Butterfly then processes the individual graphs in parallel, tracing the paths that reads and pairs of reads take within the graph, ultimately reporting full-length transcripts for alternatively spliced isoforms, and teasing apart transcripts that corresponds to paralogous genes.
How to use
You can use the
send_trinity
command to submit jobs to the queue system. After answering few questions a script will be created and submitted to the queue system. For advanced users it can be used to generate a sample script.
Performance
Trinity can be run in parallel but it is not very efficient above 4 cores with low performance, as can be seen in the the table. Trinity consumes high amounts of RAM.
Cores | 1 | 4 | 8 | 12 |
Time | 5189 | 2116 | 1754 | 1852 |
Speddup | 1 | 2.45 | 2.96 | 2.80 |
Efficiency (%) | 100 | 61 | 37 | 23 |