General information
IDBA-UD 1.1.1 is a iterative De Bruijn Graph De Novo Assembler for Short Reads Sequencing data with Highly Uneven Sequencing Depth. It is an extension of IDBA algorithm. IDBA-UD also iterates from small k to a large k. In each iteration, short and low-depth contigs are removed iteratively with cutoff threshold from low to high to reduce the errors in low-depth and high-depth regions. Paired-end reads are aligned to contigs and assembled locally to generate some missing k-mers in low-depth regions. With these technologies, IDBA-UD can iterate k value of de Bruijn graph to a very large value with less gaps and less branches to form long contigs in both low-depth and high-depth regions.
How to use
To send jobs to the queue you can use the command
send_idba-ud
which after a few questions configures the job.
Performance
IDBA-UD has a good performance and scaling up to 8 cores. Above we did not measure a improvement. In the benchmark the --mimk 40 --step 20
options has been used. When we have decreased the step the the scalling is worse. This trend can be also seen in the second table.
1 core as base | 2 cores as base | ||||
Cores | Time (s) | Speed up | Performance (%) | Speed up | Performance (%) |
1 | 480 | 1 | 100 | ||
2 | 296 | 1.6 | 81 | 1.0 | 100 |
4 | 188 | 2.6 | 64 | 1.6 | 79 |
8 | 84 | 5.7 | 71 | 3.5 | 88 |
12 | 92 | 5.2 | 43 | 3.2 | 54 |
The second benchark has been done with a bigger file with 10 million bases and the --mink 20 --step 10 --min_support 2
options. We observe a regular behaviour than in the previous benchmark and how the panellization is good up to 4 cores.
Cores | Time (s) | Speed up | Performance |
1 | 13050 | 1 | 100 |
2 | 6675 | 2.0 | 98 |
4 | 3849 | 3.4 | 85 |
8 | 3113 | 4.2 | 52 |
16 | 2337 | 5.6 | 35 |
20 | 2409 | 5.4 | 27 |
More information
IDBA-UD web page.