IDBA-UD

General information

IDBA-UD 1.1.1 is a iterative De Bruijn Graph De Novo Assembler for Short Reads Sequencing data with Highly Uneven Sequencing Depth. It is an extension of IDBA algorithm. IDBA-UD also iterates from small k to a large k. In each iteration, short and low-depth contigs are removed iteratively with cutoff threshold from low to high to reduce the errors in low-depth and high-depth regions. Paired-end reads are aligned to contigs and assembled locally to generate some missing k-mers in low-depth regions. With these technologies, IDBA-UD can iterate k value of de Bruijn graph to a very large value with less gaps and less branches to form long contigs in both low-depth and high-depth regions.

How to use

To send jobs to the queue you can use the command

send_idba-ud

which after a few questions configures the job.

Performance

IDBA-UD has a good performance and scaling up to 8 cores. Above we did not measure a improvement. In the benchmark the --mimk 40 --step 20 options has been used. When we have decreased the step the the scalling is worse. This trend can be also seen in the second table.

1 core as base 2 cores as base
Cores Time (s) Speed up Performance (%) Speed up Performance (%)
1 480 1 100
2 296 1.6 81 1.0 100
4 188 2.6 64 1.6 79
8 84 5.7 71 3.5 88
12 92 5.2 43 3.2 54

The second benchark has been done with a bigger file with 10 million bases and the  --mink 20 --step 10 --min_support 2 options. We observe a regular behaviour than in the previous benchmark and how the panellization is good up to 4 cores.

Cores Time (s) Speed up Performance
1 13050 1 100
2 6675 2.0 98
4 3849 3.4 85
8 3113 4.2 52
16 2337 5.6 35
20 2409 5.4 27

More information

IDBA-UD web page.