General information
0.2.2 Version. clean_reads
cleans NGS (Sanger, 454, Illumina and solid) reads. It can trim:
- bad quality regions
- adaptors
- vectors
- regular expresssions
It also filters out the reads that do not meet a minimum quality criteria based on the sequence length and the mean quality. It can run in parallel.
Ho to use
To submit clean_reads
jobs to the queue system execute the command
send_clean_reads
It will ask few questions to build the script and submit it to the queue.
Performance
clean_reads
can be executed in parallel and scales well up to 8 cores. For 12 cores the performance is very poor. In the table 1 we show the results of the benchmark. They have been executed in a 12 cores node with E5645 Xeon processors.
cores | 1 | 4 | 8 | 12 |
Time (s) | 1600 | 422 | 246 | 238 |
Speedup | 1 | 3.8 | 6.5 | 6.7 |
Performance (%) | 100 | 95 | 81 | 56 |
The used command has been
clean_reads -i in.fastq -o out.fastq -p illumina -f fastq -g fastq -a a.fna -d UniVec -n 20 --qual_threshold=20 --only_3_end False -m 60 -t 12