Program gto_fastq_exclude_n

The gto_fastq_exclude_n discards the FASTQ reads with the minimum number of ''N'' symbols. Also, if present, it will erase the second header (after +).

For help type:

./gto_fastq_exclude_n -h


In the following subsections, we explain the input and output paramters.

Input parameters

The gto_fastq_exclude_n program needs two streams for the computation, namely the input and output standard. The input stream is a FASTQ file.

The attribution is given according to:

Usage: ./gto_fastq_exclude_n [options] [[--] args]
or: ./gto_fastq_exclude_n [options]

It discards the FASTQ reads with the minimum number of "N" symbols.
If present, it will erase the second header (after +).

-h, --help show this help message and exit

Basic options
-m, --max= The maximum of of "N" symbols in the read
< input.fastq Input FASTQ file format (stdin)
> output.fastq Output FASTQ file format (stdout)

Example: ./gto_fastq_exclude_n -m < input.fastq > output.fastq

Console output example :

Total reads : value
Filtered reads : value


An example of such an input file is:

@SRR001666.1 071112_SLXA-EAS1_s_7:5:1:817:345 length=72
GNNTGATGGCCGCTGCCGATGGCGNANAATCCCACCAANATACCCTTAACAACTTAAGGGTTNTCAAATAGA
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIII9IG9ICIIIIIIIIIIIIIIIIIIIIDIIIIIII>IIIIII/
@SRR001666.2 071112_SLXA-EAS1_s_7:5:1:801:338 length=72
NTTCAGGGATACGACGNTTGTATTTTAAGAATCTGNAGCAGAAGTCGATGATAATACGCGNCGTTTTATCAN
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII6IBIIIIIIIIIIIIIIIIIIIIIIIGII>IIIII-I)8I


Output

The output of the gto_fastq_exclude_n program is a set of all the filtered FASTQ reads, followed by the execution report.

The execution report only appears in the console.

Using the input above with the max value as 5, an output example for this is the following:

@SRR001666.2 071112_SLXA-EAS1_s_7:5:1:801:338 length=72
NTTCAGGGATACGACGNTTGTATTTTAAGAATCTGNAGCAGAAGTCGATGATAATACGCGNCGTTTTATCAN
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII6IBIIIIIIIIIIIIIIIIIIIIIIIGII>IIIII-I)8I
Total reads : 2
Filtered reads : 1