The gto_fastq_clust_reads agroups reads and creates an index file. It cluster reads in therms of Seq k-mer Lexicographical order.
For help type:
./gto_fastq_clust_reads -h
In the following subsections, we explain the input and output paramters.
The gto_fastq_clust_reads program needs two streams for the computation, namely the input and output standard. The input stream is a FASTQ file. The program sorts the FASTQ reads accoring to the lexicographic order of the genomic sequences.
The attribution is given according to:
Usage: ./gto_fastq_clust_reads [options] [[--] args]
or: ./gto_fastq_clust_reads [options]
It agroups reads and creates an index file.
It cluster reads in therms of Seq k-mer Lexicographical order
-h, --help Show this help message and exit
Basic options
-c, --ctx=
< input.fastq Input FASTQ file format (stdin)
> output.fastq Output FASTQ file format (stdout)
Example: ./gto_fastq_clust_reads -c < input.fastq > output.fastq
An example of such an input file is:
@SRR001661.1 071112_SLXA-EAS1_s_7:5:1:817:345
GGGTGATGGCCGCTGCCGATGGCGTCAAATCCCACCAAGTTACCCTTAACAACTTAAGGG
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIII9IG9ICIIIIIIIIIIIIIIIIIIIIDIII
@SRR001661.2 071112_SLXA-EAS1_s_7:5:1:801:338
GTTCAGGGATACGACGTTTGTATTTTAAGAATCTGAAGCAGAAGTCGATGATAATACGCG
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII6IBIIIIIIIIIIIIIIIIIIIIIIIGI
@SRR001661.3 071112_SLXA-EAS1_s_7:5:1:821:328
AACGCGTATTCGGAGCTTCTTCGTTGGGTACGTGCGCCTATTATGCGGCGCGATTGCTAT
+
IIIIIII6BBB6BBBBBBBBBBBBBBBBBDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
@SRR001661.4 071112_SLXA-EAS1_s_7:5:1:943:128
ATCGCGCATTCGACTGGTACGTGTACGTGTAGTCGTAGCGTATGTTCGGTCGTATGCGTG
+
II77777LPMMMPPMMMMIIIIIIIIIIIIII777777777BBBBBBBBDDDDDIIIIII
The output of the gto_fastq_clust_reads program is a FASTQ file with clustered reads in therms of the genomic sequence k-mer Lexicographical order.
An example, for the output, is:
@SRR001661.3 071112_SLXA-EAS1_s_7:5:1:821:328
AACGCGTATTCGGAGCTTCTTCGTTGGGTACGTGCGCCTATTATGCGGCGCGATTGCTAT
+
IIIIIII6BBB6BBBBBBBBBBBBBBBBBDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
@SRR001661.4 071112_SLXA-EAS1_s_7:5:1:943:128
ATCGCGCATTCGACTGGTACGTGTACGTGTAGTCGTAGCGTATGTTCGGTCGTATGCGTG
+
II77777LPMMMPPMMMMIIIIIIIIIIIIII777777777BBBBBBBBDDDDDIIIIII
@SRR001661.1 071112_SLXA-EAS1_s_7:5:1:817:345
GGGTGATGGCCGCTGCCGATGGCGTCAAATCCCACCAAGTTACCCTTAACAACTTAAGGG
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIII9IG9ICIIIIIIIIIIIIIIIIIIIIDIII
@SRR001661.2 071112_SLXA-EAS1_s_7:5:1:801:338
GTTCAGGGATACGACGTTTGTATTTTAAGAATCTGAAGCAGAAGTCGATGATAATACGCG
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII6IBIIIIIIIIIIIIIIIIIIIIIIIGI