The gto_fasta_extract_pattern_coords extracts the header and coordinates from a Multi-FASTA file format given a pattern/motif in the sequence.
For help type:
./gto_fasta_extract_pattern_coords -h
In the following subsections, we explain the input and output paramters.
The gto_fasta_extract_pattern_coords program needs two streams for the computation, namely the input and output standard. The input stream is a Multi-FASTA file.
The attribution is given according to:
Usage: ./gto_fasta_extract_pattern_coords [options] [[--] args]
or: ./gto_fasta_extract_pattern_coords [options]
It extracts the header and coordinates from a Multi-FASTA file format given a
pattern/motif in the sequence.
-h, --help show this help message and exit
Basic options
-p, --pattern= Pattern to search in the file header
< input.fasta Input Multi-FASTA file format (stdin)
> output.coords Output coordinates (stdout)
Example: ./gto_fasta_extract_pattern_coords -p < input.fasta > output.coords
An example of such an input file is:
>AB000264 |acc=AB000264|descr=Homo sapiens mRNA
ACAAGACGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCCTGGAGGGTCCACCGCTGCCCTGCTGCCATTGTCCCC
GGCCCCACCTAAGGAAAAGCAGCCTCCTGACTTTCCTCGCTTGGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAA
GTGGTTTGAGTGGACCTCCGGGCCCCTCATAGGAGAGGAAGCTCGGGAGGTGGCCAGGCGGCAGGAAGCAGGCCAGTGCC
GCGAATCCGCGCGCCGGGACAGAATCTCCTGCAAAGCCCTGCAGGAACTTCTTCTGGAAGACCTTCTCCACCCCCCCAGC
TAAAACCTCACCCATGAATGCTCGCAACACGCAAGTTTAATTCGCAAGTTAGACCTGAACGGGAGGTGGCCACGCAAGTT
The output of the gto_fasta_extract_pattern_coords program is a Multi-FASTA file.
Using the input above, with the pattern ACA, an output example for this is the following:
1 3 >AB000264 |acc=AB000264|descr=Homo sapiens mRNA
131 133 >AB000264 |acc=AB000264|descr=Homo sapiens mRNA
259 261 >AB000264 |acc=AB000264|descr=Homo sapiens mRNA
347 349 >AB000264 |acc=AB000264|descr=Homo sapiens mRNA