exonerate_to_gff

Command exonerate_to_gff converts exonerate output to gff.

usage

$ ingenannot -v 2 exonerate_to_gff > match_prot.gff

positional arguments:

Gff_genes

Gene Annotation file in GFF/GTF file format

optional arguments:

-h, –help

show this help message and exit

-m {prot,nuc}, –mode {prot,nuc}

Mode: [prot, nuc], default=prot

-p PREFIX, –prefix PREFIX

Add a prefix to the feature name, usefull if you ran exonerate in a split mode

inputs

Output of exonerate, ran with p2g model and other options such:

exonerate --model p2g --showvulgar no --showalignment no --showquerygff no --showtargetgff yes --percent 80 --ryo "AveragePercentIdentity: %pi " protein_db.pep target_genome.fasta

expected “compliant” output of exonerate:

Command line: [exonerate --model protein2genome --showvulgar no --showalignment no --showtargetgff yes --showquerygff no --minintron 4 --maxintron 5000 --percent 50 --ryo AveragePercentIdentity: %pi\n ../data/UNIPROTKB_no_zymo/UNIPROTKB.Dothideomycetes.15072020.NoZymo.fasta_chunk_0000000 /work/nlapalu/gmove/data/chr/chr_1.fasta]
Hostname: [node021]
# --- START OF GFF DUMP ---
#
#
##gff-version 2
##source-version exonerate:protein2genome:local 2.2.0
##date 2020-08-17
##type DNA
#
#
# seqname source feature start end score strand frame attributes
#
chr_1 exonerate:protein2genome:local  gene    3034345 3034872 685     -       .       gene_id 1 ; sequence tr|W6YK43|W6YK43_COCCA ; gene_orientation .
chr_1 exonerate:protein2genome:local  cds     3034345 3034872 .       -       .
chr_1 exonerate:protein2genome:local  exon    3034345 3034872 .       -       .       insertions 0 ; deletions 0
chr_1 exonerate:protein2genome:local  similarity      3034345 3034872 685     -       .       alignment_id 1 ; Query tr|W6YK43|W6YK43_COCCA ; Align 3034873 8 528
# --- END OF GFF DUMP ---
#
AveragePercentIdentity: 71.59
# --- START OF GFF DUMP ---
#
#
##gff-version 2
##source-version exonerate:protein2genome:local 2.2.0
##date 2020-08-17
##type DNA
#
#
# seqname source feature start end score strand frame attributes
#
chr_1 exonerate:protein2genome:local  gene    3357525 3358309 868     -       .       gene_id 1 ; sequence tr|W6YD21|W6YD21_COCCA ; gene_orientation +
chr_1 exonerate:protein2genome:local  cds     3358268 3358309 .       -       .
chr_1 exonerate:protein2genome:local  exon    3358268 3358309 .       -       .       insertions 0 ; deletions 0
chr_1 exonerate:protein2genome:local  splice5 3358266 3358267 .       -       .       intron_id 1 ; splice_site "GT"
chr_1 exonerate:protein2genome:local  intron  3358206 3358267 .       -       .       intron_id 1
chr_1 exonerate:protein2genome:local  splice3 3358206 3358207 .       -       .       intron_id 0 ; splice_site "AG"
chr_1 exonerate:protein2genome:local  cds     3357525 3358205 .       -       .
chr_1 exonerate:protein2genome:local  exon    3357525 3358205 .       -       .       insertions 9 ; deletions 1
chr_1 exonerate:protein2genome:local  similarity      3357525 3358309 868     -       .       alignment_id 1 ; Query tr|W6YD21|W6YD21_COCCA ; Align 3358310 1 42 ; Align 3358206 15 456 ; Align 3357741 167 114 ; Align 3357627 206 102
# --- END OF GFF DUMP ---
#
AveragePercentIdentity: 72.69

outputs

Output on stdout:

chr_1 exonerate_to_gff        match   3034345 3034872 71.59   -       .       ID=match.1;Dbxref=exonerate:0;Name=tr|W6YK43|W6YK43_COCCA
chr_1 exonerate_to_gff        match_part      3034345 3034872 71.59   -       .       ID=match.1.0;Parent=match.1;Dbxref=exonerate:tr|W6YK43|W6YK43_COCCA;Target=tr|W6YK43|W6YK43_COCCA 8 184
chr_1 exonerate_to_gff        match   3357525 3358309 72.69   -       .       ID=match.2;Dbxref=exonerate:1;Name=tr|W6YD21|W6YD21_COCCA
chr_1 exonerate_to_gff        match_part      3357525 3358205 72.69   -       .       ID=match.2.1;Parent=match.2;Dbxref=exonerate:tr|W6YD21|W6YD21_COCCA;Target=tr|W6YD21|W6YD21_COCCA 15 167
chr_1 exonerate_to_gff        match_part      3358268 3358309 72.69   -       .       ID=match.2.0;Parent=match.2;Dbxref=exonerate:tr|W6YD21|W6YD21_COCCA;Target=tr|W6YD21|W6YD21_COCCA 1 15