rescue_effectors
Command rescue_effector predicts potential missed effector genes.
Analysis of non-used assembled transcripts (as a transribed gene) to find new potential effector genes. This tool uses the effector_predictor module to compute the probability of a protein to be annotated as an effector. Fungal effector genes could be difficult to be predicted or annotated from evidence sources due to their short length and mono-exonic structure. rescue_effector searches for unannotated transcripts and tests their potential as effector genes. To avoid false positives, the length of the protein and the associated mRNAs must have a ratio close to 0.2 (max). The process clusterizes overlapping/colinear transcripts, which could therefore prevent the prediction of colinear effectors (only one gene is predicted per cluster).
usage
$ ingenannot -v 2 rescue_effector genes.gff transcripts.gff genome.fasta
positional arguments:
Gff_genes |
Gene Annotation file in GFF/GTF file format |
Gff_transcripts |
Gff file of transcript evidence, compressed with bgzip and indexed with tabix |
Genome |
Genome in fasta format |
optional arguments:
-h, –help |
show this help message and exit |
–signalp SIGNALP |
Path to signalp, default=/usr/local/bin/signalp (from system lookup) |
–tmhmm TMHMM |
Path to tmhmm, default=/usr/local/bin/tmhmm-2.0c/bin/tmhmm (from system lookup) |
–targetp TARGETP |
Path to targetp, default=/usr/local/bin/targetp (from system lookup) |
–effectorp EFFECTORP |
Path to signalp, default=None (from system lookup) |
–signalp_cpos SIGNALP_CPOS |
Maximal position of signal peptide cleavage site, default=25 |
–effectorp_score EFFECTORP_SCORE |
Minimal effectorp score, default=0.7 |
–max_len MAX_LEN |
Maximal length of protein in aa, default=300 |
–min_len MIN_LEN |
Minimal length of protein in aa, default=30 |
–min_intergenic_len MIN_INTERGENIC_LEN |
Minimal intergenic length to consider, default=100 |
–size_ratio SIZE_RATIO |
Minimal ratio length of CDS/mRNA, default=0.2 |
–unstranded |
Allow analysis of unstranded transcripts, default=False, only stranded transcripts are considered |
–nested |
Consider nested proteins, not only first start, default=False |
-o OUTPUT, –output OUTPUT |
Output Annotation file in GFF file format, default=effectors.gff3 |
inputs
Gene Annotation file in GFF/GTF file format, Gff file of transcript evidence, compressed with bgzip and indexed with tabix, Genome in fasta format
outputs
Output Gff with effectors:
# gff file
chr_1 ingenannot-effector-rescue gene 847863 848039 . - . ID=gene:effector_1;
chr_1 ingenannot-effector-rescue mRNA 847863 848039 . - . gene_id=MSTRG.188;transcript_id=MSTRG.188.2;signalp=Y;signalp_pos=21;effectorp_score=0.831;tmhmm=0;targetp=S;len_aa=58;ID=mRNA::effector_1;Parent=gene:effector_1;
chr_1 ingenannot-effector-rescue exon 847863 848039 . - . ID=exon:effector_1_1;Parent=mRNA::effector_1
chr_1 ingenannot-effector-rescue CDS 847863 848039 . - 0 ID=cds:effector_1;Parent=mRNA::effector_1
chr_1 ingenannot-effector-rescue gene 2513243 2513563 . - . ID=gene:effector_2;
chr_1 ingenannot-effector-rescue mRNA 2513243 2513563 . - . gene_id=MSTRG.666;transcript_id=MSTRG.666.1;signalp=Y;signalp_pos=23;effectorp_score=0.857;tmhmm=0;targetp=S;len_aa=58;ID=mRNA::effector_2;Parent=gene:effector_2;
chr_1 ingenannot-effector-rescue exon 2513243 2513285 . - . ID=exon:effector_2_1;Parent=mRNA::effector_2
chr_1 ingenannot-effector-rescue exon 2513373 2513440 . - . ID=exon:effector_2_2;Parent=mRNA::effector_2
chr_1 ingenannot-effector-rescue exon 2513498 2513563 . - . ID=exon:effector_2_3;Parent=mRNA::effector_2
chr_1 ingenannot-effector-rescue CDS 2513243 2513285 . - 1 ID=cds:effector_2;Parent=mRNA::effector_2
chr_1 ingenannot-effector-rescue CDS 2513373 2513440 . - 0 ID=cds:effector_2;Parent=mRNA::effector_2
chr_1 ingenannot-effector-rescue CDS 2513498 2513563 . - 0 ID=cds:effector_2;Parent=mRNA::effector_2
chr_1 ingenannot-effector-rescue gene 2591231 2591416 . + . ID=gene:effector_3;
chr_1 ingenannot-effector-rescue mRNA 2591231 2591416 . + . gene_id=MSTRG.690;transcript_id=MSTRG.690.1;signalp=Y;signalp_pos=19;effectorp_score=0.891;tmhmm=0;targetp=S;len_aa=61;ID=mRNA::effector_3;Parent=gene:effector_3;
chr_1 ingenannot-effector-rescue exon 2591231 2591416 . + . ID=exon:effector_3_1;Parent=mRNA::effector_3
chr_1 ingenannot-effector-rescue CDS 2591231 2591416 . + 0 ID=cds:effector_3;Parent=mRNA::effector_3