Skip to content

digenoma-lab/prok-annotate

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

61 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

metashot/prok-annotate workflow

A Nextflow pipeline for the functional annotation of prokaryotic genomes and metagenomes.

Cite as: DOI

MetaShot Home

Main features

Quick start

Prokka only

  1. Install Docker (or Singulariry) and Nextflow (see Dependences);

  2. Start running the analysis (Prokka only, bacteria mode):

    nextflow run metashot/prok-annotate \
      --genomes "data/*.fa" \
      --outdir results

Documentation

Run eggNOG, KofamScan and Anvi'o Kofam

eggNOG, KofamScan and Anvi'o Kofam annotations can be performed by setting --run_eggnog, --run_kofamscan and --run_anvio_kofam options:

  nextflow run metashot/prok-annotate \
    --genomes "data/*.fa" \
    --run_eggnog \
    --run_kofamscan \
    --run_anvio_kofam \
    --outdir results

EggNOG, KofamScan and Anvi'o Kofam databases will be automatically downloaded from the Internet and and they will be put in the dbs directory in the results folder. If you want to download the databases manually, run the following lines:

  • EggNOG:

    docker run --rm -v /path/to/eggnog_db:/eggnog_db metashot/eggnog-mapper:2.0.1-3 \
      python /eggnog_mapper/download_eggnog_data.py -y --data_dir /eggnog_db

    Add --eggnog_db /path/to/eggnog_db option to the nextflow run command.

  • KofamScan

    Download KOfam database from ftp://ftp.genome.jp/pub/db/kofam/ and decompress it (e.g. into the /path/to/kofamscan_db folder:

    mkdir /path/to/kofamscan_db
    curl -s -L ftp://ftp.genome.jp/pub/db/kofam/profiles.tar.gz | \
      tar -zxf - -C /path/to/kofamscan_db
    curl -s -L ftp://ftp.genome.jp/pub/db/kofam/ko_list.gz -o ko_list.gz && \
      gunzip -c ko_list.gz > /path/to/kofamscan_db/ko_list

    Add --kofamscan_db /path/to/kofamscan_db option to the nextflow run command.

  • Anvi'o Kofam

    docker run --rm -v /path/to/anvio_kofam_db:/anvio_kofam_db metashot/metashot/anvio:7-2 \
      anvi-setup-kegg-kofams --kegg-data-dir /anvio_kofam_db

    Add --anvio_kofam_db /path/to/anvio_kofam_db option to the nextflow run command.

Metagenomes

For metagenomes, we suggest to run the pipeline using the --metagenome option using EggNOG mapper only:

  nextflow run metashot/prok-annotate \
    --genomes "contigs/*.fa" \
    --metagenome \
    --skip_prokka \
    --run_eggnog \
    --eggnog_db_mem \
    --outdir results

Parameters

See the file nextflow.config for the complete list of parameters.

Output

The files and directories listed below will be created in the output directory after the pipeline has finished.

Main

  • prokka: Prokka outputs (documentation);
  • eggnog_*.tsv: count matrix for each transferred annotation;
  • kofamscan_hits.tsv: count matrix for KofamScan hits;
  • anvio_kofam_hits.tsv: count matrix for Anvio Kofam hits;
  • anvio_modules.tsv: module completeness matrix.

Secondary

System requirements

Please refer to System requirements for the complete list of system requirements options.

About

Annotation pipeline for prokaryotic genomes

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Nextflow 63.3%
  • Python 36.7%