Skip to content
Snippets Groups Projects
To learn more about this project, read the wiki.

asm4pg

This is an automatic and reproducible genome assembly workflow for pangenomic applications using PacBio HiFi data.

This workflow uses Snakemake to quickly assemble genomes with a HTML report summarizing obtained assembly stats.

workflow DAG

Repo directory structure

├── README.md
├── job.sh
├── local_run.sh
├── doc
├── workflow
│   ├── scripts
|   └── Snakefile
└──  .config
    ├── snakemake_profile
    └── masterconfig.yaml

Requirements

Miniforge (Snakemake), Singularity/Apptainer

How to Use

1. Set up

Clone the Git repository

git clone https://forgemia.inra.fr/asm4pg/GenomAsm4pg.git && cd GenomAsm4pg

All other tools will be run in Singularity/Apptainer images automatically downloaded by Snakemake. Total size of the images is ~5.5G

2. Configure the pipeline

  • Edit the masterconfig file in the .config/ directory with your sample information.

3. Run the workflow

A. On a HPC

  • Edit job.sh with path to the modules Singularity/Apptainer, Miniforge
  • Provide and environment with Snakemake and snakemake-executor-plugin-slurmin in job.sh, under source activate wf_env, you can create it like this :
conda create -n wf_env -c conda-forge -c bioconda snakemake=8.4.7 snakemake-executor-plugin-slurm

Use Miniforge with the conda-forge channel, see why here (french)

  • Add the log directory for SLURM
mkdir slurm_logs
  • Run the workflow :
sbatch job.sh dry # Check for warnings
sbatch job.sh run # Then

Nb 1: If your account name can't be automatically determined, add it in the .config/snakemake/profiles/slurm/config.yaml file.

B. Locally

  • Make sure you have Snakemake and Singularity/Apptainer installed
  • Run the workflow :
./local_run dry # Check for warnings
./local_run job.sh run # Then

Input Conversion

Currently, asm4pg requires fasta.gz files. To convert your fastq or bam files to this format, you can use the following tools:

./workflow/scripts/input_conversion.sh -i <input_file> -o <output_file>

Using the full potential of the workflow :

Asm4pg has many options. If you wish to modify the default values and know more about the workflow, please refer to the documentation

How to cite asm4pg?

We are currently writing a publication about asm4pg. Meanwhile, if you use the pipeline, please cite it using the address of this repository.

License

The content of this repository is licensed under (GNU GPLv3)

Contacts

For any troubleshooting, issue or feature suggestion, please use the issue tab of this repository. For any other question or if you want to help in developing asm4pg, please contact Ludovic Duvaux at ludovic.duvaux@inrae.fr