16S Amplicon Initial Processing Pipeline

This is a snakemake pipeline for initial handling of 16S amplicon datasets. It seeks to automate many of the steps that I describe here.

It takes as raw input sequences in the format SAMPLE-NAME_R1_001.fastq.gz that our MiSeq produces by default. The final output is an unfiltered OTU table suitable for further work with R, MOTHUR, or QIIME. This repository will not likely be maintained and I will probably end up testing QIIME2 and might migrate to newer OTU picking methods. However, feel free to use it as a reference (crude) snakemake pipeline. Also, don't hesitate to contact me at waoverholt@gmail.com if you have specific questions about this pipeline.

Dependencies

This pipeline was written with snakemake and requires python3. It is set up to be configurable with miniconda. Follow instructions to install python and miniconda.

Once you have miniconda installed you can use the environment.yaml file to install all dependencies in a virtual environment. conda env create -n snakemake_16S python=3.5 --file environment.yaml The environment name can be anything you'd like.

To set up your own environment, the pipeline requires the following packages and programs in your path:

Config File

The pipeline requires a config.yaml file to run. Please modify the existing config file for your datasets.

Pipeline Summary

Merge paired reads with pear.
Quality control with vsearch.
Adapter triming with MOTHUR.
Dereplication with vsearch.
Denovo and reference chimera detection with vsearch.
OTU picking with swarm.

#Chimera reference database The chimera reference database I use it too large to be hosted on github. I am currently using the 97 rep_set from SILVA128 database.

The resulting OTU table is not abundance filtered. The tab delimited and the biom format tables produced are identical. Further analyzes can proceed using QIIME or R. To work with QIIME, you will need to deactivate the conda environment.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
additional_files		additional_files
scripts		scripts
.gitignore		.gitignore
README.md		README.md
Snakefile		Snakefile
config.yaml		config.yaml
environment.yaml		environment.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

16S Amplicon Initial Processing Pipeline

Dependencies

Config File

Pipeline Summary

About

Releases

Packages

Languages

waoverholt/snakemake_16S_pipeline

Folders and files

Latest commit

History

Repository files navigation

16S Amplicon Initial Processing Pipeline

Dependencies

Config File

Pipeline Summary

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages