ScaleBio Seq Suite: Tagmentation Toolkit Workflow

This is a Nextflow workflow to run analysis of ScaleBio Tagmentation Toolkit sequencing libraries. It processes data from sequencing reads to alignments, single-cell outputs (peak-count matrix, etc.), and QC reports. Optionally initial ATAC downstream analysis with ArchR can be run automatically.

Getting started

First install Nextflow
Download this workflow to your machine
Install dependencies
Get the bead barcode sequence file
Launch the small pipeline test run
Download / configure a reference genome for your samples
Create a samples.csv table for your samples
Create runParams.yml, specifying inputs and analysis options for your run
Launch the workflow for your run

Inputs

Sequencing reads
- Path to the Illumina Sequencer RunFolder (bcl files)
- If you prefer to start from fastq files, generated outside (before) this workflow, see Fastq generation.
Sample Table
- A .csv file listing all samples in the analysis with their library (PCR) index and (optional) tagmentation sample barcode sequences. See samples.csv.
Reference Genome
- The workflow requires a reference genome, including a bowtie2 index for alignment and gene annotation; see Reference Genomes.
- Pre-built reference genomes are currently available for human (grch38) and mouse (mm39).

Barcodes

The workflow requires information on the location and expected sequences for all cell-barcodes, see analysisParameters.md.

Before running any analysis the user needs to add a file with the bead barcode sequences from the underlying droplet system software. For ATAC this file is named 737K-cratac-v1.txt.gz and needs to be copied to ScaleTagToolkit/references before any analysis can be run.

Outputs

The workflow produces alignments (.bam and fragments.bed), a cell-by-peak count-matrix (.mtx), QC reports and many other files. See Outputs for a full list.

Workflow Execution

Workflow test

A small test run, with all input data stored online, can be done with

nextflow run /PATH/TO/ScaleTagToolkit -profile PROFILE -params-file /PATH/TO/ScaleTagToolkit/docs/examples/runParams.yml --outDir output

See dependencies for the best PROFILE to use on your system.

Nextflow Command-line

Note that nextflow options are given with a single - (e.g. -profile), while workflow parameters (e.g. --outDir) are given with a double dash --.

See the Nextflow command-line documentation for the options to run nextflow on different systems (including HPC clusters and cloud compute).

Rerunning Reporting

Once output is generated, you may wish to rerun the QC filtering and HTML report generating scripts. This is particularly useful if you'd like to change the filter thresholds to include or exclude cells in the report. See Rerunning Scripts for details.

Configuration

Specifying Analysis Parameters

Analysis parameters (inputs, options, etc.) can be defined either in a runParams.yml file or directly on the nextflow command-line (e.g. --samples samples.csv). See analysisParameters for details on the options.

Quality Control and ArchR Analysis Parameters

Additional parameters used by the QC filtering process (cellFilter) and automated ArchR Analysis process (archrAnalysis) are defined in qcAndArchR.yml. For information on how best to modify see additionalInputParams.

Config File

In addition to the analysis parameters, a user-specific Nextflow configuration file can be used for system settings (compute and storage resources, resource limits, storage paths, etc.):

-c path/to/user.config

See Nextflow configuration for the way different configuration files, parameter files and the command-line interact.

Dependency Management

Different options to provide all required dependencies are described here. Follow one approach there and then run nextflow with the corresponding -profile.

Running in the cloud

Nextflow itself supports execution using AWS, Azure and Google Cloud.

In addition Nextflow tower offers another way to manage and execute nextflow workflows online.

Versions and Updates

See the Change log

License

By purchasing product(s) and downloading the software product(s) of ScaleBio, You accept all of the terms of the License Agreement. If You do not agree to these terms and conditions, You may not use or download any of the software product(s) of ScaleBio.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
bin		bin
docs		docs
envs		envs
lib		lib
references		references
.gitignore		.gitignore
CITATIONS.md		CITATIONS.md
LICENSE.md		LICENSE.md
README.md		README.md
changelog.md		changelog.md
main.nf		main.nf
nextflow.config		nextflow.config
nextflow_schema.json		nextflow_schema.json
report.conda.yml		report.conda.yml
scaleAtac.conda.yml		scaleAtac.conda.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ScaleBio Seq Suite: Tagmentation Toolkit Workflow

Getting started

Inputs

Barcodes

Outputs

Workflow Execution

Workflow test

Nextflow Command-line

Rerunning Reporting

Configuration

Specifying Analysis Parameters

Quality Control and ArchR Analysis Parameters

Config File

Dependency Management

Running in the cloud

Versions and Updates

License

About

Uh oh!

Releases

Packages

Languages

License

ScaleBio/ScaleTagToolkit

Folders and files

Latest commit

History

Repository files navigation

ScaleBio Seq Suite: Tagmentation Toolkit Workflow

Getting started

Inputs

Barcodes

Outputs

Workflow Execution

Workflow test

Nextflow Command-line

Rerunning Reporting

Configuration

Specifying Analysis Parameters

Quality Control and ArchR Analysis Parameters

Config File

Dependency Management

Running in the cloud

Versions and Updates

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages