Workflow Inputs

The Toil RNA-seq workflow requires input files in order to run. These files are hosted on Synapse and by UC Santa Cruz. Inputs were built using Gencode v23 and HG38 (see Methods).

Building Your Own Indices

To build your own indices, a FASTA reference file is needed along with an annotation GTF. Simply supply the reference genome and GTF file to toil-rnaseq-inputs.

An example command to generate indices for a mouse genome:

toil-rnaseq-inputs --ref /mnt/mm10.fa --gtf /mnt/mouse-annotation.gtf --star --rsem --kallisto --hera

Direct Links

STAR [25G]
RSEM [1.1G]
Kallisto [2.4G]
Hera [2.2G]

Synapse

Register for a Synapse account
Either download the samples from the website GUI or use the Python API
pip install synapseclient
python
- import synapseclient
- syn = synapseclient.Synapse()
- syn.login('foo@bar.com', 'password')
- Get the RSEM reference (1 GB)
  - syn.get('syn5889216', downloadLocation='.')
- Get the Kallisto index (2 GB)
  - syn.get('syn5886142', downloadLocation='.')
- Get the STAR index (25 GB)
  - syn.get('syn5886182', downloadLocation='.')
- Get the Hera index (2 GB)
  - syn.get('syn11678373', downloadLocation='.')

Test Inputs

These inputs are used specifically for a test sample run during continuous integration. The test sample was generated from reads mapped to chromosome 6. Ensure the ci option is enabled in the config if attempting to run these test samples so the appropriate resources are requested.

Test Sample

Direct Links

Small RSEM Reference [8.8M]
Small STAR Index [2.0G]

There are no specific test inputs for Hera and Kallisto, just use the standard inputs.

Synapse

python
- Get the sample (500 KB)
  - syn.get('syn9924961', downloadLocation='.')
  - syn.get('syn9924962', downloadLocation='.')
- Get the small RSEM reference (8 MB)
  - syn.get('syn9772189', downloadLocation='.')
- Get the small STAR reference (2 GB)
  - syn.get('syn9772190', downloadLocation='.')
- Get the Kallisto reference (1 GB, same as regular input)
  - syn.get('syn5886142', downloadLocation='.')

When running the pipeline, set the CI option in the config to true, so that it requests an appropriate amount of memory when running STAR.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Workflow Inputs

Building Your Own Indices

Direct Links

Synapse

Test Inputs

Direct Links

Synapse

Clone this wiki locally