Skip to content

Latest commit

 

History

History
88 lines (58 loc) · 8.65 KB

RNA.md

File metadata and controls

88 lines (58 loc) · 8.65 KB

Direct RNA and cDNA Sequencing of a human transcriptome on Oxford Nanopore MinION and GridION

Introduction

We have sequenced the CEPH1463 (NA12878/GM12878, Ceph/Utah pedigree) human genome reference standard on the Oxford Nanopore MinION using direct RNA sequencing kits (30 flowcells) and using the 1D ligation kit (SQK-LSK108) on R9.4 flowcells using R9.4 chemistry (FLO-MIN106). RNA from the GM12878 human cell line (Ceph/Utah pedigree) was extracted from the cultured cell line.

Data reuse and license

We encourage the reuse of this data in your own analysis and publications which is released under the Creative Commons CC-BY license. Therefore we would be grateful if you would cite the reference below if you do.

Citation

Rachael E. Workman, Alison D. Tang, Paul S. Tang, Miten Jain, John R. Tyson, Roham Razaghi, Philip C. Zuzarte, Timothy Gilpatrick, Alexander Payne, Joshua Quick, Norah Sadowski, Nadine Holmes, Jaqueline Goes de Jesus, Karen L. Jones, Cameron M. Soulette, Terrance P. Snutch, Nicholas Loman, Benedict Paten, Matthew Loose, Jared T. Simpson, Hugh E. Olsen, Angela N. Brooks, Mark Akeson & Winston Timp. Nanopore native RNA sequencing of a human poly(A) transcriptome. Nature Methods doi: doi:10.1038/s41592-019-0617-2

rel2 [fixed] [Update December 2020] Basecalls (Guppy 4.2.2)

Full Native RNA dataset (30 runs), full cDNA dataset (12 runs), and IVT RNA dataset. The data were rebasecalled using Guppy 4.2.2 flip flop (hac) models.

FileType # runs Link
Native RNA 30 FASTQ, Summary File (gzip), Multi_FAST5
cDNA 12 FASTQ, Summary File (gzip), Multi_FAST5
IVT RNA 2 FASTQ, Summary File (gzip), Multi_FAST5

Contributors

  • Nick Loman, Josh Quick, Andrew Beggs, Jaqueline Goes de Jesus (University of Birmingham)
  • Matt Loose, Nadine Holmes, Matthew Carlile (University of Nottingham)
  • Winston Timp, Roham Razaghi, Timothy Gilpatrick, Norah Sadowski, Rachael E. Workman (JHU)
  • Jared Simpson, Phil Zuzarte, Paul Tang (OICR)
  • Terry Snutch, John Tyson (UBC)
  • Mark Akeson, Angela N. Brooks, Hugh E. Olsen, Benedict Paten, Alison Tang, Miten Jain (UCSC)

Basecalls (Albacore 2.1)

Full Native RNA dataset (30 runs) and full cDNA dataset (12 runs).

FileType # runs # reads Mean (b) Read N50 (b) Link
Native RNA Pass 30 10302647 1030.24 1334 FASTQ
Native RNA Fail 30 2686736 430.96 840 FASTQ
cDNA Pass 12 15152101 932.86 1072 FASTQ
cDNA Fail 12 9129338 661.90 841 FASTQ

Combined Albacore Summary

FASTQ (Sequence Data), FAST5 (Raw Signal Data), and Bulk FAST5 (Continuous Data)

FASTQ and FAST5 files for the dataset (split by centre and sample) can be found here. The continous Bulk FAST5 files could be visualized using bulkvis.

Alignment Files

All alignments performed using minimap2.

FileType Reference Params BAM BAI
Native RNA Pass GRCh38_full_analysis_set_plus_decoy_hla.fa -ax splice -uf -k14 hg38 BAM hg38 BAI
Native RNA Pass SIRVome_isoforms_ERCCs_170612a.fasta -ax splice --splice-flank=no SIRVome BAM SIRVome BAI
Native RNA Fail GRCh38_full_analysis_set_plus_decoy_hla.fa -ax splice -uf -k14 hg38 BAM hg38 BAI
Native RNA Fail SIRVome_isoforms_ERCCs_170612a.fasta -ax splice --splice-flank=no SIRVome BAM SIRVome BAI
cDNA Pass GRCh38_full_analysis_set_plus_decoy_hla.fa -ax splice -uf -k14 hg38 BAM hg38 BAI
cDNA Pass SIRVome_isoforms_ERCCs_170612a.fasta -ax splice --splice-flank=no SIRVome BAM SIRVome BAI
cDNA Fail GRCh38_full_analysis_set_plus_decoy_hla.fa -ax splice -uf -k14 hg38 BAM hg38 BAI
cDNA Fail SIRVome_isoforms_ERCCs_170612a.fasta -ax splice --splice-flank=no SIRVome BAM SIRVome BAI

Analyses

Various analyses from the consortium work and the associated files can be found here.

Reference Files

Details on the reference files used for analyses, and their download links can be found here

External Links

Heng Li has make a custom track for the UCSC genome browser from the direct RNA dataset. Thanks Heng! [1]

References

[1] Li, H Twitter link

Acknowledgements

We are most grateful to Daniel Garalde, Daniel Jachimowicz, Andy Heron, Rosemary Dokos at Oxford Nanopore Technologies for technical and logistical assistance. We are grateful to Angel Pizarro and Jed Sundwall at Amazon Web Services for hosting this dataset as an AWS Open Data set.